2017-06-08 79 views
0

我創建了一個表,如下所示蜂巢觀點:無法查詢依賴於UDTF

CREATE TABLE TEST (ID INT, SCORE INT, NAME STRING); 

並插入幾個記錄。我想執行一個top-k查詢,返回每個ID的最高記錄,按SCORE排序。

我使用each_top_k()UDF從Hivemall庫作爲記錄在這裏:https://hivemall.incubator.apache.org/userguide/misc/topk.html

SELECT EACH_TOP_K(1, ID, SCORE, ID, NAME) AS (RANK, SCORE, ID, NAME) FROM (
SELECT * FROM TEST 
CLUSTER BY ID 
) T; 

成功地返回每個ID的最高分。然而,我然後創建一個視圖如下:

CREATE VIEW TEST_VIEW AS SELECT EACH_TOP_K(1, ID, SCORE, ID, NAME) AS (RANK, SCORE, ID, NAME) FROM (
SELECT * FROM TEST 
CLUSTER BY ID 
) T; 

並且它成功執行。然而,然後進行簡單

SELECT * FROM TEST_VIEW; 

返回以下錯誤:

Error: Error while compiling statement: FAILED: SemanticException View test_view is corresponding to UDTF, rather than a SelectOperator. (state=42000,code=40000)

我無法找到此錯誤的任何提及。有什麼建議麼?

回答

1

我會假設Hive有問題在運行時爲你的udtf推斷每個字段的數據類型。這應該可以解決它,試着把你的查詢頂部的查詢,像

CREATE VIEW TEST_VIEW AS 
select cast(rank as long) as rank, cast(score as double) as score, cast(id as string) as id, cast(name as string) as name from (
SELECT EACH_TOP_K(1, ID, SCORE, ID, NAME) AS (RANK, SCORE, ID, NAME) FROM (
SELECT * FROM TEST 
CLUSTER BY ID 
) T) t2; 
+0

謝謝,這就是它。必須處理由不與原始字段對齊的函數生成的數據類型。 – TheElysian