3
我使用HIVE兩個表看上去像(更多或更少):Hadoop的/蜂房查詢,以一列分割成若干個
-TABLE1定義爲[(變量:字符串),(值1:INT), (值2:INT)]
與現場的 「變量」 看上去像 「X0,X1,X2,X3,...,XN」
-TABLE2定義爲[(Value1Sum:INT),(Value2Sum: int),(X1:字符串),(X4:字符串),(X17:字符串)]
我「轉換」table1到table2與查詢:
INSERT OVERWRITE TABLE table2
SELECT sum(v1), sum(v2), x1, x4, x17
FROM (SELECT
Value1 as v1,
Value2 as v2,
split(Variables, ",")[1] as x1,
split(Variables, ",")[4] as x4,
split(Variables, ",")[17] as x17
FROM Table1) tmp
GROUP BY tmp.x1, tmp.x4, tmp.x17
Hive是否調用3次拆分函數?
有沒有辦法讓它更優雅?
有沒有辦法讓它更通用?
最好的問候, CC