2017-06-21 28 views
0

應該是什麼下面的SQL查詢的豬等同腳本:如何組通過對多列在豬腳本

SELECT fld1, fld2, fld3, SUM(fld4) 
FROM Table1 
GROUP BY fld1, fld2, fld3; 

對於表1:

A B C 2 X Y Z 
A B C 3 X Y Z 
A B D 2 X Y Z 
A C D 2 X Y Z 
A C D 2 X Y Z 
A C D 2 X Y Z 

OUTPUT:

A B C 5   
A B D 2   
A C D 6   

回答

0

參考https://pig.apache.org/docs/r0.11.1/basic.html#GROUP,你可以 找到多組例如

爲了您的用例,下面的代碼應該足夠

A = load 'input.csv' using PigStorage(',') AS (fld1:chararray,fld2:chararray,fld3:chararray,fld4:long,fld5:chararray,fld6:chararray,fld7:chararray); 
B = FOREACH(GROUP A BY (fld1,fld2,fld3)) GENERATE FLATTEN(group) AS (fld1,fld2,fld3), SUM(A.fld4) AS fld4_aggr; 
DUMP B; 
+0

由於它的工作... :) 我最終想出了: A =使用PigStorage(',')AS加載'input.csv'AS(fld1:chararray,fld2:chararray,fld3:chararray,fld4:long,fld5:chararray,fld6:chararray,fld7 :chararray) ; group_A =組(A)(fld1,fld2,fld3); B = foreach group_A生成group.fld1,group.fld2,group.fld3,sum(A.fld4)as sum_fld4; DUMP B; – Saurabh