2017-06-06 37 views
0

後加入和GROUP BY我在新的豬,並試圖理解爲什麼我不能指望後加入和組:COUNT豬

A = LOAD 'mary' as (line); 
B = LOAD 'mary' as (line); 

wordsA = foreach A generate flatten(TOKENIZE(line)) as wordA; 
grpdA = group wordsA by wordA; 
cntdA = foreach grpdA generate group, COUNT(wordsA); 

wordsB = foreach B generate flatten(TOKENIZE(line)) as wordB; 
grpdB = group wordsB by wordB; 
cntdB = foreach grpdB generate group, COUNT(wordsB), 'some text'; 

fltB = FILTER cntdB BY $1>1; 

jnd = join cntdA by $1, fltB by $1; 
jnd_n = foreach jnd generate $0; 
grp = group jnd by $0; 
out = foreach grp generate group, count(jnd_n); 

dump jnd_n; 
dump grp; 

轉儲jnd_n:

(was) 
(was) 
(was) 
(lamb) 
(lamb) 
(lamb) 
(Mary) 
(Mary) 
(Mary) 

轉儲GRP :

(was,{(was,2,was,2,some text),(was,2,Mary,2,some text),(was,2,lamb,2,some text)}) 
(Mary,{(Mary,2,was,2,some text),(Mary,2,Mary,2,some text),(Mary,2,lamb,2,some text)}) 
(lamb,{(lamb,2,was,2,some text),(lamb,2,Mary,2,some text),(lamb,2,lamb,2,some text)}) 

但我發現了錯誤:

Invalid scalar projection: jnd_n : A column needs to be projected from a relation for it to be used as a scalar

如果我試圖改變代碼:

out = foreach grp generate group, count(jnd_n.$0); 

然後我發現了另一個錯誤:

Failed to generate logical plan. Nested exception: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve count using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.]

我知道我能做到這一點的另一種方式,但我想導致像這正是後做了兩次豬手術後JOINGROUP BY

轉儲:

(was,3) 
(lamb,3) 
(Mary,3) 

回答

0

COUNT需要在帽。 COUNT是一個關鍵字。

out = foreach grp generate group, COUNT(jnd_n.$0);` 
+0

THX的答案,是COUNT是敏感的,卻得到了另一個錯誤: '錯誤org.apache.pig.tools.grunt.Grunt - 錯誤1045: <文件script.pig,24行,列34>無法推斷org.apache.pig.builtin.COUNT的匹配函數爲多個或不匹配。請使用明確的演員。「 – Dipas