2013-02-06 37 views
1

以下代碼片段不會完全返回我正在嘗試計算的內容;唯一用戶的數量。任何想法?如何統計具有PIG的唯一用戶的數量

data = LOAD 'input_initial' AS (user_id,item_id,rating,timestamp); 
data = FOREACH data GENERATE user_id,item_id; 
STORE data INTO 'input_final'; 
data_users = FOREACH data GENERATE user_id; 
group_users = GROUP data_users BY user_id; 
count_users = FOREACH group_users GENERATE COUNT(data_users); 
STORE count_users INTO 'count_users'; 

回答

3

你需要修改對「所有」,而不是單個字段的最後一組操作行爲:

group_users = GROUP data_users BY user_id; 
grp_all = GROUP group_users ALL; 
count_users = FOREACH grp_all GENERATE COUNT(group_users); 
+0

它不工作,我很害怕。你測試過了嗎?它成功了嗎? – user706838

+0

對不起,錯過了一步 –

+0

太棒了!有效!非常感謝! – user706838

相關問題