2013-11-26 92 views
2

我正在閱讀從豬的apache日誌,它計數從ip的總連接。阿帕奇豬計數排序

A = LOAD 'access.log' using PigStorage(' ') as (f0:chararray,f1:chararray,f2:chararray,f3:chararray,f4:chararray,f5:chararray,f6:chararray); 
grp_f5 = GROUP A by f5; 
counts = FOREACH grp_f5 GENERATE group, COUNT(A); 
store counts into '/data/accesslog' using PigStorage(','); 

結果:

2.50.3.29,71 
71.5.94.4,30 
12.0.19.50,6 
12.53.17.3,4 
155.69.4.4,37 
166.77.6.8,12 
218.0.7.30,1956 
5.10.83.28,1 
5.86.82.80,177 
50.18.2.73,1 
59.10.5.53,377 

但是數據不會被數排序,任何想法?

回答

9

如果您不明確排序數據,它將不會被排序。 分選可以用ORDER BY完成:

counts = FOREACH grp_f5 GENERATE group, COUNT(A) AS cnt; 
counts_ordered = ORDER counts BY cnt DESC;