2017-01-05 76 views
0

有9個字段的模式,我只需要兩個字段(6,7即5美元,6美元),我想計算$ 5的平均值,我想按升序排序$ 6,所以如何完成這項任務可以幫助我。在豬中找到平均值並按升序排序

輸入數據:

N368SW 188 170 175 17 -1 MCO MHT 1142 
N360SW 100 115 87 -10 5 MCO MSY 550 
N626SW 114 115 90 13 14 MCO MSY 550 
N252WN 107 115 84 -10 -2 MCO MSY 550 
N355SW 104 115 85 -1 10 MCO MSY 550 
N405WN 113 110 96 14 11 MCO ORF 655 
N456WN 110 110 92 24 24 MCO ORF 655 
N743SW 144 155 124 7 18 MCO PHL 861 
N276WN 142 150 129 -2 6 MCO PHL 861 
N369SW 153 145 134 30 22 MCO PHL 861 
N363SW 151 145 137 5 -1 MCO PHL 861 
N346SW 141 150 128 51 60 MCO PHL 861 
N785SW 131 145 118 -15 -1 MCO PHL 861 
N635SW 144 155 127 -6 5 MCO PHL 861 
N242WN 298 300 276 68 70 MCO PHX 1848 
N439WN 130 140 111 -4 6 MCO PIT 834 
N348SW 140 135 124 7 2 MCO PIT 834 
N672SW 136 135 122 9 8 MCO PIT 834 
N493WN 151 160 136 -9 0 MCO PVD 1073 
N380SW 170 155 155 13 -2 MCO PVD 1073 
N705SW 164 160 147 6 2 MCO PVD 1073 
N233LV 157 160 143 1 4 MCO PVD 1073 
N786SW 156 160 139 6 10 MCO PVD 1073 
N280WN 160 160 146 1 1 MCO PVD 1073 
N282WN 104 95 81 10 1 MCO RDU 534 
N694SW 89 100 77 3 14 MCO RDU 534 
N266WN 94 95 82 9 10 MCO RDU 534 
N218WN 98 100 77 12 14 MCO RDU 534 
N355SW 47 50 35 15 18 MCO RSW 133 
N388SW 44 45 30 37 38 MCO RSW 133 
N786SW 46 50 31 4 8 MCO RSW 133 
N707SA 52 50 33 10 8 MCO RSW 133 
N795SW 176 185 153 -9 0 MCO SAT 1040 
N402WN 176 185 161 4 13 MCO SAT 1040 
N690SW 123 130 107 -1 6 MCO SDF 718 
N457WN 135 130 105 20 15 MCO SDF 718 
N720WN 144 155 131 13 24 MCO STL 880 
N775SW 147 160 135 -6 7 MCO STL 880 
N291WN 136 155 122 96 115 MCO STL 880 
N247WN 144 155 127 43 54 MCO STL 880 
N748SW 179 185 159 -4 2 MDW ABQ 1121 
N709SW 176 190 158 21 35 MDW ABQ 1121 
N325SW 110 105 97 36 31 MDW ALB 717 
N305SW 116 110 90 107 101 MDW ALB 717 
N403WN 145 165 128 -6 14 MDW AUS 972 
N767SW 136 165 125 59 88 MDW AUS 972 
N730SW 118 120 100 28 30 MDW BDL 777 

我已經寫了這樣的代碼,但它不能正常工作:通過

a = load '/path/to/file' using PigStorage('\t'); 
b = foreach a generate (int)$5 as field_a:int,(chararray)$6 as field_b:chararray; 
c = group b all; 
d = foreach c generate b.field_b,AVG(b.field_a); 
e = order d by field_b ASC; 
dump e; 

我面對錯誤的順序:

grunt> a = load '/user/horton/sample_pig_data.txt' using PigStorage('\t'); 
grunt> b = foreach a generate (int)$5 as fielda:int,(chararray)$6 as fieldb:chararray; 
grunt> describe @; 
b: {fielda: int,fieldb: chararray} 
grunt> c = group b all; 
grunt> describe @; 
c: {group: chararray,b: {(fielda: int,fieldb: chararray)}} 
grunt> d = foreach c generate b.fieldb,AVG(b.fielda);                             
grunt> e = order d by fieldb ; 
2017-01-05 15:51:29,623 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1025: 
<line 6, column 15> Invalid field projection. Projected field [fieldb] does not exist in schema: :bag{:tuple(fieldb:chararray)},:double. 
Details at logfile: /root/pig_1483631021021.log 

我想輸出像(不涉及輸入數據):

(({(Bharathi),(Komal),(Archana),(Trupthi),(Preethi),(Rajesh),(siddarth),(Rajiv) }, 
    { (72) , (83) , (87) , (75) , (93) , (90) , (78) , (89) }),83.375) 
+0

您是否收到任何錯誤?如果不是那麼你的實際輸出是什麼樣子? – Amit

+0

你的輸出不屬於同一個輸入文件,也沒有排序!你能舉個好榜樣嗎? – 54l3d

+0

我編輯了錯誤的帖子,你可以看看並告訴我解決方案..... –

回答

0

如果您已經找到答案,最佳做法是將其發佈,以便引用其他人可以有更好的理解。