2014-10-19 43 views
0

我正在豬身上編程,發生錯誤,我一直無法解決。錯誤1200:意外的符號?

這裏是代碼我試圖運行:

--Load files into relations 
month1 = LOAD 'hdfs:/data/big/data/weather/201201hourly.txt' USING PigStorage(','); 
month2 = LOAD 'hdfs:/data/big/data/weather/201202hourly.txt' USING PigStorage(','); 
month3 = LOAD 'hdfs:/data/big/data/weather/201203hourly.txt' USING PigStorage(','); 
month4 = LOAD 'hdfs:/data/big/data/weather/201204hourly.txt' USING PigStorage(','); 
month5 = LOAD 'hdfs:/data/big/data/weather/201205hourly.txt' USING PigStorage(','); 
month6 = LOAD 'hdfs:/data/big/data/weather/201206hourly.txt' USING PigStorage(','); 

--Combine relations 
months = UNION month1, month2, month3, month4, month5, month6; 

/* Splitting relations 
SPLIT months INTO 
     splitMonth1 IF SUBSTRING(date, 4, 6) == '01', 
     splitMonth2 IF SUBSTRING(date, 4, 6) == '02', 
     splitMonth3 IF SUBSTRING(date, 4, 6) == '03', 
     splitRest IF (SUBSTRING(date, 4, 6) == '04' OR SUBSTRING(date, 4, 6) == '04'); 
*/ 

/* Joining relations 

stations = LOAD 'hdfs:/data/big/data/QCLCD201211/stations.txt' USING PigStorage() AS (id:int, name:chararray) 

JOIN months BY wban, stations by id; 

*/ 

--filter out unwanted data 
clearWeather = FILTER months BY SkyCondition == 'CLR'; 

--Transform and shape relation 
shapedWeather = FOREACH clearWeather GENERATE date, SUBSTRING(date, 0, 4) as year, SUBSTRING(date, 4, 6) as month, SUBSTRING(date, 6, 8) as day, skyCondition, dryTemp; 

--Group relation specifying number of reducers 
groupedMonthDay = GROUP shapedWeather BY month, day PARALLEL 10; 

--Aggregate relation 
aggedResults = FOREACH groupedByMonthDay GENERATE group as MonthDay, AVG(shapedWeather.dryTemp), MIN(shapedWeather.dryTemp), MAX(shapedWeather.dryTemp), COUNT(shapedWeather.dryTemp) PARALLEL 10; 

--Sort relation 
sortedResults = SORT aggedResults BY $1 DESC; 

--Store results in HDFS 
STORE SortedResults INTO 'hdfs:/data/big/data/weather/pigresults' USING PigStorage(':'); 

這是我得到的回報,當我運行代碼:

Pig Stack Trace 
--------------- 
ERROR 1200: <file /home/eduardo/Documentos/pig/weather.pig, line 35, column 52> Syntax error, unexpected symbol at or near 'PARALLEL' 

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during parsing. <file /home/eduardo/Documentos/pig/weather.pig, line 35, column 52> Syntax error, unexpected symbol at or near 'PARALLEL' 
    at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1691) 
    at org.apache.pig.PigServer$Graph.access$000(PigServer.java:1411) 
    at org.apache.pig.PigServer.parseAndBuild(PigServer.java:344) 
    at org.apache.pig.PigServer.executeBatch(PigServer.java:369) 
    at org.apache.pig.PigServer.executeBatch(PigServer.java:355) 
    at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140) 
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:202) 
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173) 
    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84) 
    at org.apache.pig.Main.run(Main.java:607) 
    at org.apache.pig.Main.main(Main.java:156) 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
    at java.lang.reflect.Method.invoke(Method.java:606) 
    at org.apache.hadoop.util.RunJar.main(RunJar.java:160) 
Caused by: Failed to parse: <file /home/eduardo/Documentos/pig/weather.pig, line 35, column 52> Syntax error, unexpected symbol at or near 'PARALLEL' 
    at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:241) 
    at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:179) 
    at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1678) 
    ... 15 more 
================================================================================ 

回答

1

,如果你是分組多列你必須把裏面的功能括號

groupedMonthDay = GROUP shapedWeather BY (month, day) PARALLEL 10; 

另一點是你可以通過使用避免多重負載和聯合低命令,這將加載所有以上述組合開始的文件。

allMonths = LOAD 'hdfs:/data/big/data/weather/[0-9]*hourly.txt' USING PigStorage(','); 

櫃面你想從一堆文件加載僅上述六個文件,那麼你可以加載這樣

allMonths = LOAD 'hdfs:/data/big/data/weather/20120[1-6]*hourly.txt' USING PigStorage(','); 
+0

謝謝您的幫助我解決了這個問題。 – 2014-10-21 00:01:16