register s3n://uw-cse344-code/myudfs.jar
-- load the test file into Pig
--raw = LOAD 's3n://uw-cse344-test/cse344-test-file' USING TextLoader as (line:chararray);
-- later you will load to other files, example:
raw = LOAD 's3n://uw-cse344/btc-2010-chunk-000' USING TextLoader as (line:chararray);
-- parse each line into ntriples
ntriples = foreach raw generate FLATTEN(myudfs.RDFSplit3(line)) as (subject:chararray,predicate:chararray,object:chararray);
--filter 1
subjects1 = filter ntriples by subject matches '.*rdfabout\\.com.*' PARALLEL 50;
--filter 2
subjects2 = subjects1;
,但我得到的錯誤:
2012-03-10 01:19:18039 [主] ERROR org.apache.pig.tools.grunt.Grunt - 錯誤1200:不匹配的輸入';'期待LEFT_PAREN 日誌文件的詳細信息:/home/hadoop/pig_1331342327467.log
所以看起來豬不喜歡那樣。我該如何做到這一點?