處理我的XML文件是這樣的:XML文件使用Apache豬
<CATALOG>
<CD>
<TITLE>hadoop developer</TITLE>
<ARTIST>ajay</ARTIST>
<COUNTRY>india</COUNTRY>
<COMPANY>ITC</COMPANY>
<PRICE>10.90</PRICE>
<YEAR>2013</YEAR>
</CD>
</CATALOG>
和我使用了一些正則表達式,但我不知道爲什麼我沒有得到期望的輸出...我的代碼如下:
**註冊/usr/lib/pig/piggybank.jar
A = load 'input.xml' using org.apache.pig.piggybank.storage.XMLLoader('CATALOG') as (x: chararray);
B = foreach A GENERATE FLATTEN(REGEX_EXTRACT_ALL(x,'<CATALOG>\n*<CD>\n<TITLE>(.*)</TITLE>\n*<ARTIST>(.*)</ARTIST>\n*<COUNTRY>(.*)</COUNTRY>\n*<COMPANY>(.*)</COMPANY>\n*<PRICE>(.*)</PRICE>\n*<YEAR>(.*)</YEAR>\n*</CD>\\n*</CATALOG>')) as (name:chararray, words:chararray);**
而且我的輸出如下:
2013-08-20 12:40:24,043 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2013-08-20 12:40:24,044 [main] WARN
org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized
2013-08-20 12:40:24,047 [main] INFO
org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2013-08-20 12:40:24,047 [main] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
它有什麼問題?謝謝。
@ mr2ert感謝您的編輯,但請幫助我,我有什麼問題。 –
我剛剛運行腳本,一切正常。你將不得不更加具體地瞭解什麼是錯誤的。 – mr2ert