2013-06-03 44 views
0

下面是數據如何使用豬腳本插入換行符到內容

123,456,789,Q,W,E,R,20120513

123,77,88,8,JJ,OO, 「OOO,\ r \ n」 個 「d,\ r \ NDF,123」,20120514

123,77,88,8,JJ,OO,OOO,20120514

欲更換這些\ r \ n使用豬腳本轉換爲換行符。

Pig Script: 

    REGISTER file:///usr/share/pig/contrib/piggybank/java/piggybank.jar; 

    DEFINE CSVLoader org.apache.pig.piggybank.storage.CSVLoader; 

    RAW = LOAD '/home/bannie/test/test.log' 
      USING CSVLoader() AS (
       a: chararray, 
       b: chararray, 
       c: chararray, 
       d: chararray, 
       e: chararray, 
       f: chararray, 
       g: chararray, 
       h: chararray 
      ); 

    C = FOREACH RAW GENERATE REPLACE(g, '\\\\r\\\\n', '\uxxxx') as max; 

grunt> C = FOREACH RAW GENERATE REPLACE(g, '\\\\r\\\\n', '\u000f') as max; 
    grunt> C = FOREACH RAW GENERATE REPLACE(g, '\\\\r\\\\n', '\u000e') as max; 
    grunt> C = FOREACH RAW GENERATE REPLACE(g, '\\\\r\\\\n', '\u000d') as max; 
    2013-06-03 17:53:42,629 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: <line 55, column 32> mismatched input '(' expecting SEMI_COLON 
    Details at logfile: /home/bannie/pig_1370249955149.log 
    grunt> C = FOREACH RAW GENERATE REPLACE(g, '\\\\r\\\\n', '\u000a') as max; 
    2013-06-03 17:53:47,601 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: <line 55, column 32> mismatched input '(' expecting SEMI_COLON 
    Details at logfile: /home/bannie/pig_1370249955149.log 
    grunt> C = FOREACH RAW GENERATE REPLACE(g, '\\\\r\\\\n', '\u000b') as max; 
    grunt> C = FOREACH RAW GENERATE REPLACE(g, '\\\\r\\\\n', '\u000c') as max; 

Anyone knows how to insert it? 
+0

已經通過使用REPLACE(f6,'\\\\ n','\ n')作爲f6解決了... – bannie

回答

0

您無法將任何內容插入到文件中。 Pig具有與Hadoop Map Reduce相同的限制。默認情況下,輸出是一個包含部分文件的目錄。