2014-11-14 90 views
1

我想用0-s填充字符串數據類型字段。有沒有辦法做到這一點?我需要有固定長度(40)值。在豬左邊填充字符串

在此先感謝, 千里眼

+0

也許使用正則表達式? – clairvoyant 2014-11-14 15:15:04

+0

你能貼一些例子嗎? – 2014-11-14 15:50:03

回答

4

必須動態地生成基於剩餘字符串的長度零的數目,所以我不認爲它可能在本地豬。
這在UDF中是非常有可能的。

input.txt中

11111 
222222222 
33 
org.apache.hadoop.util.NativeCodeLoader 
apachepig 

PigScript:

REGISTER leftformat.jar; 

A = LOAD 'input.txt' USING PigStorage() AS(f1:chararray); 
B = FOREACH A GENERATE format.LEFTPAD(f1); 
DUMP B; 

輸出:

(0000000000000000000000000000000000011111) 
(0000000000000000000000000000000222222222) 
(0000000000000000000000000000000000000033) 
(0org.apache.hadoop.util.NativeCodeLoader) 
(0000000000000000000000000000000apachepig) 

UDF代碼:的下面Java類文件被編譯並作爲產生leftformat.jar
LEFTPAD.java

package format; 
import java.io.IOException; 
import org.apache.commons.lang.StringUtils; 
import org.apache.pig.EvalFunc; 
import org.apache.pig.data.Tuple; 

public class LEFTPAD extends EvalFunc<String> { 
@Override 
public String exec(Tuple arg) throws IOException { 
     try 
     { 
      String input = (String)arg.get(0); 
      return StringUtils.leftPad(input, 40, "0"); 
     } 
     catch(Exception e) 
     { 
      throw new IOException("Caught exception while processing the input row ", e); 
     } 
    } 
} 

UPDATE:

1.Download 4 jar files from the below link(apache-commons-lang.jar,piggybank.jar, pig-0.11.0.jar and hadoop-common-2.6.0-cdh5.4.5) 
http://www.java2s.com/Code/Jar/a/Downloadapachecommonslangjar.htm 
http://www.java2s.com/Code/Jar/p/Downloadpiggybankjar.htm 
http://www.java2s.com/Code/Jar/p/Downloadpig0110jar.htm 

2. Set all the 3 jar files to your class path 
    >> export CLASSPATH=/tmp/pig-0.11.1.jar:/tmp/piggybank.jar:/tmp/apache-commons-lang.jar 

3. Create directory name format 
    >>mkdir format 

4. Compile your LEFTPAD.java and make sure all the three jars are included in the class path otherwise compilation issue will come 
    >>javac LEFTPAD.java 

5. Move the class file to format folder 
    >>mv LEFTPAD.class format 

6. Create jar file name leftformat.jar 
    >>jar -cf leftformat.jar format/ 

7. jar file will be created, include into your pig script 

Example from command line: 
$ mkdir format 
$ javac LEFTPAD.java 
$ mv LEFTPAD.class format/ 
$ jar -cf leftformat.jar format/ 
$ ls 
LEFTPAD.java format  input.txt leftformat.jar script.pig 
+0

感謝您的回答。我在創建jar文件時遇到了問題。我已經創建了LEFTPAD.class(上面),並創建了帶有followong的創建和擴展txt:Main-Class:LEFTPAD,然後運行以下命令:jar cfm leftformat.jar extension.txt LEFTPAD.class。 Jar創建成功,但從豬腳本中調用時,我收到錯誤消息 – clairvoyant 2014-11-17 10:05:37

+0

錯誤消息:<行113,列24>無法生成邏輯計劃。嵌套異常:org.apache.pig.backend.executionengine.ExecException:錯誤1070:無法使用imports解析format.LEFTPAD:[,java.lang。,org.apache.pig.builtin。,org.apache.pig.impl .builtin。] – clairvoyant 2014-11-17 10:09:54

+0

嘗試像這樣,創建一個文件夾名稱「format」並編譯你的java文件LEFTPAD.java。 LEFTPAD.class應該存在於「格式」文件夾中。然後像這樣創建jar文件「jar -cf leftformat.jar format /」。之後,將這個罐子包含在你的豬腳本中。如果您遇到一些問題,請讓我知道。 – 2014-11-17 10:43:59