2017-04-20 27 views
0

我有以下數據:查找當一個特定的字符是第二個到最後一個字符串豬

address|some_mask_value 
123 Main | 10100011110 
124 Main | 10100011100 

我使用Apache豬版0.15.0.2.4.2.0-258

我「M試圖在第二到最後一個字符來創建一個指標‘some_mask_value’是1。我已經試過:

load_data = LOAD '/myfile.txt' USING PigStorage('|') AS (address:String, some_mask_value:String); 

grunt> case_test = FOREACH load_data GENERATE (CASE trial 
>> WHEN LAST_INDEX_OF(name, '1') 2 THEN yes 
>> ELSE no); 

2017-04-20 16:59:50,522 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: <line 5, column 30> mismatched input '2' expecting THEN 

基本上,如果第二到最後一個字符是1,那麼我會過濾掉該行後面

回答

1
a = load 'data.txt' using PigStorage('|') 
     as (address: chararray, some_mask_value:chararray); 

如果屏蔽字段長度是固定的,就像在你的樣本數據,則:

b = foreach a generate $0 .. , (
     CASE SUBSTRING(some_mask_value, 9, 10) 
      WHEN '1' THEN 'YES' 
      ELSE 'NO' 
     END 
    ) as inidcator; 

dump b; 
(123 Main,10100011110,YES) 
(124 Main,10100011100,NO) 

如果面膜是不固定的長度:

b = foreach a generate $0 .. , (
     CASE SUBSTRING(some_mask_value, (int)SIZE(some_mask_value) - 2, (int)SIZE(some_mask_value) - 1) 
      WHEN '1' THEN 'YES' 
      ELSE 'NO' 
     END 
    ) as indicator; 
dump b; 
(123 Main,10100011110,YES) 
(124 Main,10100011100,NO) 

這是假設掩膜場也許不具備領先或尾隨空格。

+0

工作正常!謝謝! – knobby

相關問題