2014-09-25 47 views
2

我只是試圖將豬的日期時間格式轉換爲紀元時間所以我可以用時間做其他計算。這是我的(部分)腳本如下:豬 - 無法推斷org.apache.pig.piggybank.evaluation.datetime.convert.ISOToUnix的匹配函數爲多個或不匹配

DEFINE ISOToUnix org.apache.pig.piggybank.evaluation.datetime.convert.ISOToUnix(); 
A = LOAD 's3://hearstlogfiles/google/NetworkBackfillImpressions_271283/2014/09/24/NetworkBackfillImpressions_271283_20140924_00.gz' USING PigStorage(','); 
B = LIMIT A 10; 
C = FOREACH B GENERATE 
(chararray)(CONCAT(CONCAT(SUBSTRING($0, 0,10),' '),SUBSTRING($0, 11,19))) as dt_string:chararray, 
DATE_TIME(CONCAT(CONCAT(SUBSTRING($0, 0,10),' '),SUBSTRING($0, 11,19))) AS dt; 
D = FOREACH C GENERATE 
dt_string, 
dt, 
ISOToUnix(dt)/1000 as epoch:long; 
DUMP D; 

當豬試圖執行下面的行,我得到它下面的錯誤。我知道我把dt作爲正確的格式。

ISOToUnix(dt)/1000 as epoch:long 
Could not infer the matching function for org.apache.pig.piggybank.evaluation.datetime.convert.ISOToUnix as multiple or none of them fit. Please use an explicit cast. 

當我DUMP C,我得到以下。所以我知道C dt的格式是正確的。

(2014-09-24 02:53:54,2014-09-24T02:53:54.000Z) 
(2014-09-24 02:57:54,2014-09-24T02:57:54.000Z) 
(2014-09-24 03:05:06,2014-09-24T03:05:06.000Z) 
(2014-09-24 03:27:30,2014-09-24T03:27:30.000Z) 
(2014-09-24 03:37:00,2014-09-24T03:37:00.000Z) 
(2014-09-24 03:39:18,2014-09-24T03:39:18.000Z) 
(2014-09-24 03:41:24,2014-09-24T03:41:24.000Z) 
(2014-09-24 03:43:18,2014-09-24T03:43:18.000Z) 
(2014-09-24 03:58:12,2014-09-24T03:58:12.000Z) 

請幫忙。

+0

類似的錯誤就發生在我擁有了你找到解決辦法呢? – 2017-09-12 13:06:06

回答

0
https://pig.apache.org/docs/r0.7.0/api/org/apache/pig/piggybank/evaluation/datetime/convert/ISOToUnix.html

粘貼例如:

REGISTER /Users/me/commiter/piggybank/java/piggybank.jar ; 
REGISTER /Users/me/commiter/piggybank/java/lib/joda-time-1.6.jar ; 
DEFINE ISOToUnix org.apache.pig.piggybank.evaluation.datetime.convert.ISOToUnix(); 
ISOin = LOAD 'test.tsv' USING PigStorage('\t') AS (dt:chararray, dt2:chararray); 

DESCRIBE ISOin; 
ISOin: {dt: chararray,dt2: chararray} 

DUMP ISOin; 
(2009-01-07T01:07:01.000Z,2008-02-01T00:00:00.000Z) 
(2008-02-06T02:06:02.000Z,2008-02-01T00:00:00.000Z) 
(2007-03-05T03:05:03.000Z,2008-02-01T00:00:00.000Z) 
... 

toUnix = FOREACH ISOin GENERATE ISOToUnix(dt) AS unixTime:long; 

DESCRIBE toUnix; 
toUnix: {unixTime: long} 
DUMP toUnix; 
(1231290421000L) 
(1202263562000L) 
(1173063903000L) 
... 

如果你已經注意到,DT(這是爲PARAM傳遞給ISOToUnix UDF是chararray所以,你需要強制轉換的「DT」一欄下面chararray:

C = FOREACH B 
     GENERATE 
      (chararray)(CONCAT(CONCAT(SUBSTRING($0, 0,10),' '), 
      SUBSTRING($0, 11,19))) as dt_string:chararray, 
      CONCAT(CONCAT(SUBSTRING($0, 0,10),' '),SUBSTRING($0, 11,19)) AS dt:chararray; 

D = FOREACH C 
     GENERATE 
      dt_string, 
      dt, 
      ISOToUnix((chararray)dt)/1000 as epoch:long; 

DUMP D; 

希望這有助於。

+0

感謝您的幫助Guarav,我得到的錯誤**無法施展datetim échararray **。如果我找到另一個解決方案,我會讓這個社區知道。 – 2014-09-28 12:48:21