2014-01-13 15 views
2

這是非常簡單的演示,可以在0.11重現問題。豬模式和類型異常

=== testSchemaDATA ===

1_a 
2_b 
3_c 

的第一個腳本:

a = load 'testSchemaDATA' as (str:chararray); 
a1 = foreach a generate flatten(STRSPLIT(str,'_',2)) as num; 
a2 = foreach a1 generate (int)num as num; 
dump a2; 

是合適的劇本和轉儲他回答:

第二個錯誤的腳本是(唯一的區別是tw Ø腳本是A1聲明的架構聲明):

a = load 'testSchemaDATA' as (str:chararray); 
a1 = foreach a generate flatten(STRSPLIT(str,'_',2)) as (num,char); 
a2 = foreach a1 generate (int)num as num; 
dump a2; 

舉報 錯誤org.apache.pig.tools.grunt.Grunt - 錯誤1052: 不能投ByteArray的詮釋

我不不知道如何解釋這一點。這是一個錯誤?

回答

0

這將工作:

a = load 'testSchemaDATA' as (str:chararray); 
a1 = foreach a generate flatten(STRSPLIT(str,'_',2)) as (num:int,char:chararray); 
a2 = foreach a1 generate num as num; 
dump a2; 

會給你的輸出:

(1) 
(2) 
(3) 

而且

a = load 'testSchemaDATA' as (str:chararray); 
a1 = foreach a generate flatten(STRSPLIT(str,'_',2)) as (num:int,char:chararray); 
a2 = foreach a1 generate char as char; 
dump a2; 

會給你的輸出:

(a) 
(b) 
(c) 

區別在於,在這種情況下,您將STRSPLIT的結果明確地轉換爲int和chararray。如果沒有給出,它將默認爲bytearray。

如果你a1 = foreach a generate flatten(STRSPLIT(str,'_',2)) as num; 然後describe a1

a1: {num: bytearray} 

如果你 a1 = foreach a generate flatten(STRSPLIT(str,'_',2)) as (num,char);然後describe a1給出:

a1: {num: NULL,char: NULL} 

看起來型即將在此情況下爲空。我不確定爲什麼會這樣。如果有人可以說,會很好。