我有加入豬的問題。我將從給你的背景開始。這裏是我的代碼:豬 - 加入不起作用
-- START file loading
start_file = LOAD 'dir/start_file.csv' USING PigStorage(';') as (PARTRANGE:chararray, COD_IPUSER:chararray);
-- trim
A = FOREACH start_file GENERATE TRIM(PARTRANGE) AS PARTRANGE, TRIM(COD_IPUSER) AS COD_IPUSER;
dump A;
這給輸出:
(79.92.147.88,20140310)
(79.92.147.88,20140310)
(109.31.67.3,20140310)
(109.31.67.3,20140310)
(109.7.229.143,20140310)
(109.8.114.133,20140310)
(77.198.79.99,20140310)
(77.200.174.171,20140310)
(77.200.174.171,20140310)
(109.17.117.212,20140310)
加載其他的文件:
-- Chargement du fichier recherche Hadopi
file2 = LOAD 'dir/file2.csv' USING PigStorage(';') as (IP_RECHERCHEE:chararray, DATE_HADO:chararray);
dump file2;
輸出是這樣的:
(2014/03/10 00:00:00,79.92.147.88)
(2014/03/10 00:00:01,79.92.147.88)
(2014/03/10 00:00:00,192.168.2.67)
現在,我想要做一個左外連接。下面的代碼:
result = JOIN file2 by IP_RECHERCHEE LEFT OUTER, A by COD_IPUSER;
dump result;
輸出是這樣的:
(2014/03/10 00:00:00,79.92.147.88,,)
(2014/03/10 00:00:00,192.168.2.67,,)
(2014/03/10 00:00:01,79.92.147.88,,)
所有的「文件2」的記錄都在這裏,這是很好的,但任何start_file都在這裏。這就好像加入失敗了一樣。
你知道問題在哪裏嗎?
謝謝。