0
我是新來的阿帕奇豬。我用tab分隔字段創建了2個文件; employees.txt和employees2.txt [有在文件中沒有行間距,這是satisify這個編輯器]阿帕奇豬JOIN不像預期的那樣表現
employees.txt包含:
joe 21 94085 50000.0
Tom 21 94085 50000.0
John 21 94085 50000.0
employees2.txt包含:
joe 4085559898
joe 4085559899
tom 4085559897
tom 4085559896
john 4085559896
後來我嘗試一個簡單的加入:
e1 = LOAD 'employees.txt' AS (name, age, zip, salary);
e2 = LOAD 'employees2.txt' AS (name, phone);
e3 = JOIN e1 BY name, e2 BY name;
DUMP e3;
結果:
(joe,21,94085,50000.0,joe,4085559899)
(joe,21,94085,50000.0,joe,4085559898)
我預計:
(joe,21,94085,50000.0,joe,4085559899)
(joe,21,94085,50000.0,joe,4085559898)
(Tom,21,94085,50000.0,Tom,4085559897)
(Tom,21,94085,50000.0,Tom,4085559896)
(joe,21,94085,50000.0,Tom,4085559896)
我在做什麼錯?
感謝,
克里斯