2015-05-09 27 views

回答

2

您可以試試下面的過濾器命令?

輸入:

file1.txt 
file2.PDF 
file3.doc 
file4.ppt 
file5.pdf 

PigScript:

A = LOAD 'input' USING PigStorage() AS (filename:chararray); 
B = FILTER A BY filename matches '.*\\.(pdf|PDF)$'; 
DUMP B; 

輸出:

(file2.PDF) 
(file5.pdf)