我想讀取存儲在Linux機器上一個文件夾中的大量.csv文件(幾個千兆字節)的第一個和最後一個記錄。假設他們被稱爲have1.csv, have2.csv, ...
等。SAS - 讀取多個csv文件的第一個和最後一個觀察結果
所以我試了下面的代碼,它只給了我第一行。但不是最後一行。
%let datapath = ~/somefolder/;
data want;
length finame $300.;
/*Reference all CSV files in input data folder*/
infile "&datapath.have*.csv" delimiter=","
MISSOVER DSD lrecl=32767 firstobs=2
eov=eov eof=eof filename=finame end=done;
/*Define input format of variables*/
informat Var1 COMMA. Var2 COMMA. Var3 COMMA.;
/*Loop over files*/
do while(not done);
/*Set trailing @ to hold the input open for the next input statement
this is because we have several files */
input @;
/*If first line in file is encountered eov is set to 1,
however, we have firstobs=2, hence all lines would be skipped.
So we need to reset EOV to 0.*/
if eov then
do;
/*Additional empty input statement
handles missing value at first loop*/
input;
eov = 2;
end;
/*First observation*/
if eov=2 then do;
input Var1--Var3;
fname=finame;
output;
eov = 0;
end;
/*Last observation*/
if 0 then do;
eof: input Var1--Var3;
fname=finame;
output;
end;
input;
end;
stop;
run;
我非常感謝您的幫助!如果我誤解了infile,end,eov,eof和input @的概念或相互作用,請告訴我!我不知道我的錯誤是...
您是否還試圖跳過標題行?那是關於FIRSTOBS = option的評論? – Tom
是的,很抱歉沒有提前回復。 –