追加和編輯SAS中的多個CSV文件

我在SAS目錄上有大約27,000個.csv文件，這些文件也被複制到網絡驅動器中，因此我可以選擇使用其中一種。追加和編輯SAS中的多個CSV文件

我需要將所有csv文件合併到一個數據集並刪除任何空行（空列在A列中包含一個逗號）。

每個CSV文件都有唯一的名稱，但結構和格式是相同的。

在最後的數據集。我想A列包含的數據從導入文件範圍A1被複制的文件名：到S1：下到B列到T

我已經試過下面的代碼，但它圍繞後1000文件失敗：

x 'cd C:\temp'; 

filename csv ('*.csv'); 

proc import out=work.LSPImportFiles 
datafile = csv DBMS=CSV REPLACE; 
GETNAMES=yes; 
run;

我也嘗試下面的代碼，但這似乎錯過列8,9和後幾百年失敗：

data want; 
    length _filename_ $32; 
    infile C:\temp\*.csv" dlm = ',' filename = _filename_; 
    input @; 
    if _filename_ ne lag1(_filename_) then delete; 
input 
Column_1 :$15. 
Column_2 :$16. 
Column_3 /*this is a number to 2 decimal places*/ 
Column_4 /*this is a number to 2 decimal places*/ 
Column_5 /*this is a number to 2 decimal places*/ 
Column_6 /*this is a number to 2 decimal places*/ 
Column_7 /*this is a number to 2 decimal places*/ 
Column_8 /*this is a percentage*/ 
Column_9 /*this is a percentage*/ 
Column_10 /*this is a number to 2 decimal places*/ 
Column_11 /*this is a number to 2 decimal places*/ 
Column_12 /*this is a number to 2 decimal places*/ 
Column_13 /*this is a number to 2 decimal places*/ 
Column_14 /*this is a blank column*/ 
Column_15 /*this is a number to 2 decimal places*/ 
Column_16 /*this is a number to 2 decimal places*/ 
Column_17 /*this is a number to 2 decimal places*/ 
run;

來源

2016-02-11 user3319459

如果第8和第9列有百分號，那麼您應該使用PERCENT格式或COMMA。我還會包括INFILE語句選項DSD和MISSOVER。最好從數據步驟中看到日誌記錄，至少與數據錯誤和新行相關的日誌記錄。我不需要看27000個「INFILE IS」筆記。我懷疑你有流水和其他「問題」。 –

「失敗」是什麼意思？ – Joe

您將需要爲該名稱定義兩個變量，因爲INFILE語句中引用的一個會自動刪除。如果你有一個真正的CSV文件，那麼你會希望使用DSD選項來正確處理空字段，尤其是因爲您似乎指出至少有一列是空的。最好是明確定義變量，而不是根據所使用的類型格式或信息進行SAS猜測。此外，使用TRUNCOVER選項可以避免SAS跳到下一行，如果行的字段數少於預期。

data want; 
    length _filename_ filename $32 Column_1 $15 Column_2 $16 Column_3-Column_17 8; 
    infile "C:\temp\*.csv" dsd dlm = ',' filename = _filename_ truncover; 
    input @; 
    if _filename_ ne lag1(_filename_) then delete; 
    input Column_1 - Column_17 ; 
    filename=_filename_; 
run;

根據百分比是如何在CSV編碼的文件，你可能需要添加這種說法，使SAS會接受像10%值。

informat column_8 column_9 percent. ;

您可以將任何其他處理添加到數據步驟中。例如，要刪除第一列爲空的行（我假設您的意思是說列A包含逗號），則可以在run語句前添加此行。

if missing(column_1) then delete;

來源

2016-02-11 15:00:51 Tom

謝謝你，這是工作對待，但是表格非常大，因爲line_2中沒有數據的行沒有被刪除。任何想法如何在infile語句中執行此操作而不是額外的數據步驟？ – user3319459

還有什麼辦法可以在日期文件被修改？ – user3319459

您可以添加所需的任何其他處理。例如，你可以在'run;'語句之前添加'if missing（column_2）'然後delete''這一行。 – Tom

追加和編輯SAS中的多個CSV文件

回答

相關問題