導入多個csv文件並提取一列數據形成一個矩陣

我對MATLAB非常陌生，所以我相當肯定這是一個非常簡單的問題。我有幾個輸出數據集，每個都有一個前綴（例如PT_1到PT_20）。我想使用for循環將每個csv文件第二列的數據導入到一個新的矩陣中，並將其與時間對齊，這在所有文件中都是不變的。導入多個csv文件並提取一列數據形成一個矩陣

輸入文件看起來像

PT_1 .....

time param 1 param 2 param 3 
2/01/2001 23:00 11.449428 3 314.322471 
3/01/2001 23:00 11.448935 3 311.683002

PT_2 .....

time param 1 param 2 param 3 
2/01/2001 23:00 11.445892 0 296.523937 
3/01/2001 23:00 11.445393 0 294.0944

而且我想我的輸出看起來像

time PT_1 PT_2 
2/01/2001 23:00 11.449428 11.445892 
3/01/2001 23:00 11.448935 11.445393

到目前爲止，我得到的代碼它是

files = 0:1:21; 
for i=1:21; 
filename = sprintf('WQ_%d.csv', files(i)); 
origdata = importdata (filename); 
end

我可以看到它正確識別文件名，但它並沒有真正做我想做的事情，因爲它在每次循環寫入數據。顯然，我的編碼錯了。任何人都可以幫助我解決如何編寫一個合適的代碼？非常感謝！

來源

2012-12-14 cwick

我剛剛更新了我的答案，因爲我意識到您的csv文件似乎有標題行。如果我的解決方案無法解決問題，請告訴我，因爲我必須在途中做出一些假設。我想最終的答案將會非常接近我現在所擁有的。 –

試試這個：

%# Set the number of csv files 
DirectoryPath = 'FullDirectoryPathHereWithTrailingSlash'; 
NumFile = 2; 

%# Open the first file and get the first column (the date column) 
File1Path = [DirectoryPath, 'PT_1.csv']; 
fid1 = fopen(File1Path, 'r'); 
Date = textscan(fid1, '%s %*[^\n]', 'Delimiter', ',', 'HeaderLines', 1); 
fclose(fid1); 

%Convert dates to matlab date numbers and get number of rows 
Date = datenum(Date{1, 1}, 'dd/mm/yyyy'); 
T = size(Date, 1); 

%# Preallocate a matrix to hold all the data, and add the date column 
D = [Date, NaN(T, NumFile)]; 

%# Loop over the csv files, get the second column and add it to the data matrix 
for n = 1:NumFile 

    %# Get the current file name 
    CurFilePath = [DirectoryPath, 'PT_', num2str(n), '.csv']; 

    %# Open the current file for reading and scan in the second colum using numerical format 
    fid1 = fopen(CurFilePath, 'r'); 
    CurData = textscan(fid1, '%*s %f %*[^\n]', 'Delimiter', ',', 'HeaderLines', 1); 
    fclose(fid1); 

    %Add the current data to the cell array 
    D(:, n+1) = CurData{1, 1}; 

end

希望的代碼應該用我提供的意見是不言自明。一個稍微棘手的位是我用於textscan函數的格式字符串。這裏是一個快速解釋：

1）'%s %*[^\n]'說：獲取第一列，它是字符串格式（即%s），並跳過所有剩餘的列（即%*[^\n]）。

2）'%*s %f %*[^\n]'說：跳過第一列，這是在字符串格式（即%*s），得到第二列，這是一個浮點數（即%f），然後跳過所有剩餘列（即%*[^\n]）。

更新：我剛剛更新了代碼，在頂部包含一個變量，允許您指定csv文件所在的目錄（以防它不是當前目錄）。只需將文本FullDirectoryPathHereWithTrailingSlash替換爲適當的路徑，例如Linux上的/home/username/Documents/或Windows上的C:\Windows\Blah\。

我只測試了兩個測試的CSV文件的代碼，命名爲PT_1.csv和出現正是這樣PT_2.csv：

time, param 1, param 2, param 3 
2/01/2001 23:00, 11, 3, 314.322471 
3/01/2001 23:00, 12, 3, 311.683002

和

time, param 1, param 2, param 3 
2/01/2001 23:00, 13, 0, 296.523937 
3/01/2001 23:00, 14, 0, 294.0944

結果？

>> D 

D = 

     730853   11   13 
     730854   12   14

來源

2012-12-14 02:33:27

嗨科林！感謝您的即時回覆。我跟你的代碼一起，收到一條消息，說'賦值有更多的非單實例rhs維度比非單實例下標'。這意味着什麼？謝謝你的幫助！ – cwick

@cwick你能告訴我哪一行拋出錯誤嗎？ –

這是循環的最後一行，'D（:, n + 1）= CurData {1,1}' – cwick

導入多個csv文件並提取一列數據形成一個矩陣

回答

相關問題