如何導入複雜的CSV文件導入數字矢量到Matlab

我想知道我們應該如何從包含字符串，雙打和字符一個複雜的CSV文件讀取等如何導入複雜的CSV文件導入數字矢量到Matlab

例如，你可以請提供了成功的可以在這個csv文件中提取數值的命令？

點擊here。

例如：

yield curve data 2013-10-04  
Yields in percentages per annum.   


Parameters - AAA-rated bonds   
Series key Parameters Description 
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.BETA0 2.03555 Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Beta 0 - Euro, provided by ECB 
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.BETA1 -2.009068 Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Beta 1 - Euro, provided by ECB 
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.BETA2 24.54184 Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Beta 2 - Euro, provided by ECB 
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.BETA3 -21.80556 Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Beta 3 - Euro, provided by ECB 
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.TAU1 5.351378 Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Tau 1 - Euro, provided by ECB 
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.TAU2 4.321162 Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Tau 2 - Euro, provided by ECB

這些都是信息的一部分，在文件中。我試圖csvread('yc_latest.csv', 6, 1, [6,1,6,1])來獲取值2.03555，但它給了我下面的錯誤：

Error using dlmread (line 139) 
    Mismatch between file and format string. 
    Trouble reading number from file (row 1u, field 3u) ==> "Euro area (changing composition) - 
    Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous 
    compounding - yield error minimisation - Yield curve parameters, Beta 0 

    Error in csvread (line 50) 
     m=dlmread(filename, ',', r, c, rng);

來源

2013-10-07 Cancan

您的感謝尚不成熟。向我們展示您的代碼，您的最佳嘗試，我們可能會提供幫助。鏈接到狡猾網站上的zip文件並不鼓勵許多SOE遵循它們。 –

你能給我們一個你想如何解析一行的例子嗎？（你實際需要哪些數據） –

對不起，我剛剛編輯 – Cancan

我強烈建議您使用「導入數據」從MATLAB的功能（這是在「HOME」工具欄）。

特別注意在截圖中，它也可以爲您生成代碼，以便將來可以自動運行它。 enter image description here

來源

2013-10-08 08:25:44 bdecaf

對於混合數據（數字和文本），我通常會推薦單元格數組選項。 –

真的，我從MATLAB自動發現的設置中截取了屏幕截圖。假設有很多需要調整的地方。 – bdecaf

這裏是一個非常哈克解決方案。不幸的是，Matlab在閱讀csv文件方面大打口水，使得這種hackery成爲不幸的必需品。在光明的一面，你可能只需要編寫一次這樣的代碼。

fid = fopen('yc_latest.csv'); %// open the file 

%// parse as csv, skipping the first six lines 
contents = textscan(fid, '%s %f %[^\n]', 'HeaderLines', 6); 

%// unpack the fields and give them meaningful names 
[seriesKey, parameters, description] = contents{:}; 

fclose(fid);     %// don't forget this!

來源

2013-10-07 15:03:55

請注意，您可以使用'textscan（...，'HeaderLines'，6）'而不是循環。 ** P.S **：我認爲MATLAB解析CSV文件非常重要！ –

@EitanT將它與R進行比較，其中代碼將是'x < - read.csv（「yc_latest.csv」，skip = 5，header = TRUE，stringsAsFactors = FALSE）。這對於列名更改順序，或者列的添加/刪除順序也很穩定（這在我工作的地方會發生很多！）而Matlab解決方案則涉及單獨提取標題並匹配它們。令我感到沮喪的是，並沒有內置到Matlab的全功能csv閱讀功能。不過，關於「HeaderLines」的好點，我會編輯以包含它。 –

好吧，['csvread']（http://www.mathworks.com/help/matlab/ref/csvread.html），但由於這個文件不是真正的逗號分隔的，所以你不能抱怨MATLAB在這裏。這就好像說C語言在閱讀文件時一樣，這絕對是無稽之談。也許這個功能沒有被嵌入到語言中，但是你可以輕鬆地創建一個相同的東西。順便說一句，我認爲你可以在參數列的格式化字符串中使用'％f'來代碼中進行另一次改進，它將爲你節省以後執行'str2double'的麻煩。 –

從克里斯到該溶液中的另一種：

fid=fopen('yc_latest.csv'); 
Rows = textscan(fid,'%s', 'delimiter','\n'); %Creates a temporary cell array with the rows 
fclose(fid); 

%looks for the lines with a euro value: 
value=strfind(Rows,'Euro'); 
Idx = find(~cellfun('isempty', value)); 

Columns= cellfun(@(x) textscan(x,'%f','delimiter','\t','CollectOutput',1), Rows); 
Columns= cellfun(@transpose, Columns, 'UniformOutput', 0);

與實際的歐元值的所有行的索引被存儲在IDX。

來源

2013-10-07 16:05:57

您可能想要使用textscan這種方式。

每一行被解析正則分隔符（製表符，空格），和所使用的格式是%*s用星跳過所述第一元件（YC.B.U2.EUR.4F.G_N_A.SV_C_YM.BETA0），然後%f獲取感興趣的值，最後%*[^\n]跳過剩下的線。

fid = fopen(filename);         
C = textscan(fid, '%*s%f%*[^\n]', 'HeaderLines', 6); 
fclose(fid); 

values = C{1};

來源

2013-10-08 07:37:16 marsei

如何導入複雜的CSV文件導入數字矢量到Matlab

回答

相關問題