2016-09-28 85 views
0

我有一個很長的文本文件是這樣的:如何從MATLAB中的.txt文件中讀取特定的信息?

I0927 11:33:18.534551 16932 solver.cpp:244]  Train net output #0: loss = 2.61145 (* 1 = 2.61145 loss) 
I0927 11:33:18.534620 16932 sgd_solver.cpp:106] Iteration 20, lr = 0.001 
I0927 11:33:33.221546 16932 solver.cpp:228] Iteration 40, loss = 0.573027 
I0927 11:33:33.221771 16932 solver.cpp:244]  Train net output #0: loss = 0.573027 (* 1 = 0.573027 loss) 
I0927 11:33:33.221851 16932 sgd_solver.cpp:106] Iteration 40, lr = 0.001 
I0927 11:33:47.883162 16932 solver.cpp:228] Iteration 60, loss = 0.852016 
I0927 11:33:47.884717 16932 solver.cpp:244]  Train net output #0: loss = 0.852016 (* 1 = 0.852016 loss) 
I0927 11:33:47.884812 16932 sgd_solver.cpp:106] Iteration 60, lr = 0.001 
I0927 11:34:02.543320 16932 solver.cpp:228] Iteration 80, loss = 0.385975 
I0927 11:34:02.543442 16932 solver.cpp:244]  Train net output #0: loss = 0.385975 (* 1 = 0.385975 loss) 
I0927 11:34:02.543514 16932 sgd_solver.cpp:106] Iteration 80, lr = 0.001 
I0927 11:34:17.297544 16932 solver.cpp:228] Iteration 100, loss = 0.526758 
I0927 11:34:17.297659 16932 solver.cpp:244]  Train net output #0: loss = 0.526758 (* 1 = 0.526758 loss) 
I0927 11:34:17.297722 16932 sgd_solver.cpp:106] Iteration 100, lr = 0.001 
I0927 11:34:31.962934 16932 solver.cpp:228] Iteration 120, loss = 0.792767 

我想提取以下信息

[ Iteration, Train net output, lr ]

,並把它們在MATLAB細胞。

請問您能指導我如何做到這一點?

+0

您希望在輸出中使用的'Iteration'是來自'sgd_solver'或'solver'的''Iteration''。 'regexp'應該能夠處理這個,但是你可能需要多次運行它。 –

+0

您可以逐行讀取文件。然後使用'strfind'查找關鍵字位置並相應地切斷字符串。例如,對於「Iteration」,查找關鍵字「Iteration」起始索引(i1),然後查找下一個逗號(i2)。然後你知道該值位於[i1 + 9:i2-1] –

+2

下面是我如何從日誌中提取'Train net output'的例子。 https://regex101.com/r/uGus7S/1。你應該能夠很容易地修改這個表達式,然後在MATLAB中使用它。 –

回答

0

正如Some Guy建議,你可以用regexp

fid = fopen('log.txt','r'); 
output = {}; 
line = fgetl(fid); 
while ischar(line) 
    l1 = regexp(line, 'Iteration\s+(\d+),\s+loss\s+=\s+', 'tokens', 'once'); 
    if ~isempty(l1) 
     %// we got the first line of an iteration 
     line = fgetl(fid); 
     l2 = regexp(line, 'Train net output #0: loss = (\S+)', 'tokens', 'once'); 
     line = fgetl(fid); 
     l3 = regexp(line, 'Iteration \d+, lr = (\S+)', 'tokens', 'once'); 
     output{end+1} = [str2double(l1{1}), str2double(l2{1}), str2double(l3{1})]; 
    end 
    line = fgetl(fid); 
end; 
fclose(fid); 
output = vertcat(output{:}); 

順便說一句,你的朱古力知道$CAFFE_ROOT/tools/extra/parse_log.py效用?

+1

而不是在while循環中執行'regexp',你可以使用'fread'一次讀取所有的東西,並一次性使用'regexp'來獲得'[Iteration,Train net output,lr]'。像你這樣使用'regexp'沒有使用'strfind'的好處,事實上這更難理解。 –

1

我刪除了前兩個和你的日誌的最後一行,使其保持一致,這樣你在每次迭代這樣經過Train net outputsgd_solver .. lr =行:

I0927 11:33:33.221546 16932 solver.cpp:228] Iteration 40, loss = 0.573027 
I0927 11:33:33.221771 16932 solver.cpp:244]  Train net output #0: loss = 0.573027 (* 1 = 0.573027 loss) 
I0927 11:33:33.221851 16932 sgd_solver.cpp:106] Iteration 40, lr = 0.001 
I0927 11:33:47.883162 16932 solver.cpp:228] Iteration 60, loss = 0.852016 
I0927 11:33:47.884717 16932 solver.cpp:244]  Train net output #0: loss = 0.852016 (* 1 = 0.852016 loss) 
I0927 11:33:47.884812 16932 sgd_solver.cpp:106] Iteration 60, lr = 0.001 
I0927 11:34:02.543320 16932 solver.cpp:228] Iteration 80, loss = 0.385975 
I0927 11:34:02.543442 16932 solver.cpp:244]  Train net output #0: loss = 0.385975 (* 1 = 0.385975 loss) 
I0927 11:34:02.543514 16932 sgd_solver.cpp:106] Iteration 80, lr = 0.001 
I0927 11:34:17.297544 16932 solver.cpp:228] Iteration 100, loss = 0.526758 
I0927 11:34:17.297659 16932 solver.cpp:244]  Train net output #0: loss = 0.526758 (* 1 = 0.526758 loss) 
I0927 11:34:17.297722 16932 sgd_solver.cpp:106] Iteration 100, lr = 0.001 

可以使用讀取該文件爲文本fileread,然後使用下面的代碼執行regexp

txt = fileread('log.txt'); 
it = regexp(txt,'I0927.*solver.cpp:228]\sIteration\s(.*),.*','tokens','dotexceptnewline') 

it = 

    1×4 cell array 

    {1×1 cell} {1×1 cell} {1×1 cell} {1×1 cell} 

net_out = regexp(txt,'I0927.*solver.cpp:244]\s*Train\snet\soutput.*loss\s=\s(\S*).*','tokens','dotexceptnewline'); 
lr = regexp(txt,'I0927.*sgd_solver.cpp:106]\sIteration.*lr\s=\s(\S*)','tokens','dotexceptnewline'); 

輸出需要調理的一點點,然後才能將其轉換爲數字:

% Get outputs out of their cells 
it = [it{:}]'; 
net_out = [net_out{:}]'; 
lr = [lr{:}]'; 

sim_out = str2double([it net_out lr]); 
相關問題