2016-09-07 18 views
1

我有大約20000-30000行的日誌文件,它們包含所有類型的數據,每行以當前時間戳開始,然後是文件/ linu數字路徑,然後添加了一些附加(不必要的信息)的對象的值。Python:從stdout中提取模式並保存在csv中

2016/08/31 17:27:43/usr/log/data/old/objec: 540: Adjustment Stat 
2016/08/31 17:27:43/usr/log/data/old/objec: 570: Position: 1 
2016/08/31 17:27:43/usr/log/data/old/object::1150: Adding new object in department xxxx 
2016/08/31 17:27:43/usr/log/data/old/file1.java:: 728: object ID: 0 
2016/08/31 17:27:43/usr/log/data/old/file2.java:: 729: Start location:1 
2016/08/31 17:27:43/usr/log/data/old/file1.java:: 730: End location:55 
2016/08/31 17:27:43/usr/log/data/old/: 728: object ID: 1 
2016/08/31 17:27:43/usr/log/data/old/: 729: Start location:56 
2016/08/31 17:27:43/usr/log/data/old/: 730: End location:67 
2016/08/31 17:27:43/usr/log/data/old/: 728: object ID: 2 
2016/08/31 17:27:43/usr/log/data/old/: 729: Start location:68 
2016/08/31 17:27:43/usr/log/data/old/: 730: End location:110 
Timer to Calculate location of object x took 0.004935 seconds 

.... ... ... 同樣的信息...新對象 有每個文件30-40對象羣體,他們改變(ID之間0-3)

I want to extract information (next line after Adjustment Stat)and save in a text file like 
Position ObjectID StartLocation EndLocation 
      0    1    55 
      1    56    67 
      2    68    110 

... ... ...

(這裏不存在與ID爲0的任何對象) ...

Or may be store in csv file like 
    0,1,55 
    1,56,67 
    2,68,110 

回答

3
import csv 

with open('out.csv', 'w') as output_file, open('in.txt') as input_file: 
    writer = csv.writer(output_file) 
    for l in input_file: 
     if 'object ID:' in l: 
      object_id = l.split(':')[-1].strip() 
     elif 'Start location:' in l: 
      start_loc = l.split(':')[-1].strip() 
     elif 'End location:' in l: 
      end_loc = l.split(':')[-1].strip() 
      writer.writerow((object_id, start_loc, end_loc)) 


2.6版本:

import csv 
import contextlib 

with contextlib.nested(open('out.csv', 'w'), open('in.txt')) as (output_file, input_file): 
    writer = csv.writer(output_file) 
    for l in input_file: 
     if 'object ID:' in l: 
      object_id = l.split(':')[-1].strip() 
     elif 'Start location:' in l: 
      start_loc = l.split(':')[-1].strip() 
     elif 'End location:' in l: 
      end_loc = l.split(':')[-1].strip() 
      writer.writerow((object_id, start_loc, end_loc)) 

out.csvin.txt在OP)

0,1,55 
1,56,67 
2,68,110 
+0

邏輯很好真的很不錯,唯一的問題是doens't與我的olddd python( 2.6)...在舊python中轉換工作..謝謝 – Naumann

+0

內建的單行復合上下文管理器在2.7中引入 - 嘗試編輯後的2.6版本。 –

+0

真棒...感謝Craig,它也很適合2.6,更重要的是你使用contextlib.nested的方式......這將解決我的其他許多問題。 – Naumann

相關問題