2014-09-24 51 views
0

我是python編程新手。我正在處理一個文本文件,這是一個軟件的結果文件。基本上,每當我們使用該軟件時,它會將所有消息寫入結果文本文件(類似於日誌文件)。在.txt文件中查找最新的修改表

現在我的問題是,該文件有很多表像下面這樣:

it may have some million lines above 
* ============================== INERTIA ============================== 
* File: /home/hamanda/transfer/cradle_vs30_dkaplus01_fwd_dl140606_fem140704_v00.bif 
* Solver: Nastran 
* Date: 24/09/14 
* Time: 10:29:50 
* Text: 
* 
* Area        +1.517220e+06 
* Volume        +5.852672e+06 
* 
* Structural mass     +4.594348e-02 
* MASS elements      +0.000000e+00 
* NSM on property entry    +0.000000e+00 
* NSM by parts (VMAGen and MPBalanc) +0.000000e+00 
* NSM by NSMCreate     +0.000000e+00 
* Total mass       +4.594348e-02 
* 
* Center of gravity 
* in the global   +1.538605e+02 +3.010898e+00 -2.524868e+02 
* coordinate system 
* 
* Moments of inertia +8.346990e+03 +6.187810e-01 +1.653922e+03 
* about the global  +6.187810e-01 +5.476398e+03 +4.176218e+01 
* coordinate system  +1.653922e+03 +4.176218e+01 +7.746156e+03 
* 
* Steiner share   +2.929294e+03 +4.016500e+03 +1.088039e+03 
* 
* Moments of inertia +5.417696e+03 +2.190247e+01 -1.308790e+02 
* about the center  +2.190247e+01 +1.459898e+03 +6.835397e+00 
* of gravity   -1.308790e+02 +6.835397e+00 +6.658117e+03 
* --------------------------------------------------------------------- 
some lines below and this table may repeat if user does any change to area and volume 
values.---------- 

現在我的問題是:如何打印在控制檯上的最新表。我能夠打印表格的第一次出現,現在我無法獲得表格的最新發生。

我需要在控制檯上打印最新的表格,我該怎麼做? 這是我的代碼:

input = open(fileName,'r') 
    intable = False 
    for line in input: 
     if line.strip() == "* ============================== INERTIA ==============================": 
      intable = True 
     if line.strip() == "* ---------------------------------------------------------------------": 
      intable = False 
      break 
     if intable and line.strip().startswith("*"): 
      z1=(line.strip()) 
      print(z1) 
+0

您有一個良好的開端,但不清楚你卡在什麼。解析出日期並將它們與迄今爲止最新的日期進行比較。如果這個更新,保留它。在文件的末尾,打印您保存的那個。你有哪些麻煩? – tripleee 2014-09-24 05:21:57

+0

如果您可以更改整個過程,更好的方法可能是將每個表保存到單個文件。更妙的是,生成這些表格的東西會以機器可讀的格式寫入它們 - JSON非常流行,並且易於使用。 – tripleee 2014-09-24 05:23:41

+0

我可以做到這一點,但有些時候它可能不會寫入日期和時間,因爲當軟件正在運行時,您在區域和體積上進行了一些更改,它只是寫入新的表格,但它可能不會寫入它創建的日期和時間新的桌子。我遇到了麻煩,因爲我不能區分他們@ tripleee – ayaan 2014-09-24 05:25:26

回答

1

試試這個:

f = open(fileName,'r') 
content = f.readlines() 
content.reverse() 
for line in content: 
    if line.strip() == "* ============================== INERTIA ==============================": 
     index = content.index(line) 
     break 
for line in content[index::-1]: 
    print line 
+0

是的,我會回覆你... – ayaan 2014-09-24 05:27:41

+0

我編輯了代碼並感謝你現在的工作 – ayaan 2014-09-24 05:37:49

0

您還可以捕獲列表中的文件數據,如下圖所示:

delimiter = '* ============================== INERTIA ==============================\n' 
    filedata = open(filepath).read().split(delimiter) 
    print filedata[-1] # This will print your latest occurrence of table 

我不知道有關代碼效率,但絕對有效。 在需要的情況下,您還可以列出表格的所有其他事件。

+0

這是最快的方法,但效率低下。如果結果文件太大,則需要很長時間。 – han058 2014-09-24 06:09:03

+0

是的結果文件大約有1212567行它需要這麼多時間....... – ayaan 2014-09-24 06:23:36

+0

...,@ayaan然後我推薦使用bash。或者Python的最佳方式有點複雜。 – han058 2014-09-24 06:29:11

1

如果你可以使用bash,那麼下面是更有效的方法。

RESULT_FILE="result_text_file_name" 
START_LINE=$(grep -n "===== INERTIA ====" $RESULT_FILE | tail -1 | cut -d":" -f1) 
END_LINE=$(grep -n " --------------" $RESULT_FILE | tail -1 | cut -d":" -f1) 
LINE_COUNT=$(wc -l $RESULT_FILE | awk '{print $1}') 
tail -n `expr $LINE_COUNT - $FIRST_LINE + 1` $RESULT_FILE | head -n `expr $END_LINE - $FIRST_LINE + 1` 

還是你想蟒蛇,然後讀取後How to read lines from a file in python starting from the end

,並通過我指的上述頁寫的代碼! (讀線相反的方式)

我認爲結果文件「的test.txt」

#!/usr/bin/env python 
import sys 
import os 
import string 

"""read a file returning the lines in reverse order for each call of readline() 
This actually just reads blocks (4096 bytes by default) of data from the end of 
the file and returns last line in an internal buffer. I believe all the corner 
cases are handled, but never can be sure...""" 

class BackwardsReader: 
    def readline(self): 
    while len(self.data) == 1 and ((self.blkcount * self.blksize) < self.size): 
     self.blkcount = self.blkcount + 1 
     line = self.data[0] 
     try: 
     self.f.seek(-self.blksize * self.blkcount, 2) # read from end of file 
     self.data = string.split(self.f.read(self.blksize) + line, '\n') 
     except IOError: # can't seek before the beginning of the file 
     self.f.seek(0) 
     self.data = string.split(self.f.read(self.size - (self.blksize * (self.blkcount-1))) + line, '\n') 

    if len(self.data) == 0: 
     return "" 

    # self.data.pop() 
    # make it compatible with python <= 1.5.1 
    line = self.data[-1] 
    self.data = self.data[:-1] 
    return line + '\n' 

    def __init__(self, file, blksize=4096): 
    """initialize the internal structures""" 
    # get the file size 
    self.size = os.stat(file)[6] 
    # how big of a block to read from the file... 
    self.blksize = blksize 
    # how many blocks we've read 
    self.blkcount = 1 
    self.f = open(file, 'rb') 
    # if the file is smaller than the blocksize, read a block, 
    # otherwise, read the whole thing... 
    if self.size > self.blksize: 
     self.f.seek(-self.blksize * self.blkcount, 2) # read from end of file 
    self.data = string.split(self.f.read(self.blksize), '\n') 
    # strip the last item if it's empty... a byproduct of the last line having 
    # a newline at the end of it 
    if not self.data[-1]: 
     # self.data.pop() 
     self.data = self.data[:-1] 


if(__name__ == "__main__"): 
    f = BackwardsReader("test.txt") 
    end_line = "---------------------------------------------------" 
    start_line = "========= INERTIA =======" 
    lines = [] 

    intable = False 
    line = f.readline() 
    while line: 
    if line.find(end_line) >= 0: 
     intable = True 
    if intable: 
     lines.append(line) 
     if line.find(start_line) >= 0: 
     break 
    line = f.readline() 

    lines.reverse() 

    print "".join(lines) 

和測試的結果!

[my server....]$ wc -l test.txt 
 
34008720 test.txt 
 

 
[my server....]$ time python test.py 
 
* ============================== INERTIA ============================== 
 
* File: /home/hamanda/transfer/cradle_vs30_dkaplus01_fwd_dl140606_fem140704_v00.bif 
 
* Solver: Nastran 
 
* Date: 24/09/14 
 
* Time: 10:29:50 
 
* Text: 
 
* 
 
* Area        +1.517220e+06 
 
* Volume        +5.852672e+06 
 
* 
 
* Structural mass     +4.594348e-02 
 
* MASS elements      +0.000000e+00 
 
* NSM on property entry    +0.000000e+00 
 
* NSM by parts (VMAGen and MPBalanc) +0.000000e+00 
 
* NSM by NSMCreate     +0.000000e+00 
 
* Total mass       +4.594348e-02 
 
* 
 
* Center of gravity 
 
* in the global   +1.538605e+02 +3.010898e+00 -2.524868e+02 
 
* coordinate system 
 
* 
 
* Moments of inertia +8.346990e+03 +6.187810e-01 +1.653922e+03 
 
* about the global  +6.187810e-01 +5.476398e+03 +4.176218e+01 
 
* coordinate system  +1.653922e+03 +4.176218e+01 +7.746156e+03 
 
* 
 
* Steiner share   +2.929294e+03 +4.016500e+03 +1.088039e+03 
 
* 
 
* Moments of inertia +5.417696e+03 +2.190247e+01 -1.308790e+02 
 
* about the center  +2.190247e+01 +1.459898e+03 +6.835397e+00 
 
* of gravity   -1.308790e+02 +6.835397e+00 +6.658117e+03 
 
* --------------------------------------------------------------------- 
 

 

 
real \t 0m0.025s 
 
user \t 0m0.018s 
 
sys \t 0m0.006

+0

最後一行的test.txt必須固定爲「$ RESULT_FILE」 – han058 2014-09-24 06:30:46

+1

Ern,爲什麼不只是'awk'/ INERTIA/{i = 1;刪除a} {a [i ++] = $ 0} END {for(j = i; j <= i; ++ j} print a [j]}'「$ RESULT_FILE」' – tripleee 2014-09-24 06:46:42

+0

我知道這種方式。不擅長使用awk!所以我寫了非優雅的方式! – han058 2014-09-24 07:17:10