2017-03-17 158 views
-1

我想寫一個小小的python腳本來繪製一些.dat文件。爲此,我需要首先處理文件。 .dat文件如下所示:讀取文件,刪除文本字段,保留數字文件

(Real64 
(numDims 1) 
(size 513) 
(data 
    [ 90.0282291905089 90.94377050431068 92.31708247501335 93.38521400778211 94.60593575951782 95.67406729228657 97.04737926298925 97.96292057679104 ...] 
) 
) 

我想刪除文本部分和「正常」括號。我只需要[...]之間的數據。

我想是這樣的:

from Tkinter import Tk 
from tkFileDialog import askopenfilename 

# just a small GUI to get the file 
Tk().withdraw() 
filename = askopenfilename() 

import numpy as np 

with open(filename) as f: 
    temp = f.readlines(5) #this is the line in the .dat file 

    for i in range(len(temp)-1): 
     if type(temp[i]) == str: 
      del temp[i] 

然而,這會導致產生一個「出界指數」。幫助將不勝感激。

+1

你從哪裏得到'.dat'文件?你可能有任何產生它給你另一種格式(如JSON)?如果不是,可以用逗號替換空格,並將其解析爲JSON。 –

+1

你是什麼意思*「刪除文本部分」*?清楚。向我們展示給定輸入的預期輸出。應該'(size 513)' - >'(513)'還是'513'或完全刪除?你可以使用正則表達式來完成所有這些,但是你沒有爲我們指定你想要做什麼。 – smci

+0

您是否嘗試過使用正則表達式? – chbchb55

回答

0

我只需要數據在[......]

# treat the whole thing as a string 
temp = '''(Real64 
(numDims 1) 
(size 513) 
(data 
    [ 90.0282291905089 90.94377050431068 92.31708247501335 ] 
) 
)''' 

# split() at open bracket; take everything right 
# then split() at close bracket; take everything left 
# strip() trailing/leading white space 
number_string = temp.split('[')[1].split(']')[0].strip() 

# convert to list of floats, because I expect you'll need to 
number_list = [float(i) for i in number_string.split(' ')] 

print number_string 
print number_list 

>>> 90.0282291905089 90.94377050431068 92.31708247501335 
>>> [90.0282291905089, 90.94377050431068, 92.31708247501335] 
+0

這工作得很好,謝謝! –

0
print re.findall("\[([0-9. ]+)\]",f.read()) 

這就是所謂的regular expression和它說發現我所有的東西,是在方括號

\[ # literal left bracket 
(# capture the stuff in here 
[0-9. ] # accept 0-9 and . and space 
+ # at least one ... probably more 
) # end capture group 
\] # literal close bracket 

或者你可以使用類似pyparsing

inputdata = '''(Real64 
(numDims 1) 
(size 513) 
(data 
    [ 90.0282291905089 90.94377050431068 92.31708247501335 93.38521400778211 94.60593575951782 95.67406729228657 97.04737926298925 97.96292057679104 ...] 
) 
) 
''' 
from pyparsing import OneOrMore, nestedExpr 

data = OneOrMore(nestedExpr()).parseString(inputdata) 
print "GOT:", data[0][-1][2:-1] 
兩者之間的數字時間和空間
+0

謝謝,這確實幫了我一些忙! –