2014-10-04 68 views
0

我嘗試編寫一個Python代碼以從輸入中提取DVDL值。這裏是截斷的輸入。python csv分隔符無法正常工作

 A V E R A G E S O V E R 50000 S T E P S 


NSTEP = 50000 TIME(PS) =  300.000 TEMP(K) = 300.05 PRESS = -70.0 
Etot = -89575.9555 EKtot =  23331.1725 EPtot  = -112907.1281 
BOND =  759.8213 ANGLE =  2120.6039 DIHED  =  4231.4019 
1-4 NB =  940.8403 1-4 EEL =  12588.1950 VDWAALS =  13690.9435 
EELEC = -147238.9339 EHBOND =   0.0000 RESTRAINT =   0.0000 
DV/DL =  13.0462 
EKCMT =  10212.3016 VIRIAL =  10891.5181 VOLUME  = 416404.8626 
               Density =   0.9411 
Ewald error estimate: 0.6036E-04 



     R M S F L U C T U A T I O N S 


NSTEP = 50000 TIME(PS) =  300.000 TEMP(K) =  1.49 PRESS = 129.9 
Etot =  727.7890 EKtot =  115.7534 EPtot  =  718.8344 
BOND =  23.1328 ANGLE =  36.1180 DIHED  =  19.9971 
1-4 NB =  12.7636 1-4 EEL =  37.3848 VDWAALS =  145.7213 
EELEC =  739.4128 EHBOND =   0.0000 RESTRAINT =   0.0000 
DV/DL =   3.7510 
EKCMT =  76.6138 VIRIAL =  1195.5824 VOLUME  =  43181.7604 
               Density =   0.0891 
Ewald error estimate: 0.4462E-04 

這是腳本。基本上我們在輸入中有很多DVDL(而不是上面的截斷輸入),我們只需要最後兩個。所以我們把它們全部讀入一個列表中,只有最後兩個。最後,我們將列表中的最後兩個DVDL寫入一個csv文件。慾望輸出

13.0462,3.7510

然而,下面的腳本(蟒蛇2.7)將會使輸出這樣的。任何大師能啓發嗎?謝謝。

13.0462「」 3.7510「」

下面是腳本:

import os 
import csv 

DVDL=[] 
filename="input.out" 
file=open(filename,'r') 

with open("out.csv",'wb') as outfile: # define output name 
    line=file.readlines() 
    for a in line: 
     if ' DV/DL =' in a: 
      DVDL.append(line[line.index(a)].split('  ')[1]) # Extract DVDL number 

    print DVDL[-2:]  # We only need the last two DVDL 
    yeeha="".join(str(a) for a in DVDL[-2:]) 
    print yeeha 
    writer = csv.writer(outfile, delimiter=',',lineterminator='\n')#Output the list into a csv file called "outfile" 
    writer.writerows(yeeha) 
+0

寫入'yeeha = 「」 加入(STR(a)用於一個在DVDL [-2:])' 「編劇」將把單詞分解爲字符。 – 2014-10-04 18:52:56

+0

如果我使用writer.writerows(DVDL [-2:])而不是writer.writerows(yeeha),則輸出將如下所示: – Chubaka 2014-10-04 18:56:19

+0

如果您只需要行[DV/DL = 13.0462, DV/DL = 3.7510],對整個文本文件進行編譯**正則表達式**塊搜索,恕我直言爲你抓取這些值更聰明,讓你直接將這兩個值給你。 – user3666197 2014-10-04 18:56:51

回答

1

至於誰提出的方法的評論者一直沒有勾勒出這樣的代碼的機會,這裏就是我想建議做它(編輯爲允許可選簽署浮點數可選指數,由一個答案,建議Python regular expression that matches floating point numbers):

import re,sys 

pat = re.compile("DV/DL += +([+-]?(\d+(\.\d*)?|\.\d+)([eE][+-]?\d+)?)") 
values = [] 
for line in open("input.out","r"): 
    m = pat.search(line) 
    if m: 
     values.append(m.group(1)) 
outfile = open("out.csv","w") 
outfile.write(",".join(values[-2:])) 

已經運行這個S CRIPT:

$ cat out.csv 
13.0462,3.7510 

我沒用過csv模塊在這種情況下,因爲它是不是真的有必要爲這樣一個簡單的輸出文件。然而,添加下列行到腳本將使用csv到相同的數據寫入out1.csv

import csv 

writer = csv.writer(open("out1.csv","w")) 
writer.writerow(values[-2:]) 
+0

謝謝!在閱讀了這個腳本以及一些「令人困惑」的python reg表達手冊之後,我瞭解了Python中的基本章節! – Chubaka 2014-10-04 22:29:19

+0

如果DV/DL =正浮點,但不是負浮點,則此功能可以很好地工作。你能開導嗎? – Chubaka 2014-10-05 00:37:11

+0

我試過了:pat = re.compile(「DV/DL \ = r'[+ - ]?(\ d +(\。\ d *)?| \。\ d +)([eE] [+ - ]? \ d +)?'「)和pat = re.compile(」DV/DL + = +(^ - ?\ d +(\。\ d +)?$)「)等等。 – Chubaka 2014-10-05 00:37:41