Python使用re模塊解析導入的文本文件

def regexread(): 
    import re 

    result = '' 
    savefileagain = open('sliceeverfile3.txt','w') 

    #text=open('emeverslicefile4.txt','r') 
    text='09,11,14,34,44,10,11, 27886637, 0\n561, Tue, 5,Feb,2013, 06,25,31,40,45,06,07, 19070109, 0\n560, Fri, 1,Feb,2013, 05,21,34,37,38,01,06, 13063500, 0\n559, Tue,29,Jan,2013,' 

    pattern='\d\d,\d\d,\d\d,\d\d,\d\d,\d\d,\d\d' 
    #with open('emeverslicefile4.txt') as text:  
    f = re.findall(pattern,text) 

    for item in f: 
     print(item) 

    savefileagain.write(item) 
    #savefileagain.close()

上面的函數用於解析文本並返回七個數字組。我有三個問題。Python使用re模塊解析導入的文本文件

首先在「讀」文件，其中包含完全相同的文本作爲= '09 ...等」返回TypeError expected string or buffer，我不能閱讀一些職位甚至解決。其次，當我嘗試將結果寫入「寫入」文件時，沒有任何返回信息，第三，我不知道如何獲得與print語句獲得的輸出相同的輸出結果，它是三行每個七個數字是我想要的輸出。

這是我第一次用正則表達式，所以請溫柔一點！

來源

2013-02-12 user1478335

這應該做的伎倆，檢查說明什麼Im做這裏=評論）好運

import re 
filename = 'sliceeverfile3.txt' 
pattern = '\d\d,\d\d,\d\d,\d\d,\d\d,\d\d,\d\d' 
new_file = [] 

# Make sure file gets closed after being iterated 
with open(filename, 'r') as f: 
    # Read the file contents and generate a list with each line 
    lines = f.readlines() 

# Iterate each line 
for line in lines: 

    # Regex applied to each line 
    match = re.search(pattern, line) 
    if match: 
     # Make sure to add \n to display correctly when we write it back 
     new_line = match.group() + '\n' 
     print new_line 
     new_file.append(new_line) 

with open(filename, 'w') as f: 
    # go to start of file 
    f.seek(0) 
    # actually write the lines 
    f.writelines(new_file)

來源

2013-02-12 22:20:40 OmegaOuter

謝謝。這隻返回一行數字09,11,14,34,44,10,11。也許我錯誤地完成了縮進？我正在閱讀的文件如下N1，N2，N3，N4，N5，L1，L2，Jackpot，Wins \ n562，Fri，2013年2月8日，09,11,14,34,44,10， 2013年2月5日星期二，06,25,31,40,45,06,07,19070109,0560，2013年2月1日，星期五，05,21,34， 37,38,01,06,13063500,01555，2013年1月29日，星期二，09,16,26,36,39,02,06,643,1250,2155，2013年1月25日，星期五， 03,10,18,31,37,02,04，37772357，1 \ n557，2013年1月22日，星期二，感謝您的幫助。 – user1478335 2013-02-13 09:54:51

用於行的行：＃正則表達式應用到每個行匹配= re.findall（圖案，線）如果匹配：＃確保添加\ n至正確顯示時，我們把它寫回 #new_line =匹配.group（）+'\ n' print（match） new_file.append（match） lines = f.readlines（）我把它改成了這裏的腳本，這似乎工作。我認爲該文件只是一個連續的「句子」，不會像文本編輯器中出現的那樣分隔線條？ – user1478335 2013-02-13 10:10:29

解決了這個問題，我沒有真正測試代碼。我把f.write而不是f.writelines這是在文件中寫入字符串列表的正確方法。它只會將相應的編號寫入文件。如果你需要不同的輸出，那麼修改new_line的內容，使其反映在最終名稱中。另外我會建議使用另一個文件名輸出文件，它更好地保留原件;） – OmegaOuter 2013-02-13 23:59:05

你在正確的軌道上是有點......

你會遍歷文件： How to iterate over the file in python

和正則表達式應用到每一行。當你意識到你正在嘗試編寫'item'時，上面的鏈接應該真正回答你所有的3個問題。

來源

2013-02-12 19:45:29 brwnj

Python使用re模塊解析導入的文本文件

回答

相關問題