從文本文件中提取數據以用於python腳本？

基本上，我有一個這樣的文件：從文本文件中提取數據以用於python腳本？

Url/Host: www.example.com 
Login:  user 
Password: password 
Data_I_Dont_Need: something_else

如何使用正則表達式來分隔的細節把它們放在變量？

對不起，如果這是一個可怕的問題，我永遠無法掌握RegEx。所以，另一個問題是，你能否提供RegEx，但是要解釋它的每個部分是什麼？

來源

2010-05-16 Rob

使用str.split（「：」）不是一個選項嗎？ – extraneon 2010-05-16 19:02:51

你應該把這些條目放在字典中，而不是放在很多單獨的變量中 - 顯然，你使用的鍵是n eed 不是可以被接受爲變量名稱（'Url/Host'中的斜線將是一個殺手鐗！ - ），但是它們會像字符串鍵入字典一樣好。

import re 

there = re.compile(r'''(?x)  # verbose flag: allows comments & whitespace 
        ^  # anchor to the start 
         ([^:]+) # group with 1+ non-colons, the key 
         :\s*  # colon, then arbitrary whitespace 
         (.*)  # group everything that follows 
         $   # anchor to the end 
        ''')

然後

configdict = {} 
for aline in open('thefile.txt'): 
    mo = there.match(aline) 
    if not mo: 
    print("Skipping invalid line %r" % aline) 
    continue 
    k, v = mo.groups() 
    configdict[k] = v

使RE模式「詳細」（用(?x)啓動它們或使用re.VERBOSE作爲第二個參數re.compile）的可能性是非常有用的，讓你澄清你的RE帶有註釋和很好的對齊空格。我認爲這是可悲的使用不足;-)。

來源

2010-05-16 19:06:32

很好的答案和很好的解釋。我想我希望刪除該值的潛在空白。我相信這可以通過在值組和行結束符'$'之間添加\ s *來完成？ – extraneon 2010-05-16 19:09:23

AttributeError：'NoneType'對象沒有屬性'group' – Rob 2010-05-16 20:58:42

@Rob，你指的是'groups'，而不是'group'。是的，我忘了添加明顯需要的'continue'來做**跳過，讓我添加它。順便說一句，你的問題沒有提到可以有不符合這種模式的線條，以及在找到這樣的線條時該怎麼辦 - 請編輯你的Q以添加這些重要信息！ – 2010-05-17 00:04:52

好吧，如果你不知道正則表達式，簡單地改變你的文件是這樣的：

Host = www.example.com 
Login = uer 
Password = password

並使用ConfigParser的Python模塊http://docs.python.org/library/configparser.html

來源

2010-05-16 18:57:59 mkotechno

修改文件並不是一個真正的選擇，但是謝謝 – Rob 2010-05-16 19:00:59

ConfigParser支持'：'分隔符http://stackoverflow.com/questions/2845018/extracting-data-from-a-text-file-to-use-in -a-python-script/2845923＃2845923 – jfs 2010-05-16 23:29:23

編輯：更好的解決方案

for line in input: 
    key, val = re.search('(.*?):\s*(.*)', line).groups()

來源

2010-05-16 19:03:01 mikerobi

對於像這樣簡單的文件，你並不需要正則表達式。字符串函數可能更容易理解。此代碼：

def parse(data): 
    parsed = {}  
    for line in data.split('\n'): 
     if not line: continue # Blank line 
     pair = line.split(':') 
     parsed[pair[0].strip()] = pair[1].strip() 
    return parsed 

if __name__ == '__main__': 
    test = """Url/Host: www.example.com 
    Login:  user 
    Password: password 
""" 
    print parse(test)

將做的工作，並導致：

{'Login': 'user', 'Password': 'password', 'Url/Host': 'www.example.com'}

來源

2010-05-16 19:56:01 snim2

ConfigParser模塊支持':'分隔符。

import ConfigParser 
from cStringIO import StringIO 

class Parser(ConfigParser.RawConfigParser): 
    def _read(self, fp, fpname): 
     data = StringIO("[data]\n"+fp.read()) 
     return ConfigParser.RawConfigParser._read(self, data, fpname) 

p = Parser() 
p.read("file.txt") 
print dict(p.items("data"))

輸出：

{'login': 'user', 'password': 'password', 'url/host': 'www.example.com'}

雖然正則表達式或手動分析可能是在你的情況更合適。

來源

2010-05-16 23:28:52 jfs

從文本文件中提取數據以用於python腳本？

回答

相關問題