Python如何將文件與模板匹配

我正在尋找匹配一些常見模板的許多文件，並提取差異。我想就最好的方式提出建議。例如：Python如何將文件與模板匹配

模板答：

<1000 text lines that have to match> 
a=? 
b=2 
c=3 
d=? 
e=5 
f=6 
<more text>

模板B：

<1000 different text lines that have to match> 
h=20 
i=21 
j=? 
<more text> 
k=22 
l=? 
m=24 
<more text>

如果我在文件C通過：

<1000 text lines that match A> 
a=500 
b=2 
c=3 
d=600 
e=5 
f=6 
<more text>

我想一個簡單的方法說這匹配模板A，並提取'a = 500'，'d = 600'。

我可以將這些與正則表達式匹配，但文件相當大，構建正則表達式會很痛苦。

我也試過difflib，但解析操作碼和提取差異看起來並不理想。

任何人有更好的建議嗎？

來源

2013-01-23 Peter Hofmann

您可能需要稍微調整一下以處理額外的文本，因爲我不知道確切的格式，但它不應該太難。

with open('templ.txt') as templ, open('in.txt') as f: 
    items = [i.strip().split('=')[0] for i in templ if '=?' in i] 
    d = dict(i.strip().split('=') for i in f) 
    print [(i,d[i]) for i in items if i in d]

出來：

[('a', '500'), ('d', '600')] # With template A 
[]       # With template B

，或者排列：

from itertools import imap,compress 
with open('templ.txt') as templ, open('in.txt') as f: 
    print list(imap(str.strip,compress(f,imap(lambda x: '=?' in x,templ))))

出來：

['a=500', 'd=600']

來源

2013-01-23 16:29:55 root

首先謝謝你，這是一種提取我沒有想到的數據的方法。它似乎沒有幫助找到匹配的模板。如果我將templ.txt設置爲模板B，然後運行它['c = 3'，'e = 5']。我正在尋找一種方法來遍歷模板，找到匹配的模板，然後提取數據。 –

@PeterHofmann - 添加了一個可以與任何模板一起使用的應用程序。 – root

不調查性能：

將所有內容加載到字典中，以便您擁有A = {'a': '?', 'b': 2, ...}，B = {'h': 20, 'i': 21, ...}，C = {'a': 500, 'b': 2, ...}
如果A.keys() == C.keys()你知道是c匹配A.
然後簡單diff both dictionaries。

根據需要改進。

來源

2013-01-23 16:39:44

對不起，我應該讓我的例子更清楚，除了文件中的值之外，還有靜態行，註釋和其他許多應該匹配的東西。 –

Python如何將文件與模板匹配

回答

相關問題