我正在編寫一個編輯文本文件的程序。我打算讓程序查找重複的字符串並刪除n-1行類似的字符串。如何使用字典中的鍵搜索字符串?
這裏是劇本我到目前爲止:
import re
fname = raw_input("File name - ")
fhand = open(fname, "r+")
fhand.read()
counts = {}
pattern = re.compile(pattern)
# This searches the file for duplicate strings and inserts them into a dictionary with a counter
# as the value
for line in fhand:
for match in pattern.findall(line):
counts.setdefault(match, 0)
counts[match] += 1
pvar = {}
#This creates a new dictionary which contains all of the keys in the previous dictionary with
# count > 1
for match, count in counts.items():
if count > 1:
pvar[match] = count
fhand.close()
count = 0
# Here I am trying to delete n - 1 instances of each string that was a key in the previous
# dictionary
with open(fname, 'r+') as fhand:
for line in fhand:
for match, count in pvar.items():
if re.search(match, line) not in line:
continue
count += 1
else:
fhand.write(line)
print count
fhand.close()
我怎樣才能使代碼工作的最後一位?是否可以使用字典中的鍵來標識相關行並刪除n-1個實例? 還是我做的完全錯了?
編輯:從文件中的示例,這應該是一個列表與每個'XYZ'實例在換行符前面有兩個空格字符。格式化有點搞砸了,我向你道歉 INPUT
-=XYZ[0:2] &
-=XYZ[0:2] &
-=XYZ[3:5] &
=XYZ[6:8] &
=XYZ[9:11] &
=XYZ[12:14] &
-=XYZ[15:17] &
=XYZ[18:20] &
=XYZ[21:23] &
輸出
= XYZ [0:2]
編輯
而且,任何人都可以解釋爲什麼代碼的最後部分不返回任何內容?
你是指XYZ實例? Sry我真的不明白。我甚至不'理解'輸入文件。 – ProgrammingIsAwsome
我只想刪除其中包含'XYZ'的行 –
但是它們都包含'XYZ':o – BartoszKP