試圖找到一個巨大的文件的字符串並用另一個字符串

下面是我的代碼替換的結果 -試圖找到一個巨大的文件的字符串並用另一個字符串

import tarfile 
import os 
import sys 
import re 

script, bak = sys.argv 
bakfile = str(bak) 
currentwd = os.path.dirname(os.path.realpath(__file__)) 
file_to_work = tarfile.open(name=currentwd+"/"+bakfile, mode="r") 
file_to_work.extractall() 

currentwd = os.path.dirname(os.path.realpath(__file__)) 
with open(currentwd+"/onedb.xml", "r") as file: 
    f = file.read() 
    words = re.findall(r'{ssha}_\w*?=', f) 
    re.sub(words,r'string_to_replace',f)

我用tarfile模塊和提取的gzfile，從提取的文件，拿起onedb.xml。使用正則表達式來查找字符串，這是成功的。

現在，當我嘗試使用re.sub替換搜索的字符串時，出現以下錯誤。

Traceback (most recent call last): 
    File "preset.py", line 16, in <module> 
    re.sub(words,r'string_to_replace',f) 
    File "/usr/lib/python2.7/re.py", line 151, in sub 
    return _compile(pattern, flags).sub(repl, string, count) 
    File "/usr/lib/python2.7/re.py", line 232, in _compile 
    p = _cache.get(cachekey) 
TypeError: unhashable type: 'list'

來源

2017-07-08 Vignesh SP

'字= re.findall（R '{SSHA} _ \ W * =？'，F）'返回一個Python'list'。您將'list'傳遞給're.sub'函數作爲您的模式。這肯定會導致錯誤。 – Abdou

@Abdou，謝謝你。任何建議來解決這個問題？ –

我不太確定你想要替換什麼，但是如果你試圖替換'words'的所有元素，那麼你可以嘗試：'re.sub（'|'.join（words），r'string_to_replace'， F）'。 – Abdou

使用都在同一個表達：

re.sub(r'{ssha}_\w*?=', r'string_to_replace', f)

來源

2017-07-09 10:56:10 stovfl

試過這個。由於該文件非常龐大，該腳本只是坐着，不知道幕後發生了什麼。 @stovfl –

試圖找到一個巨大的文件的字符串並用另一個字符串

回答

相關問題