1
如何讓這個python程序更快地讀取大文本文件?我的代碼花費了將近五分鐘的時間來閱讀文本文件,但我需要它做得更快。我認爲我的算法不在O(n)中。python文本文件讀取速度慢
一些樣品數據(actual data是470K +行):
Aarika
Aaron
aaron
Aaronic
aaronic
Aaronical
Aaronite
Aaronitic
Aaron's-beard
Aaronsburg
Aaronson
我的代碼:
import string
import re
WORDLIST_FILENAME = "words.txt"
def load_words():
wordlist = []
print("Loading word list from file...")
with open(WORDLIST_FILENAME, 'r') as f:
for line in f:
wordlist = wordlist + str.split(line)
print(" ", len(wordlist), "words loaded.")
return wordlist
def find_words(uletters):
wordlist = load_words()
foundList = []
for word in wordlist:
wordl = list(word)
letters = list(uletters)
count = 0
if len(word)==7:
for letter in wordl[:]:
if letter in letters:
wordl.remove(letter)
# print("word left" + str(wordl))
letters.remove(letter)
# print(letters)
count = count + 1
#print(count)
if count == 7:
print("Matched:" + word)
foundList = foundList + str.split(word)
foundList.sort()
result = ''
for items in foundList:
result = result + items + ','
print(result[:-1])
#Test cases
find_words("eabauea" "iveabdi")
#pattern = "asa" " qlocved"
#print("letters to look for: "+ pattern)
#find_words(pattern)
聽起來更適合http://codereview.stackexchange.com/。 – alecxe
如果你也可以解釋你的程序應該做什麼,它會有所幫助。 – MYGz
有一件事......'wordlist = wordlist + str.split(line)'複製每行的單詞列表。做'wordlist.extend(line.strip()。split())'。或者,如果你想擺脫重複和更快的單詞查找,請將'wordlist'設置爲'set',並執行'.update'。 – tdelaney