問題:通過以列表形式傳入的分隔符將字符串拆分爲單詞列表。字符串拆分問題
字符串:"After the flood ... all the colors came out."
所需的輸出:['After', 'the', 'flood', 'all', 'the', 'colors', 'came', 'out']
我寫了下面的功能 - 注意,我知道有更好的方法使用一些內置的功能蟒蛇來分割字符串,但爲求學習,我想我會繼續這樣說:
def split_string(source,splitlist):
result = []
for e in source:
if e in splitlist:
end = source.find(e)
result.append(source[0:end])
tmp = source[end+1:]
for f in tmp:
if f not in splitlist:
start = tmp.find(f)
break
source = tmp[start:]
return result
out = split_string("After the flood ... all the colors came out.", " .")
print out
['After', 'the', 'flood', 'all', 'the', 'colors', 'came out', '', '', '', '', '', '', '', '', '']
我想不通爲什麼「出籠」不拆分爲「來」和「走出去」作爲兩個單獨的單詞。就好像兩個單詞之間的空白字符被忽略一樣。我認爲其餘的產出是垃圾,這是源於與「出來」問題相關的問題。
編輯:
我跟着@ IVC的建議,並用下面的代碼上來:
def split_string(source,splitlist):
result = []
lasti = -1
for i, e in enumerate(source):
if e in splitlist:
tmp = source[lasti+1:i]
if tmp not in splitlist:
result.append(tmp)
lasti = i
if e not in splitlist and i == len(source) - 1:
tmp = source[lasti+1:i+1]
result.append(tmp)
return result
out = split_string("This is a test-of the,string separation-code!"," ,!-")
print out
#>>> ['This', 'is', 'a', 'test', 'of', 'the', 'string', 'separation', 'code']
out = split_string("After the flood ... all the colors came out.", " .")
print out
#>>> ['After', 'the', 'flood', 'all', 'the', 'colors', 'came', 'out']
out = split_string("First Name,Last Name,Street Address,City,State,Zip Code",",")
print out
#>>>['First Name', 'Last Name', 'Street Address', 'City', 'State', 'Zip Code']
out = split_string(" After the flood ... all the colors came out...............", " ."
print out
#>>>['After', 'the', 'flood', 'all', 'the', 'colors', 'came', 'out']
謝謝大家的精彩解決方案。我已經走了這一條,因爲它迫使我學習邏輯,而不是使用預先構建的函數。顯然,如果我要寫商業代碼,我不會重新發明輪子,但爲了學習的目的,我會與這個答案一起去。感謝大家的幫助。 – codingknob