我寫代碼,獲取文本標記爲輸入:查找連接令牌
tokens = ["Tap-", "Berlin", "Was-ISt", "das", "-ist", "cool", "oh", "Man", "-Hum", "-Zuh-UH-", "glit"]
的代碼應該查找包含連字符或連接到彼此連字符的所有標記:基本上輸出應該是:
[["Tap-", "Berlin"], ["Was-ISt"], ["das", "-ist"], ["Man", "-Hum", "-Zuh-UH-", "glit"]]
我寫了一個碼,但不知何故,我不是跟hypens得到連接令牌回:要嘗試一下:http://goo.gl/iqov0q
def find_hyphens(self):
tokens_with_hypens =[]
for i in range(len(self.tokens)):
hyp_leng = 0
while self.hypen_between_two_tokens(i + hyp_leng):
hyp_leng += 1
if self.has_hypen_in_middle(i) or hyp_leng > 0:
if hyp_leng == 0:
tokens_with_hypens.append(self.tokens[i:i + 1])
else:
tokens_with_hypens.append(self.tokens[i:i + hyp_leng])
i += hyp_leng - 1
return tokens_with_hypens
我該怎麼做?是否有更高性能的解決方案?由於