我想寫一個函數,如果在字母字符之前有一個標點符號,函數將在之前放置一個空格,如果標點符號是在字母字符後面,那麼後面應該有一個空格。然而它不應該發生在整數情況下。例如Python文本解析&分裂
("thanks." >>> "thanks ." and "hello?123!lom" >>> "hello ?123! lom")
我下面的代碼工作正常時,有一個標點符號,但不是在同一個標點符號重演看我下面的代碼:
def normalize(utterance):
# Converting to lowercase & removing multiple white spaces
utterance = ' '.join(utterance.lower().split())
# List of punctuations
punctuations_list = [',','.','?',':',';','!',')','(','\'']
for punctuation in punctuations_list:
if punctuation in utterance:
try:
char_before = str(utterance[utterance.index(punctuation) -1])
char_after = str(utterance[utterance.index(punctuation) +1])
except IndexError:
char_after = "0"
if char_before.isdigit()==False and char_before not in punctuations_list:
utterance = utterance.replace(punctuation, " " + punctuation)
if char_after.isdigit()==False and char_after not in punctuations_list:
utterance = utterance.replace(punctuation, punctuation + " ")
return utterance
normalize("thank you:? the time is 2:30pm")
>>>'thank you :? the time is 2 :30pm'
我想輸出是:
'thank you :? the time is 2:30pm'
即沒有時間之間的空間,問題是因爲冒號「:」被重複我相信,有人可以解決這個問題。
的錯誤似乎是在下面的一行:
utterance = utterance.replace(punctuation, " " + punctuation)
無論它匹配它取代了整個標點符號,但我不知道如何在這方面的整頓!
這是給錯誤的輸出:'hank你:?在1點30分會議' –
謝謝。更新它 – taras