2017-10-08 190 views
0

我試圖將標點符號作爲分隔符將文本文件拆分爲句子。我到目前爲止的代碼工作正常,但分隔符正在單獨打印出一行。我怎樣才能將標點符號與句子一起保存?將文本文件拆分成句子

import re 
string = "" 
with open("text.txt") as file: 
    for line in file: 
     for l in re.split(r"(\. |\? |\!)",line): 
      string += l + "\n" 
print(string) 

輸出示例:

This is the flag of the Prooshi — ous, the Cap and Soracer 
. 
This is the bullet that byng the flag of the Prooshious 
. 
This is the ffrinch that fire on the Bull that bang the flag of the Prooshious 
. 

回答

0

這其實很簡單,你將在每個迭代上\ n(換行符),所以,比如說你分裂Kek.它會增加字符串變量Kek\n然後.\n。 你需要這樣做:

with open("text.txt") as file: 
for line in file: 
    for l in re.split(r"(\. |\? |\!)",line): 
     string += l 
    string += '\n'