Python從長長的字符串中刪除單詞的完整句子

我已將一篇小說粘貼到文本文件中。我想爲他們保留在每個頁面的頂部出現的（僅僅取消其在這些線路出現也將這樣做）刪除包含下列句子的所有行：Python從長長的字符串中刪除單詞的完整句子

"Thermal Molecular Movement in , Order and Probability"

"Molecular and Ionic Interactions as the Basis for the Formation"

"Interfacial Phenomena and Membranes"

我第一次嘗試是如下：

mystring = file.read() 
mystring=mystring.strip("Molecular Structure of Biological Systems") 
mystring=mystring.strip("Thermal Molecular Movement in , Order and Probability") 
mystring=mystring.strip("Molecular and Ionic Interactions as the Basis for the Formation") 
mystring=mystring.strip("Interfacial Phenomena and Membranes") 

new_file=open("no_refs.txt", "w") 

new_file.write(mystring) 

file.close()

但是這對輸出文本文件沒有影響......內容是完全沒有改變......我覺得這很奇怪，因爲下面的例子玩具正常工作：

>>> "Hello this is a sentence. Please read it".strip("Please read it") 
'Hello this is a sentence.'

正如上面沒有工作，我嘗試以下代替：

file=open("novel.txt", "r") 
mystring = file.readlines() 
for lines in mystring: 
    if "Thermal Molecular Movement in , Order and Probability" in lines: 
     mystring.replace(lines, "") 
    elif "Molecular and Ionic Interactions as the Basis for the Formation" in lines: 
     mystring.replace(lines, "") 
    elif "Interfacial Phenomena and Membranes" in lines: 
     mystring.replace(lines, "") 
    else: 
     continue 

new_file=open("no_refs.txt", "w") 

new_file.write(mystring) 
new_file.close() 
file.close()

但這種嘗試我得到這個錯誤：

類型錯誤：預期字符串或其他字符緩衝區對象

來源

2016-10-13 johnny utah

第一個str.strip()只刪除該模式，如果發現開始或結束字符串，它解釋它似乎工作在你的測試中，但實際上並不是你想要的。
其次，你要不能在當前行執行列表中的替代（你不分配回置換結果）

這裏有一個固定的版本，成功地消除的模式行：

with open("novel.txt", "r") as file: 
    mystring = file.readlines() 
    for i,line in enumerate(mystring): 
     for pattern in ["Thermal Molecular Movement in , Order and Probability","Molecular and Ionic Interactions as the Basis for the Formation","Interfacial Phenomena and Membranes"]: 
      if pattern in line: 
       mystring[i] = line.replace(pattern,"")      

    # print the processed lines 
    print("".join(mystring))

注意enumerate構建體，其允許迭代上的值&索引。僅對值進行迭代將允許查找模式，但不能在原始列表中修改它們。

另請注意，with open構造，只要離開塊就關閉文件。

這裏的一個版本可以完全消除含圖案的線（掛在，有一個在那裏一些單行功能編程的東西）：

with open("novel.txt", "r") as file: 
    mystring = file.readlines() 
    pattern_list = ["Thermal Molecular Movement in , Order and Probability","Molecular and Ionic Interactions as the Basis for the Formation","Interfacial Phenomena and Membranes"] 
    mystring = "".join(filter(lambda line:all(pattern not in line for pattern in pattern_list),mystring)) 
    # print the processed lines 
    print(mystring)

解釋：根據條件的行的過濾器列表：無不需要的模式必須符合要求。

來源

2016-10-13 22:15:19

這是非常感謝：你知道我會如何刪除整條線，而不僅僅是模式，例如「第3.1節」生物系統的能量和動力學「，第337頁」 - 刪除整條線...我試過「mystring.pop（i）」，但它給出：AttributeError：'str'對象沒有屬性'pop' –

Python從長長的字符串中刪除單詞的完整句子

回答

相關問題