刪除帶有特殊字符「\」和「/」的文字

在推文分析過程中，我運行的是包含\或/（可能在一個「詞」中出現多個外觀）的「單詞」。我想有這樣的話完全刪除，但無法真正抓住這個刪除帶有特殊字符「」和「/」的文字

這是我的嘗試：

sen = 'this is \re\store and b\\fre' 
sen1 = 'this i\s /re/store and b//fre/' 

slash_back = r'(?:[\w_]+\\[\w_]+)' 
slash_fwd = r'(?:[\w_]+/+[\w_]+)' 
slash_all = r'(?<!\S)[a-z-]+(?=[,.!?:;]?(?!\S))' 

strt = re.sub(slash_back,"",sen) 
strt1 = re.sub(slash_fwd,"",sen1) 
strt2 = re.sub(slash_all,"",sen1) 
print strt 
print strt1 
print strt2

我想獲得：

this is and 
this i\s and 
this and

但是，我得到：

and 
this i\s/and/
i\s /re/store b//fre/

要添加：在這種情況下，「單詞」是一個字符串，由空格或標點符號分隔ns（如普通文字）

來源

2015-11-02 Toly

精美的問的問題。我希望有一個問題模板，提問者不得不採用類似的方式。 – d0nut

@iismathwizard我不得不重新加載頁面來仔細檢查我的眼睛是否正確 –

這是怎麼回事？我加了一些標點符號的例子：

import re 

sen = r'this is \re\store and b\\fre' 
sen1 = r'this i\s /re/store and b//fre/' 
sen2 = r'this is \re\store, and b\\fre!' 
sen3 = r'this i\s /re/store, and b//fre/!' 

slash_back = r'\s*(?:[\w_]*\\(?:[\w_]*\\)*[\w_]*)' 
slash_fwd = r'\s*(?:[\w_]*/(?:[\w_]*/)*[\w_]*)' 
slash_all = r'\s*(?:[\w_]*[/\\](?:[\w_]*[/\\])*[\w_]*)' 

strt = re.sub(slash_back,"",sen) 
strt1 = re.sub(slash_fwd,"",sen1) 
strt2 = re.sub(slash_all,"",sen1) 
strt3 = re.sub(slash_back,"",sen2) 
strt4 = re.sub(slash_fwd,"",sen3) 
strt5 = re.sub(slash_all,"",sen3) 
print(strt) 
print(strt1) 
print(strt2) 
print(strt3) 
print(strt4) 
print(strt5)

輸出：你可以做到這一點，而不re

this is and 
this i\s and 
this and 
this is, and! 
this i\s, and! 
this, and!

來源

2015-11-02 03:22:17

美麗！像夢一樣工作！非常感謝！！ – Toly

一種方式是使用join和理解。

sen = 'this is \re\store and b\\fre' 
sen1 = 'this i\s /re/store and b//fre/' 

remove_back = lambda s: ' '.join(i for i in s.split() if '\\' not in i) 
remove_forward = lambda s: ' '.join(i for i in s.split() if '/' not in i) 

>>> print(remove_back(sen)) 
this is and 
>>> print(remove_forward(sen1)) 
this i\s and 
>>> print(remove_back(remove_forward(sen1))) 
this and

來源

2015-11-02 03:46:23 BlivetWidget

有趣的做法！我只認爲這是針對特定案例的特定解決方案，而我正在尋找一種通用解決方案。馬克的解決方案到目前爲止，已經從我的推特收集中最野生的字符串。謝謝！ – Toly

刪除帶有特殊字符「\」和「/」的文字

回答

相關問題