Python 3x正則表達式語法

在下面的示例中嘗試刪除字符串中的所有重複單詞時，應該檢查單詞的一次或多次重複的正確語法。下面的示例返回Python 3x正則表達式語法

cat cat in the hat hat hat

它忽略了串在一個以上的重複，只刪除「中的」 &「該」已重複一次。

>>> re.sub(r'(\b[a-z]+) \1', r'\1', 'cat cat cat in in the the hat hat hat hat hat hat')

來源

2014-04-04 user1810023

我會繼續貨架我的大腦，直到有人證實，但我的直覺說正則表達式是不完全意味着這種類型的任務。我從未擅長匹配「動態重複」。 – Sam

試試這個regex：

(\b[a-z]+)(?: \1)+

我不得不做的就是把你的\1成非捕獲組，以便我們能重複1+倍。然後我們就可以代替它，你做同樣的方式：

re.sub(r'(\b[a-z]+)(?: \1)', r'\1', 'cat cat cat in in the the hat hat hat hat hat hat')

來源

2014-04-04 23:19:02 Sam

試試這個：

re.sub(r'(\b[a-z]+)(?: \1)+', r'\1', 'cat cat cat in in the the hat hat hat hat hat hat')

背引用後重復操作將使其匹配多次重複。

來源

2014-04-04 23:19:42 Barmar

您可以使用此：

re.sub(r'(\b[a-z]+) (?=\1\b)', '', 'cat cat cat in in the the hat hat hat hat hat hat')

來源

2014-04-04 23:20:14

這應該打印帶有重複

check_for_repeats = 'cat cat cat in in the the hat hat hat hat hat hat' 
words = check_for_repeats.split() 
sentence_array = [] 

for i in enumerate(words[:-1]): 
    if i[1] != words[i[0] + 1]: 
     sentence_array.append(i[1]) 
if words[-1:] != words[-2:]: 
    sentence_array.append(words[-1:][0]) 

sentence = ' '.join(sentence_array) 
print(sentence)

來源

2014-04-04 23:24:13

我不是Python大師，但是如果我正確閱讀這篇文章，不會替換這個'你好，我是山姆。山姆，我是。「有點像」你好，薩姆。山姆。「？但是，如果沒有正則表達式，這可能是可行的，更高效。 – Sam

我剛編輯了代碼才能正常工作。 –

甜蜜，不要在沒有成爲Python的人的情況下研究它是否有效。但是給了你一個使用非正則表達式（最可能更有效）答案的+1。 – Sam

非正則表達式的替代給出的句子時，順序並不重要是

" ".join(set(string_with_duplicates.split()))

首先用空格分割字符串，將返回的列表變成一個集合（刪除重複的每一個元素是唯一的），然後將這些項目加回到一個字符串中。

>>> string_with_duplicates = 'cat cat cat in in the the hat hat hat hat hat hat' 
>>> " ".join(set(string_with_duplicates.split())) 
'the in hat cat'

如果需要保留的單詞的順序，你可以寫這樣的事情

>>> unique = [] 
>>> for w in string_of_duplicates.split(): 
     if not w in unique: 
     unique.append(w) 
>>> " ".join(unique) 
'cat in the hat'

來源

2014-04-04 23:46:27 dannymilsom

Python 3x正則表達式語法

回答

相關問題