查找以輔音開頭和結尾的單詞

我試圖找到以輔音開頭和結尾的單詞。以下是我所嘗試的，而這不是我正在尋找的。我很困難，需要你的幫助/建議。查找以輔音開頭和結尾的單詞

import re 

a = "Still, the conflicting reports only further served to worsen tensions in the Ukraine crisis, which has grown drastically \ 
in the past few weeks to a new confrontation between Russia and the West reminiscent of low points in the Cold War." 

b = re.findall(" ([b, c, d, f, g, h, j, k, l, m, n, p, q, r, s, t, v, w, x, y, z, ',', '.'].+?[b, c, d, f, g, h, j, k, l, m, n, p, q, r, s, t, v, w, x, y, z, ',', '.']) ", a.lower()) 
print(b)

輸出是：

['the conflicting', 'further', 'to worsen', 'the ukraine crisis,', 'has', 'drastically', 'the past', 'weeks', 'new', 'between', 'the west', 'low', 'the cold']

但輸出是不正確的。我必須使用正則表達式。沒有它，我想，這太艱難了。

非常感謝！

來源

2014-03-04 Friendly User

你要包括逗號和句號？ – 2rs2ts

試試這個：

vowels = ['a', 'e', 'i', 'o', 'u'] 
words = [w for w in a.split() if w[0] not in vowels and w[-1] not in vowels]

然而，這不會照顧在.和,

編輯結束的話：如果你有使用正則表達式來進行匹配：

ending_in_vowel = r'(\b\w+[AaEeIiOoUu]\b)?' #matches all words ending with a vowel 
begin_in_vowel = r'(\b[AaEeIiOoUu]\w+\b)?' #matches all words beginning with a vowel

我們那麼需要找到所有不是以元音開頭，也不以元音結尾的單詞

ignore = [b for b in re.findall(begin_in_vowel, a) if b] 
ignore.extend([b for b in re.findall(ending_in_vowel, a) if b])

的檢查結果則是：

result = [word for word in a.split() if word not in ignore]

來源

2014-03-04 03:24:09 shaktimaan

非常感謝。這是一個很好的解決方案。有沒有什麼辦法可以用正則表達式來做到這一點？ –

@FriendlyUser - 當然。將句子拆分爲空格，並去掉標點符號以製作行列表。使用正則表達式來匹配單詞的第一個字符，並將其與輔助列表（或對元音的負面信息進行比較，以便於輸入）進行比較。對最後一個字符做同樣的事情。如果兩者均爲真，則將該詞保存到新列表中。 – MattDMo

正則表達式？嘗試''re.findall（r「\ b [^ aeiouAEIOU，。\ s] [^ \ s] *？[^ aeiouAEIOU，。\ s] \ b」，a）'但這可能不適用於所有標點符號話。 – Turophile

下面是使用startswith()和endswith()一個非常明確的解決方案。爲了實現自己的目標，你必須剝奪你自己的特殊字符和字符串轉換爲一個單詞列表（在代號爲s）：

vowels = ('a', 'e', 'i', 'o', 'u') 
[w for w in s if not w.lower().startswith(vowels) and not w.lower().endswith(vowels)]

來源

2014-03-04 03:32:57 pkacprzak

與@warunsl的回答有什麼不同？ – MattDMo

@MattDMo至少對我來說，使用'startswith/endswith'更加冗長。 – pkacprzak

如果'a'是一個輸入字符串，那麼你可以找到's'（單詞列表）s = re.findall（r'\ w +'，a）'它比's = a.split（） '因爲它允許在輸入中標點符號。 – jfs

首先，你應該split()a，使您獲得每個字。然後檢查第一個字母和最後一個字母是否在列表consonants中。如果是，則將其append轉換爲all，最後打印all的內容。

consonants = ['b', 'c', 'd', 'f', 'g', 'h', 'j', 'k', 'l', 'm', 'n', 'p', 'q', 'r', 's', 't', 'v', 'w', 'x', 'y', 'z'] 

a = "Still, the conflicting reports only further served to worsen tensions in the Ukraine crisis, which has grown drastically \ 
in the past few weeks to a new confrontation between Russia and the West reminiscent of low points in the Cold War." 

all = [] 

for word in a.split(): 
    if word[0] in consonants and word[len(word)-1] in consonants: 
     all.append(word) 

print all

來源

2014-03-04 05:52:02

如果您正在尋找刪除標點，正則表達式將工作：

>>> re.findall(r'\b[bcdfghj-np-tv-z][a-z]*[bcdfghj-np-tv-z]\b', a.lower()) 
['still', 'conflicting', 'reports', 'further', 'served', 'worsen', 'tensions', 'crisis', 'which', 'has', 'grown', 'drastically', 'past', 'few', 'weeks', 'new', 'confrontation', 'between', 'west', 'reminiscent', 'low', 'points', 'cold', 'war']

但是，你原來的嘗試看上去就像是試圖保持逗號和句號，所以如果這是你的目標你也可以使用這樣的：

>>> re.findall(r'\b[bcdfghj-np-tv-z][a-z]*[bcdfghj-np-tv-z][,.]?(?![a-z])', a.lower()) 
['still,', 'conflicting', 'reports', 'further', 'served', 'worsen', 'tensions', 'crisis,', 'which', 'has', 'grown', 'drastically', 'past', 'few', 'weeks', 'new', 'confrontation', 'between', 'west', 'reminiscent', 'low', 'points', 'cold', 'war.']

我不知道爲什麼\b在我的第一個例子通常不會匹配後標點符號（該文檔稱它瓦特生病），但無論如何這些工作。

如果要考慮收縮，表達，簡直是這個：

r"\b[bcdfghj-np-tv-z][a-z']*[bcdfghj-np-tv-z][,.]?(?![a-z])"

來源

2014-03-04 16:20:24 2rs2ts

查找以輔音開頭和結尾的單詞

回答

相關問題