刪除子，如果不使用正則表達式

例如一個西班牙語單詞，如果我有：刪除子，如果不使用正則表達式

a = "aveces soñar es muy ließ y también человек"

我所要的輸出是：

"aveces soñar es muy y también"

我使用正則表達式：「 [^ \ u0000- \u007FáéíóüñÁÉÓÓÜÑ¿¡] +'來匹配不屬於西班牙語的字符，但是我不知道如何刪除這個字，如果它包含其中一個字符。

有什麼建議嗎？

來源

2017-10-29 looker

你在用什麼語言？ –

我使用python 3.5.4 – looker

試試這個正則表達式（我相信通過您所提供的unicode的範圍）：

(?:^|\s)(?=\S*[^\u0000-\u007FáéíóúüñÁÉÍÓÚÜÑ¿¡])\S+

Substitute any match with a blank string. Click for Demo

說明：

(?:^|\s) - 比賽無論是開始字符串或白色空間
(?=\S*[^\u0000-\u007FáéíóúüñÁÉÍÓÚÜÑ¿¡]) - positive lookahead以檢查是否非西班牙字符存在或不存在
\S+ - 如果非西班牙字符的情況下（在步驟2中檢查），匹配1+出現的非空白字符

Python代碼（Generated）：

# coding=utf8 
# the above tag defines encoding for this document and is for Python 2.x compatibility 

import re 

regex = r"(?:^|\s)(?=\S*[^\u0000-\u007FáéíóúüñÁÉÍÓÚÜÑ¿¡])\S+" 

test_str = "aveces soñar es muy ließ y también человек" 

subst = "" 

# You can manually specify the number of replacements by changing the 4th argument 
result = re.sub(regex, subst, test_str, 0, re.MULTILINE) 

if result: 
    print (result) 

# Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.

To see the output, Run code here

來源

2017-10-29 02:56:12 Gurman

這正是我一直在尋找的。謝謝！！ – looker

很高興能有幫助:) – Gurman

刪除子，如果不使用正則表達式

回答

相關問題