2016-10-09 45 views
1

我有一個簡單的問題,可以忽略包含兩個或多個連續大寫字母和更多語法規則的句子。Python正則表達式忽略兩個連續大寫字母的句子

問題:根據定義,正則表達式不應與字符串'This is something with two CAPS.'匹配,但它確實匹配。

代碼:

''' Check if a given sentence conforms to given grammar rules 

    $ Rules 
     * Sentence must start with a Uppercase character (e.g. Noun/ I/ We/ He etc.) 
     * Then lowercase character follows. 
     * There must be spaces between words. 
     * Then the sentence must end with a full stop(.) after a word. 
     * Two continuous spaces are not allowed. 
     * Two continuous upper case characters are not allowed. 
     * However the sentence can end after an upper case character. 
''' 

import re 


# Returns true if sentence follows these rules else returns false 
def check_sentence(sentence): 
    checker = re.compile(r"^((^(?![A-Z][A-Z]+))([A-Z][a-z]+)(\s\w+)+\.$)") 
    return checker.match(sentence) 

print(check_sentence('This is something with two CAPS.')) 

輸出:

<_sre.SRE_Match object; span=(0, 32), match='This is something with two CAPS.'> 

回答

0

它可能更容易編寫你的正則表達式中的負(發現是壞的句子所有句子),比它在正。

checker = re.compile(r'([A-Z][A-Z]|[ ][ ]|^[a-z])') 
check2 = re.compile(r'^[A-Z][a-z].* .*\.$') 
return not checker.findall(sentence) and check2.findall(sentence) 
+0

這是很難的,我有一個測試套件這裏是tha的結果t –

+0

???結果@ aswin-mohan在哪裏? – 2ps

+0

更重要的是,它爲什麼難以工作? – 2ps

0

您的負面預測僅適用於被測試字符串的開頭。

第二捕獲組(^(?![A-Z][A-Z]+))

^斷言位置在字符串的開始

負先行(?![A-Z][A-Z]+)

"This will NOT fail."

"THIS will fail."

+0

謝謝。你能否寫下修改後的正則表達式? –

相關問題