python正則表達式非貪婪強制

我在這裏看到一個關於非貪婪匹配的例子。
python正則表達式非貪婪強制

reg_string = "(.*?)>Title" 
path = "<html><head><title>Title</title>" 
match = re.match(reg_string, path) 
if match: 
    print match.group()

但是，如果我想Python來嚷嚷，這是不匹配的，因爲之後的第一個>沒有Title什麼。由於本場比賽：

"<html\><head><title>Title"

來源

2017-04-05 Eda Jede

嘗試reg_string = "([^>]*?)>Title"

來源

2017-04-05 12:59:13 Guillaume

很好用！謝謝 –

@EdaJede如果回答你的問題，請標記爲正確。 – xlm

你可能想看看BeautifulSoup的Python庫HTML的更直接的分析和處理：

https://www.crummy.com/software/BeautifulSoup/bs4/doc/

來源

2017-04-05 12:59:04

據我瞭解你，你想在Title之前獲取所有內容;但如果沒有標題文本，那麼它應該抱怨？

# Here we add a zero-to-many length match, delimited by `<` or end of line 
# and capture it in a second group 
reg_string = "(.*?)>(.*?)(<|$)" 

path = "<html><head><title>Title</title>" 

match = re.match(reg_string, path) 
if match: 
    if match.group(2) == "": 
     throw Exception("No title content") 
    else 
     print match.group(1) 
else: 
    throw Exception("No match")

來源

2017-04-05 13:04:14 taifwa

它應該抱怨，當「標題」不是在第一個「>」之後。 –

然後你的主要例子工作。只需添加一個「else」子句，或者「如果不匹配：」 – taifwa

python正則表達式非貪婪強制

回答

相關問題