Python正則表達式：這有什麼問題？

我想做一個正則表達式來從這個XML中獲取錯誤代碼。Python正則表達式：這有什麼問題？

>>> re_code = re.compile(r'<errorcode>([0-9]+)</errorcode>', re.MULTILINE) 
>>> re_code.match('''<?xml version="1.0" encoding="ISO-8859-1" standalone="no"?> 
... <methoderesponse> 
...  <status> 
...   <message/> 
...   <errorcode>515</errorcode> 
...   <value>ERROR</value> 
...  </status> 
... </methoderesponse> 
... ''')

它應該很容易。但我不明白爲什麼它不匹配。

來源

2012-11-19 Natim

在處理正則表達式時，還應該包含輸入和期望輸出，以便我們能更好地幫助您。 –

@InbarRose：輸入和期望的輸出在問題中。 – Tim

爲@Jon Clements說，.match()只適用於表達式應該從字符串的開頭運行，.search()搜索字符串的第一次出現，並.findall()搜索所有的出現。

但不管，你應該稍正則表達式修改到一個稍微更可讀的版本：

regex = re.compile(r'<errorcode>(\d+)</errorcode>')

你不需要re.MULTILINE參數，它不涉及這一問題。

來源

2012-11-19 10:00:24

.match()嘗試在開始時進行匹配。你想.search()或更可能.findall()

看一看XML解析器雖然 - 好得多使用XPath或相當於讓您的數據（加上它會處理該正則表達式的不會細微差別）

一個例子與您的示例XML工作：

import xml.etree.ElementTree as ET 
tree = ET.fromstring(text) 

>>> tree.findall('.//errorcode')[0].text 
'515'

更多信息有關ElementTree here，我會親自檢查lxml

來源

2012-11-19 09:46:54

我不想爲此使用etree。 – Natim

Python正則表達式：這有什麼問題？

回答

相關問題