2016-12-28 69 views
1

下面是一句「建築是100米高20米寬的」我想提取約高度爲100的號碼,所以我用如何從一個句子,通過蟒蛇提取數

question = input " " 
height = re.findall(r'(\d+) m tall', question) 

但是,有時句子不是「100米高」,而是「100米高」。在這種情況下,我的程序不能再提取我想要的號碼了。有沒有辦法改善我的課程,讓它工作,而不管句子包含「高」還是「高」。

回答

4

您可以通過|檢查「高或高」條件:

(\d+) m (tall|high) 

演示:

>>> re.findall(r'(\d+) m (tall|high)', 'a building is 100 m tall and 20 m wide') 
[('100', 'tall')] 
>>> re.findall(r'(\d+) m (tall|high)', 'a building is 100 m high and 20 m wide') 
[('100', 'high')] 

如果你想要不被捕獲的話,使用非捕獲組

(\d+) m (?:tall|high) 
1
>>> import re 
>>> re.findall(r'(\d+) m (?:tall|high)', "a building is 100 m tall and 20 m wide") 
['100'] 
>>> re.findall(r'(\d+) m (?:tall|high)', "a building is 100 m high and 20 m wide") 
['100'] 
0

根據您的要求,正則表達式應該匹配任何術語「高」或「高」。

  i.e., ?:tall|high 
     where, ?: means 'matches any of' 
       and,  | means 'or' 

因此,解決方案可以像:

>>> re.findall(r'(\d+) m (?:tall|high)', question) 


['100']