Python beautifulsoup匹配字符串後的正則表達式

我使用BeautifulSoup和Python來刮取網頁。我有一個BS元素，Python beautifulsoup匹配字符串後的正則表達式

a = soup.find('div', class_='section lot-details')

它返回一系列列表對象，如下所示。

<li><strong>Location:</strong> WA - 222 Welshpool Road, Welshpool</li> 
<li><strong>Deliver to:</strong> Pickup Only WA</li>

我希望每個STR

WA - 222 Welshpool Road, Welshpool 
Pickup Only WA

我如何得到這個出BS物體後返回的文本？我不確定這個正則表達式，以及它如何與BeautifulSoup進行交互。

來源

2016-05-19 Testy8

如何讓'div'返回'li'？ – AKS

(?:</strong>)(.*)(?:</li>)捕獲字段\1(.*)會做這項工作。

Python代碼示例：

In [1]: import re 
In [2]: test = re.compile(r'(?:</strong>)(.*)(?:</li>)') 
In [3]: test.findall(input_string) 
Out[1]: [' WA - 222 Welshpool Road, Welshpool', ' Pickup Only WA']

檢查在這裏https://regex101.com/r/fD0fZ9/1

來源

2016-05-19 13:29:34

這也適用於其他更一般的情況。 – Testy8

你並不真正需要的正則表達式。如果您在列表中有你的li標籤：

>>> for li in li_elems: 
...  print li.find('strong').next_sibling.strip() 

WA - 222 Welshpool Road, Welshpool 
Pickup Only WA

這是假設只有一個strong在li和文本元素是繼。

，或者：

>>> for li in li_elems: 
...  print li.contents[1].strip() 

WA - 222 Welshpool Road, Welshpool 
Pickup Only WA

來源

2016-05-19 13:47:57 AKS

Python beautifulsoup匹配字符串後的正則表達式

回答

相關問題