如何提取使用BeautifulSoup

我有一個div，其ID的div的屬性值爲「IMG-CONT」如何提取使用BeautifulSoup

<div class="img-cont-box" id="img-cont" style='background-image: url("http://example.com/example.jpg");'>

我想提取使用漂亮soup.How我可以做背景圖像的URL它？

來源

2017-04-02 latish

你可以find_all或find第一場比賽。

import re 
soup = BeautifulSoup(html_str) 
result = soup.find('div',attrs={'id':'img-cont','style':True}) 
if result is not None: 
    url = re.findall('\("(http.*)"\)',result['style']) # return a list.

來源

2017-04-02 22:04:39 oshribr

我已經做到了part.How從結果變量提取網址是什麼？ – latish

謝謝，它的工作!!你能解釋我這部分「url = re.findall（'\（」（http。*）「\）'，結果['風格']）」。 – latish

'result ['style']'return the string''background-image：url（「http://example.com/example.jpg」）;''和're.findall（）'是一個正則表達式搜索，閱讀更多關於正則表達式檢查這個鏈接https://docs.python.org/2/library/re.html – oshribr

試試這個：

import re 

from bs4 import BeautifulSoup 

html = '''\ 
<div class="img-cont-box" \ 
id="img-cont" \ 
style='background-image: url("http://example.com/example.jpg");'>\ 
''' 

soup = BeautifulSoup(html, 'html.parser') 
div = soup.find('div', id='img-cont') 
print(re.search(r'url\("(.+)"\)', div['style']).group(1))

來源

2017-04-03 07:57:29 hallazzang

如何提取使用BeautifulSoup

回答

相關問題