2017-02-12 115 views
4

這是示例HTML代碼:如何div元素使用beautifulsoup另一個div元素?

<div class="cb-col cb-col-25 cb-mtch-blk"><a class="cb-font-12" href="/live-cricket-scores/16947/ind-vs-ban-only-test-bangladesh-tour-of-india-2017" target="_self" title="India v Bangladesh - Only Test"> 
<div class="cb-hmscg-bat-txt cb-ovr-flo "> 
<div class="cb-ovr-flo cb-hmscg-tm-nm">BAN</div> 
<div class="cb-ovr-flo" style="display:inline-block; width:140px">322/6 (104.0 Ovs)</div> 
</div> 

我想提取文本等BAN6分之322(104.0 OVS)從上述解析的HTML。 Iam這樣做 -

soup = BeautifulSoup(html) 
div_class = soup.findAll('div',class_='cb-col cb-col-25 cb-mtch-blk') 
for each in div_class: 
    #I want to get those texts from variable 'each' 

我該怎麼辦?

回答

3

您可以使用some css selectors與BeautifulSoup4:

>>> from bs4 import BeautifulSoup 
>>> html = ... # the html provided in the question 
>>> soup = BeautifulSoup(html, 'lxml') 
>>> name, size = soup.select('div.cb-hmscg-bat-txt.cb-ovr-flo div') 
>>> name.text 
u'BAN' 
>>> size.text 
u'322/6 (104.0 Ovs)' 
+1

無在線。 4,我得到一個錯誤,因爲「太多的值解壓縮」。我該怎麼辦? – ddlj

+0

@ddlj,如何取代第4行:'print([x.text for x in soup.select('div.cb -hmscg-bat-txt.cb-ovr-flo div')])' – falsetru

+0

@ddlj,順便說一句,你能分享實際的HTML(或從你得到的HTML網址)?正如你在我的回答中看到的,我可以在問題中使用給定的html來獲得這兩個文本。 – falsetru

1

each意味着你提供的HTML代碼,你應該去下div標籤,並獲得所有文本使用stripped_strings

div_class = soup.findAll('div',class_='cb-col cb-col-25 cb-mtch-blk') 
for each in div_class: 
    name, size = each.div.stripped_strings 
    print(name, size) 

出來:

BAN 322/6 (104.0 Ovs) 
相關問題