beautifulsoup4，使用.find_all的正確方法？

如果我解析使用BS4一個網站，並從它的源代碼，我想打印文本「+ 26.67％」beautifulsoup4，使用.find_all的正確方法？

<font color="green"><b><nobr>+26.67%</nobr></b></font>

我已經與.find_all()命令（http://www.crummy.com/software/BeautifulSoup/bs4/doc/）無濟於事瞎搞。搜索源代碼並打印文本的正確方法是什麼？

我的代碼：

import requests 
from bs4 import BeautifulSoup 

    set_url = "*insert web address here*" 
    set_response = requests.get(set_url) 
    set_data = set_response.text 
    soup = BeautifulSoup(set_data) 
    e = soup.find("nobr") 
    print(e.text)

來源

2014-02-13 user3230554

一個小例子：

>>> s="""<font color="green"><b><nobr>+26.67%</nobr></b></font>""" 
>>> print s 
<font color="green"><b><nobr>+26.67%</nobr></b></font> 
>>> from bs4 import BeautifulSoup 
>>> soup = BeautifulSoup(s) 
>>> e = soup.find("nobr") 
>>> e.text #or e.get_text() 
u'+26.67%'

find回報第一Tag，find_all返回ResultSet：

>>> type(e) 
<class 'bs4.element.Tag'> 
>>> es = soup.find_all("nobr") 
>>> type(es) 
<class 'bs4.element.ResultSet'> 
>>> for e in es: 
...  print e.get_text() 
... 
+26.67%

如果你想指定nobrb和font下，它可以是：

>>> soup.find("font",{'color':'green'}).find("b").find("nobr").get_text() 
u'+26.67%'

連續.find可能會導致異常，如果事先.find返回None，注意。

來源

2014-02-13 10:11:42 WKPlus

那是幾乎相同的，我使用'find_all（）方法'我認爲這個問題是IM解析網頁方式。在你的例子中你的設置's =「」「....」「」''在我的程序中使用請求解析頁面。我將我的代碼添加到主要問題讓我知道你在想什麼 – user3230554

@ user3230554那麼你的代碼有什麼問題？ – WKPlus

使用a CSS selector：

>>> s = """<font color="green"><b><nobr>+26.67%</nobr></b></font>""" 
>>> from bs4 import BeautifulSoup 
>>> soup = BeautifulSoup(s) 
>>> soup.select('font[color="green"] > b > nobr') 
[<nobr>+26.67%</nobr>]

添加或刪除屬性或元素名稱構成選擇字符串進行匹配或多或少精確。

來源

2014-02-13 10:15:26 phihag

這裏有我的解決辦法

s = """<font color="green"><b><nobr>+26.67%</nobr></b></font>""" 
from bs4 import BeautifulSoup 
soup = BeautifulSoup(s) 
a = soup.select('font') 
print a[0].text

來源

2014-02-13 12:02:30 combuilder

beautifulsoup4，使用.find_all的正確方法？

回答

相關問題