2017-01-27 84 views
0

任何人都可以幫助我解釋爲什麼我不能在這裏工作?Python BeautifulSoup如何提取/找到

我不太明白BeautifulSoup的文檔。

req = Request('http://performance.morningstar.com/stock/performance-return.action?p=dividend_split_page&t=D05, headers={'User-Agent': 'Mozilla/5.0'}) 
webpage = urlopen(req).read() 

soup = bs4.BeautifulSoup(webpage, 'lxml') 

div = soup.find('div', {'id': 'div_annual_dividends'}) 

th = div.find('th', text="Dividend Amount") 

㈡着似乎使用nextSibling.text

這是我收到的錯誤來提取值0.56

AttributeError: 'NoneType' object has no attribute 'nextSibling' 

如何將輸出存儲到數組?

for tr in soup('th', text="Dividend Amount"): 
row = [td.text for td in tr('td')] 
print(row) 

這是正確的?

回答

0

此頁由JavaScript渲染,真實數據是在這個網址:

http://performance.morningstar.com/perform/Performance/stock/annual-dividends.action?&t=XSES:D05&region=sgp&culture=en-US&cur=&ops=clear&ndec=2&y=5 

enter image description here

你可以找到的網址,Chrome瀏覽器開發工具。

代碼:

import requests, bs4 

r = requests.get('http://performance.morningstar.com/perform/Performance/stock/annual-dividends.action?&t=XSES:D05&region=sgp&culture=en-US&cur=&ops=clear&ndec=2&y=5') 
soup = bs4.BeautifulSoup(r.text, 'lxml') 
rows = [] 
for tr in soup('tr', class_=False): 
    row = [td.text for td in tr('td')] 
    rows.append(row) 

出來:

[['0.56', '0.56', '0.58', '0.60', '0.60'], 
['3.77', '3.27', '2.82', '3.59', '3.46']] 
+0

AttributeError的: 'NavigableString' 對象有沒有屬性 '文本' 如果我做打印nextSibling.text –

+0

你能包括評論和更新代碼,這樣我可以將它們存儲到一個數組? 我明白tr爲湯('tr',class_ = False):將抓取「Dividend Amount」的html。 但我如何將td存儲到數組? 謝謝。 –

+0

@OOI YI YONG在問題中發佈你想要的結果。 –