2017-04-10 59 views
0

爲一組特定的'td'標記和裏面的文本刮表。要篩選我正在針對特定的'img'標記並嘗試使用previousSibling調用來獲得我想要的'td'。我試過previousSiblingprevious_siblingprevious並不斷收到錯誤:AttributeError:'ResultSet'對象沒有屬性'previousSibling'BS4

'ResultSet' object has no attribute 'previousSibling'

任何幫助,將不勝感激。
到目前爲止,這是我的代碼。

from urllib2 import urlopen 
import requests 
from bs4 import BeautifulSoup 
base_url = 'http://www.myfxbook.com/forex-economic-calendar' 
response = urlopen(base_url) 
html = response 
soup = BeautifulSoup(html.read().decode('utf-8'), "lxml") 
table = soup.find('table', attrs={'class': 'table center td30'}) 
is_row = table.findAll('img', attrs={'class': 'sprite sprite-common sprite-high-impact'}).previousSibling('td').text 
print is_row 

回答

1

您正在搜索的圖片沒有siblings。你想要的(我認爲)是獲得圖像父親以前的兄弟姐妹。

例子:

from bs4 import BeautifulSoup 
import requests 

base_url = 'http://www.myfxbook.com/forex-economic-calendar' 
response = requests.get(base_url) 


soup = BeautifulSoup(response.content.decode('utf-8'), "html.parser") 

table = soup.find('table', attrs={'class': 'table center td30'}) 
is_row = table.findAll('img', attrs={'class': 'sprite sprite-common 
sprite-high-impact'}) 

for row in is_row: 
    print (row.parent.find_previous_sibling("td").get_text(strip=True)) 

,輸出:

Fed's Yellen Speech 
FOMC Member Kashkari Speech 
BOE's Governor Carney speech 
Claimant Count Change 
BOC Rate Statement 
BoC Interest Rate Decision 
Bank of Canada Monetary Policy Report 
BoC Press Conference 
+0

這將是完全的答案。那麼,我需要父行和之前的兄弟姐妹? - 爲什麼這兩個? –

+1

如果你使用'find_previous',你會得到你的圖像自己的父母,所以你必須得到你的圖像父母之前得到以前的'td'。 – Zroq

+1

而且,'previous_sibling'將查找與您正在搜索的相同級別的節點。您想要的文字與另一位父母處於同一級別。 – Zroq