你正在服用剛剛前三沒有忽視[:3]
,該切片從列表中前三個元素:
DataGrid.find_all('tr')[:3] # first three elements
應該DataGrid.find_all('tr')[3:]
#所有,但前三個元素
from bs4 import BeautifulSoup
import requests
r=requests.get('http://www.virginiaequestrian.com/main.cfm?action=greenpages&GPType=8')
soup=BeautifulSoup(r.content)
tbl = soup.find("table")
for tag in tbl.find_all("tr")[3:]:
for td in tag.find_all('td'):
print td.text
上述tbl.find_all("tr")
切片,用在兩個不同的解析器輸出:
In [20]: soup=BeautifulSoup(r.content,"html.parser")
In [21]: tbl = soup.find("table")
In [22]: len(tbl.find_all("tr"))
Out[22]: 364
In [23]: len(tbl.find_all("tr")[3:])
Out[23]: 361
In [24]: soup=BeautifulSoup(r.content,"lxml")
In [25]: tbl = soup.find("table")
In [26]: len(tbl.find_all("tr")[3:])
Out[26]: 361
In [27]: len(tbl.find_all("tr"))
Out[27]: 364
如果你真的想要的more
的HREFs那麼你應該正是這樣做的,得到了a
標籤爲每個tr
,也有6分TR的排前你真正想要,所以你需要跳過6:
tbl = soup.find("table")
out = (tag.find('a') for tag in tbl.find_all("tr")[6:])
for a in out:
print(a["href"])
輸出:
main.cfm?action=greenpages&sub=view&ID=9068
main.cfm?action=greenpages&sub=view&ID=9504
main.cfm?action=greenpages&sub=view&ID=10868
main.cfm?action=greenpages&sub=view&ID=10261
main.cfm?action=greenpages&sub=view&ID=10477
main.cfm?action=greenpages&sub=view&ID=10708
main.cfm?action=greenpages&sub=view&ID=11712
main.cfm?action=greenpages&sub=view&ID=12402
main.cfm?action=greenpages&sub=view&ID=12496
..................
若要使用的鏈接只是在前面加上主網址:
for a in out:
print("http://www.virginiaequestrian.com/{}".format(a["href"]))
輸出:
http://www.virginiaequestrian.com/main.cfm?action=greenpages&sub=view&ID=9068
http://www.virginiaequestrian.com/main.cfm?action=greenpages&sub=view&ID=9504
http://www.virginiaequestrian.com/main.cfm?action=greenpages&sub=view&ID=10868
http://www.virginiaequestrian.com/main.cfm?action=greenpages&sub=view&ID=10261
http://www.virginiaequestrian.com/main.cfm?action=greenpages&sub=view&ID=10477
http://www.virginiaequestrian.com/main.cfm?action=greenpages&sub=view&ID=10708
http://www.virginiaequestrian.com/main.cfm?action=greenpages&sub=view&ID=11712
http://www.virginiaequestrian.com/main.cfm?action=greenpages&sub=view&ID=12402
http://www.virginiaequestrian.com/main.cfm?action=greenpages&sub=view&ID=12496
http://www.virginiaequestrian.com/main.cfm?action=greenpages&sub=view&ID=12633
http://www.virginiaequestrian.com/main.cfm?action=greenpages&sub=view&ID=13528
,如果你打開馬術網站第一個將引領你的,即你希望第一個數據,。
添加你認爲的前三行 –