使用Python刮擦表格數據 - BeautifulSoup

無法弄清楚如何刮取第一個表格數據而不是兩者。使用Python刮擦表格數據 - BeautifulSoup

<tr> 
<td>WheelDust 
</td> 
<td>A large puff of barely visible brown dust 
</td></tr>

我只想WheelDust而是我得到WheelDust和隱約可見的褐色灰塵

import requests 
from bs4 import BeautifulSoup 


r = requests.get("https://wiki.garrysmod.com/page/Effects") 

soup = BeautifulSoup(r.content, "html.parser") 

for td in soup.findAll("table"): 
    #--print(td) 
    for a in td.findAll("tr"): 
     print(a.text)

來源

2017-08-17 Lopez Nlak

如果您不希望在第一次查找後繼續迭代，則可以使用soup.find_all的soup.find intsead。當您找到'WheelDust'時，您也可以使用'break'。 – Landmaster

是的，但這是一張桌子，所以我想在第一類中找到所有的東西 –

爲什麼你在進入tr之後不做a.find（'td'）？ – Landmaster

大粉撲我還是不知道你要問什麼，但我相信你說你想訪問和唯一的第一個，正確的？如果是這樣的話，這是行不通的嗎？我會嘗試它，但它說我沒有訪問該網站。

import requests 
from bs4 import BeautifulSoup 


r = requests.get("https://wiki.garrysmod.com/page/Effects") 

soup = BeautifulSoup(r.content, "html.parser") 

for td in soup.findAll("table"): 
    #--print(td) 
    for a in td.findAll("tr"): 
     print(a.find('td'))

來源

2017-08-17 09:50:34 Landmaster

哦，那就是我要找的。我沒有看到這樣做。謝謝。當我添加文本屬性時，它不會將文本返回給我，而是使用標籤 –

Yup！這聽起來很合理。一旦你解決了你的問題，隨意勾選複選標記，以便將問題標記爲已完成。 – Landmaster

試試這個。它會給你所有來自該表的數據。

import requests ; from bs4 import BeautifulSoup 

soup = BeautifulSoup(requests.get("https://wiki.garrysmod.com/page/Effects").text, "html.parser") 

table = soup.findAll('table', attrs={'class':'wikitable'})[0] # Changing the index number will give you whichever table you like 
list_of_rows = [[t_data.text for t_data in item.findAll('td')] 
       for item in table.findAll('tr')] 

for data in list_of_rows: 
    print(data)

來源

2017-08-18 19:21:36 SIM

使用Python刮擦表格數據 - BeautifulSoup

回答

相關問題