在Python中使用美麗的湯網頁刮 - JavaScript表

我試圖從網站刮表，但我似乎無法用Python中的Beautifulsoup弄清楚。我不知道是否因爲表格格式，但我基本上想把這個表格變成CSV。在Python中使用美麗的湯網頁刮 - JavaScript表

from bs4 import BeautifulSoup 
import requests 

page = requests.geenter code heret("https://spotwx.com/products/grib_index.php?model=hrrr_wrfprsf&lat=41.03399&lon=-73.76291&tz=America/New_York&display=table") 
soup = BeautifulSoup(page.content, 'html.parser') 
print(soup.prettify)

有關如何隔離此數據表的任何建議？我查了很多Beautifulsoup教程，但HTML看起來與大多數引用不同。非常感謝您的幫助 -

來源

2017-10-05 Frank Drin

試試這個。該網站的表格會動態生成，因此只能使用requests才能得到結果。

from selenium import webdriver 
from bs4 import BeautifulSoup 
import csv 

outfile = open("spotwx.csv", "w", newline='') 
writer = csv.writer(outfile) 

driver = webdriver.Chrome() 
driver.get("https://spotwx.com/products/grib_index.php?model=hrrr_wrfprsf&lat=41.03399&lon=-73.76291&tz=America/New_York&display=table") 
soup = BeautifulSoup(driver.page_source, 'lxml') 

driver.quit() 
titles = soup.select("table#example")[0] 
list_row =[[tab_d.text for tab_d in item.select('td')] 
       for item in titles.select('tr')] 

for data in list_row: 
    print(' '.join(data)) 
    writer.writerow(data) 
outfile.close()

來源

2017-10-05 20:59:46 SIM

非常感謝您的回覆。我不熟悉Webdriver，但我不需要實時刷新（除非絕對必要，否則不希望使用Webdriver）。看起來，簡單地做一個請求拉取在soup.prettify代碼中顯示了必要的數據，但我不知道如何將它提取到表中。再次感謝您的幫助！ –

當我嘗試上面的代碼時，出現錯誤 selenium.common.exceptions.WebDriverException：消息：'chromedriver'可執行文件需要位於PATH中。請參閱https://sites.google.com/a/chromium.org/chromedriver/home –

第一個應該可以工作。如果沒有，那麼去第二個。 1.'driver = webdriver.Chrome（'C：/path/to/chromedriver.exe'）'2.'driver = webdriver.Chrome（'/ path/to/chromedriver'）'.Btw，你必須根據到你的系統，我的意思是路徑。謝謝。 – SIM

在Python中使用美麗的湯網頁刮 - JavaScript表

回答

相關問題