2017-07-27 47 views
3

因此,我一直試圖從「2005年至2013年之間喝酒」 颳起「2005 - 2013」​​,起初這段代碼對我有用,但現在我只能得到返回的空列表,我的要求還是有200個狀態碼python請求有時會返回空列表

import requests, lxml.html, csv 
headers = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) 
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36'} 
page = requests.get('http://www.cellartracker.com/wine.asp? 
iWine=91411',headers=headers) 
print(page.status_code) 
html = lxml.html.fromstring(page.content) 
content_divs = html.xpath('//a[@title="Source: Community"]/text()') 
print(content_divs) 

不知道我是否應該開始使用硒這樣做刮痧,因爲它是一個js的網站,如果是不知道如何做,要麼這樣一些基本的幫助將是有益的! 謝謝!

+0

如果它是一個js的網站,肯定你需要使用'Selenium'或類似工具刮它 – gaback

+0

我得到預期的結果,不知道爲什麼它會停止工作,你是否試圖一遍又一遍解析相同的網站有時候會得到一個空的列表?如果你想引用使用'Selenium'進行搜刮,我只是[回答](https://stackoverflow.com/a/45315393/5103802)這個問題的一個問題。 –

回答

1

使用硒

from selenium import webdriver 
url = "https://www.cellartracker.com/wine.asp?iWine=91411" 

driver = webdriver.Chrome(executable_path="chromedriver2.25") 
driver.get(url) 
list = driver.find_elements_by_xpath("//li[contains(.,'review')]") 
for item in list: 
    print(item.text) 
    print("---") 

輸出:

Options 
1/4/2014 - REUBENSHAPCOTT WROTE: 
91 Points 
Delicious! Had no idea that Australia made port this good, and affordable. Terrific, smooth fig and plum. Aged and neither sharp nor grapey. If you see it, buy it. 
Do you find this review helpful? Yes - No/Comment 
--- 
Options 
1/20/2013 - LISAADAM WROTE: 
85 Points 
The wine looks Tawny colored. 
Do you find this review helpful? Yes - No/Comment 
--- 
Options 
12/22/2012 - WINEAGGREGATE LIKES THIS WINE: 
90 Points 
Molasses, pepper, butterscotch candy that's been roasted a bit. Very nice. 
Do you find this review helpful? Yes - No/Comment 
--- 
Options 
10/30/2011 - GTI2TON WROTE: 
87 Points 
Sweeter than average tawny and straightforward, but still has nice richness in its raisin and light carmel notes. Good value. 
Do you find this review helpful? Yes - No/Comment