我試圖在使用Splash發出請求後訪問cookie。 以下是我如何構建請求。從Splash請求中讀取cookies
script = """
function main(splash)
splash:init_cookies(splash.args.cookies)
assert(splash:go{
splash.args.url,
headers=splash.args.headers,
http_method=splash.args.http_method,
body=splash.args.body,
})
assert(splash:wait(0.5))
local entries = splash:history()
local last_response = entries[#entries].response
return {
url = splash:url(),
headers = last_response.headers,
http_status = last_response.status,
cookies = splash:get_cookies(),
html = splash:html(),
}
end
"""
req = SplashRequest(
url,
self.parse_page,
args={
'wait': 0.5,
'lua_source': script,
'endpoint': 'execute'
}
)
該腳本是Splash文檔的精確副本。
所以我試圖訪問在網頁上設置的cookie。當我不使用Splash時,下面的代碼按照我的預期工作,但在使用Splash時不起作用。
self.logger.debug('Cookies: %s', response.headers.get('Set-Cookie'))
這同時使用飛濺返回:
2017-01-03 12:12:37 [spider] DEBUG: Cookies: None
當我不使用飛濺此代碼的工作,並返回該網頁提供的餅乾。
飛濺的文檔顯示該代碼例如:
def parse_result(self, response):
# here response.body contains result HTML;
# response.headers are filled with headers from last
# web page loaded to Splash;
# cookies from all responses and from JavaScript are collected
# and put into Set-Cookie response header, so that Scrapy
# can remember them.
我不知道我是否正確地理解這一點,但我要說,我應該能夠訪問在同一餅乾就像我不使用Splash一樣。
中間件設置:
# Download middlewares
DOWNLOADER_MIDDLEWARES = {
# Use a random user agent on each request
'crawling.middlewares.RandomUserAgentDownloaderMiddleware': 400,
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware': None,
'scrapy.downloadermiddlewares.cookies.CookiesMiddleware': 700,
# Enable crawlera proxy
'scrapy_crawlera.CrawleraMiddleware': 600,
# Enable Splash to render javascript
'scrapy_splash.SplashCookiesMiddleware': 723,
'scrapy_splash.SplashMiddleware': 725,
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware': 810,
}
所以我的問題是:我怎麼在使用飛濺請求訪問餅乾?
我已將端點添加到請求但沒有結果。 response.headers.get('Set-Cookie')仍然返回一個NoneType。對於response.cookiejar,我得到一個錯誤:AttributeError:'SplashTextResponse'對象沒有屬性'cookiejar' – Casper
@Casper - 你確定所有描述的選項都設置在settings.py中嗎? scrapy_splash.SplashCookiesMiddleware添加到'DOWNLOADER_MIDDLEWARES'嗎? –
我用DOWNLOADER_MIDDLEWARES設置變量更新了這個問題。 – Casper