爲什麼錯誤信息不能記錄到指定的文件中？

平臺：Debian的8 +蟒蛇3.4 + Scrapy 1.3.2 這裏是我的蜘蛛下載某些URL形成yahoo.com爲什麼錯誤信息不能記錄到指定的文件中？

import scrapy 
import csv 

class TestSpider(scrapy.Spider): 
    name = "quote" 
    allowed_domains = ["yahoo.com"] 
    start_urls = ['url1','url2','url3',,,,'urls100'] 

    def parse(self, response): 
     filename = response.url.split("=")[1] 
     open('/tmp/'+filename+'.csv', 'wb').write(response.body)

某些錯誤信息出現時，執行它：

2017-02-19 21:28:27 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response 
<404 https://chart.yahoo.com/table.csv?s=GLU>: HTTP status code is not handled or not allowed

https://chart.yahoo.com/table.csv?s=GLU是start_urls之一。

現在我想抓住錯誤信息。

import scrapy 
import csv 

import logging 
from scrapy.utils.log import configure_logging 
configure_logging(install_root_handler=False) 
logging.basicConfig(
    filename='/tmp/log.txt', 
    format='%(levelname)s: %(message)s', 
    level=logging.INFO 
) 

class TestSpider(scrapy.Spider): 
    name = "quote" 
    allowed_domains = ["yahoo.com"] 
    start_urls = ['url1','url2','url3',,,,'url100'] 

    def parse(self, response): 
     filename = response.url.split("=")[1] 
     open('/tmp/'+filename+'.csv', 'wb').write(response.body)

爲什麼該錯誤信息，如
2017年2月19日21時28分27秒[scrapy.spidermiddlewares.httperror] INFO：忽略響應 < https://chart.yahoo.com/table.csv?s=GLU 404>：HTTP狀態代碼沒有被處理或不允許 不能記錄在/home/log.txt中？

想到eLRuLL，我加了handle_httpstatus_list = [404]。

import scrapy 
import csv 

import logging 
from scrapy.utils.log import configure_logging 
configure_logging(install_root_handler=False) 
logging.basicConfig(
    filename='/home/log.txt', 
    format='%(levelname)s: %(message)s', 
    level=logging.INFO 
) 

class TestSpider(scrapy.Spider): 
    handle_httpstatus_list = [404] 
    name = "quote" 
    allowed_domains = ["yahoo.com"] 
    start_urls = ['url1','url2','url3',,,,'url100'] 

    def parse(self, response): 
     filename = response.url.split("=")[1] 
     open('/tmp/'+filename+'.csv', 'wb').write(response.body)

錯誤信息仍然不能記錄到/home/log.txt文件中，爲什麼？

來源

2017-02-20 it_is_a_literature

使用handle_httpstatus_list屬性上的蜘蛛來處理404狀態：

class TestSpider(scrapy.Spider): 
    handle_httpstatus_list = [404]

來源

2017-02-20 02:04:33 eLRuLL

爲什麼錯誤信息不能記錄到指定的文件中？

回答

相關問題