在scrapy中選擇文件名

-1

我有一個url列表，每個url都與一個唯一的ID相關聯。我想使用scrapy下載每個URL並將它們保存在一個以其唯一ID命名的文件中。我通過一個基本的教程去了，有下面的代碼，但不知道我怎樣才能獲得UID，同時節省解析後的文件 -在scrapy中選擇文件名

import scrapy 
import json 

class QuotesSpider(scrapy.Spider): 
    name = "quotes" 

    def start_requests(self): 
     urls = json.load(open('url_info.json')) 
     for entity in urls: 
      url = entity['url'] 
      uid = entity['uid'] # unique id 
      request_object = scrapy.Request(url=url, callback=self.parse) 
      request_object.meta['uid'] = uid 
      yield request_object 

    def parse(self, response): 
     filename = 'quotes-unique-id.html' # % can I access uid here 
     with open(filename, 'wb') as f: 
      f.write(response.body)

來源

2017-08-16 comiventor

不確定爲什麼這個問題是downvoted。 @tomáš-linhart我早些時候嘗試過這個解決方案，但它給了我關鍵的錯誤。這就是爲什麼我編輯了我的代碼，並刪除了包含你所建議的內容的行。如果原因不存在，Downvotes令人沮喪：（ – comiventor

您從meta屬性得到uid在parse方法是這樣的：

filename = 'quotes-{}.html'.format(response.meta['uid'])

來源

2017-08-16 05:33:19

或'f「引號 - {response.meta ['uid']}」'如果您正在運行py3.6 :) – Granitosaurus

@Granitosaurus [This]（https：// www .python.org/dev/peps/pep-0498 /）真的很酷，不知道。可能時間已經從2.7開始... :-) –

在scrapy中選擇文件名

回答

相關問題