我在這裏發佈的代碼,我用來產生一個MultiCSVItemPipeline
基於drcolossos上面的答案。
此管道假定所有Item類遵循約定*項(例如TeamItem,EventItem)並創建team.csv,event.csv文件並將所有記錄發送到相應的csv文件。
from scrapy.exporters import CsvItemExporter
from scrapy import signals
from scrapy.xlib.pydispatch import dispatcher
def item_type(item):
return type(item).__name__.replace('Item','').lower() # TeamItem => team
class MultiCSVItemPipeline(object):
SaveTypes = ['team','club','event', 'match']
def __init__(self):
dispatcher.connect(self.spider_opened, signal=signals.spider_opened)
dispatcher.connect(self.spider_closed, signal=signals.spider_closed)
def spider_opened(self, spider):
self.files = dict([ (name, open(CSVDir+name+'.csv','w+b')) for name in self.SaveTypes ])
self.exporters = dict([ (name,CsvItemExporter(self.files[name])) for name in self.SaveTypes])
[e.start_exporting() for e in self.exporters.values()]
def spider_closed(self, spider):
[e.finish_exporting() for e in self.exporters.values()]
[f.close() for f in self.files.values()]
def process_item(self, item, spider):
what = item_type(item)
if what in set(self.SaveTypes):
self.exporters[what].export_item(item)
return item
好的,編寫MultiCSVItemPipeline後,我感覺更好:-)。我檢查你建議的物品類,找出物品的位置。我給出了自己的答案,以顯示具有相同問題的任何人的代碼。 – Diomedes