2015-10-22 78 views
0

我正在使用Scrapy從管道中導出來自我的蜘蛛的JSON。我想將json包裝在產品對象中。更改Scrapy JSON輸出

我使用JsonLinesItemExporter

目前,我的JSON是這樣的:

{"name": "Protective iPhone Stand Case", 
    "link": "https://things.com/899029978367138670/Strap-On-SoftRack-Roof-Rack-by-Otium", 
    "category_old": "Sports & Outdoors", 
    "image_url": "https://thingd-media-ec1.com/default/899029978367138670_42120cf10765.jpg", 
    "price": "160", 
    "interest": "13", 
    "company": "ACME", 
    "country": "USA"} 

"product": { 
    "name": "Protective iPhone Stand Case", 
    "link": "https://things.com/899029978367138670/Strap-On-SoftRack-Roof-Rack-by-Otium", 
    "category_old": "Sports & Outdoors", 
    "image_url": "https://thingd-media-ec1.com/default/899029978367138670_42120cf10765.jpg", 
    "price": "160", 
    "interest": "13", 
    "company": "ACME", 
    "country": "USA" 
} 

那麼,如何把它包在產品對象?

這裏是我的代碼管道:

import requests 
import time 
from scrapy.utils.project import get_project_settings 
import sys 
import json 
from scrapy import signals 
from scrapy.exporters import JsonLinesItemExporter 

SETTINGS = get_project_settings() 

class FancyPipeline(object): 

    def __init__(self): 
     #Instantiate API Connection 
     self.files = {} 
     url = 'http://unshakable-missile-106309.nitrousapp.com:3000/api/v1/imports' 

    @classmethod 
    def from_crawler(cls, crawler): 
     pipeline = cls() 
     crawler.signals.connect(pipeline.spider_opened, signals.spider_opened) 
     crawler.signals.connect(pipeline.spider_closed, signals.spider_closed) 
     return pipeline 

    def spider_opened(self, spider): 
     #open a static/dynamic file to read and write to 
     file = open('%s_items.json' % spider.name, 'w+b') 
     self.files[spider] = file 
     self.exporter = JsonLinesItemExporter(file) 
     self.exporter.start_exporting() 

    def spider_closed(self, spider): 
     self.exporter.finish_exporting() 
     file = self.files.pop(spider) 
     file.close() 

    def process_item(self, item, spider): 
     self.exporter.export_item(item) 
     return item 

回答

1

我可以用下面的代碼來做到這一點:

def spider_opened(self, spider): 
     #open a static/dynamic file to read and write to 
     file = open('%s_items.json' % spider.name, 'w+b') 
     self.files[spider] = file 
     file.write('''{ 
    "product": [''') 
     self.exporter = JsonLinesItemExporter(file) 
     self.exporter.start_exporting() 

    def spider_closed(self, spider): 
     self.exporter.finish_exporting() 
     file = self.files.pop(spider) 
     file.write("]}") 
     file.close()