我試圖讓使用scrapy我可以得到CSV數據從Amazon的數據,但我不能夠插入數據在mysql數據庫 請找我的代碼 我的蜘蛛如何從scrapy數據插入到mysql
import scrapy
from craigslist_sample.items import AmazonDepartmentItem
from scrapy.contrib.spiders import CrawlSpider, Rule
from scrapy.contrib.linkextractors import LinkExtractor
class AmazonAllDepartmentSpider(scrapy.Spider):
name = "amazon"
allowed_domains = ["amazon.com"]
start_urls = [
"http://www.amazon.com/gp/site-directory/ref=nav_sad/187-3757581-3331414"
]
def parse(self, response):
for sel in response.xpath('//ul/li'):
item = AmazonDepartmentItem()
item['title'] = sel.xpath('a/text()').extract()
item['link'] = sel.xpath('a/@href').extract()
item['desc'] = sel.xpath('text()').extract()
return item
我的流水線代碼是
import sys
import MySQLdb
import hashlib
from scrapy.exceptions import DropItem
from scrapy.http import Request
class MySQLStorePipeline(object):
host = 'rerhr.com'
user = 'amazon'
password = 'sads23'
db = 'amazon_project'
def __init__(self):
self.connection = MySQLdb.connect(self.host, self.user, self.password, self.db)
self.cursor = self.connection.cursor()
def process_item(self, item, spider):
try:
self.cursor.execute("""INSERT INTO amazon_project.ProductDepartment (ProductDepartmentLilnk)
VALUES (%s)""",
(
item['link'].encode('utf-8')))
self.connection.commit()
except MySQLdb.Error, e:
print "Error %d: %s" % (e.args[0], e.args[1])
return item
當我運行以下命令
scrapy爬行亞馬遜-o items.csv -t CSV
那麼我可以能夠得到的數據在我的CSV 但是當我運行
scrapy爬行亞馬遜
與上面的代碼我不是能夠在MySQL 插入數據,請幫助我什麼,我們必須做的話,我可以在MySQL中插入數據
感謝
什麼是控制檯?任何錯誤?管道是否在設置中打開?你確定你正在檢查結果到你插入的同一個數據庫嗎?謝謝。 – alecxe 2014-12-05 15:02:09
我的管道設置是ITEM_PIPELINES = ['projectname.pipelines.MySQLStorePipeline'],是的,我正在檢查相同的數據庫 – wiretext 2014-12-05 15:19:24