2013-08-26 41 views
0

我收到了一些錯誤,取決於正在插入/更新的內容。Scrapy:MySQL管道 - 遇到意外錯誤

這裏是用於處理所述產品的代碼:

def process_item(self, item, spider): 

    try: 
     if 'producer' in item: 
      self.cursor.execute("""INSERT INTO Producers (title, producer) VALUES (%s, %s)""", (item['title'], item['producer'])) 
     elif 'actor' in item: 
      self.cursor.execute("""INSERT INTO Actors (title, actor) VALUES (%s, %s)""", (item['title'], item['actor'])) 
     elif 'director' in item: 
      self.cursor.execute("""INSERT INTO Directors (title, director) VALUES (%s, %s)""", (item['title'], item['director'])) 
     else: 
      self.cursor.execute("""UPDATE example_movie SET distributor=%S, rating=%s, genre=%s, budget=%s WHERE title=%s""", (item['distributor'], item['rating'], item['genre'], item['budget'], item['title'])) 
     self.conn.commit() 
    except MySQLdb.Error, e: 
     print "Error %d: %s" % (e.args[0], e.args[1]) 

    return item 

這裏從刮刀返回的items的示例:

[{'budget': [u'N/A'], 'distributor': [u'Lorimar'], 'genre': [u'Action'], 'rating': [u'R'],'title': [u'Action Jackson']}, {'actor': u'Craig T. Nelson', 'title': [u'Action Jackson']}, {'actor': u'Sharon Stone', 'title': [u'Action Jackson']}, {'actor': u'Carl Weathers', 'title': [u'Action Jackson']}, {'producer': u'Joel Silver', 'title': [u'Action Jackson']}, {'director': u'Craig R. Baxley', 'title': [u'Action Jackson']}] 

下面是錯誤返回:

2013-08-25 23:04:57-0500 [ActorSpider] ERROR: Error processing {'budget': [u'N/A'], 
'distributor': [u'Lorimar'], 
'genre': [u'Action'], 
'rating': [u'R'], 
'title': [u'Action Jackson']} 
Traceback (most recent call last): 
    File "/Library/Python/2.7/site-packages/scrapy/middleware.py", line 62, in _process_chain 
    return process_chain(self.methods[methodname], obj, *args) 
    File "/Library/Python/2.7/site-packages/scrapy/utils/defer.py", line 65, in process_chain 
    d.callback(input) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/twisted/internet/defer.py", line 361, in callback 
    self._startRunCallbacks(result) 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/twisted/internet/defer.py", line 455, in _startRunCallbacks 
    self._runCallbacks() 
--- <exception caught here> --- 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/twisted/internet/defer.py", line 542, in _runCallbacks 
    current.result = callback(current.result, *args, **kw) 
    File "/Users/fortylashes/Documents/Management_Work/BoxOfficeMojo/BoxOfficeMojo/pipelines.py", line 53, in process_item 
    self.cursor.execute("""UPDATE example_movie SET distributor=%S, rating=%s, genre=%s, budget=%s WHERE title=%s""", (item['distributor'], item['rating'], item['genre'], item['budget'], item['title'])) 
    File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/MySQLdb/cursors.py", line 159, in execute 
    query = query % db.literal(args) 
exceptions.ValueError: unsupported format character 'S' (0x53) at index 38 

    Error 1064: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '), 'Craig T. Nelson')' at line 1 
    2013-08-25 23:04:57-0500 [ActorSpider] DEBUG: Scraped from <200 http://www.boxofficemojo.com/movies/?id=actionjackson.htm> 
{'actor': u'Craig T. Nelson', 'title': [u'Action Jackson']} 
    Error 1064: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '), 'Sharon Stone')' at line 1 
    2013-08-25 23:04:57-0500 [ActorSpider] DEBUG: Scraped from <200 http://www.boxofficemojo.com/movies/?id=actionjackson.htm> 
{'actor': u'Sharon Stone', 'title': [u'Action Jackson']} 
    Error 1064: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '), 'Carl Weathers')' at line 1 
    2013-08-25 23:04:57-0500 [ActorSpider] DEBUG: Scraped from <200 http://www.boxofficemojo.com/movies/?id=actionjackson.htm> 
{'actor': u'Carl Weathers', 'title': [u'Action Jackson']} 
    Error 1064: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '), 'Joel Silver')' at line 1 
    2013-08-25 23:04:57-0500 [ActorSpider] DEBUG: Scraped from <200 http://www.boxofficemojo.com/movies/?id=actionjackson.htm> 
{'producer': u'Joel Silver', 'title': [u'Action Jackson']} 
    Error 1064: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '), 'Craig R. Baxley')' at line 1 
    2013-08-25 23:04:57-0500 [ActorSpider] DEBUG: Scraped from <200 http://www.boxofficemojo.com/movies/?id=actionjackson.htm> 
{'director': u'Craig R. Baxley', 'title': [u'Action Jackson']} 

顯然有很多問題。感謝您的閱讀!任何和所有的建議或想法,非常感謝!

:::: UPDATE/MORE INFO ::::

似乎有三部電影,測試儀52的總的,其被插入到所述ActorsProducersDirectors表。 注意UPDATE聲明根本不起作用。

這些影片分別是:Abraham Lincoln: Vampire HunterAce Ventura: Pet DetectiveAce Ventura: When Nature Calls

有趣的是,這些都是,在title:的電影 - 我不知道這意味着什麼,但如果任何人有一個想法,請分享!

::::: INSERT解決:::::

原來,問題是由scraper把個人列表中的項目引起的。所以{'actor': [u'this one guy']相反頂部{'actor': u'this one guy'}

回答

1

您在代碼的第53行使用了字符串數據類型的錯誤格式說明符。它應該是小而不是大寫'S'。

self.cursor.execute("""UPDATE example_movie SET distributor=%S, rating=%s, genre=%s, budget=%s WHERE title=%s""", (item['distributor'], item['rating'], item['genre'], item['budget'], item['title'])) 

它應該是這樣的。

self.cursor.execute("""UPDATE example_movie SET distributor=%S, rating=%s, genre=%s, budget=%s WHERE title=%s""", (item['distributor'], item['rating'], item['genre'], item['budget'], item['title'])) 
+0

謝謝您花時間回覆!哎呀...很高興你抓住了!還解決了插入語句的問題 - 我將更新該信息,測試代碼並接受您的答案!謝謝! – DMML