python scrapy xpath：InternalError：（1136，u「列計數與第1行的值計數不匹配」）

有我的Code.When我抓取其他網址時，這是沒有問題的，但是當我抓取此url.it問我列不匹配。我不知道爲什麼計數長度是字符長度，而不是字典長度？python scrapy xpath：InternalError：（1136，u「列計數與第1行的值計數不匹配」）

class JikespiderSpider(scrapy.Spider): 
    name = "jikespider" 
    allowed_domains = ["fromgeek.com"] 
    start_urls = ['http://www.fromgeek.com/topic/'] 

    def parse(self, response): 
    sel = Selector(response) 
    jike_list = sel.xpath('//ul[@id="masonry0"]') 
    ll = len(sel.xpath('//ul[@id="masonry0"]/li')) 
    for jike in range(ll): 
     item = JikeItem() 
     try: 
      item['jike_title'] = jike_list.xpath('//li/div/div[@class="n-pic fl"]/a/@title').extract()[jike].strip() 
      item['jike_uptime'] = jike_list.xpath('//li/div/div[@class="n-keytime "]/div[@class="time fr"]/text()').extract()[jike].strip() 
      item['jike_tag'] = jike_list.xpath('//li/div/div[@class="n-keytime "]/div[@class="key fl"]').xpath('string(.)').extract()[jike].strip() 
      print len(item['jike_title']) 
      print len(item['jike_uptime']) 
      print len(item['jike_tag']) 
      print '--------------------------' 
      yield item 
     except Exception,e: 
      print e

來源

2017-05-06 fan

請顯示您的'items.py'，並且如果您將抓取的項目存儲到'db'中，然後'管道'代碼中，因爲項目被刮取，但是在處理獲取的項目期間發生問題。 – JkShaw

我無法用代碼重現您的錯誤消息。（scrapy 1.3.2，Python 2.7.11）。

我想知道爲什麼你不循環selector list但建立一個計數器訪問元素。嵌套的XPath查詢更容易。

class JikespiderSpider(scrapy.Spider): 
    name = "jikespider" 
    allowed_domains = ["fromgeek.com"] 
    start_urls = ['http://www.fromgeek.com/topic/'] 

    def parse(self, response): 

     sel_jike_list = response.xpath('//ul[@id="masonry0"]/li') 
     for sel_jike in sel_jike_list: 
      item = JikeItem() 
      item['jike_title'] = sel_jike.xpath('.//div[@class="n-pic fl"]/a/@title').extract_first() 
      # ... other fields 
      yield item

請注意嵌套XPath開頭處的點。

來源

2017-05-06 13:30:51

python scrapy xpath：InternalError：（1136，u「列計數與第1行的值計數不匹配」）

回答

相關問題