2015-11-10 32 views
0

我想從snapdeal中刮取移動產品功能。刮取Snapdeal以提取手機功能

//*[@id="productSpecs"]/div/div[2]/div[2]/div/table[1]/tbody/tr/td/table/tbody/tr/td[2] 
//*[@id="productSpecs"]/div/div[2]/div[2]/div/table[1]/tbody/tr/td/table/tbody/tr/td[1] 

這些是xpaths。我可以在google chrome中看到通過刮板擴展的結果,但無法通過scrapy獲取結果。

from scrapy.spider import BaseSpider 
# from scrapy.selector import HtmlXPathSelector 
from scrapy.selector import Selector 
from demo.items import CraigslistSampleItem 


class MySpider(BaseSpider): 
    name = "craigs" 
    allowed_domains = ["www.snapdeal.com"] 
    start_urls = ["http://www.snapdeal.com/product/samsung-galaxy-j2-8gb/655619199985"] 

    def parse(self, response): 
     # hxs = HtmlXPathSelector(response) 
     sel = Selector(response) 
     titles = sel.xpath("//*[@id='productSpecs']/div/div[2]/div[2]/div/table[1]/tbody/tr/td/table/tbody/tr/td[2]") 
     print titles 
     items = [] 
     for titles in titles: 
      item = CraigslistSampleItem() 
      # item["Brand"] = titles.extract() 
      items.append(item) 
     print items 

titles打印爲空,這是示例代碼。

回答

0

編輯您的XPath爲:

titles = sel.xpath("//*[@id='productSpecs']/div/div[2]/div[2]/div/table[1]/tr/td/table/tr/td[2]") 

這是因爲鉻在源代碼中增加了額外的tbody標籤。

+0

非常感謝。這對我有用:) – Ninjaneer