scrapy可以單獨使用scrapy刮掉iframe的內容嗎？

iframe code scrapy可以單獨使用scrapy刮掉iframe的內容嗎？

我試過複製並粘貼網站的元素（xpath），但沒有返回任何結果。

scrapy可以抓取iframe中的數據嗎？如果是的話，如果沒有，還應該做些什麼其他的事情？謝謝！

rules = (Rule (SgmlLinkExtractor(deny = path_deny_base, restrict_xpaths=('*')) 
    , callback="parse", follow= True), 
    ) 


    def parse(self, response): 
     yield(Request(url, callback = self.parse_iframe)) 

    def parse_iframe(self, response): 
     #your code to scrape the content from iframe 
     #def parse_items(self, response): 
     hxs = HtmlXPathSelector(response) 
     titles = hxs.select('//div[2]/h1') 
      #//div[2]/h1 
     linker = hxs.select('//div[2]/div[10]/a[1]') 
      #//div[2]/div[10]/a[1] 
     loc_Con = hxs.select('//div[2]/div[1]/div[2]/span/span/span[1]') #//div[2]/div[1]/div[2]/span/span/span[1] 
     loc_Reg = hxs.select('//div[2]/div[1]/div[2]/span/span/span[2]') #/div[2]/div[1]/div[2]/span/span/span[2] 
     loc_Loc = hxs.select('//div[2]/div[1]/div[2]/span/span/span[3]') #/div[2]/div[1]/div[2]/span/span/span[3] 
     items = [] 
     for titles in titles: 
      item = CraigslistSampleItem() 
      #item ["job_id"] = id.select('text()').extract()[0].strip() 
      item ["title"] = map(unicode.strip, titles.select('text()').extract()) #ok 
      item ["link"] = linker.select('@href').extract() #ok 
      item ["info"] = (response.url) 
      temp1 = loc_Con.select('text()').extract() 
      temp2 = loc_Reg.select('text()').extract() 
      temp3 = loc_Loc.select('text()').extract() 
      temp1 = temp1[0] if temp1 else "" 
      temp2 = temp2[0] if temp2 else "" 
      temp3 = temp3[0] if temp3 else "" 
      item["code"] = "{0}-{1}-{2}".format(temp1, temp2, temp3) 
      items.append(item) 
     return(items)

來源

2014-06-19 chano

Scrapy無法從iframe中抓取內容。相反，你也求iframe網址，如：

def parse(self, response): 
    yield(Request(url, callback = self.parse_iframe)) 

def parse_iframe(self, response): 
    #your code to scrape the content from iframe

在哪裏，網址應該是iframe網址，例如https://career-meridia....../jobs)

編輯：

用紅色下劃線的部分更換網址。 Put the underlined part 編輯2： 請確保您傳遞了iframe url所需的每個參數。否則，你什麼也得不到。如果它是post方法，你必須通過所有的post參數。

來源

2014-06-19 08:36:38

如果我想獲得環境服務助手，請問xpath正常嗎？ – chano

它肯定會做，如果你得到這個孩子iframe的響應 –

在這裏很難閱讀你的代碼，你可以用這段代碼編輯你的問題。謝謝 –

這就是我這樣做的方式。首先獲取iframe網址，然後再次調用解析。

urls = response.css('iframe::attr(src)').extract() 
for url in urls : 
     yield scrapy.Request(url....)

來源

2017-07-21 11:08:34 chairam

scrapy可以單獨使用scrapy刮掉iframe的內容嗎？

回答

相關問題