exceptions.TypeError: Request url must be str or unicode, got list:
上面是我收到的錯誤,是我的縮進嗎?這段代碼應該使用它從scrapy中提取的鏈接來提取第一個div中的第二段,但我得到這個錯誤
這是我的代碼。
from scrapy.spider import BaseSpider
from bathUni.items import BathuniItem
from scrapy.selector import HtmlXPathSelector
from scrapy.http.request import Request
from urlparse import urljoin
class recursiveSpider(BaseSpider):
name = 'recursive2'
allowed_domains = ['http://www.bristol.ac.uk/']
start_urls = ['http://www.bristol.ac.uk/international/countries/']
def parse(self, response):
hxs = HtmlXPathSelector(response)
links = []
for i in range(1, 154):
xpath = ('//*[@id="all-countries"]/li[*]/ul/li[*]/a/@href' .format (i+1))
link = hxs.select(xpath).extract()
links.append(link)
for link in links:
yield Request(link, callback=self.parse_linkpage)
def parse_linkpage(self, response):
hxs = HtmlXPathSelector(response)
item = BathuniItem()
item ['Qualification'] = hxs.select('//*[@id="uobcms-content"]/div/div/div[1]/p[2]').extract()
yield item
我如何得到這個工作,並按照第一頁的鏈接從鏈接中提取數據?任何例子都會很棒。
請提供完整的追蹤 – jonrsharpe
您正在使用哪種版本的scrapy?另外,你能舉一個你想要從每個國家頁面上刮取的信息的例子嗎? – Talvalin