使用Python 2.7.10版。試圖通過運行這個蜘蛛從網頁中提取數據。當我安裝scrapy並在我的mac終端上運行它時,我能夠獲得最初的數據。但是現在我無法獲取數據,而是收到Traceback錯誤。執行Scrapy時無法成功執行爬網,因爲Scrapy在執行時
import scrapy
class ShopcluesSpider(scrapy.Spider):
name = 'shopclues'
allowed_domains = ['www.shopclues.com/mobiles-featured-store-4g-smartphone.html']
start_urls = ['http://www.shopclues.com/mobiles-featured-store-4g-smartphone.html/']
#custom_settings = {'FEED_URI' : 'tmp/shopclues.csv'}
def parse(self, response):
titles = response.css('img::attr(title)').extract()
#images = response.css('img::attr(data-img)').extract()
prices = response.css('.p_price::text').extract()
discounts = response.css('.prd_discount::text').extract()
for item in zip(titles,prices,discounts):
scraped_info = {
'title' : item[0],
'price' : item[1],
#'image_urls' : [item[2]], #Set's the url for scrapy to download images
'discount' : item[2]
}
yield scraped_info
得到了以下錯誤:
Traceback (most recent call last):
File "/usr/local/bin/scrapy", line 11, in <module>
sys.exit(execute())
File "/Library/Python/2.7/site-packages/scrapy/cmdline.py", line 148, in execute
cmd.crawler_process = CrawlerProcess(settings)
File "/Library/Python/2.7/site-packages/scrapy/crawler.py", line 243, in __init__
super(CrawlerProcess, self).__init__(settings)
File "/Library/Python/2.7/site-packages/scrapy/crawler.py", line 134, in __init__
self.spider_loader = _get_spider_loader(settings)
File "/Library/Python/2.7/site-packages/scrapy/crawler.py", line 330, in _get_spider_loader
return loader_cls.from_settings(settings.frozencopy())
File "/Library/Python/2.7/site-packages/scrapy/spiderloader.py", line 61, in from_settings
return cls(settings)
File "/Library/Python/2.7/site-packages/scrapy/spiderloader.py", line 25, in __init__
self._load_all_spiders()
File "/Library/Python/2.7/site-packages/scrapy/spiderloader.py", line 47, in _load_all_spiders
for module in walk_modules(name):
File "/Library/Python/2.7/site-packages/scrapy/utils/misc.py", line 71, in walk_modules
submod = import_module(fullpath)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/importlib/__init__.py", line 37, in import_module
__import__(name)
File "/Users/acetonemarketing/Documents/scrapy/ourfirstscraper/ourfirstscraper/spiders/shopclues.py", line 16
for item in zip(titles,prices,discounts):
^
IndentationError: unexpected indent
'IndentationError'與格式化你的源代碼有關,Python使用縮進來構造源代碼,所以它容易受到不好的縮進。但是,當我複製代碼時,我沒有任何問題。 –
感謝@TomášLinhart的回覆。由於您沒有遇到任何錯誤,這是否與我運行此蜘蛛的用戶帳戶有關?當我安裝scrapy時,我不得不使用sudo -H pip安裝scrapy來完成它。 –
它與您運行蜘蛛的用戶帳戶無關。源代碼的縮進不好,但是從你發佈的內容來看並不明顯。 –