Scrapy：從特定的div

提取信息這是我的代碼：Scrapy：從特定的div

def parse(self, response): 
    return scrapy.FormRequest.from_response(
     response, 
     formdata={'uuid': 'user', 'password': 'cons'}, 
     callback=self.after_login 
    ) 

def after_login(self, response): 
    # check login succeed before going on 
    if "authentication failed" in response.body: 
     self.log("Login failed", level=log.ERROR) 
else: 
    self.log('LOGGED') 
    sel = Selector(response) 
    sel.xpath("//div[@class='amount cSpringGreen']/text()").extract()

但是，當我執行它沒有出現。它應該工作的方式是在網站上顯示該信息後登錄。 html代碼是這樣的。

<h1 class="hide2"></h1> 
<div id="vodaint-local" class="wrapper rhomb"> 
<div class="spring"> 
<script type="text/javascript"> 
<div class="mod mod-selectsizeheader vodaint-local"> 
<div id="mivf" class="content"> 
<div id="navigation-breadcrumb" class="belt"> 
<div class="belt"> 
<div class="miVFR"> 
<div class="mainMiVF cf"> 
<div class="headerMiVF cf"> 
<div class="bodyMiVF cf"> 
<div class="mainNav" style="height: auto;"> 
<div class="mainContent withHeader" style="height: 585px;"> 
<style> 
<div id="contentSpinner" style="margin-bottom: 432px; display: none;"> 
<script> 
<section> 
<script type="text/javascript"> 
<div class="mainContentContainer home"> 
<div class="headerBanner"> 
<script type="text/javascript"> 
<div class="lineContainer "> 
<h6 class="topHeading prepago"> </h6> 
<div class="columnGroup cf"> 
<div class="column newPromo"> 
<div class="columnContent"> 
<p class="cTitle"> Tu saldo</p> 
--THIS IS THE INFO I WANT TO SHOW-- 
<div class="amount cSpringGreen"> 
0, 
<span> 96</span> 
€ 
</div>

謝謝！

編輯：在這個pastebin中你可以找到整個HTML文件http://pastebin.com/B2HpACCw我想登錄後顯示的東西是「0'96」，謝謝！

來源

2014-11-22 AngelaBR

的HTML是有點怪異。有很多沒有關閉的'div'和'script'。甚至還有一個可疑的'style'元素。 – dreyescat 2014-11-22 19:15:18

我不確定我是否明白確切的問題 - 是否有任何「'after_login」方法中的消息被打印？究竟是什麼問題：蜘蛛沒有在網站上登錄，或者是數據沒有被抓取？ – elias 2014-11-22 19:29:37

登錄完美地工作，它顯示LOGGED在屏幕上，問題是之後，它不顯示任何東西。 – AngelaBR 2014-11-23 20:18:20

將其存儲到項目編輯; items.py

class TestItem(scrapy.Item): 
    text= scrapy.Field()

，然後在蜘蛛作爲

item=TestItem() 
item['text'] = sel.xpath("//div[@class='amount cSpringGreen']/text()").extract() 
print item['text']

來源

2014-11-26 11:20:10 Tushar

我不得不修改desc = scrapy.Field（）用於文本。 = scrapy.Field（）。之後它會提示。 2014-11-26 12：35：48 + 0100 [login] DEBUG：LOGGED []。該項目是空的。 – AngelaBR 2014-11-26 11:37:36

對不起，我只是忘了更新...現在你必須包括項目items.py然後從你的蜘蛛訪問它 – Tushar 2014-11-26 11:39:59

是的，我做到了，但它顯示爲空：/ – AngelaBR 2014-11-26 11:43:49

Scrapy：從特定的div

回答

相關問題