如何在使用XPath和CSS選擇器功能

我是菜鳥，想用scrapy框架搶東西，但我有麻煩：如何在使用XPath和CSS選擇器功能

HTML中：

<ul class="tip" id="tip1"> 
    <li id="tip1_0"> 
     <a href="http://***" title="***" target="_self">*** 
     </a> 
    </li> 
    <li id="tip1_1"> 
     <a href="http://***" title="***" target="_self">*** 
     </a> 
    </li> 
    <li id="tip1_2"> 
     <a href="http://***" title="***" target="_self">*** 
     </a> 
    </li> 
</ul>

我用：

f = response.xpath("//*[@id='tip1']//li/a/@href | //*[@id='tip1']//li/a/@title").extract()

當我得到的f是一個列表，我會改變列表（F）與dict（NAME0 = F [0]，value0 = F [1]，NAME1 = F [2]，值1 = [f3]，依此類推）。有什麼辦法更容易？

的Html B：

<div class="info"> 
    <a target="_blank" href="***" title="***"> 
    </a> 
</div> 
<div class="info"> 
    <a target="_blank" href="***" title="***"> 
    </a> 
</div> 
<div class="info"> 
    <a target="_blank" href="***" title="***"> 
    </a> 
</div>

在這種情況下：

file = response.xpath('//div[@class="info"]') 
for line in file: 
    f = line.xpath('/a/@href').extract() 
    d = line.xpath('/a/@title').extract()

但是，它不工作，只是返回 'F = []' 和 'd = []'，那麼，我很困惑，我該如何解決這個問題？非常感謝。

來源

2016-10-10 xie

你可以通過預先點取得了你的內心表達的具體情況的：

f = line.xpath('./a/@href').extract() 
d = line.xpath('./a/@title').extract()

或者，指向你的外在表達a並得到@href和@title：

file = response.xpath('//div[@class="info"]/a') 
for line in file: 
    f = line.xpath('@href').extract_first() 
    d = line.xpath('@title').extract_first()

還要注意使用extract_first()方法。

來源

2016-10-10 18:37:13 alecxe

如何在使用XPath和CSS選擇器功能

回答

相關問題