如何獲取在scrapy spider中使用xpath嵌套在其他表中的錶行

-1

如何獲取嵌套在其他表和表單標籤中的錶行。我嘗試了幾個代碼，但似乎沒有工作。如何獲取在scrapy spider中使用xpath嵌套在其他表中的錶行

我已經使用了下面的Python代碼，但沒能得到任何

def parse(self, response): 
    t = response.xpath('//table[@class="DataGrid"]/tbody/tr') 
    for tr_obj in enumerate(t): 
     print(tr_obj.xpath('td[1]/text()').extract_first())

下面是HTML代碼，並在此我需要得到具有類名作爲gridTable

<html> 
<body> 
    <table></table> 
    <table> 
     <tbody> 
      <tr> 
       <td> 
        <span></span> 
        <script></script> 
        <form> 
         <table class="dPage1"> 
          <tbody> 
           <tr></tr> 
           <tr> 
            <td> 
             <table> 
              <tbody> 
               <tr> 
                <td> 
                 <table class="gridTable"> 

                 </table> 
                </td> 
               </tr> 
              </tbody> 
             </table> 
            </td> 
           </tr> 
          </tbody> 
         </table> 
        </form> 
       </td> 
      </tr> 
     </tbody> 
    </table> 
</body> 
</html>

表

解決方案

for tr_obj in enumerate(response.xpath('//table[@class="DataGrid"]/tr')): 
     print(tr_obj.xpath('td[1]/text()').extract_first())

來源

2017-05-22 Sharath

您可以通過在括號中指定標記來選擇在xpath中遵循哪些標記。

對於您的例子那就是：

//table[@class="gridTable"]/...

來源

2017-05-22 12:17:23 rongon

我已經與我使用的Python代碼更新，但無法獲取任何行 – Sharath

你能提供的網址是什麼？ – rongon

其實網址不能透露在這裏對不起，因爲我們需要提交數據，然後它給了我們結果，但我已經指定了確切的html它看起來如何 – Sharath

建議您一定不要使用在scrapy documentation您XPath聲明tbody。

因此請嘗試沒有它們和/或嘗試通過使用/*/或//來規避它們。

試着這麼做：

def parse(self, response): 
    # Get a Selector list for all rows 
    sel_rows = response.xpath('//table[@class="DataGrid"]/tr') 

    # loop over row selectors ... 
    for sel_row in sel_rows: 
     print(sel_row.xpath('td[1]/text()').extract_first())

來源

2017-05-22 12:47:11

嘿，謝謝它的作品 – Sharath

偶然我編輯了錯誤的問題。你接受了錯誤的答案嗎？ –

它適合我 – Sharath

如何獲取在scrapy spider中使用xpath嵌套在其他表中的錶行

回答

相關問題