用xpath查找表格元素中的所有tr？

def parse_header(table): 
    ths = table.xpath('//tr/th') 
    if not ths: 
     ths = table.xpath('//tr[1]/td') # here is the problem, this will find tr[1]/td in all html file insted of this table 

    # bala bala something elese 

doc = html.fromstring(html_string) 
table = doc.xpath("//div[@id='divGridData']/div[2]/table")[0] 
parse_header(table)

我想在我的表中找到所有tr[1]/td，但table.xpath("//tr[1]/td")仍然在html文件中找到所有。我如何才能找到這個元素而不是所有的html文件？用xpath查找表格元素中的所有tr？

編輯：

content = ''' 

<root> 
    <table id="table-one"> 
     <tr> 
      <td>content from table 1</td> 
     <tr> 
     <table> 
      <tr> 
       <!-- this is content I do not want to get --> 
       <td>content from embeded table</td> 
      <tr> 
     </table> 
    </table> 
</root>''' 

root = etree.fromstring(content) 
table_one = root.xpath('table[@id="table-one"]') 
all_td_elements = table_one.xpath('//td') # so this give me too much!!!

現在我不想內嵌表的內容，我該怎麼辦呢？

來源

2015-10-29 roger

要查找作爲上下文節點的子元素的元素，請在期間將.運算符添加到XPath中。所以，我認爲你正在尋找的XPath是：

.//tr[1]/td

這將選擇td元素，它們是當前表的子元素，而不是在整個HTML文件。

舉個例子：

from lxml import etree 

content = ''' 

<root> 
    <table id="table-one"> 
     <tr> 
      <td>content from table 1</td> 
     <tr> 
    </table> 
    <table id="table-two"> 
     <tr> 
      <td>content from table 2</td> 
     <tr> 
    </table> 
</root>''' 

root = etree.fromstring(content) 
table_one = root.xpath('table[@id="table-one"]') 

# this will select all td elements in the entire XML document (so two elements) 
all_td_elements = table_one.xpath('//td') 

# this will just select the single sub-element because of the period 
just_sub_td_elements = table_one.xpath('.//td')

來源

2015-10-29 16:35:16 gtlambert

我還有一個問題，我沒有更新我的問題，我怎麼能夠無視嵌入式表？ – roger

我不明白更新？ – gtlambert

我不想用'table_one.xpath（'// td'）' – roger

用xpath查找表格元素中的所有tr？

回答

相關問題