2016-05-13 101 views
0

這可能是一件非常簡單的事情,但我一直在失敗。使用lxml/XPath獲得第n個元素失敗

root包含一個或多個「<鏈接/ >」時,root.xpath('(// link)')將它們全部返回。但root.xpath('(// link)[0]')返回一個空列表。哪裏不對?

from unittest import TestCase, TestProgram 

class T(TestCase): 
    base_path = r'(//_:link)' 
    def test0ok(self): 
     self._test(2, self.base_path) 
    def test1ng(self): 
     self._test(1, self.base_path + r'[0]') 
    def _test(self, expected, path): 
     try: 
      from lxml.etree import fromstring as parse_xml_string 
     except ImportError: 
      raise 
     root = parse_xml_string(_xhtml) 
     nsmap = dict(_=root.nsmap[None]) 
     gotten = root.xpath(path, namespaces=nsmap) 
     gotten = len(gotten) 
     self.assertEqual(expected, gotten) 

_xhtml = br''' 
<?xml version="1.0" encoding="UTF-8"?> 
<!DOCTYPE html PUBLIC 
    "-//W3C//DTD XHTML 1.1//EN" 
    "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd" 
> 
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en"> 
<head> 
<link rev="made" href="./" /> 
<link rel="contents" href="./" /> 
<title>te</title> 
</head> 
<body> 
<h1>st</h1> 
</body> 
</html> 
'''[1:] 

if __name__ == r'__main__': 
    TestProgram() 

回答

3

這是因爲索引XPath中有1開始,而不是0:

root.xpath('(//link)[1]') 

或者,您也可以通過指數在Python(0基於)獲得元素:

root.xpath('//link')[0]