我可以使用python，selenium和lxml解析xpath嗎？

所以我一直在想如何使用BeautifulSoup並做了一個快速搜索，發現lxml可以解析html頁面的xpath。如果我能做到這一點，我會很開心，但教程並不直觀。我可以使用python，selenium和lxml解析xpath嗎？

我知道如何使用Firebug獲取xpath，並且很好奇，如果任何人使用lxml並且可以解釋我如何使用它來解析特定的xpath，並且打印它們......每行說5 ...或者如果它是平坦的可能？！

硒使用Chrome，並且正確地加載頁面，只是需要幫助前進。

謝謝！

2012-12-20 twitch after coffee

什麼是BS4？維基百科說它的一些轎車:) – Himanshu

@Himanshu對不起bs4 = beautifulsoup4 –

好吧。要在python上使用xml文檔，請參閱元素樹http://docs.python.org/2/library/xml.etree.elementtree.html#xpath-support。您可能無法解析所有HTML文檔，因爲它們可能不是全部有效的XML文檔。請參閱http://stackoverflow.com/questions/285990/parse-html-via-xpath – Himanshu

lxml的ElementTree的具有.xpath（）方法（請注意，ElementTree的在xml包在Python分佈dosent具有！）

例如

# see http://lxml.de/xpathxslt.html 

from lxml import etree 

# root = etree.parse('/tmp/stack-overflow-questions.xml') 
root = etree.XML(''' 
     <answers> 
      <answer author="dlam" question-id="13965403">AAA</answer> 
     </answers> 
''') 

all_answers = root.xpath('.//answer') 

for i, answer in enumerate(all_answers): 
    who_answered = answer.attrib['author'] 
    question_id = answer.attrib['question-id'] 
    answer_text = answer.text 
    print 'Answer #{0} by {1}: {2}'.format(i, who_answered, answer_text)

來源

2012-12-20 07:29:38

我更喜歡使用lxml。因爲對於大型元素提取，lxml的效率比selenium更高。您可以使用selenium獲得網頁的源和解析與lxml的XPath的，而不是在selenium本地find_elements_with_xpath源。

來源

2016-10-27 05:26:09 stamaimer

我可以使用python，selenium和lxml解析xpath嗎？

回答

相關問題