的Python - 硒 - webscrape的xmlns表

<html xmlns="hyyp://www.w3.org/1999/xhtml"> 
    <head>_</head> 
    <body> 
     <form name="Main Form" method="post" action="HTMLReport.aspx?ReportName=..."> 
      <div id="Whole"> 
       <div id="ReportHolder"> 
        <table xmlns:msxsl="urn:schemeas-microsoft-com:xslt" width="100%"> 
         <tbody> 
          <tr> 
           <td>_</td> 
           <td>LIVE</td> 
           and the data I need is here between <td> </td>

現在，到目前爲止我的代碼是：的Python - 硒 - webscrape的xmlns表

import time 
from selenium import webdriver 

chromeOps=webdriver.ChromeOptions() 
chromeOps._binary_location = "C:\\Program Files\\Google\\Chrome\\Application\\chrome.exe" 
chromeOps._arguments = ["--enable-internal-flash"] 

browser = webdriver.Chrome("C:\\Program Files\\Google\\Chrome\\Application\\chromedriver.exe", port=4445, chrome_options=chromeOps) 
time.sleep(3) 

browser.get('website') 
elem=browser.find_element_by_id('MainForm') 
el=elem.find_element_by_xpath('//*[@id="ReportHolder"]')

代碼的最後兩行只是真的是我想知道我怎麼路徑可能的XPath breaksdown前走。嘗試xpath到超出此點的任何內容都會導致noSuchElementException。

任何人都可以向我解釋我如何從表格中抽取數據嗎？

我目前的想法是，也許我必須通過「東西」到一個XML樹API，並通過它訪問它。雖然我不知道如何捕捉它。

如果任何人都可以給我下一步它將不勝感激，感覺有點像我在黑暗的房間裏拿着蠟燭此刻。

來源

2014-01-21 Phoenix

爲了澄清，如果你使用了一個xpath，比如：'driver.find_element_by_xpath（「// div [@ id ='ReportHolder']/table/tbody/tr」）'你正在接收異常？ –

是的，這是正確的。在我的示例中： el = browser.find_element_by_xpath（'// * [@ id =「ReportHolder」]/table/tbody/tr'）產生noSuchElementException – Phoenix

這很簡單。這是一個計時問題。

解決方案：在xpath請求之前放置一個time.sleep（5）。

browser.get('http://www.mmgt.co.uk/HTMLReport.aspx?ReportName=Fleet%20Day%20Summary%20Report&ReportType=7&CategoryID=4923&Startdate='+strDate+'&email=false') 
time.sleep(5) 
ex=browser.find_element_by_xpath('//*[@id="ReportHolder"]/table/tbody/tr/td')

xpath正在請求對動態內容的引用。

該表格是動態內容和需要較長的時間來加載內容，那麼它爲Python程序到達線：從先前的線

ex=browser.find_element_by_xpath('//*[@id="ReportHolder"]/table/tbody/tr')

：

browser.get('http://www.mmgt.co.uk/HTMLReport.aspx?ReportName=Fleet%20Day%20Summary%20Report&ReportType=7&CategoryID=4923&Startdate='+strDate+'&email=false')

來源

2014-01-21 13:10:36 Phoenix

而不是使用'time.sleep內置的WebdriverWait類和支持的ExpectedConditions。例如，WebdriverWait（self.druver，5）.until（ExpectedConditions.presence_of_element_located（（By.XPATH，「xpath here」））' –

的Python - 硒 - webscrape的xmlns表

回答

相關問題