2017-02-09 61 views
0

捕捉來自多個網頁的數據頁面:http://www.forbes.com/companies/icbc/需要使用硒

package selenium; 

import java.util.List; 
import java.util.concurrent.TimeUnit; 

import org.junit.After; 
import org.junit.Before; 
import org.junit.Test; 
import org.openqa.selenium.By; 
import org.openqa.selenium.By.ByTagName; 
import org.openqa.selenium.WebDriver; 
import org.openqa.selenium.WebElement; 
import org.openqa.selenium.ie.InternetExplorerDriver; 
import org.openqa.selenium.support.ui.ExpectedConditions; 
import org.openqa.selenium.support.ui.WebDriverWait; 

public class ForbesTest { 

WebDriver driver; 
String url; 


    @Before 
    public void setUp() throws Exception { 

     System.setProperty("webdriver.ie.driver","D:\\IEDriverServer_x64_2.53.1\\IEDriverServer.exe"); 
     driver=new InternetExplorerDriver(); 
     driver.manage().timeouts().implicitlyWait(10, TimeUnit.SECONDS); 
     url="http://www.forbes.com/companies/icbc/"; 
     driver.get(url); 
      } 

    @After 
    public void tearDown() throws Exception { 
     driver.quit(); 
     driver.close(); 
    } 

    @Test 
    public void test() throws InterruptedException { 
     Thread.sleep(10000); 
     WebElement tab=driver.findElement(By.className("large")); 
     Thread.sleep(1000); 
     String text= tab.getText(); 
     System.out.println(text); 

     WebElement col1=driver.findElement(By.tagName("dt")); 
     //Thread.sleep(1000); 
     String industry= col1.getText(); 
     if(industry.matches("Industry")){ 
      System.out.println(industry); 

      WebElement col2=driver.findElement(By.tagName("dd")); 
      //Thread.sleep(1000); 
      String industryName= col2.getText(); 
      System.out.println(industryName); 
     } 
     String forbesWebsite= driver.getCurrentUrl(); 
     System.out.println(forbesWebsite); 
     WebElement nextPage=driver.findElement(By.className("next-number")); 
     nextPage.click(); 
     driver.close(); 
} 
    } 

我想捕捉排名,公司,國家,銷售量,銷售額排名,利潤,排名利潤,資產,等級的資產,市場價值,排名市場價值,行業,公司成立,公司網站,員工,總部城,CEO名稱,Forbes.com公司信息頁面和年

+1

而你的問題是......? – Guy

+1

我需要捕捉Industry,Established等,但它們具有相同的標籤。我應該怎樣使用XPath?如果XPath我怎麼得到它? – Parithi

回答

1

要爲行業獲取文本:

String industryName= driver.findElement(By.xpath("//*[contains(text(),'Industry')]//following::dd[1]")).getText(); 

要獲得文本˚F或者成立時間:

String Founded= driver.findElement(By.xpath("//*[contains(text(),'Founded')]//following::dd[1]")).getText(); 

所以,你只需要與所需的文本替換字符串下面

xpath = //*[contains(text(),'String')]//following::dd[1] 
+0

非常感謝。 – Parithi