2017-08-16 56 views
-1

我想使用「page.asText()」解析蒸汽市場的網頁,但這不起作用。這可能發生是因爲在1秒內加載html之後,項目未被加載。WebClient(htmlunit)沒有看到一些元素

public static void main(String[] args) throws Exception{ 
      java.util.logging.Logger.getLogger("com.gargoylesoftware.htmlunit").setLevel(java.util.logging.Level.OFF); 
      java.util.logging.Logger.getLogger("org.apache.http").setLevel(java.util.logging.Level.OFF); 
      String link="http://steamcommunity.com/market/search?appid=730#p6_price_asc"; 
      HtmlPage page; 
      WebClient webClient = new WebClient(BrowserVersion.CHROME); 
      page = (HtmlPage) webClient.getPage(link); 
      System.out.println(page.asText()); 
      } 

在控制檯中我看到:

Show advanced options... 






< 1 2 3 4 5 6 ... 939 > 
Showing 1-10 of 9389 results 

它需要:

所有的
Show advanced options... 
PRICE 
QUANTITY 
NAME 
31,218 
Starting at: 
$0.35 USD 
Operation Hydra Case 
Counter-Strike: Global Offensive 
276,582 
Starting at: 
$0.23 USD 
. 
. 
. 

M4A1-S | Decimator (Field-Tested) 
Counter-Strike: Global Offensive 


232 
Starting at: 
$27.06 USD 

AWP | Asiimov (Battle-Scarred) 
Counter-Strike: Global Offensive 


28,068 
Starting at: 
$0.75 USD 

Krakow 2017 Legends Autograph Capsule 
Counter-Strike: Global Offensive 


< 1 2 3 4 5 6 ... 940 > 
Showing 1-10 of 9392 results 

回答

0

首先,確保啓用javascript。

webClient.getOptions.setJavaScriptEnabled(true); 

我通常做的,以等待更多的元素,以負荷爲:

thread.sleep(3000); 

這使第3頁秒加載的所有附加內容。

您也可以嘗試任何由這裏的其他用戶列出的其他方法:

HTMLUnit doesn't wait for Javascript

+1

時需要使用 「的Thread.Sleep(3000);」? WebClient webClient =新的WebClient(BrowserVersion.CHROME); webClient.getOptions()。setJavaScriptEnabled(true); page =(HtmlPage)webClient.getPage(link); System.out.println(page.asText()); –

+0

您將需要在webClient.getPage(鏈接)之後使用thread.sleep()。 –

+0

WOW。「getPage(link)」總是重新加載?我想一次所有的getPage。謝謝你。這麼多) –