0
我剛纔聽到有關此HtmlUnit
的事情,當時我試圖弄清楚如何轉儲網站的源代碼。我想要做的是使用腳本從網站轉儲源代碼,但是當我運行它時,會獲得一長串紅色。嘗試連接網頁時出現HtmlUnit錯誤
這是使用代碼IM:
public static void main(String[] args) throws FailingHttpStatusCodeException, MalformedURLException, IOException {
String url = "http://www.runelocus.com/forums/member.php?102785&tab=aboutme#aboutme";
WebClient client = new WebClient(BrowserVersion.FIREFOX_3_6);
HtmlPage page = client.getPage(url);
System.out.println(page.getWebResponse().getContentAsString());
}
這是錯誤的即時得到:
Exception in thread "main" org.apache.http.client.ClientProtocolException
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:822)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:776)
at com.gargoylesoftware.htmlunit.HttpWebConnection.getResponse(HttpWebConnection.java:152)
at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseFromWebConnection(WebClient.java:1439)
at com.gargoylesoftware.htmlunit.WebClient.loadWebResponse(WebClient.java:1358)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:307)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:373)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:358)
at HTMLDumping.htmlunittest.main(htmlunittest.java:18)
Caused by: org.apache.http.ProtocolException: Invalid header: blcc_proxy
at org.apache.http.impl.io.AbstractMessageParser.parseHeaders(AbstractMessageParser.java:224)
at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:281)
at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:247)
at org.apache.http.impl.conn.AbstractClientConnAdapter.receiveResponseHeader(AbstractClientConnAdapter.java:219)
at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:298)
at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:645)
at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:464)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:820)
如果有人可能知道如何解決此問題,請提出好的建議。所有反饋都被接受
你打算爲測試目的使用'HTMLUnit'還是你想'刮'網站?如果你只是想'刮'網站,那麼[JSOUP](http://jsoup.org/)是一個更好的選擇。 – radimpe 2012-07-10 12:31:04