0
我正在解析任何網站的html內容的應用程序,但今天,我發現了我的實現異常。 我想從這個URL獲取內容:http://tomfishburne.com/2014/09/socialmedia.html,我總是得到以下錯誤:java.io.IOException:服務器返回的HTTP響應代碼:403.我正在使用JSOUP庫。我嘗試了其他2個選項,但沒有使用Jsoup選項,但這並不成功。這個頁面可以從瀏覽器訪問,但不能從java訪問。你能請一些建議幫忙嗎?如何通過Java從特定的HTML頁面檢索內容
感謝
Document doc;
String url = "http://tomfishburne.com/2014/09/socialmedia.html";
try {
Response response = Jsoup
.connect(url)
.ignoreContentType(true)
.userAgent(
"Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:25.0) Gecko/20100101 Firefox/25.0")
.timeout(12000)
.followRedirects(true).execute();
doc = response.parse();
} catch (Exception e) {
try {
doc = Jsoup.connect(url)
.userAgent(
"Mozilla/5.0 (Windows NT 6.1; WOW64; rv:25.0) Gecko/20100101 Firefox/25.0")
.get();
} catch (IOException e1) {
try {
URL url2 = new URL(url);
HttpURLConnection conn = (HttpURLConnection) url2
.openConnection();
conn.setRequestProperty(
"User-Agent",
"Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2062.124 Safari/537.36");
BufferedReader in;
in = new BufferedReader(new InputStreamReader(
conn.getInputStream(), "UTF-8"));
} catch (UnsupportedEncodingException e2) {
} catch (IOException e2) {
//This exception is always thrown because of 403 error code
}
}
}
}
謝謝Szymom。我已經忘記了這個祕密的方法:) – yhony 2014-10-18 17:40:06
太棒了,我很高興我能幫你:) – 2014-10-18 18:12:52