如果你打印出來,你會發現文件,該重定向使用JavaScript實現:
[...]
window.location.href = '../oilnew/';
[...]
您可以手動解析腳本標籤,並找到window.location.href
無論是檢查是否被觸發時,加載並提取目標或使用HtmlUnit(儘管速度很慢)可以遵循重定向。
示例代碼
String userAgent = "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36";
String url = "http://www.oil-india.com/";
Document doc;
java.util.logging.Logger.getLogger("com.gargoylesoftware").setLevel(java.util.logging.Level.OFF);
final WebClient webClient = new WebClient(BrowserVersion.CHROME);
webClient.getOptions().setJavaScriptEnabled(true);
webClient.getOptions().setRedirectEnabled(true);
try {
url = webClient.getPage(url).getUrl().toString(); // HtmlUnit
doc = Jsoup.connect(url).userAgent(userAgent).followRedirects(true).get(); // jsoup
System.out.println(doc.toString());
} catch (FailingHttpStatusCodeException | IOException e) {
e.printStackTrace();
}
輸出
<a href="#" class="close">Close</a>
<a href="default.aspx"><img src="oilindia-img/logo.jpg" alt="Oil India" style="height:95px;"></a>
<a href="screenreader.aspx"><img src="oilindia-img/screen_reader_icon.png" style="vertical-align:middle;" alt="top"><span id="MenuBarTop_link_screenreader" class="link_screenreader">Screen Reader Access</span> </a>
<a href="javascript:decreaseFontSize();" class="toplink"> <img alt="orange color" src="oilindia-img/a-.png" id="Img1"> </a>
[...]
檢查的響應代碼做你的條件[鏈接](http://stackoverflow.com/questions/6467848/how-對獲得-HTTP響應代碼-FOR-A-URL-中的Java) –