你爲什麼說有兩個基本的URL?所有相關鏈接指向http://superior.edu.pk/presentation/user/
(否則!)。
試試下面的代碼:
//If you use an URL you haven't to especify base URL
Document doc=Jsoup.connect("http://superior.edu.pk/presentation/user/Default.aspx").get();
//If you use a file or a String you have. Base URL is http://superior.edu.pk/presentation/user/ of course
//Document doc = Jsoup.parse(Main.class.getResourceAsStream("page.htm"), "utf-8", "http://superior.edu.pk/presentation/user/");
//Only as an example. You can fetch any anchor as wou wish.
Elements links = doc.select("div.footerMaterial > a");
for (Element link : links){
String attr = link.absUrl("href");
System.out.println(attr);
}
你會看到所有的正確絕對URL。從相對鏈接指向所獲得的那些,以superior.edu.pk和絕對的人指出他們respectives域(www.digitallibrary.edu.pk和www.google.com)
(編輯)
你還可以測試這個代碼:
Element link = doc.select(".logo > a:nth-child(1) > img:nth-child(1)").first();
String attr = link.absUrl("src");
System.out.println(attr);
會給你:
http://superior.edu.pk/images/logo.jpg
這是正確的!
解釋是相對網址是../../images/logo.jpg
,這是http://superior.edu.pk/presentation/user/../../images/logo.jpg
,它解析爲http://superior.edu.pk/images/logo.jpg
。
一個頁面只能有一個基礎url!
例如如果您選中「../../images/logo.jpg」,它會解析爲http://www.superior.edu.pk/images/logo.jpg,並且在此處看不到「/ presentation/user /」 URL。這就是爲什麼我說相對URL是用2個基本URL解決的。 – 2014-11-08 10:50:42
好.. ../../images/logo.jpg是相對於superior.edu.pk/presentation/user/也!檢查它:http://superior.edu.pk/presentation/user/../../images/logo.jpg – fonkap 2014-11-08 10:58:23
@ user2866518我編輯了答案 – fonkap 2014-11-09 19:04:03