解析HTML href屬性

我正在開發一個項目，我需要解析HTML以從網頁中提取數據。我在Java中使用Jsoup。我需要從以下內容中提取數據。解析HTML href屬性

<tr> 
      <td><small><a href="http://www.timeanddate.com/worldclock/fixedtime.html?iso=20160821T2100&amp;p1=248" target="_blank">2016/08/21 21:00</a></small></td> 
      <td><small><a href="https://agc003.contest.atcoder.jp">AtCoder Grand Contest 003</a></small></td> 

</tr>

我可以得到值的比賽名稱和時間，但如何提取網址。我想要得到比賽的URL https://agc003.contest.atcoder.jp 如何得到這個？

編輯： 這裏是我的代碼

 private void getAC() throws IOException { 

    Document doc = Jsoup.connect("https://atcoder.jp/").userAgent(Desktop.getDesktop().toString()).get(); 
    Element table = doc.getElementsByClass("table-responsive").get(1); 
    Elements contestStartTime = table.getElementsByTag("td"); 
    int cnt = 1; 
    for (Element i : contestStartTime) { 
     System.out.println(cnt + ". " + i.html()); 
     cnt++; 
    } 

}

來源

2016-08-19 Meghla Khan

我不是f熟悉JSoup或Java，但我會加載文件，逐行讀取它並使用正則表達式模式來搜索您需要的，然後從該行解析Url。 – dinotom

你可以添加你的代碼來獲取比賽名稱和時間嗎？ – TDG

由於標籤似乎沒有一個id或anyhing目標與他們，我真的不知道。但是，一旦找到元素就很容易獲取URL。 'Elements.attr（「href」）'應該得到值 –

JSoup對DOM處理豐富的API，查找此功能：

Element content = doc.getElementById("content"); 
Elements links = content.getElementsByTag("a"); 
for (Element link : links) { 
    String linkHref = link.attr("href"); 
    String linkText = link.text(); 
}

你也可以得到你的鏈接這樣

Elements links = doc.select("table a[href]");

來源

2016-08-19 09:23:19 degr

謝謝。它正在工作！：d –

解析HTML href屬性

回答

相關問題