我正在閱讀網頁中的內容,然後使用Jsoup解析器幫助解析它,以僅獲取正文部分中存在的超鏈接。我得到的輸出:從Java中獲取給定字符串的子串
<a href="/sports/sports.asp" style="TEXT-DECORATION: NONE"><font color="#0000FF">Sports</font></a>
<a href="/titanic/titanic.asp" style="TEXT-DECORATION: NONE"><font color="#0000FF">Titanic</font></a>
<a href="gastheft.asp" onmouseover="window.status='License Plate Theft';return true" onmouseout="window.status='';return true">license plates</a>
<a href="miracle.asp" onmouseover="window.status='Miracle Cars';return true" onmouseout="window.status='';return true">miracle cars</a>
<a href="/crime/warnings/clear.asp" onmouseover="window.status='Clear Loss';return true" onmouseout="window.status='';return true" target="clear">Clear</a>
and even more hyperlinks.
從所有的人,所有我感興趣的是像
/sports/sports.asp
/titanic/titanic.asp
gastheft.asp
miracle.asp
/crime/warnings/clear.asp
我怎樣才能做到這一點使用字符串或有任何其他方式或方法將數據使用Jsoup Parser本身提取這些信息?
http://jsoup.org/cookbook/extracting-data/attributes-text-html – helderdarocha