在Android中使用Jsoup檢索解析字符串

我正在編寫一個Android應用程序，它將從網站讀取一些信息並將其顯示在應用程序的屏幕上。我正在使用Jsoup庫以字符串的形式獲取信息。首先，這裏的網站的HTML是什麼樣子：在Android中使用Jsoup檢索解析字符串

<strong> 
    Now is the time<br /> 
    For all good men<br /> 
    To come to the aid<br /> 
    Of their country<br /> 
</strong>

這裏是我如何檢索，並試圖分析文本：

Document document = Jsoup.connect(WEBSITE_URL).get(); 
resultAggregator = ""; 

Elements nodePhysDon = document.select("strong"); 

//check results 
if (nodePhysDon.size()> 0) { 
    //get value 
    donateResult = nodePhysDon.get(0).text(); 
    resultAggregator = donateResult; 
} 

if (resultAggregator != "") { 
    // split resultAggregator into an array breaking up with br/
    String donateItems[] = resultAggregator.split("<br />"); 
}

但隨後donateItems [0]不只是「現在是時間「，這是所有四個字符串放在一起。我也試過沒有「br」和「/」之間的空格，並得到相同的結果。如果我做resultAggregator.split（「br」）;那麼donateItems [0]只是第一個字：「現在」。

我懷疑問題是Jsoup方法選擇是剝離標籤出？

有什麼建議嗎？我無法更改網站的HTML。我必須像現在一樣使用它。

來源

2015-09-13 Jungle Jim

可能重複： //stackoverflow.com/questions/5640334/how-do-i-preserve-line-breaks-when-using-jsoup-to-convert-html-to-plain-text） – luksch

試試這個：

//check results 
if (nodePhysDon.size()> 0) { 
    //use toString() to get the selected block with tags included 
    donateResult = nodePhysDon.get(0).toString(); 
    resultAggregator = donateResult; 
} 

if (resultAggregator != "") { 
// remove <strong> and </strong> tags 
    resultAggregator = resultAggregator.replace("<strong>", ""); 
    resultAggregator = resultAggregator.replace("</strong>", ""); 
    //then split with <br> 
    String donateItems[] = resultAggregator.split("<br>"); 
}

確保與<br>分裂，而不是<br />

[如何使用jsoup對HTML轉換爲純文本時保留換行符？（HTTP的

來源

2015-09-13 07:13:02

您的建議有效。謝謝喬爾！我沒有意識到text（）和toString（）之間有區別 –

在Android中使用Jsoup檢索解析字符串

回答

相關問題