Jsoup允許您通過document.selector("CSS SELECTOR")
通過CSS選擇器獲取元素。
如果你想直接td
孩子的table
你可以使用CSS selector >
, which means direct children。對於你的情況,那麼,你應該使用:
#tableID > tbody > tr > td
這可以讓你所有td
第一級別s的#tableID
table
的。有些事情需要注意:
- 您有包括中間
> tbody > tr
。即使您的原始HTML標記沒有它們,Jsoup也會在解析HTML時創建它們。
- 你不需要在第一部分有一個ID。你可以擁有任何東西。例如,所有
table
s的所有第一級td
s與.pretty
:table.pretty > tbody > tr > td
。
在Jsoup:
Elements tds = document.select("#tableID > tbody > tr > td");
- 或者,如果你想先選擇表(或之前已經選中):
Element myTable = document.select("#tableID")
;
Elements tds = myTable.select(" > tbody > tr > td")
;
最後但並非最不重要的,一個示例代碼從你的例子得到TD
S:
import org.jsoup.Jsoup;
import org.jsoup.nodes.*;
import org.jsoup.select.*;
public class JsoupHtmlDirectChildren {
public static void main(String[] args) {
String html = "" +
"<html> " +
" <body> " +
" <span>HELLO!</span> " +
" <table id=\"myTable\"> " +
" <tbody> " +
" <tr> " +
" <th>header</th> " +
" <!-- <td> tags on a high level in the hierarchy. --> " +
" <td>high level1 " +
" <table> " +
" <tbody> " +
" <tr> " +
" <!-- <td> tags on a low level in the hierarchy. --> " +
" <td>low level1</td> " +
" <td>low level2</td> " +
" <td>low level3</td> " +
" </tr> " +
" </tbody> " +
" </table> " +
" </td> " +
" <td>high level2</td> " +
" <td>high level3</td> " +
" </tr> " +
" </tbody> " +
" </table> " +
" </body> " +
"</html> ";
Document doc = Jsoup.parse(html);
// all first level children TD of the #myTable table
Elements highLevelTDs = doc.select("#myTable > tbody > tr > td");
System.out.println("QUANTITY FOUND: "+highLevelTDs.size());
for (Element td : highLevelTDs) {
System.out.println("\n\n###HIGH LEVEL TD: "+td);
}
}
}
輸出:
QUANTITY FOUND: 3
###HIGH LEVEL TD: <td>high level1
<table>
<tbody>
<tr>
<!-- <td> tags on a low level in the hierarchy. -->
<td>low level1</td>
<td>low level2</td>
<td>low level3</td>
</tr>
</tbody>
</table> </td>
###HIGH LEVEL TD: <td>high level2</td>
###HIGH LEVEL TD: <td>high level3</td>
你解決了我的問題!再次!非常感謝! – ian