我在String變量中有一個Big html,我想獲取div的內容。我不能依賴正則表達式,因爲它可以嵌套div。所以,讓我們假設我有以下字符串 -從HTML中提取表示爲字符串的內容
String test = "<div><div id=\"mainContent\">foo bar<div>good best better</div> <div>test test</div></div><div>foo bar</div></div>";
然後,我怎麼能得到這個用一個簡單的java程序 -
<div id="mainContent">foo bar<div>good best better</div> <div>test test</div></div>
那麼我的計算策略是這樣的(可能是horrable,仍然戰鬥正確) -
public static void main(String[] args) {
int count = 1;
int fl = 0;
String s = "<div><div id=\"mainContent\">foo bar<div>good best better</div> <div>test test</div></div><div>foo bar</div></div>";
String tmp = s;
int len = s.length();
for (int i=0; i<len; i++){
int st = s.indexOf("div>");
if(st > -1) {
char c = s.charAt(st-1);
if(c == '/') {
count--;
} else {
count++;
}
s = s.substring(st+4);
System.out.println(s);
i = i + st;
System.out.println(c + " -- " + st + " -- " + count + " -- " + i);
if (count == 0) {
fl = i;
break;
}
}
}
System.out.println("final ind - " + fl);
s = tmp.substring(0, fl + 4);
System.out.println("final String - " + s);
}