在我的XML行爲我有一個多行的元素:奇怪的字符()與SAX + Java的
<tag id="sometag" ...>
| first line
| second line
| third line
| fourth line
<tag ...>
....
<tag id="someothertag" ...>
| ANOTHER FIRST LINE
| ANOTHER SECOND LINE
| ANOTHER THIRD LINE
| ANOTHER FORTH LINE
<tag ...>
然後在Java中我有必要startElement
,endElement
和characters
方法,但我發現我得到一些奇怪的行爲與characters
:
public void characters(char[] ch, int start, int length){
Log.d(TAG, "characters("\"" + (new String(ch)).replaceAll("[\r\n]", "\\n") + "\", " + start + ", " + length + ")");
}
除此之外,我對字符什麼都不做。我基本上創建了一個解析器的兩個實例。有一個例子,我正在尋找sometag
。如果我找到要查找的內容並返回該元素,則會拋出異常。
D/MyProgram(1565): STARTING document parsing...
D/MyProgram(1565): characters("n ", 0, 1)
D/MyProgram(1565): characters(" | first line", 0, 20)
D/MyProgram(1565): characters("n | first line", 0, 1)
D/MyProgram(1565): characters(" | second line", 0, 23)
D/MyProgram(1565): characters("n | second line", 0, 1)
D/MyProgram(1565): characters(" | third line", 0, 26)
D/MyProgram(1565): characters("n | third line", 0, 1)
D/MyProgram(1565): characters(" | fourth lineline", 0, 22)
D/MyProgram(1565): characters("n | fourth lineline", 0, 1)
D/MyProgram(1565): characters(" | fourth lineline", 0, 4)
D/MyProgram(1565): Successfully found "sometag"!
...和另一個全新的實例,我正在尋找someothertag
。我做了和以前一樣的事情。
D/MyProgram(1565): STARTING document parsing...
D/MyProgram(1565): characters("n", 0, 1)
D/MyProgram(1565): characters(" ", 0, 4)
D/MyProgram(1565): characters("n ", 0, 1)
D/MyProgram(1565): characters(" | first line", 0, 20)
D/MyProgram(1565): characters("n | first line", 0, 1)
D/MyProgram(1565): characters(" | second line", 0, 23)
D/MyProgram(1565): characters("n | second line", 0, 1)
D/MyProgram(1565): characters(" | third line", 0, 26)
D/MyProgram(1565): characters("n | third line", 0, 1)
D/MyProgram(1565): characters(" | fourth lineline", 0, 22)
D/MyProgram(1565): characters("n | fourth lineline", 0, 1)
D/MyProgram(1565): characters(" | fourth lineline", 0, 4)
D/MyProgram(1565): Successfully found "someothertag"!
我明白,XML解析是基於流的(它解析塊而不是整個字符串),但這是非常奇怪的行爲。這裏有幾件事我注意到,真的是讓人眼花繚亂:
- 隨着人物的每一次迭代(),解析器沒有啓動離開的地方或整理的字符,如果它,的確,完成解析:我m甚至得到之前之前的第一個字符數組('
n
',它是換行符)。 ch
有最初不存在的額外字符:「line
」被追加到「forth line
」。- 當我創建一個全新的解析器實例時,這些字符被「重新讀取」。第二個執行應該讀的東西,如:
..this ...
D/MyProgram(1565): characters("n", 0, 1)
D/MyProgram(1565): characters(" ", 0, 4)
D/MyProgram(1565): characters("n ", 0, 1)
D/MyProgram(1565): characters(" | ANOTHER FIRST LINE", 0, 20)
D/MyProgram(1565): characters("n | ANOTHER SECOND LINE", 0, 1)
...等等。
任何想法我做錯了什麼?提前致謝。
看起來像你不尊重開始和長度。 – bmargulies