2013-10-08 37 views
1

在我的XML行爲我有一個多行的元素:奇怪的字符()與SAX + Java的

<tag id="sometag" ...> 
    | first line 
    |  second line 
    |   third line 
    |  fourth line 
<tag ...> 
.... 
<tag id="someothertag" ...> 
    | ANOTHER FIRST LINE 
    |  ANOTHER SECOND LINE 
    |   ANOTHER THIRD LINE 
    |  ANOTHER FORTH LINE 
<tag ...> 

然後在Java中我有必要startElementendElementcharacters方法,但我發現我得到一些奇怪的行爲與characters

public void characters(char[] ch, int start, int length){ 
    Log.d(TAG, "characters("\"" + (new String(ch)).replaceAll("[\r\n]", "\\n") + "\", " + start + ", " + length + ")"); 
} 

除此之外,我對字符什麼都不做。我基本上創建了一個解析器的兩個實例。有一個例子,我正在尋找sometag。如果我找到要查找的內容並返回該元素,則會拋出異常。

D/MyProgram(1565): STARTING document parsing... 
D/MyProgram(1565): characters("n ", 0, 1) 
D/MyProgram(1565): characters("  | first line", 0, 20) 
D/MyProgram(1565): characters("n  | first line", 0, 1) 
D/MyProgram(1565): characters("  | second line", 0, 23) 
D/MyProgram(1565): characters("n  | second line", 0, 1) 
D/MyProgram(1565): characters("  |  third line", 0, 26) 
D/MyProgram(1565): characters("n  |  third line", 0, 1) 
D/MyProgram(1565): characters("  | fourth lineline", 0, 22) 
D/MyProgram(1565): characters("n  | fourth lineline", 0, 1) 
D/MyProgram(1565): characters("  | fourth lineline", 0, 4) 
D/MyProgram(1565): Successfully found "sometag"! 

...和另一個全新的實例,我正在尋找someothertag。我做了和以前一樣的事情。

D/MyProgram(1565): STARTING document parsing... 
D/MyProgram(1565): characters("n", 0, 1) 
D/MyProgram(1565): characters(" ", 0, 4) 
D/MyProgram(1565): characters("n ", 0, 1) 
D/MyProgram(1565): characters("  | first line", 0, 20) 
D/MyProgram(1565): characters("n  | first line", 0, 1) 
D/MyProgram(1565): characters("  | second line", 0, 23) 
D/MyProgram(1565): characters("n  | second line", 0, 1) 
D/MyProgram(1565): characters("  |  third line", 0, 26) 
D/MyProgram(1565): characters("n  |  third line", 0, 1) 
D/MyProgram(1565): characters("  | fourth lineline", 0, 22) 
D/MyProgram(1565): characters("n  | fourth lineline", 0, 1) 
D/MyProgram(1565): characters("  | fourth lineline", 0, 4) 
D/MyProgram(1565): Successfully found "someothertag"! 

我明白,XML解析是基於流的(它解析塊而不是整個字符串),但這是非常奇怪的行爲。這裏有幾件事我注意到,真的是讓人眼花繚亂:

  • 隨着人物的每一次迭代(),解析器沒有啓動離開的地方或整理的字符,如果它,的確,完成解析:我m甚至得到之前之前的第一個字符數組('n',它是換行符)。
  • ch有最初不存在的額外字符:「line」被追加到「forth line」。
  • 當我創建一個全新的解析器實例時,這些字符被「重新讀取」​​。第二個執行應該讀的東西,如:

..this ...

D/MyProgram(1565): characters("n", 0, 1) 
D/MyProgram(1565): characters(" ", 0, 4) 
D/MyProgram(1565): characters("n ", 0, 1) 
D/MyProgram(1565): characters("  | ANOTHER FIRST LINE", 0, 20) 
D/MyProgram(1565): characters("n  |  ANOTHER SECOND LINE", 0, 1) 

...等等。

任何想法我做錯了什麼?提前致謝。

+3

看起來像你不尊重開始和長度。 – bmargulies

回答

3

正如Margulies所說,你在傳遞的字符數組中不使用startlength

public void characters(char[] ch, int start, int length) { 
    // use only the indicated segment. 
    String str = new String(ch, start, length); 
    Log.d(TAG, "characters("\"" + str.replaceAll("[\r\n]", "\\n") + "\", " + start + ", " + length + ")"); 
} 
+0

我遇到的另一個問題是解析器的字符串生成器是靜態的。我需要使用builder.setLength()重置它。 – i41