的Java - XML文檔

的確定大小我有一個簡單的代碼，從給定的URL中獲取XML文件：的Java - XML文檔

DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(link);

代碼返回XML文檔（org.w3c.dom.Document中）。我只需要獲取生成的xml文檔的大小。有沒有優雅的方式來做到這一點，而不涉及第三方罐子？

P.S.大小（KB），或MB，而不是數量點頭

來源

2012-07-05 guest86

size in form kb？或節點的數量？ – 2012-07-05 11:52:57

KB。我編輯了我的文章 – guest86 2012-07-05 11:54:02

第一原始版本：將文件加載到本地緩衝區。然後你知道你的輸入有多長時間。然後從緩衝區中解析XML：

URL url = new URL("..."); 
InputStream in = new BufferedInputStream(url.openStream()); 
ByteArrayOutputStream buffer1 = new ByteArrayOutputStream(); 
int c = 0; 
while((c = in.read()) >= 0) { 
    buffer1.write(c); 
} 

System.out.println(String.format("Length in Bytes: %d", 
    buffer1.toByteArray().length)); 

ByteArrayInputStream buffer2 = new ByteArrayInputStream(buffer1.toByteArray()); 

Document doc = DocumentBuilderFactory.newInstance() 
    .newDocumentBuilder().parse(buffer2);

缺點是RAM中的附加緩衝區。

第二個更優雅版：總結與定製java.io.FilterInputStream計數通過它流字節輸入流：

URL url = new URL("..."); 
CountInputStream in = new CountInputStream(url.openStream()); 
Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(in); 
System.out.println(String.format("Bytes: %d", in.getCount()));

這裏是CountInputStream。所有read()方法都被覆蓋以委託給超類並計算所得字節數：

public class CountInputStream extends FilterInputStream { 

    private long count = 0L; 

    public CountInputStream(InputStream in) { 
    super(in); 
    } 

    public int read() throws IOException { 
    final int c = super.read(); 
    if(c >= 0) { 
     count++; 
    } 
    return c; 
    } 

    public int read(byte[] b, int off, int len) throws IOException { 
    final int bytesRead = super.read(b, off, len); 
    if(bytesRead > 0) { 
     count += bytesRead; 
    } 
    return bytesRead; 
    } 

    public int read(byte[] b) throws IOException { 
    final int bytesRead = super.read(b); 
    if(bytesRead > 0) { 
     count += bytesRead; 
    } 
    return bytesRead; 
    } 

    public long getCount() { 
    return count; 
    } 
}

來源

2012-07-05 22:19:07 vanje

的也許這：

document.getTextContent().getBytes().length;

來源

2012-07-05 11:56:56 Phebus40

不，getTextContent返回null，儘管文檔被填充：\ – guest86 2012-07-05 12:06:37

不優雅的方式：創建文件.xml和file.length（） – Phebus40 2012-07-05 12:11:56

你可以這樣說：

long start = Runtime.getRuntime().freeMemory();

構建你的XML文檔對象。然後再次調用上述方法。

Document ocument = parser.getDocument(); 

long now = Runtime.getRuntime().freeMemory(); 

System.out.println(" size of Document "+(now - start));

來源

2012-07-05 12:52:32 UVM

這不會工作 - 會有很多對象（如DOM節點）分配內存，而不僅僅是包含文檔內容的字符串。 – 2012-07-05 15:43:00

將XML文件解析到DOM樹後，源文檔（作爲字符串）不再存在。您只需從該文檔構建一個節點樹 - 因此不再可能從DOM文檔準確確定源文檔的大小。

你可以transform the DOM document back into an XML file using the identity transform;但這是一種非常全面的獲取大小的方法，它仍然不能完全匹配源文檔的大小。

對於您要做的事情，最好的方法是自己下載文檔，記下大小，然後使用InputStream將它傳遞給DocumentBuilder.parse方法。

來源

2012-07-05 15:59:03

的Java - XML文檔

回答

相關問題