Java SAX解析器進度監控

9

使用javax.swing.ProgressMonitorInputStream.

來源

2010-06-23 10:48:21 EJP

+0

我認爲這將足夠接近。謝謝！ – Danijel 2010-06-24 10:38:02

+0

任何答案都可以比這更簡單嗎？！ :) – Matthieu 2013-07-17 07:12:39

1

假設你知道你有多少文章，你不能只在處理程序中保留一個計數器嗎？例如。

public void startElement (String uri, String localName, 
          String qName, Attributes attributes) 
          throws SAXException { 
    if(qName.equals("article")){ 
     counter++ 
    } 
    ... 
}

（我不知道你是否正在解析「文章」，這只是一個例子）

如果你不事先知道文章的數量，你需要先算它。然後你可以打印狀態nb tags read/total nb of tags，比如說每100個標籤（counter % 100 == 0）。

甚至有另一個線程監視進度。在這種情況下，您可能希望同步對計數器的訪問，但並非必要，因爲它不需要非常準確。

我的2美分

來源

2010-06-23 08:37:02 ewernli

+0

我想通了，但我正在尋找一種方法來做到這一點，而無需首先計算文章。我想也許有一種方法可以找出解析器在文件中的位置，因爲我可以輕鬆地獲取文件大小。 – Danijel 2010-06-23 09:15:41

2

您可以通過重寫方法org.xml.sax.helpers.DefaultHandler/BaseHandlersetDocumentLocator得到您的文件當前行/列的估計。用一個對象調用此方法，在需要時可從中獲取當前行/列的近似值。

編輯：據我所知，沒有標準的方法來獲得絕對的位置。但是，我相信一些SAX實現提供了這種信息。

來源

2010-06-23 08:54:44

+0

關閉，但後來我必須知道文件中的行數，對不對？ – Danijel 2010-06-23 09:17:10

+0

確實。另一個想法可能是由神祕的EJP指出的。您可以使用輸入流中的提升來估計進度。然而，這不是解析過程中的進展，因爲可能存在緩衝和預測。 – 2010-06-23 12:20:10

0

我會使用輸入流中的位置。製作自己的普通流類，委託/從「真實」類繼承並跟蹤讀取的字節。正如你所說，獲取文件總量很容易。我不會擔心緩衝，超前等等 - 對於像這樣的大文件，它是雞飼料。另一方面，我將這個職位限制爲「99％」。

來源

2011-07-01 17:48:20

10

由於EJP對ProgressMonitorInputStream的建議，最後我擴展了FilterInputStream，這樣ChangeListener就可以用來監控當前讀取的字節位置。

有了這個，你可以更好地控制，例如爲了顯示平行讀取大XML文件的多個進度條。這正是我所做的。

因此，監測的數據流的一個簡化版本：

/** 
* A class that monitors the read progress of an input stream. 
* 
* @author Hermia Yeung "Sheepy" 
* @since 2012-04-05 18:42 
*/ 
public class MonitoredInputStream extends FilterInputStream { 
    private volatile long mark = 0; 
    private volatile long lastTriggeredLocation = 0; 
    private volatile long location = 0; 
    private final int threshold; 
    private final List<ChangeListener> listeners = new ArrayList<>(4); 


    /** 
    * Creates a MonitoredInputStream over an underlying input stream. 
    * @param in Underlying input stream, should be non-null because of no public setter 
    * @param threshold Min. position change (in byte) to trigger change event. 
    */ 
    public MonitoredInputStream(InputStream in, int threshold) { 
     super(in); 
     this.threshold = threshold; 
    } 

    /** 
    * Creates a MonitoredInputStream over an underlying input stream. 
    * Default threshold is 16KB, small threshold may impact performance impact on larger streams. 
    * @param in Underlying input stream, should be non-null because of no public setter 
    */ 
    public MonitoredInputStream(InputStream in) { 
     super(in); 
     this.threshold = 1024*16; 
    } 

    public void addChangeListener(ChangeListener l) { if (!listeners.contains(l)) listeners.add(l); } 
    public void removeChangeListener(ChangeListener l) { listeners.remove(l); } 
    public long getProgress() { return location; } 

    protected void triggerChanged(final long location) { 
     if (threshold > 0 && Math.abs(location-lastTriggeredLocation) < threshold) return; 
     lastTriggeredLocation = location; 
     if (listeners.size() <= 0) return; 
     try { 
     final ChangeEvent evt = new ChangeEvent(this); 
     for (ChangeListener l : listeners) l.stateChanged(evt); 
     } catch (ConcurrentModificationException e) { 
     triggerChanged(location); // List changed? Let's re-try. 
     } 
    } 


    @Override public int read() throws IOException { 
     final int i = super.read(); 
     if (i != -1) triggerChanged(location++); 
     return i; 
    } 

    @Override public int read(byte[] b, int off, int len) throws IOException { 
     final int i = super.read(b, off, len); 
     if (i > 0) triggerChanged(location += i); 
     return i; 
    } 

    @Override public long skip(long n) throws IOException { 
     final long i = super.skip(n); 
     if (i > 0) triggerChanged(location += i); 
     return i; 
    } 

    @Override public void mark(int readlimit) { 
     super.mark(readlimit); 
     mark = location; 
    } 

    @Override public void reset() throws IOException { 
     super.reset(); 
     if (location != mark) triggerChanged(location = mark); 
    } 
}

它不知道 - 或者護理 - 底層流有多大，所以你需要得到它的一些其他的方式，比如從文件本身。

所以，在這裏不用簡化示例用法：

try (
    MonitoredInputStream mis = new MonitoredInputStream(new FileInputStream(file), 65536*4) 
) { 

    // Setup max progress and listener to monitor read progress 
    progressBar.setMaxProgress((int) file.length()); // Swing thread or before display please 
    mis.addChangeListener(new ChangeListener() { @Override public void stateChanged(ChangeEvent e) { 
     SwingUtilities.invokeLater(new Runnable() { @Override public void run() { 
     progressBar.setProgress((int) mis.getProgress()); // Promise me you WILL use MVC instead of this anonymous class mess! 
     }}); 
    }}); 
    // Start parsing. Listener would call Swing event thread to do the update. 
    SAXParserFactory.newInstance().newSAXParser().parse(mis, this); 

} catch (IOException | ParserConfigurationException | SAXException e) { 

    e.printStackTrace(); 

} finally { 

    progressBar.setVisible(false); // Again please call this in swing event thread 

}

在我的情況下，很好地進展提高自左向右無異常跳躍。調整性能和響應性之間的最佳平衡閾值。太小，閱讀速度在小設備上可能翻倍，太大，進展不順利。

希望它有幫助。如果您發現錯誤或錯別字，請隨時編輯，或投票給我一些鼓勵！：D

來源

2012-04-09 09:30:57 Sheepy

+0

非常好！正是我在找什麼，我會適應，謝謝！ :) – Matthieu 2013-07-17 07:15:25

Java SAX解析器進度監控

回答

相關問題