使用gwtwiki「用法：解析器<XML-FILE>」錯誤

過程wiki轉儲使用gwtwiki和java處理一個wikimedia轉儲文件（例如：http://dumps.wikimedia.org/enwiki/20150304/enwiki-20150304-pages-meta-history9.xml-p000897146p000925000.bz2）。我對Java很新穎（我可以理解並編寫簡單的Java腳本），並使用eclipse。我已經導入了gwtwiki項目並嘗試運行DumpExample.java，並且我得到了Usage: Parser <XML-FILE>響應錯誤。使用gwtwiki「用法：解析器<XML-FILE>」錯誤

我不知道在哪裏定義.bz2轉儲文件的路徑，並試圖至少編輯用法：Parser <XML-FILE>對其他內容的錯誤響應，但即使嘗試逐步運行，也得到了相同的結果或者添加幾行代碼，如System.out.println("test");

文檔沒有提供如何完成這個任務的解釋，因爲我認爲對於熟悉java的人來說，這應該是非常自我解釋的。

現在，我不需要一步一步的教程，我該如何做到這一點，但我想要一個起點或一些線索，我會自己學習。在搜尋了幾天後，我發現我甚至不知道從哪裏開始。我也知道你可以這樣說：

瞭解更多Java！

但我總是通過參與像這樣的項目來學習更好。

的DumpExample.java：

package info.bliki.wiki.dump; 

import org.xml.sax.SAXException; 

/** 
* Demo application which reads a compressed or uncompressed Wikipedia XML dump 
* file (depending on the given file extension <i>.gz</i>, <i>.bz2</i> or 
* <i>.xml</i>) and prints the title and wiki text. 
* 
*/ 
public class DumpExample { 
    /** 
    * Print title an content of all the wiki pages in the dump. 
    * 
    */ 
    static class DemoArticleFilter implements IArticleFilter { 

     public void process(WikiArticle page, Siteinfo siteinfo) throws SAXException { 
      System.out.println("----------------------------------------"); 
      System.out.println(page.getId()); 
      System.out.println(page.getRevisionId()); 
      System.out.println(page.getTitle()); 
      System.out.println("----------------------------------------"); 
      System.out.println(page.getText()); 
     } 
    } 

    /** 
    * Print all titles of the wiki pages which have &quot;Real&quot; content 
    * (i.e. the title has no namespace prefix) (key == 0). 
    */ 
    static class DemoMainArticleFilter implements IArticleFilter { 

     public void process(WikiArticle page, Siteinfo siteinfo) throws SAXException { 
      if (page.isMain()) { 
       System.out.println(page.getTitle()); 
      } 
     } 

    } 

    /** 
    * Print all titles of the wiki pages which are templates (key == 10). 
    */ 
    static class DemoTemplateArticleFilter implements IArticleFilter { 

     public void process(WikiArticle page, Siteinfo siteinfo) throws SAXException { 
      if (page.isTemplate()) { 
       System.out.println(page.getTitle()); 
      } 
     } 

    } 

    /** 
    * Print all titles of the wiki pages which are categories (key == 14). 
    */ 
    static class DemoCategoryArticleFilter implements IArticleFilter { 

     public void process(WikiArticle page, Siteinfo siteinfo) throws SAXException { 
      if (page.isCategory()) { 
       System.out.println(page.getTitle()); 
      } 
     } 

    } 

    /** 
    * @param args 
    */ 
    public static void main(String[] args) { 
     if (args.length == 1) { 
      System.out.println("test"); 
      System.out.println("test"); 
      System.out.println("test"); 
      System.out.println("test"); 
      System.err.println("Usagessss: Parser <XML-FILEZZZZZZ>"); 
      System.out.println("test2"); 
      System.exit(-1); 
     } 
     // String bz2Filename = 
     // "c:\\temp\\dewikiversity-20100401-pages-articles.xml.bz2"; 
     String bz2Filename = args[0]; 
     try { 
      IArticleFilter handler = new DemoArticleFilter(); 
      WikiXMLParser wxp = new WikiXMLParser(bz2Filename, handler); 
      wxp.parse(); 
     } catch (Exception e) { 
      e.printStackTrace(); 
     } 
    } 
}

來源

2015-06-06 asknomore

我沒有一個答案，但我只想說：很好，你早點開始記錄自己！這是一個很好的習慣。你可能會有可怕的設計模式（我無法分辨），但文檔可以使你的代碼在90％的時間內至少可以理解。當然，編寫好的代碼還是比較好的，但是在重構時你可以在文檔中重新使用它。 –

遲到的回答，也許它會幫助你或者你已經移動了，mayby這將有助於在未來的人絆倒在這篇文章中，我使用這個實現：

File f = new File("c:/path/to/svwiki-20151102-pages-meta-current.xml"); 
    WikiXMLParser wxp; 
    try { 
     wxp = new WikiXMLParser(f, handler); 
     wxp.parse(); 
    } catch (IOException e) { 
     e.printStackTrace(); 
    } catch (SAXException e) { 
     e.printStackTrace(); 
    }

來源

2015-11-17 08:08:57 Fluff

使用gwtwiki「用法：解析器<XML-FILE>」錯誤

回答

相關問題