2015-05-29 170 views


我做了一個叫Splitter的課。它的工作是讀取XML文件並根據特定的XML startend標籤將其分割成更小的文件,而每個較小的文件也必須小於給定的maxfilesize



public class Splitter { 

    public static void split(String directory, String fileName, 
     String transactionTag, int fileSize) throws IOException{ 
    String startTag = "<"+ transactionTag + ">"; 
    String endTag = "</"+ transactionTag + ">"; 
    File f = new File(directory + fileName); 
    File output = new File (directory + "Output/" + fileName); 
    BufferedInputStream in = new BufferedInputStream(new FileInputStream(f)); 
    Splitter sp = new Splitter(); 
    int fileCount = 0; 
    int len; 
    int maxFileSize = fileSize; 
    byte[] buf = new byte[maxFileSize]; 
    SimpleDateFormat sdf = new SimpleDateFormat("yyyy_MM_dd_hh_mm_ss"); 
    Date curDate = new Date(); 
    String strDate = sdf.format(curDate); 
    String fileTime = strDate; 
    while ((len = in.read(buf)) > 0) { 
      File afile =new File(directory + "Output\\" + fileName + "." + fileCount); 
       if(afile.renameTo(new File(directory + "Output\\Archive\\" + fileName + "." + fileCount + "-" + fileTime))){ 
        System.out.println("Files failed to be archived. "); 
       System.out.println("This file does not exist."); 
     }catch(Exception e){ 
     BufferedOutputStream out = new BufferedOutputStream(new FileOutputStream(output + "." + fileCount)); 
     String newInput = new String(buf,0,len); // newInput is a String no greater in length than whatever bytes or chars 
     String value = sp.getXML(newInput, transactionTag); 

     //This part is incomplete. 
     //Do something with value to make this class split the file by XML tags. 
     //Also make sure any left over code before the first start tag and last end tag are also put into smaller files. 

     int start = value.indexOf(startTag); 
     int end = value.lastIndexOf(endTag); 

    public String getXML(String content, String tagName){ 
    String startTag = "<"+ tagName + ">"; 
    String endTag = "</"+ tagName + ">"; 
    int startposition = content.indexOf(startTag); 
    int endposition = content.indexOf(endTag, startposition); 
    if (startposition == -1)return ""; 
    startposition += startTag.length(); 
    if(endposition == -1) return ""; 
    return content.substring(startposition, endposition); 
    public static void main(String[]args) throws IOException{ 
    int num = 100; 
    int kb = num * 1024; 
    Splitter split = new Splitter(); 
    split("C:/SplitUp/", "fileSplit.xml", "blah1", kb); 
    System.out.println("Program ran"); 

IIUC您的單個輸入文件('fileSplit.xml')有多個'start'和'end'標籤,會讓你每對start'的'和'end'標籤之間的內容分割成獨立的單個文件,對? –


是的,這是完全正確的。實際上,我已經將這段代碼運行到通過fileSize分割文件的位置,但我也需要通過這些開始和結束標記來分割它。我有getXML方法,它看到開始和結束標記之間的內容,並且我知道我需要將它調用到split方法中,並執行某種循環來分割所有內容,但我不知道如何去關於這樣做。我還需要提交「剩菜」,這意味着將第一個開始標記之前的內容以及最後一個結束標記之後的內容放入其自己的文件中。我會感謝任何見解。 – Galvatron




    <!-- Some XML metadata --> 
    <!-- Some XML data --> 
    <!-- Some XML data --> 
    <!-- Some XML data --> 
    <!-- Some XML data --> 
    <!-- Some XML metadata --> 



  1. java.nio.files.readAllLines(Path path, Charset cs)讀你C:/SplitUp/fileSplit.xml
  2. java.io.FileWriter寫信給所有子文件。

本質(用於Java 7+),你可以這樣做,

// read the entire fileSplit.xml into an array of string 
List<String> fileContent = files.readAllLines(Paths.get("C:/SplitUp/fileSplit.xml"), StandardCharsets.UTF_8); 

// iterate through the array to split the file content into sub-files 
String subFileContent = ""; 
for(String line : fileContent){ 
    if(line.compareToIgnoreCase("<start>") != 0 || line.compareToIgnoreCase("<footer>") != 0) { // keep reading if this line isn't a <start> nor a <footer> 
    subFileContent += line; 
    else { // if this line is a <start> or a <footer>, write all the content thus-far into a new sub-file 
    // sub-files names taken from your codes above. Make sure they are unique! 
    FileWriter fileWriter = new FileWriter(directory + "Output\\" + fileName + "." + fileCount++); 

    // this will write up to only maxFileSize number of characters. 
    // how do you want to handle spillover? 
    fileWriter.write(subFileContent, 0, maxFileSize); 

    // reset subFileContent 
    subFileContent = new String(line); 



您可以將最後的else更改爲else if爲f當其length()超出maxFileSize時,請將subFileContent寫出來,並確保餘數寫入第二個子文件。但是,我要說的是,在處理第二個需求之前,先將內容分解成子文件,然後再開始工作。