2017-09-23 132 views
1

我是Apache POI的新手。無法在刪除重複行後使用Apache POI編寫新的excel

我寫了一個小代碼,用於從excel文件中刪除重複的記錄。我可以成功地識別跨頁的重複記錄,但在刪除記錄後寫入新文件時,不會生成輸出。

請幫我在哪裏布萊恩錯了?

我是否正確書寫?或者我錯過了什麼?

public static void main(String args[]) { 
    DataFormatter formatter = new DataFormatter(); 
    HSSFWorkbook input_workbook; 
    HSSFWorkbook workbook_Output_Final; 

    HSSFSheet input_workbook_sheet; 

    HSSFRow row_Output; 
    HSSFRow row_1_index; 
    HSSFRow row_2_index; 

    String value1 = ""; 
    String value2 = ""; 
    int count; 


    //main try catch block starts 
    try { 

     FileInputStream input_file = new FileInputStream("E:\\TEST\\Output.xls"); //reading from input file 
     input_workbook = new HSSFWorkbook(new POIFSFileSystem(input_file)); 

     for (int sheetnum = 0; sheetnum < input_workbook.getNumberOfSheets(); sheetnum++) { //traversing sheets 

      input_workbook_sheet = input_workbook.getSheetAt(sheetnum); 

      int input_workbook_sheet_total_row = input_workbook_sheet.getLastRowNum(); //fetching last row nmber 

      for (int input_workbook_sheet_row_1 = 0; input_workbook_sheet_row_1 <= input_workbook_sheet_total_row; input_workbook_sheet_row_1++) { //traversing row 1 

       for (int input_workbook_sheet_row_2 = 0; input_workbook_sheet_row_2 <= input_workbook_sheet_total_row; input_workbook_sheet_row_2++) { 

        row_1_index = input_workbook_sheet.getRow(input_workbook_sheet_row_1); //fetching one iteration row index 
        row_2_index = input_workbook_sheet.getRow(input_workbook_sheet_row_2); //fetching sec iteration row index 

        if (row_1_index != row_2_index) { 
         count = 0; 
         value1 = ""; 
         value2 = ""; 
         for (int row_1_index_cell = 0; row_1_index_cell < row_1_index.getLastCellNum(); row_1_index_cell++) { //traversing cell for each row 
          try { 
           value1 = value1 + formatter.formatCellValue(row_1_index.getCell(row_1_index_cell)); //fetching row cells value 
           value2 = value2 + formatter.formatCellValue(row_2_index.getCell(row_1_index_cell)); //fetching row cells value 

          } catch (NullPointerException e) { 
          } 
          count++; 
          if (count == row_1_index.getLastCellNum()) { 

           if (value1.hashCode() == value2.hashCode()) { //remove the duplicate logic 
            System.out.println("deleted : " + row_2_index); 
            System.out.println("------------------"); 
            input_workbook_sheet.removeRow(row_2_index); 
           } 

          } 
         } 

        } 
       } 
      } 

     } 
     FileOutputStream fileOut = new FileOutputStream("E:\\TEST\\workbook.xls"); 
     input_workbook.write(fileOut); 
     fileOut.close(); 
     input_file.close(); 
    } catch (Exception e) { 
     //e.printStackTrace(); 
    } 
    //main try catch block ends 

} 

回答

1

幾件事情需要注意:

  1. 你吞下任何一種異常的;與我的測試數據Igotsome nullpointers,這將阻止工作簿被寫入

  2. 當刪除行時,往回移動行號是一個古老的技巧,因爲那麼你不必調整行號您剛刪除

  3. 該代碼清空行,但它不會向上移動所有行(=刪除後有間隙)。如果你想消除這個差距,你可以使用shiftRows

  4. 你用hashcode比較東西,這是可能的(在某些使用情況下),但我覺得你想要做.equals()。又見Relationship between hashCode and equals method in Java

下面是一些代碼,工作了我的測試數據,隨意評論,如果事情不與您的數據的工作:

public static void main(String args[]) throws IOException { 
    DataFormatter formatter = new DataFormatter(); 
    HSSFWorkbook input_workbook; 
    HSSFWorkbook workbook_Output_Final; 

    HSSFSheet input_workbook_sheet; 

    HSSFRow row_Output; 
    HSSFRow row_1_index; 
    HSSFRow row_2_index; 

    String value1 = ""; 
    String value2 = ""; 
    int count; 

    FileInputStream input_file = new FileInputStream("c:\\temp\\test.xls"); 
    input_workbook = new HSSFWorkbook(new POIFSFileSystem(input_file)); 

    for (int sheetnum = 0; sheetnum < input_workbook.getNumberOfSheets(); sheetnum++) { 

     input_workbook_sheet = input_workbook.getSheetAt(sheetnum); 

     int input_workbook_sheet_total_row = input_workbook_sheet.getLastRowNum(); 

     for (int input_workbook_sheet_row_1 = input_workbook_sheet_total_row; input_workbook_sheet_row_1 >=0; input_workbook_sheet_row_1--) { // traversing 

      for (int input_workbook_sheet_row_2 = input_workbook_sheet_total_row; input_workbook_sheet_row_2 >= 0 ; input_workbook_sheet_row_2--) { 

       row_1_index = input_workbook_sheet.getRow(input_workbook_sheet_row_1); 
       row_2_index = input_workbook_sheet.getRow(input_workbook_sheet_row_2); 

       if (row_1_index != null && row_2_index != null && row_1_index != row_2_index) { 
        count = 0; 
        value1 = ""; 
        value2 = ""; 

        int row_1_max = row_1_index.getLastCellNum() - 1; 
        for (int row_1_index_cell = 0; row_1_index_cell < row_1_max; row_1_index_cell++) { 
         try { 
          value1 = value1 + formatter.formatCellValue(row_1_index.getCell(row_1_index_cell)); 

          value2 = value2 + formatter.formatCellValue(row_2_index.getCell(row_1_index_cell)); 

         } catch (NullPointerException e) { 
          e.printStackTrace(); 
         } 
         count++; 

         if (value1.equals(value2)) { 
          System.out.println("deleted : " + row_2_index.getRowNum()); 
          System.out.println("------------------"); 
          input_workbook_sheet.removeRow(row_2_index); 


          input_workbook_sheet.shiftRows(
            row_2_index.getRowNum() + 1, 
            input_workbook_sheet_total_row, 
            -1, 
            true, 
            true); 
         } 


        } 

       } 
      } 
     } 

    } 
    FileOutputStream fileOut = new FileOutputStream("c:\\temp\\workbook.xls"); 
    input_workbook.write(fileOut); 
    fileOut.close(); 
    input_file.close(); 
    input_workbook.close(); 
} 
+0

非常感謝指針。 :) 我想提幾點: 1.你的代碼是刪除所有重複的行,不像我的,只保留其中一個副本,刪除其餘的。也許我不清楚刪除重複項目,對此抱歉。 2.像你說的那樣,刪除重複項是讓行變空白但不刪除它們。 3.爲什麼不會哈希比較比較好的方法? – Akash

+0

我已經添加了shiftRows和一個鏈接到hashcode/equals的更深入的解釋。大約1:實際上它應該保留一行(最後一行)。它可以與我的測試數據一起工作,但如果它不適合你,我需要更多關於你比較的數據的信息。 – JensS

+0

感謝您的幫助。我能夠修改我的代碼並使其工作。:) – Akash