閱讀大文件錯誤「outofmemoryerror」（java）

對不起，我的英語。我想讀一個大文件，但是當我讀取錯誤發生時outOfMemoryError。我不明白如何在應用程序中處理內存。以下代碼不起作用：閱讀大文件錯誤「outofmemoryerror」（java）

try { 

    StringBuilder fileData = new StringBuilder(1000); 
    BufferedReader reader = new BufferedReader(new FileReader(file)); 

    char[] buf = new char[8192]; 
    int bytesread = 0, 
     bytesBuffered = 0; 

    while((bytesread = reader.read(buf)) > -1) { 

     String readData = String.valueOf(buf, 0, bytesread); 
     bytesBuffered += bytesread; 

     fileData.append(readData); //this is error 

     if (bytesBuffered > 1024 * 1024) { 
      bytesBuffered = 0; 
     } 
    } 

    System.out.println(fileData.toString().toCharArray()); 
} finally { 

}

來源

2015-02-07 qazqwerty

什麼是你可以使用盡可能高的Java版本？以這種方式讀取文件非常過時，除了因爲Android或其他原因需要使用Java 6以外。否則，你應該使用Java 8的Stream API。 – Bevor 2015-02-07 14:40:46

我使用1.70_71。我需要讀取大文件，而不是readLine（）。因爲文件（5GB）只能包含一行 – qazqwerty 2015-02-07 14:57:18

您需要預先分配一個大緩衝區以避免重新分配。

File file = ...; 
StringBuilder fileData = new StringBuilder(file.size());

並備有大量堆大小運行：

java -Xmx2G

====更新

while循環利用緩衝區並不需要太多的內存來運行。將輸入視爲流，將搜索字符串與流匹配。這是一個非常簡單的狀態機。如果你需要搜索多個單詞，你可以找到一個TrieTree實現（支持流）。

// the match state model 
...xxxxxxabxxxxxaxxxxxabcdexxxx... 
     ab  a  abcd 

    File file = new File("path_to_your_file"); 
    String yourSearchWord = "abcd"; 
    int matchIndex = 0; 
    boolean matchPrefix = false; 
    try (BufferedReader reader = new BufferedReader(new FileReader(file))) { 
     int chr; 
     while ((chr = reader.read()) != -1) { 
      if (matchPrefix == false) { 
       char searchChar = yourSearchWord.charAt(0); 
       if (chr == searchChar) { 
        matchPrefix = true; 
        matchIndex = 0; 
       } 
      } else { 
       char searchChar = yourSearchWord.charAt(++matchIndex); 
       if (chr == searchChar) { 
        if (matchIndex == yourSearchWord.length() - 1) { 
         // match!! 
         System.out.println("match: " + matchIndex); 
         matchPrefix = false; 
         matchIndex = 0; 
        } 
       } else { 
        matchPrefix = false; 
        matchIndex = 0; 
       } 
      } 
     } 
    }

來源

2015-02-07 13:44:35 javamonk

感謝您的回覆。這沒關係，例如，如果10GB文件使用這個'StringBuilder fileData = new StringBuilder（file.size（））;'？ – qazqwerty 2015-02-07 14:13:28

你能描述一下10GB文件的操作嗎？這個過程可能會有所不同，取決於你的工作。 – javamonk 2015-02-07 14:17:38

我需要一個大文件（5-10gb）來查找包含所需單詞的字符串。我不知道該怎麼做，也許過多的面向字符或多部分下載。很高興找到那個例子。 – qazqwerty 2015-02-07 14:21:41

試試這個。這可能會有所幫助： -

try{ 
    BufferedReader reader = new BufferedReader(new FileReader(file)); 
    String txt = ""; 
    while((txt = reader.read()) != null){ 
     System.out.println(txt); 
    } 
}catch(Exception e){ 
    System.out.println("Error : "+e.getMessage()); 
}

來源

2015-02-07 13:43:05 khandelwaldeval

感謝您的回答。 'txt = reader中的這個錯誤）！= null'，未解決的編譯問題：類型不匹配：無法從BufferedReader轉換爲String。 – qazqwerty 2015-02-07 14:09:59

@qazqwerty對不起我的壞...它的'reader.read（）'。看到修改後的代碼 – khandelwaldeval 2015-02-07 14:13:41

@qazqwerty如果它可以幫助你，也不會忘記接受（勾選）答案 – khandelwaldeval 2015-02-07 14:14:51

你不應該在內存中保存這樣的大文件，因爲你用完了，就像你看到的那樣。由於您使用Java 7，因此您需要手動將文件作爲流讀取，並即時檢查內容。否則，你可以使用Java 8的流API。這只是一個例子。它的工作原理，但要記住，該發現字的位置可能會因爲編碼的問題各不相同，所以這是沒有生產代碼：

import java.io.File; 
import java.io.FileInputStream; 
import java.io.IOException; 

public class FileReader 
{ 
    private static String wordToFind = "SEARCHED_WORD"; 
    private static File file = new File("YOUR_FILE"); 
    private static int currentMatchingPosition; 
    private static int foundAtPosition = -1; 
    private static int charsRead; 

    public static void main(String[] args) throws IOException 
    { 
     try (FileInputStream fis = new FileInputStream(file)) 
     { 
      System.out.println("Total size to read (in bytes) : " + fis.available()); 

      int c; 
      while ((c = fis.read()) != -1) 
      { 
       charsRead++; 
       checkContent(c); 
      } 

      if (foundAtPosition > -1) 
      { 
       System.out.println("Found word at position: " + (foundAtPosition - wordToFind.length())); 
      } 
      else 
      { 
       System.out.println("Didnt't find the word!"); 
      } 

     } 
     catch (IOException e) 
     { 
      e.printStackTrace(); 
     } 
    } 

    private static void checkContent(int c) 
    { 
     if (currentMatchingPosition >= wordToFind.length()) 
     { 
      //already found.... 
      return; 
     } 

     if (wordToFind.charAt(currentMatchingPosition) == (char)c) 
     { 
      foundAtPosition = charsRead; 
      currentMatchingPosition++; 
     } 
     else 
     { 
      currentMatchingPosition = 0; 
      foundAtPosition = -1; 
     } 
    } 
}

來源

2015-02-07 16:01:43 Bevor

閱讀大文件錯誤「outofmemoryerror」（java）

回答

相關問題