使用base64編碼器和InputStreamReader的問題

我在數據庫中有一些CLOB列需要放入Base64編碼的二進制文件。這些文件可能很大，所以我需要對它們進行流式處理，我無法讀取整個事物立刻。使用base64編碼器和InputStreamReader的問題

我使用org.apache.commons.codec.binary.Base64InputStream做編碼，我遇到了一個問題。我的代碼基本上是這樣

FileInputStream fis = new FileInputStream(file); 
Base64InputStream b64is = new Base64InputStream(fis, true, -1, null); 
BufferedReader reader = new BufferedReader(new InputStreamReader(b64is)); 

preparedStatement.setCharacterStream(1, reader);

當我運行上面的代碼，我更新 java.io.IOException: Underlying input stream returned zero bytes的執行過程中獲得的其中之一，它在InputStreamReader的代碼深拋出。

爲什麼這不起作用？在我看來，reader會嘗試從基礎64流讀取數據，它會從文件流中讀取數據，所有事情都應該很開心。

來源

2010-05-30 karoberts

這似乎是Base64InputStream中的一個錯誤。你正確地調用它。

您應該將此報告給Apache Commons編解碼器項目。

簡單的測試案例：

import java.io.*; 
import org.apache.commons.codec.binary.Base64InputStream; 

class tmp { 
    public static void main(String[] args) throws IOException { 
    FileInputStream fis = new FileInputStream(args[0]); 
    Base64InputStream b64is = new Base64InputStream(fis, true, -1, null); 

    while (true) { 
     byte[] c = new byte[1024]; 
     int n = b64is.read(c); 
     if (n < 0) break; 
     if (n == 0) throw new IOException("returned 0!"); 
     for (int i = 0; i < n; i++) { 
     System.out.print((char)c[i]); 
     } 
    } 
    } 
}

的read(byte[])呼叫的InputStream不允許返回0。它在其上是3個字節的倍數長的任何文件返回0。

來源

2010-05-30 03:34:54

是的，你是對的。這是Base64InputStream中的一個錯誤。對於確認這一點的測試用例+1。 – BalusC 2010-05-30 03:46:32

報告順便說一句：https://issues.apache.org/jira/browse/CODEC-101這就是說，我仍然想知道我的測試文件確實是3字節長的倍數的巧合：o） – BalusC 2010-05-30 04:08:48

哇，謝謝你的確認，我必須說我驚訝於我發現了這樣的錯誤（無意中）。 – karoberts 2010-05-30 05:27:27

「對於頂部效率，考慮一個BufferedReader內包裝的InputStreamReader，例如：」

BufferedReader in = new BufferedReader(new InputStreamReader(b64is));

附錄：作爲Base64被填充到4個字符的倍數，驗證該源不被截斷。 A flush()可能是必需的。

來源

2010-05-30 01:41:58 trashgod

也許它更有效率，但它不能解決問題 – karoberts 2010-05-30 02:08:49

你的流被截斷了嗎？ IIRC，'base64'被陷害。 – trashgod 2010-05-30 02:25:23

問題已更新。你能詳細說明你的意思是「base64被框住」嗎？流直接來自文件。 – karoberts 2010-05-30 02:31:44

有趣的是，我在這裏做了一些測試，當您使用InputStreamReader讀取Base64InputStream時，它確實會拋出異常，無論流的來源如何，但是當您將它作爲二進制流讀取時，它的工作完美無瑕。正如Trashgod所提到的，Base64編碼被構造。 InputStreamReader實際上應該再次調用flush()Base64InputStream以查看它是否不返回任何數據。

~~我沒有看到其他解決此問題的方法，而不是實施您自己的 Base64InputStreamReader或 Base64Reader~~ 。 這實際上是一個錯誤，請參閱Keith的答案。

作爲一種解決方法，您也可以將其存儲在數據庫中的BLOB而不是CLOB中，並使用PreparedStatement#setBinaryStream()代替。它是否存儲爲二進制數據並不重要。無論如何，您不希望將這樣大的Base64數據轉換爲可索引或可搜索的。

更新：因爲這不是一個選項，並具有阿帕奇共享編解碼器傢伙修復Base64InputStream錯誤，我repored作爲CODEC-101可能需要一些時間，你可以考慮使用其他第三方的Base64 API。我找到了一個here（公有領域，所以你可以做任何你想要的東西，甚至放置在你自己的包中），我已經在這裏測試過了，它工作正常。

InputStream base64 = new Base64.InputStream(input, Base64.ENCODE);

更新2：公地編解碼器傢伙fixed它很快。

Index: src/java/org/apache/commons/codec/binary/Base64InputStream.java 
=================================================================== 
--- src/java/org/apache/commons/codec/binary/Base64InputStream.java (revision 950817) 
+++ src/java/org/apache/commons/codec/binary/Base64InputStream.java (working copy) 
@@ -145,21 +145,41 @@ 
     } else if (len == 0) { 
      return 0; 
     } else { 
-   if (!base64.hasData()) { 
-    byte[] buf = new byte[doEncode ? 4096 : 8192]; 
-    int c = in.read(buf); 
-    // A little optimization to avoid System.arraycopy() 
-    // when possible. 
-    if (c > 0 && b.length == len) { 
-     base64.setInitialBuffer(b, offset, len); 
+   int readLen = 0; 
+   /* 
+    Rationale for while-loop on (readLen == 0): 
+    ----- 
+    Base64.readResults() usually returns > 0 or EOF (-1). In the 
+    rare case where it returns 0, we just keep trying. 
+ 
+    This is essentially an undocumented contract for InputStream 
+    implementors that want their code to work properly with 
+    java.io.InputStreamReader, since the latter hates it when 
+    InputStream.read(byte[]) returns a zero. Unfortunately our 
+    readResults() call must return 0 if a large amount of the data 
+    being decoded was non-base64, so this while-loop enables proper 
+    interop with InputStreamReader for that scenario. 
+    ----- 
+    This is a fix for CODEC-101 
+   */ 
+   while (readLen == 0) { 
+    if (!base64.hasData()) { 
+     byte[] buf = new byte[doEncode ? 4096 : 8192]; 
+     int c = in.read(buf); 
+     // A little optimization to avoid System.arraycopy() 
+     // when possible. 
+     if (c > 0 && b.length == len) { 
+      base64.setInitialBuffer(b, offset, len); 
+     } 
+     if (doEncode) { 
+      base64.encode(buf, 0, c); 
+     } else { 
+      base64.decode(buf, 0, c); 
+     } 
       } 
-    if (doEncode) { 
-     base64.encode(buf, 0, c); 
-    } else { 
-     base64.decode(buf, 0, c); 
-    } 
+    readLen = base64.readResults(b, offset, len); 
      } 
-   return base64.readResults(b, offset, len); 
+   return readLen; 
     } 
    }

我在這裏試過，它工作正常。

來源

2010-05-30 03:43:09 BalusC

+1良好的解決方法。 – trashgod 2010-05-30 03:59:45

不幸的是，我不能使用BLOB，因爲有時候那裏的數據會有文字 – karoberts 2010-05-30 05:33:48

+1謝謝，那個班會很好地工作。 – karoberts 2010-05-30 16:19:31

使用base64編碼器和InputStreamReader的問題

回答

相關問題