2012-07-20 51 views
1
打開一個文件ODT我

有下面的方法來壓縮條目添加到在一個ZipOutputStream:當使用ZipOutputStream使用Java創建失敗,在OpenOffice的

private void addFile(String filename, byte[] bytes, ZipOutputStream zos, boolean encrypt) throws IOException { 
     ZipEntry entry = new ZipEntry(filename); 
     if (encrypt) { 
      entry.setMethod(ZipEntry.DEFLATED); 
     } else { 
      entry.setMethod(ZipEntry.STORED); 
      CRC32 crc32 = new CRC32(); 
      crc32.update(bytes); 
      entry.setCrc(crc32.getValue()); 
      entry.setSize(bytes.length); 
      entry.setCompressedSize(bytes.length); 
     } 
     zos.putNextEntry(entry); 
     zos.write(bytes); 
     zos.flush(); 
     zos.closeEntry(); 
    } 

...我用它打開一個新的ZipOutputStream( ZipOutputStream zos = new ZipOutputStream(new BufferedOutputStream(new FileOutputStream(new File(path))));),設置其收縮方法(zos.setMethod(ZipOutputStream.DEFLATED);),然後調用以下文件(按順序):

  1. 「MIME類型」(對於這個文件,我設置的ZipEntry方法存儲)
  2. 「清單.xml「在一個子文件中lder連續所謂的 「META-INF」(META-INF/manifest.xml的)
  3. 「的content.xml」
  4. 「styles.xml」
  5. 「了meta.xml」 在
  6. 「thumbnail.png」子文件夾名爲 「縮略圖」( 「縮略圖/ thumbnail.png」)
  7. 「的settings.xml」

......最後,我呼籲ZipOutputStream(zos.close();)close方法。

如果我試圖用OpenOffice直接打開它,它會問我要打開什麼樣的文件,它說文件已損壞,最後打開文件...但是如果我解壓縮文件(我用的是winrar),然後我再次用相同的工具壓縮(winrar,我的意思是)沒有任何改變,OpenOffice能夠打開沒有任何問題的文檔...

任何幫助?提前致謝!

+0

這聽起來有點奇怪。你是否正確處理異常**? – 2012-07-20 23:42:32

+0

圍繞所有代碼我有一個很大的「嘗試捕獲」,並且沒有例外。 – 2012-07-20 23:54:07

+0

你有任何小小的「嘗試捕獲」來壓扁異常嗎? (我問你是否正在處理異常* ...) – 2012-07-21 00:38:41

回答

0
**I don't see a ZipEntry for the Directory META-INF/ to be added in your list of files, is the manifest.xml file ending up in the root directory? then again, that's not the problem I guessing.** 

> 1. "mimetype" (for this file, I set the ZipEntry method to STORED) 
> 2. "manifest.xml" in a subfolder called "META-INF" (META-INF/manifest.xml) 
> 3. "content.xml" 
> 4. "styles.xml" 
> 5. "meta.xml" 
> 6. "thumbnail.png" in a subfolder called "Thumbnails" ("Thumbnails/thumbnail.png") 
> 7. "settings.xml" 

*That's why OO is freaking out for a moment and the and when you open it in winrar, winZip or any other zip utility and re-zip it, the utility will create the entry for the directories in the zipFile. by noticing that the file URL conforms to a valid path on the current OS, I beat if you took the bad zip file with not directory entries and opened in OO it may not be able to recover because the File.Separator is leaning the other way.* 

*So In the case where you open it right in OO and it can't determine what type of content its opening maybe because it can't locate the manifest.xml which it would be expecting to unpack into the META-INF/ directory.* 

*This causes a chain side effect. So when files that are inside directories are just seen as files with path "Configurations2/accelerator/current.xml" for example. But because the Name or the String Representation of the Path does not include actual folders, its just a string that looks like a path and so OO thinks "Configurations2/accelerator/current.xml" is the name of the file. It does not recognize it as two directories with a file init.* 


**so why does zip utility or OO it self fix it at some point?** 

*what's most likely going on is when OO unpacks the files tries to find the manifest.xml named entry, but it cant find it there is a "META-INF/manifest.xml" named entry or resource but it does a compare == 0 and withInString to find the keys.* 

*And this gets fixed when the application(s) write the entry "META-INF/manifest.xml" back into a zipped file, it because of the "/" knows that this is a path that contains directories and adds the directories as entries in the zip file, so when it reopens or when it recovers by trying to save-reload the document it now has a directory META-INF/ as a entry it adds that first then the manifest.xml file into that dir and the ODF structure is what it likes* 

**Important: When Adding files Images, Data, Video, Audio or any type you should either: 
- Create a ZipEntry for each folder in the path. for /MyCustomImgDir/Trip/swimming.png 
    -- Entry Name: MyCustomImgDir/ should be added as a directory entry 
    -- and Entry Name: MyCustomImgDir/Trip/ should be added another directory entry 
- OR you can add the file in the Meta-Inf/manifest.xml file and provide the full-path location so, OO can find-and-put it in the proper place. 


********************************************************************************** 
* Optianl Reading for more detail if you're not familir with zip files and paths * 
********************************************************************************** 
* Why the below section? - I spent many years working on a platform that used * 
*       the ODF format, when the API was limited and we were * 
*       and the only way to do many things was to rip aprt the* 
*       xml and add-delete-modify-move around the bytes all over* 
*       Major pain in the butt, I just want to same someonea few weeks* 
*       heck. (I can say that right?) 
********************************************************************************** 


So to further discuss and more details, Ex: I have a file.ods and when I list the contents of the file below is what I get with 15 files 

[email protected]~/temp/eKits/variableData>unzip -l ../variabledata.ods 
Archive: ../variabledata.ods 
    Length  Date Time Name 
-------- ---- ---- ---- 
     46 04-11-08 18:36 mimetype 
     0 04-11-08 18:36 Configurations2/statusbar/ 
     0 04-11-08 18:36 Configurations2/accelerator/current.xml 
     0 04-11-08 18:36 Configurations2/floater/ 
     0 04-11-08 18:36 Configurations2/popupmenu/ 
     0 04-11-08 18:36 Configurations2/progressbar/ 
     0 04-11-08 18:36 Configurations2/menubar/ 
     0 04-11-08 18:36 Configurations2/toolbar/ 
     0 04-11-08 18:36 Configurations2/images/Bitmaps/ 
    61403 04-11-08 18:36 content.xml 
    6909 04-11-08 18:36 styles.xml 
    1037 04-11-08 18:36 meta.xml 
    1355 04-11-08 18:36 Thumbnails/thumbnail.png 
    7668 04-11-08 18:36 settings.xml 
    1873 04-11-08 18:36 META-INF/manifest.xml 
--------     ------- 
    80291     15 files 


***************************************************************************** 
and Unziping it produces: (15 files, like it listed) 

[email protected]~/temp/eKits>unzip -d variabledata variabledata.ods 
Archive: variabledata.ods 
extracting:/mimetype 
    creating:/Configurations2/statusbar/ 
    inflating:/Configurations2/accelerator/current.xml 
    creating:/Configurations2/floater/ 
    creating:/Configurations2/popupmenu/ 
    creating:/Configurations2/progressbar/ 
    creating:/Configurations2/menubar/ 
    creating:/Configurations2/toolbar/ 
    creating:/Configurations2/images/Bitmaps/ 
    inflating:/content.xml 
    inflating:/styles.xml 
extracting:/meta.xml 
    inflating:/Thumbnails/thumbnail.png 
    inflating:/settings.xml 
    inflating:/META-INF/manifest.xml 

***************************************************************************** 
so from my unzipped files and folders I rezip it without any change and looking at results show: 

[email protected]~/temp/eKits/variableData>zip -ru newDocZip.zip * 
    zip warning: newDocZip.zip not found or empty 
    adding: Configurations2/ (stored 0%) 
    adding: Configurations2/accelerator/ (stored 0%) 
    adding: Configurations2/accelerator/current.xml (stored 0%) 
    adding: Configurations2/floater/ (stored 0%) 
    adding: Configurations2/images/ (stored 0%) 
    adding: Configurations2/images/Bitmaps/ (stored 0%) 
    adding: Configurations2/menubar/ (stored 0%) 
    adding: Configurations2/popupmenu/ (stored 0%) 
    adding: Configurations2/progressbar/ (stored 0%) 
    adding: Configurations2/statusbar/ (stored 0%) 
    adding: Configurations2/toolbar/ (stored 0%) 
    adding: META-INF/ (stored 0%) 
    adding: META-INF/manifest.xml (deflated 83%) 
    adding: Thumbnails/ (stored 0%) 
    adding: Thumbnails/thumbnail.png (deflated 44%) 
    adding: content.xml (deflated 89%) 
    adding: meta.xml (deflated 59%) 
    adding: mimetype (deflated 4%) 
    adding: settings.xml (deflated 87%) 
    adding: styles.xml (deflated 78%) 

****************************************************************************** 
*****now there are 20 entries******************* 
Why OO does not include a ZipEntry for folder that are in the middle of a path: 

Example there is an entry at: Configurations2/accelerator/current.xml 
but you do not see entries for directories: 
- Configurations2/ 
and 
- Configurations2/accelerator 
Why Because it know when it keeps 

****************************************************************************** 

** Notice that mimetype is the first ZipEntry in the ZipFile created by OO and its anywhere and where ever when created by a zip utility** 

*Notice the META-INF/ and Thumbnails/ other folders that are added in order (<em>parent dir before children's<em>)!* 

**So lets check out the mimetype of the .ods and .zip file with command line util:** 

[email protected]~/temp/eKits>file --mime variabledata.ods 
variabledata.ods: application/vnd.oasis.opendocument.spreadsheet; charset=binary 

*you see that the mimetype shows as a application/spreadsheet and when you do the same for the new zip file (just rezipped unzipped ods content by zip util no changes to paths or files) you get:* 
****************************************************************************** 

[email protected]~/temp/eKits/variabledata>file --mime newDoc.zip 
newDoc.zip: application/zip; charset=binary 

*it's no loger a spreadsheet it is an actual zip file content type.* 
****************************************************************************** 

*Also just for fun you may want to output the raw bytes in a command line or consol by just 'cat' or 'head' or 'print' on the same files. You get trash but somethings there if look closer.* 
****************************************************************************** 


[email protected]~/temp/eKits/variabledata>head ../variabledata.ods 
PK???8?l9?.<em>mimetypeapplication/vnd.oasis.opendocument.spreadsheet<em>PK???8Configurations2/statusbar/???8' 

**and re-zipped file, we've lost the content-type header and much more.** 
****************************************************************************** 

[email protected]~/temp/eKits/variabledata>head newDoc.ods 
PK 
[email protected]/UT ?? PF? Pux 
              ?PK 

****************************************************************************** 
***What to watch out for*** 
****************************************************************************** 

**It is not nesserry to have the content-type of the zip file as the <em>mimetypeapplication/vnd.oasis.opendocument.spreadsheet<em> it can be <em>applicatoin/zip<em> and openOffice will still be able to use it. The content-type is important to have there and correct, but for OpenOffice-Documents its more important to have the mimetype file in the root dir and have the META-INF/manifest.xml file in the directory META-INF/. 

- The application does not look at the extension to determine the file type the OS does to launch the linked application for that extension, and if there is non it tries to find the content-type from the file header and find the appropriate application to handel file. Note that you can assign .txt or .jpeg to open with openOffice and it will open it even though its not a zip file. 

- OO tries to start by looking at the file header to find the content-type first but it can live with out it. 

- It then tries to find manifest.xml file. It placed it there so it can trust it. From that file It makes a check list of stuff listed in the manifest.xml file and cross checks it against the actual files (ZipEntires) extracted from the ZippedFile. 
    <em>Listed directories in the manifest.xml that are not in the zipContent as entries won't cause a problem long as its not expecting a file in that directory.<em> 

- When it fails to read the file header (because its application/zip type), and the it does not have a mimetype. It will wiggout and want to find out really bad what type this file is. Or if there is file (i.e. ZipEntry) missing or it thinks its missing becuase the name is full string literal with the full path it can't resolve it to a ZipEntry Name Key. Even though, the real readable binary data is in the zip file, it will think something is wrong and ask to restore it but reproduce the zip content the best way it can. 

- The ODF file is a zip format file but it's picky and expect things to in a certain place. ** 

****************************************************************************** 
**So whats the fix to not have this problem? Simply: You need to recursively walk the directories adding sub-dirs and files all of them.** 
****************************************************************************** 

**In more detail:** 
1. start at the root directory 
2. RootDirectory.listfiles() 
2a. if the entry from listfile() 
    2a-1) if aFile from the list is a real file then add that as the zipentry as you would 
     * then move on to the next item in your listfiles() 
    2a-2) if aFile from the list is a directory you need to add that as a entry in the zipfile as a dir by appending "/" at the end of the path. 
     * then again do a listfiles() on the directory to get files and dirs for the directory you just added and restart step 1. with the sub-directory as the RootDirectory and add its dirs and sub-dirs and files. 

****************************************************************************** 
* other note to keep in mind: 
****************************************************************************** 
** Of course there is the matter of using the right Encoding for the xml files and other files. Windows defualts to ios-8859-1 and Linux/Unix/Mac defaults to UTF-8 when new OutputSteams and writers are create. so if somethings working in nix and not winders, good place to look is at your streams** 
****************************************************************************** 

Thanks, I hope you had as much fun reading this as I did writing it. I just hope it makes sense. 

ps - sorry about the formatting job. 
+0

我是否需要創建根目錄(「/」或「./」)? – 2012-07-21 15:02:42

+0

壓縮解壓縮時通常不使用zip文件名。您可以創建zip文件,並安全地將文件重命名爲任何內容,但解壓縮將使用新名稱。我們使用追加時間,processedby狀態,並使文件名反映文件名中的當前狀態和歷史記錄。 – Pareshkumar 2012-07-21 18:20:47

+0

祝你好運,希望你能釘上它。也可能無關緊要,但是如果一次寫入所有數據,我確實遇到了問題,這是一個阻塞呼叫,所以如果您壓縮一張200Mb的圖像,您的應用程序就可以freezz一下。在我開始一次讀取/寫入1024個字節後,我從來沒有遇到過問題。 – Pareshkumar 2012-07-21 18:26:59

0

勞爾,

看起來你的zip文件是完整的,正確的,所以它不是壓縮的過程,造成問題的原因。

問題是,當您的進程爲maniest.xml文件創建xml文件時,doctype dtd位置無效。

中的manifest.xml行:

,你也必須在投擲OO輕易嘗試把東西重新走到一起時,拉鍊的根manifest.rdf文件。我不知道您用於創建OpenOffice文檔XML輸出的過程,但那是問題所在。您需要確保DTD的路徑正確,或從XML文件中刪除DOCTYPE。

這裏是你如何手動修復OpenOffice的ODT文件:

  • 從拉鍊根目錄
  • 取出Manifest.rdf從META-INF/menifest刪除DOCTYPE線。XML
  • 添加下面列出的目錄清單條目(請注意,在Configurations2條目中的差)

    <清單:文件條目清單:媒體類型=「」清單:全路徑=「Configurations2 /狀態欄/ 「/>

    <清單:文件條目清單:媒體類型=」」清單:全路徑= 「Configurations2 /加速器/的current.xml」/>

    <清單:文件條目manifest:media-type =「」manifest:full-path =「Configurations2/accelerator /」/>

    <清單:文件條目清單:媒體類型=「」清單:全路徑=「Configurations2 /浮子/」 />

    <清單:文件條目清單:媒體類型=「」清單:全路徑= 「Configurations2 /彈出菜單/」/>

    <清單:文件條目清單:媒體類型= 「」 清單:全路徑= 「Configurations2 /進度/」/>

    < manifest:file-entry manifest:media-type =「」manifest:full-path =「Configurations2/toolpanel /」/>

    <清單:文件條目清單:媒體類型=「」清單:全路徑=「Configurations2 /菜單欄/」 />

    <清單:文件條目清單:媒體類型=「」清單:全路徑= 「Configurations2 /工具欄/」/>

    <清單:文件條目清單:媒體類型= 「」 清單:全路徑= 「Configurations2 /圖像/位圖/」/>

    < manifest:file-entry manifest:media-type =「」manifest:full-path =「Configurations2/images /」/>

<清單:文件條目清單:媒體類型= 「應​​用/ vnd.sun.xml.ui.configuration」 清單:全路徑= 「Configurations2 /」/>

我認爲winrar可能正在爲您解決dtd路徑問題。開放式辦公室修理時也是這樣。 OO可以沒有定義DTD,它知道要尋找什麼,但它不喜歡錯誤的DTD的

希望能爲你解決它。我想知道你用什麼來編寫XML。

+0

我也送你我的代碼創建了OpenOffice的創建的ODT文件和ODT文件編碼的修復......之後,你可以看到,有一些差別在一些壓縮元素的大小...你知道爲什麼嗎?可能是java zip實現的問題? – 2012-07-27 09:28:15

0

我發現問題可以發生在Windows上,您在傳遞給ZipEntry構造函數的文件名中提供反斜槓。 我將它們改爲正斜槓,之後OpenOffice不再抱怨odt被破壞