2014-07-18 100 views
1

我期待通過網絡目錄遞歸搜索以確定zip文件中的所有.xls文件。對於在zip文件中找到的每個XLS文件,我想將其複製到C:的本地位置。這是我的腳本到目前爲止:在Python中複製zipfile的內容

import os 
import zipfile 
import fnmatch 
import shutil 

rootPath = "L:\Data\Cases" 
destPath = "C:\Test" 
allFileList = [] 
zipList = [] 

# Create a list containing all files contained within L:\Data\Cases 
for dirname, dirnames, filenames in os.walk(rootPath): 
    for filename in filenames: 
     allFileList.append(os.path.join(dirname, filename)) 

# Return a list of filepaths containing zipfiles. 
for file in allFileList: 
    if file.endswith(".zip"): 
     zipList.append(file) 

for file in zipList: 
    with zipfile.ZipFile(file) as zip_file: 
     for member in zip_file.namelist(): 
      if member.endswith(".xls"): 
       filename = os.path.basename(member) 
       if not filename: 
        continue 
       source = zip_file.open(member) 
       target = os.path.join(destPath, filename) 
       shutil.copy2(source, target) 

錯誤代碼如下。我認爲這個錯誤是由於將壓縮容器中的文件複製到目標路徑引起的。

Traceback (most recent call last): 
    File "C:/Users/user/Desktop/parsecsv.py", line 30, in <module> 
    shutil.copy2(source, target) 
    File "C:\Program Files\Python278\lib\shutil.py", line 130, in copy2 
    copyfile(src, dst) 
    File "C:\Program Files\Python278\lib\shutil.py", line 68, in copyfile 
    if _samefile(src, dst): 
    File "C:\Program Files\Python278\lib\shutil.py", line 63, in _samefile 
    return (os.path.normcase(os.path.abspath(src)) == 
    File "C:\Program Files\Python278\lib\ntpath.py", line 487, in abspath 
    path = _getfullpathname(path) 

有什麼建議嗎?

+0

'ROOTPATH =「L:\ DATA \」。?'將最後一個「(雙引號)轉義字符 – Nilesh

+0

您可以編輯和顯示錯誤這將有助於 –

+0

有輕微的變化編輯和最新的錯誤 – thefragileomen

回答

2

正如布魯諾提到我認爲你不能檢查zip文件的內容,但我認爲一個更清潔的方式可以提取後因此可以刪除它們你可以使用shutil.rmtree去除其他東西。

def main(): 
rootPath = "C:\\rootpath" 
destPath = "C:\\Test" 
allFileList = [] 
zipList = [] 
# Create a list containing all files contained within L:\Data\Cases 
for dirname, dirnames, filenames in os.walk(rootPath): 
    for filename in filenames: 
     allFileList.append(os.path.join(dirname, filename)) 

# Return a list of filepaths containing zipfiles. 
for file in allFileList: 
    if file.endswith(".zip"): 
     zipList.append(file) 


for file in zipList: 
    with zipfile.ZipFile(file) as zip_file: 
     for member in zip_file.namelist(): 
      if member.endswith(".xls"): 
       zip_file.extract(member, destPath) 

for dirname, dirnames, filenames in os.walk(destPath): 
    for filename in filenames: 
     if not filename.endswith(".xls"): 
      shutil.rmtree(filename) 

if __name__ == '__main__': 
main() 
2

ZipFile.open()不返回文件系統路徑,而是返回類似文件的對象ZipExtFile。你想要的是ZipFile.extract()(然後你不需要shutil.copy()在所有):

# NB : untested code, refer to the doc for more infos 
for file in zipList: 
    with zipfile.ZipFile(file) as zip_file: 
     for member in zip_file.namelist(): 
      if member.endswith(".xls"): 
       zip_file.extract(member, destPath) 

而且和FWIW,你不需要先建的所有文件的列表,然後建立壓縮文件的列表,然後遍歷這個名單上 - 你可以如做整個事情一通:

for dirname, dirnames, filenames in os.walk(rootPath): 
    for filename in filenames: 
     if not filename.endswith(".zip"): 
      continue 
     fullpath = os.path.join(dirname, filename)) 
     with zipfile.ZipFile(fullpath) as zip_file: 
      for member in zip_file.namelist(): 
       if member.endswith(".xls"): 
        zip_file.extract(member, destPath) 
+0

非常感謝@bruno的幫助,上面的工作很好,但是它提取了包含xls文件的文件夾結構,有沒有可能簡單地提取沒有文件夾結構的xls文件?謝謝 – thefragileomen

+0

不直接AFAICT。如果你不在乎關於文件的元數據(時間等),你可以通過ZipFile.open()把它的內容寫到一個新創建的文件中,否則,你必須首先將文件提取到一個臨時目的地然後'shutil.move()它到達它想要的目的地。 –