從Python創建Excel文件

我的項目是處理不同的Excel文件。爲此，我想創建一個包含以前文件的一些數據的文件。所有這些都是爲了擁有我的數據庫。目標是獲取這些數據的圖表。所有這一切都自動。從Python創建Excel文件

我用Python編寫了這個程序。但是，它需要20分鐘才能運行。我怎樣才能優化它？另外，我在一些文件中有相同的變量。所以我想在最後的文件中，不重複相同的變量。怎麼做？

這裏是我的程序：

import os 
import xlrd 
import xlsxwriter 
from xlrd import open_workbook 

wc = xlrd.open_workbook("U:\\INSEE\\table-appartenance-geo-communes-16.xls") 
sheet0=wc.sheet_by_index(0) 

# création 

with xlsxwriter.Workbook('U:\\INSEE\\Department61.xlsx') as bdd: 
    dept61 = bdd.add_worksheet('deprt61') 

folder_path = "U:\\INSEE\\2013_telechargement2016" 

col=8 
constante3=0 
lastCol=0 
listeV = list() 

for path, dirs, files in os.walk(folder_path): 
    for filename in files:    
     filename = os.path.join(path, filename)   
     wb = xlrd.open_workbook(filename, '.xls')    
     sheet1 = wb.sheet_by_index(0)   
     lastRow=sheet1.nrows   
     lastCol=sheet1.ncols   
     colDep=None 
     firstRow=None 
     for ligne in range(0,lastRow):     
      for col2 in range(0,lastCol):      
       if sheet1.cell_value(ligne, col2) == 'DEP': 
        colDep=col2 
        firstRow=ligne 
        break 
      if colDep is not None: 
       break 
     col=col-colDep-2-constante3 
     constante3=0 
     for nCol in range(colDep+2,lastCol): 
        constante=1 
        for ligne in range(firstRow,lastRow): 
          if sheet1.cell(ligne, colDep).value=='61': 
            Q=(sheet1.cell(firstRow, nCol).value in listeV) 
            if Q==False: 
              V=sheet1.cell(firstRow, nCol).value 
              listeV.append(V) 
              dept61.write(0,col+nCol,sheet1.cell(firstRow, nCol).value) 
              for ligne in range(ligne,lastRow): 
                if sheet1.cell(ligne, colDep).value=='61': 
                  dept61.write(constante,col+nCol,sheet1.cell(ligne, nCol).value) 
                constante=constante+1 

            elif Q==True: 
              constante3=constante3+1 # I have a problem here. I would like to count the number of variables that already exists but I find huge numbers. 
        break 
     col=col+lastCol 

bdd.close()

感謝你爲你的未來幫助。 :)

來源

2017-05-02 Jen

imho，'for file in files：'之後的整個代碼塊需要縮進1級，除了'bdd.close（）'之外，循環纔有意義。我已經做了編輯。如果這是錯誤的，再次編輯。 – aneroid

這個可能對於SO來說太寬泛了，所以這裏有一些指導你可以優化的地方。也許添加一張樣張的樣張截圖。

wrt if sheet1.cell_value(ligne, col2) == 'DEP': DEP是否可以在一張紙上多次出現？如果肯定會發生只有一次，那麼當您得到colDep和firstRow的值時，則會跳出兩個循環。在兩個循環中添加break，通過添加一箇中斷來結束內部循環，然後檢查標誌值並在迭代之前跳出外部循環。像這樣：

colDep = None # initialise to None 
firstRow = None # initialise to None 
for ligne in range(0,lastRow):     
    for col2 in range(0,lastCol):      
     if sheet1.cell_value(ligne, col2) == 'DEP': 
      colDep=col2 
      firstRow=ligne 
      break # break out of the `col2 in range(0,lastCol)` loop 
    if colDep is not None: # or just `if colDep:` if colDep will never be 0. 
     break # break out of the `ligne in range(0,lastRow)` loop

我覺得範圍在你寫對BDD塊for ligne in range(0,lastRow):應該firstRow開始，因爲你知道，0至FIRSTROW-1將是空的sheet1您剛纔讀尋找標題。
```
for ligne in range(firstRow, lastRow): 
```
這樣可以避免浪費時間讀取空的標題行。

更清潔的代碼的其他注意事項：

使用with xlsxwriter.Workbook('U:\INSEE\\Department61.xlsx') as bdd: syntax的清晰度。
- 和總是使用雙斜槓，即使控制字符不前\\內字符串：'U:\\INSEE\\Department61.xlsx'
您已經使用sheet1.cell_value()以及sheet1.cell().value您的讀操作。選擇一個，除非在value=='61'的情況下需要擴展單元信息。
閱讀PEP-8瞭解如何編寫更多可讀代碼。

來源

2017-05-02 22:21:51 aneroid

感謝您的幫助。我會閱讀。 – Jen

我改變了一些我的代碼。但是，我有一些問題。 – Jen

從Python創建Excel文件

回答

相關問題