Python的 - 我使用的代碼從這裏涉及到路徑

http://pythoncentral.io/finding-duplicate-files-with-python/

找到一個文件夾中的重複文件。

這些是我在Python中的第一步（我來自VBA for Excel），我的問題可能很簡單，但我嘗試了幾件事情沒有成功。運行代碼後，我得到的消息：

-f is not a valid path, please verify 
An exception has occurred, use %tb to see the full traceback.

％TB產生：

SystemExit        Traceback (most recent call last) 
<ipython-input-118-31268a802b4a> in <module>() 
    11    else: 
    12     print('%s is not a valid path, please verify' % i) 
---> 13     sys.exit() 
    14   printResults(dups) 
    15  else: 

SystemExit:

我使用的代碼是：

# dupFinder.py 
import os, sys 
import hashlib 

def findDup(parentFolder): 
    # Dups in format {hash:[names]} 
    dups = {} 
    for dirName, subdirs, fileList in os.walk(parentFolder): 
     print('Scanning %s...' % dirName) 
     for filename in fileList: 
      # Get the path to the file 
      path = os.path.join(dirName, filename) 
      # Calculate hash 
      file_hash = hashfile(path) 
      # Add or append the file path 
      if file_hash in dups: 
       dups[file_hash].append(path) 
      else: 
       dups[file_hash] = [path] 
    return dups 


# Joins two dictionaries 
def joinDicts(dict1, dict2): 
    for key in dict2.keys(): 
     if key in dict1: 
      dict1[key] = dict1[key] + dict2[key] 
     else: 
      dict1[key] = dict2[key] 


def hashfile(path, blocksize = 65536): 
    afile = open(path, 'rb') 
    hasher = hashlib.md5() 
    buf = afile.read(blocksize) 
    while len(buf) > 0: 
     hasher.update(buf) 
     buf = afile.read(blocksize) 
    afile.close() 
    return hasher.hexdigest() 


def printResults(dict1): 
    results = list(filter(lambda x: len(x) > 1, dict1.values())) 
    if len(results) > 0: 
     print('Duplicates Found:') 
     print('The following files are identical. The name could differ, but the content is identical') 
     print('___________________') 
     for result in results: 
      for subresult in result: 
       print('\t\t%s' % subresult) 
      print('___________________') 

    else: 
     print('No duplicate files found.') 


if __name__ == '__main__': 
path='C:/DupTestFolder/' #this is the path to analyze for duplicated files 
    if len(sys.argv) > 1: 
     dups = {} 
     folders = sys.argv[1:] 
     for i in folders: 
      # Iterate the folders given 
      if os.path.exists(i): 
       # Find the duplicated files and append them to the dups 
       joinDicts(dups, findDup(i)) 
      else: 
       print('%s is not a valid path, please verify' % i) 
       sys.exit() 
     printResults(dups) 
    else: 
     print('Usage: python dupFinder.py folder or python dupFinder.py folder1 folder2 folder3')

我想有和沒有結束路徑「「最後，但結果是一樣的。

我正在Jupyter與Python 3

提前

感謝您的幫助！

來源

2017-09-15 Pegaso

路徑變量未在您的代碼中使用。

您所做的只是對sys.argv[1:]的迭代，它們是腳本的參數。您將每個參數視爲目錄路徑。

在Windows控制檯，您可以嘗試：

python dupFinder.py C:\DupTestFolder

它應該工作。

來源

2017-09-15 03:34:40

我在Jupyter運行，但我看到的解決方案： \t蟒蛇dupFinder.py C：\ DupTestFolder \t 這裏是行不通的。我嘗試了以下結果：語法無效 \t 我正在使用路徑來訪問dupFinder.py，但這也沒有幫助。蟒蛇C：\ AnacondaProjects \ dupFinder.py C：\ DupTestFolder \t ALSE返回無效語法我相信我做錯了什麼，但我想不出什麼。 – Pegaso

謝謝！我現在可以運行代碼了。我必須做兩件事： 1.將dupFinder.py保存到運行我的python安裝的相同文件夾中，在我的情況下是C：\ Users \ Pepe – Pegaso

Sys.argv在命令行窗口中工作並使用參數。它自然不適用於jupyter筆記本，或者您需要在jupyter筆記本中找出一些命令。

來源

2017-09-15 04:40:49 Cece

謝謝！我現在可以運行代碼了。我必須做兩件事情：

保存dupFinder.py到運行我的Python安裝，在我的情況C相同的文件夾：\用戶\佩佩
打開從蟒蛇cmd窗口（即創建cmd窗口放在python運行的文件夾中），我推測我可以做同樣的事情打開命令窗口並導航（cd \ command）到文件夾位置
最後運行python dupFinder.py C：\ DupTestFolder。

現在我需要了解如何將結果保存到.txt文件以供將來使用，我會在發佈前搜索它。謝謝你的幫助！

來源

2017-09-16 17:32:13 Pegaso

Python的 - 我使用的代碼從這裏涉及到路徑

回答

相關問題