python中的字數換行符

我正在嘗試編寫一個腳本來拉出目錄中許多文件的字數。我的工作與我想要的相當接近，但有一部分將我拋棄。到目前爲止的代碼是：python中的字數換行符

import glob 

directory = "/Users/.../.../files/*" 
output = "/Users/.../.../output.txt" 

filepath = glob.glob(directory) 

def wordCount(filepath): 
    for file in filepath: 
     name = file 
     fileO = open(file, 'r') 
     for line in fileO: 
      sentences = 0 
      sentences += line.count('.') + line.count('!') + line.count('?') 

      tempwords = line.split() 
      words = 0 
      words += len(tempwords) 

      outputO = open(output, "a") 
      outputO.write("Name: " + name + "\n" + "Words: " + str(words) + "\n") 

wordCount(filepath)

這寫的字計數到一個名爲「output.txt的」文件，並給了我輸出看起來像這樣：

Name: /Users/..../..../files/Bush1989.02.9.txt 
Words: 10 
Name: /Users/..../..../files/Bush1989.02.9.txt 
Words: 0 
Name: /Users/..../..../files/Bush1989.02.9.txt 
Words: 3 
Name: /Users/..../..../files/Bush1989.02.9.txt 
Words: 0 
Name: /Users/..../..../files/Bush1989.02.9.txt 
Words: 4821

這將重複在每個文件該目錄。正如你所看到的，它給了我每個文件多個計數。該文件的格式，如：

上的管理目標地址國會

2月9日的聯合會議之前，1989年

議長先生，主席先生，以及衆議院的尊貴會員及參議院...

因此，它似乎是腳本是給我的每一個文件的「部分」的計數，如在第一行，0 10個字就行了突破，3下一個，ne上的0 xt，然後計算文本的正文。

我在找的是每個文件的單個計數。任何幫助/方向表示讚賞。

來源

2012-04-01 user1074057

'X = 0'隨後在同一迴路中的'X + = something'是沒有意義的。 – tokland 2012-04-01 13:58:33

內循環的最後兩行顯示文件名和字數，應該是外循環的一部分，而不是內循環 - 因爲它們是每行運行一次。

您還正在重置每行的句子和單詞計數 - 這些應位於內循環開始之前的外循環中。

這裏是你的代碼看起來應該更改後的內容：

import glob 

directory = "/Users/.../.../files/*" 
output = "/Users/.../.../output.txt" 

filepath = glob.glob(directory) 

def wordCount(filepath): 
    for file in filepath: 
     name = file 
     fileO = open(file, 'r') 
     sentences = 0 
     words = 0 
     for line in fileO: 
      sentences += line.count('.') + line.count('!') + line.count('?') 

      tempwords = line.split() 
      words += len(tempwords) 

     outputO = open(output, "a") 
     outputO.write("Name: " + name + "\n" + "Words: " + str(words) + "\n") 

wordCount(filepath)

來源

2012-04-01 13:58:34

非常感謝您的幫助！ – user1074057 2012-04-01 14:04:56

@ user1074057：你也是每輸入一行打開輸出文件一次！上面的代碼在每個輸入文件中打開一次，這仍然非常低效。在您的代碼開始時打開它。此外：你計算「句子」，但不寫結果。 – 2012-04-01 20:55:12

是不是你的錯identation？我的意思是，最後一行被稱爲每行，但你真的意味着一次每個文件，不是嗎？

（此外，儘量避免「文件」作爲標識符 - 它是一個Python類型）

來源

2012-04-01 14:01:51 tiwo

python中的字數換行符

回答

相關問題