2017-08-16 38 views
1

我想要統計所有.txt文件中的所有As和Bs和Cs,並提供一個.csv文件來列出所有這些字母中的一個。通過所有輸入文件循環任務python

這裏的代碼做了我想要的,但只有最後一個文件我提供,而不是所有的人。

我在做什麼錯?

import glob 
import csv 

#This will print out all files loaded in the same directory and print them out 
for filename in glob.glob('*.txt*'): 
    print(filename) 

#A B and C 
substringA = "A" 
Head1 = (open(filename, 'r').read().count(substringA)) 
substringB = "B" 
Head2 = (open(filename, 'r').read().count(substringB)) 
substringC = "C" 
Head3 = (open(filename, 'r').read().count(substringC)) 
header = ("File", "A Counts" ,"B Counts" ,"C Counts") 
analyzed = (filename, Head1, Head2, Head3) 

#This will write a file named Analyzed.csv 
with open('Analyzed.csv', 'w', newline='') as csvfile: 
    writer = csv.writer(csvfile) 
    writer.writerow(header) 
    writer.writerow(analyzed) 
+0

是在for循環或外面計算'A'' B'和'C'的代碼? –

+2

只是將你的計數代碼4個空格移動到右邊,以便在'for'循環中:) – 9dogs

+0

我認爲這正是我的問題。我不知道如何循環遍歷所有文件的代碼。 – Scarlett

回答

1

還有另一個小改變你需要做的:你需要打開作爲追加,不寫,以及縮進。請注意,當您以追加方式打開時,您將不會覆蓋之前存在的任何內容,因此我在頂部添加了部分以刪除已在csv中的任何內容。

import glob 
import csv 


#This will delete anything in Analzyed.csv if it exists and replace it with the header 
with open('Analyzed.csv','w') as csvfile: 
    writer = csv.writer(csvfile) 
    header = ("File", "A Counts" ,"B Counts" ,"C Counts") 
    writer.writerow(header) 

for filename in glob.glob('*.txt*'): 
    print(filename) 

    #A B and C 
    substringA = "A" 
    Head1 = (open(filename, 'r').read().count(substringA)) 
    substringB = "B" 
    Head2 = (open(filename, 'r').read().count(substringB)) 
    substringC = "C" 
    Head3 = (open(filename, 'r').read().count(substringC)) 
    header = ("File", "A Counts" ,"B Counts" ,"C Counts") 
    analyzed = (filename, Head1, Head2, Head3) 

    #This will write a file named Analyzed.csv 
    with open('Analyzed.csv', 'a', newline='') as csvfile: 
     writer = csv.writer(csvfile) 
     writer.writerow(analyzed) 

以上是我的解決方案,儘可能保持儘可能多的代碼不變。不過,理想情況下,您只能在文件的開頭打開一次文件。這是你將如何做到這一點:

import glob 
import csv 


with open('Analyzed.csv','w') as csvfile: 
    writer = csv.writer(csvfile) 
    header = ("File", "A Counts" ,"B Counts" ,"C Counts") 
    writer.writerow(header) 

    for filename in glob.glob('*.txt*'): 
     print(filename) 

     #A B and C 
     substringA = "A" 
     Head1 = (open(filename, 'r').read().count(substringA)) 
     substringB = "B" 
     Head2 = (open(filename, 'r').read().count(substringB)) 
     substringC = "C" 
     Head3 = (open(filename, 'r').read().count(substringC)) 
     analyzed = (filename, Head1, Head2, Head3) 

     writer.writerow(analyzed) 
+0

非常感謝。這很好。如何避免它多次寫入表頭。 writer.writerow(header)只是第一次這樣做。 – Scarlett

+0

哎呀,我沒有注意到那部分,我會添加一個修復程序 – bendl

+0

請參閱我對該修復程序的編輯。 – bendl

2

壓痕失蹤,追加模式a開放Analyzed.csv

import glob 
import csv 

#This will print out all files loaded in the same directory and print them out 
for filename in glob.glob('*.txt*'): 
    print(filename) 

    #A B and C 
    substringA = "A" 
    Head1 = (open(filename, 'r').read().count(substringA)) 
    substringB = "B" 
    Head2 = (open(filename, 'r').read().count(substringB)) 
    substringC = "C" 
    Head3 = (open(filename, 'r').read().count(substringC)) 
    header = ("File", "A Counts" ,"B Counts" ,"C Counts") 
    analyzed = (filename, Head1, Head2, Head3) 

    #This will write a file named Analyzed.csv 
    with open('Analyzed.csv', 'a') as csvfile: 
     writer = csv.writer(csvfile) 
     writer.writerow(header) 
     writer.writerow(analyzed) 

編輯:刪除不支持newline=""參數

+2

將覆蓋Analyzed.txt – bendl

+0

您必須在for之前打開輸出文件或以追加模式打開輸出文件(但不會擦除先前的運行數據)。 – Malexandre

+0

我將它改爲在追加模式下打開'a' – Andras

0

你可以試試這個:

from itertools import chain 
from collections import Counter 
for filename in glob.glob('*.txt*'): 
    data = chain.from_iterable([list(i.strip("\n")) for i in open(filename)]) 

    the_count = Counter(data) 
    with open('Analyzed.csv', 'w', newline='') as csvfile: 
     writer = csv.writer(csvfile) 
     writer.writerow(filename) 
     writer.writerow("A count: {}".format(the_count["A"])) 
     writer.writerow("B count: {}".format(the_count["B"])) 
     writer.writerow("C count: {}".format(the_count["C"]))