2015-04-16 148 views
1

我需要遍歷已經建立的字符串,方法是將10個隨機字符串寫入文本文件,然後再讀入並找到每個字符串中每個字母的頻率。我已經把它重新讀入文件,但我不知道如何找到字符的值。任何幫助?遍歷字符串的頻率

import string 
import random 
from collections import Counter 

print "******************************" 

print "********* EXERCISE 5 *********" 

print "******************************" 

print "\n**** BEGIN RANDOM STRING *****\n" 


def random_string_generator(): 
    size = random.randint(20, 80) 
    return "".join(random.choice(string.ascii_lowercase + string.ascii_uppercase) 
        for _ in range(size)) 

def main(): 
    with open("exercise_five.dat", "w+") as f: 
     for x in range(0, 10): 
      data = random_string_generator() 
      f.write(data + "\n") 
     f.close() 
    with open("exercise_five.dat", 'r') as f: 
     count = 0 
     c = Counter() 
     for i in f: 
      print i 
     print "Count: %i" % count 

if __name__ == '__main__': main() 
print "*******************************" 

最終的輸出應該是這個樣子:

***** BEGIN RANDOM STRING ***** 
xGYMSlMHGQAMNrSzXWqphkGntMpyjMoHyRDzaNOcmVtoeAZzcV 
A ==> 2 D ==> 1 G ==> 3 H ==> 2 M ==> 5 O ==> 1 N ==> 2 Q ==> 1 S ==> 2 
R ==> 1 W ==> 1 V ==> 2 Y ==> 1 X ==> 1 Z ==> 1 a ==> 1 c ==> 2 e ==> 1 
h ==> 1 k ==> 1 j ==> 1 m ==> 1 l ==> 1 o ==> 2 n ==> 1 q ==> 1 p ==> 2 
r ==> 1 t ==> 2 y ==> 2 x ==> 1 z ==> 3 
******************************* 

我的代碼輸出,現在看起來是這樣的:

**** BEGIN RANDOM STRING ***** 

QheDRPpVwDnfYWYMJQwEedJsjApRVafvMYUYuepYSerkoMgCTnHLSHwCitBr 

zOFvifcwkrwXLxTrodqkxNxWVHdHDJZbYlcYjAUKz 

DRgFXVkbtwpRfXPjzJmXYW 

mpkVgUyvHEHAKUWpMZBYIKenicfdcBhxlqCZHFgxoFEmJjtrPykCzvQnFkTHfVthII 

zEXLmudQVlpVQYexAvGFTBeUuZvqTO 

KSRcpBlfNwcMoNViHFhS 

QhTiBLuGCsClezAiVFYODiJXAQCQjwnBnHjWqlsZlljA 

iYHznFLFeKwLtynubHTRtGGwjACdGlCpZSQcqnTSWVmufpHQRkwWYiajarnqNuzUzSC 

NWlGeJFFcYwacXuUHWqmzSJmsrnWRvpmdSesXXmECuvAMkxGYpHv 

WVAAiDgGaGnovCbbdazNGmWXARgdSfqCSztsNTPBdLumIXiDh 

******************************* 
+0

你的代碼輸出是什麼? –

回答

2

從您定義Counter的位置開始拾取,可以使用從文件讀取的每一行初始化Counter。這會給你具有鍵和值,類似於字典的Counter實例:

with open("exercise_five.dat", 'r') as f: 
    for line in f: 
     c = Counter(line) 
     print(' '.join('{} ==> {}'.format(key, val) for key, val in c.items())) 

最後一行的更深入的解釋:

>>> c = Counter("text") # initialize a Counter object with the string "text" 
>>> c.keys() # this instance has `keys` and `values`, similar to a dictionary 
dict_keys(['e', 't', 'x']) 
>>> c.items() # you can access both keys and values at the same time with `items` 
dict_items([('e', 1), ('t', 2), ('x', 1)]) 
>>> c 
Counter({'t': 2, 'e': 1, 'x': 1}) 
>>> for key, val in c.items(): 
...  print(key, val) 
... 
e 1 
t 2 
x 1 

在這一點上,你只需要使用一些字符串格式來獲得你想要的輸出格式,這就是print(' '.join(...)構造所做的。

1

defaultdict是這樣一個偉大的工具:

import collections 

occurrences = collections.defaultdict(int) 

word = 'ASDqasdqASD' 

for c in word: 
    occurrences[c] += 1 
print occurrences 
> defaultdict(<type 'int'>, {'A': 2, 'a': 1, 'D': 2, 's': 1, 'q': 2, 'S': 2, 'd': 1})