Python中的頻率分析 - 使用頻率而不是數字打印字母

s=array1 #user inputs an array with text in it 
n=len(s) 
f=arange(0,26,1) 
import collections 
dict = collections.defaultdict(int) 
for c in s: 
    dict[c] += 1 

for c in f: 
    print c,dict[c]/float(n)

在輸出中，c是數字而不是字母，我不知道如何將其轉換回字母。Python中的頻率分析 - 使用頻率而不是數字打印字母

此外，是否有任何方法將頻率/字母放入數組，以便可以將它們繪製在直方圖中？

來源

2011-05-08 PythonAlex

什麼是IntArrayToText調用？它是一個字符串嗎？ – 2011-05-08 03:48:52

要將一個數轉換爲它所代表的字母，只需使用內置chr：

>>> chr(98) 
'b' 
>>> chr(66) 
'B' 
>>>

來源

2011-05-08 03:42:22

應該指出的是，你是不是叫map用正確類型的參數（因此TypeError）。它需要一個函數和一個或多個迭代器，函數將應用於該函數。你的第二個參數是toChar [i]，這將是一個字符串。所有迭代實現__iter__。爲了說明：

>>> l, t = [],() 
>>> l.__iter__ 
<<< <method-wrapper '__iter__' of list object at 0x7ebcd6ac> 
>>> t.__iter__ 
<<< <method-wrapper '__iter__' of tuple object at 0x7ef6102c>

DTing's answer提醒我的collections.Counter：

>>> from collections import Counter 
>>> a = 'asdfbasdfezadfweradf' 
>>> dict((k, float(v)/len(a)) for k,v in Counter(a).most_common()) 
<<< 
{'a': 0.2, 
'b': 0.05, 
'd': 0.2, 
'e': 0.1, 
'f': 0.2, 
'r': 0.05, 
's': 0.1, 
'w': 0.05, 
'z': 0.05}

來源

2011-05-08 03:53:08 zeekay

+1我從來沒有使用過，謝謝！ =） – DTing 2011-05-08 05:21:50

>>> a = "asdfbasdfezadfweradf" 
>>> import collections 
>>> counts = collections.defaultdict(int) 
>>> for letter in a: 
...  counts[letter]+=1 
... 
>>> print counts 
defaultdict(<type 'int'>, {'a': 4, 'b': 1, 'e': 2, 'd': 4, 'f': 4, 's': 2, 'r': 1, 'w': 1, 'z': 1}) 
>>> hist = dict((k, float(v)/len(a)) for k,v in counts.iteritems()) 
>>> print hist 
{'a': 0.2, 'b': 0.05, 'e': 0.1, 'd': 0.2, 'f': 0.2, 's': 0.1, 'r': 0.05, 'w': 0.05, 'z': 0.05}

來源

2011-05-08 04:33:39 DTing

不錯！讓我想起'collections.Counter'。 – zeekay 2011-05-08 05:03:05

到頻率/字母轉換成數組：

hisArray = [dict[c]/float(n) for c in f]

來源

2011-05-08 04:35:45

如果您正在使用Python 2.7或更高您可以使用collections.Counter。

的Python 2.7+

>>> import collections 
>>> s = "I want to count frequencies." 
>>> counter = collections.Counter(s) 
>>> counter 
Counter({' ': 4, 'e': 3, 'n': 3, 't': 3, 'c': 2, 'o': 2, 'u': 2, 'a': 1, 'f': 1, 'I': 1,  'q': 1, 'i': 1, 's': 1, 'r': 1, 'w': 1, '.': 1}) 
>>> n = sum(counter.values()) * 1.0 # Convert to float so division returns float. 
>>> n 
28 
>>> [(char, count/n) for char, count in counter.most_common()] 
[(' ', 0.14285714285714285), ('e', 0.10714285714285714), ('n', 0.10714285714285714), ('t', 0.10714285714285714), ('c', 0.07142857142857142), ('o', 0.07142857142857142), ('u', 0.07142857142857142), ('a', 0.03571428571428571), ('f', 0.03571428571428571), ('I', 0.03571428571428571), ('q', 0.03571428571428571), ('i', 0.03571428571428571), ('s', 0.03571428571428571), ('r', 0.03571428571428571), ('w', 0.03571428571428571), ('.', 0.03571428571428571)]

的Python 3+

>>> import collections 
>>> s = "I want to count frequencies." 
>>> counter = collections.Counter(s) 
>>> counter 
Counter({' ': 4, 'e': 3, 'n': 3, 't': 3, 'c': 2, 'o': 2, 'u': 2, 'a': 1, 'f': 1, 'I': 1,  'q': 1, 'i': 1, 's': 1, 'r': 1, 'w': 1, '.': 1}) 
>>> n = sum(counter.values()) 
>>> n 
28 
>>> [(char, count/n) for char, count in counter.most_common()] 
[(' ', 0.14285714285714285), ('e', 0.10714285714285714), ('n', 0.10714285714285714), ('t', 0.10714285714285714), ('c', 0.07142857142857142), ('o', 0.07142857142857142), ('u', 0.07142857142857142), ('a', 0.03571428571428571), ('f', 0.03571428571428571), ('I', 0.03571428571428571), ('q', 0.03571428571428571), ('i', 0.03571428571428571), ('s', 0.03571428571428571), ('r', 0.03571428571428571), ('w', 0.03571428571428571), ('.', 0.03571428571428571)]

這也將在按頻率的降序返回（炭，頻率）元組。

來源

2011-05-08 05:08:00

Python中的頻率分析 - 使用頻率而不是數字打印字母

回答

相關問題