我想從Excel文檔中字符串及其相應數據值的含義中創建一個Python中的詞雲。 generate_from_frequencies方法需要一個頻率參數,文檔說它應該採用一系列元組。Python中的generate_from_frequencies方法所需的元組數組元組的方法wordcloud中的generate_from_frequencies方法
從 wordcloud source code部分代碼:
def generate_from_frequencies(self, frequencies):
"""Create a word_cloud from words and frequencies.
Parameters
----------
frequencies : array of tuples
A tuple contains the word and its frequency.
Returns
-------
self
"""
# make sure frequencies are sorted and normalized
frequencies = sorted(frequencies, key=item1, reverse=True)
frequencies = frequencies[:self.max_words]
# largest entry will be 1
max_frequency = float(frequencies[0][1])
frequencies = [(word, freq/max_frequency) for word, freq in frequencies]
我使用普通的列表試過,然後我試圖從一個numpy的ndarray,但PyCharm顯示警告的參數類型應該是array.py,我讀的是隻應該採取字符,整數和浮點數(array.py docs):
This module defines an object type which can compactly represent an array of basic values: characters, integers, floating point numbers.
我的測試代碼:
import os
import numpy
import wordcloud
d = os.path.dirname(__file__)
cloud = wordcloud.WordCloud()
array = numpy.array([("hi", 6), ("seven"), 17])
cloud.generate_from_frequencies(array) # <= what should go in the parentheses
如果我不顧PyCharm警告運行上面的代碼,我碰到下面的錯誤,我想告訴我的另一種方式,它不能接受ndarray類型:
File "C:/Users/Caitlin/Documents/BioDataSorter/tag_cloud_test.py", line 8, in <module>
cloud.generate_from_frequencies(array) # <= what should go in the parentheses
File "C:\Python34\lib\site-packages\wordcloud\wordcloud.py", line 263, in generate_from_frequencies
frequencies = sorted(frequencies, key=item1, reverse=True)
TypeError: 'int' object is not subscriptable
另一個潛在的問題可能是wordcloud是用Python 2編寫的,但我使用的是Python 3.4,它可能導致某些代碼不可用。我應該通過哪種類型的方法?
感謝您的回答!我將該行改爲:'cloud.generate_from_frequencies(((「hi」,float(6 /(6 + 17)),(「seven」,float(17 /(6 + 17))))))',但我得到了一個TypeError:Float對象不是可以下載的,所以我將float更改爲int並且它工作。 – CCCodes