在Python中創建一個簡單的點圖

我應該在一個語料庫中計算n-gram，並創建一個點圖來顯示單詞的等級和他們的計數，作爲驗證Zipf's law的練習。最終的結果應該例如是這個樣子：在Python中創建一個簡單的點圖

import nltk 
with open(r'./1.txt', 'r') as file: 
    text = file.read() 
    file.close() 

tokens = nltk.word_tokenize(text) 
tokens = [token.lower() for token in tokens if len(token) > 1] 
fdist = nltk.FreqDist(tokens) 
ranks = fdist.most_common()

這給我的2一長串：

enter image description here

我使用NLTK這樣提取的分佈（這裏只爲對unigram）所有單詞的數量和數量從最常見到最少排列。

我想知道我應該從這裏出發。我只需要在雙軸平面上繪製它。我沒有安裝matpotlib/numpy，並且沒有任何這些庫的經驗。然而，我有Microsoft Excel，所以我想知道如果我可以以Excel可讀的格式導出這些數據並將其繪製在那裏。

來源

2015-01-14 Morteza R

可以使用'csv'模塊中的標準庫導出爲CSV格式，其中的Excel可以讀取。一旦你安裝了它，使用'matplotlib'生成這個圖很容易。 – Marius

謝謝，我不知道如何使用matpotlib。如果您詳細說明，我會知道如何使用該庫來達到我的目的，我會將其視爲正確的答案。 –

@schmutter你有沒有使用MATLAB來製作一個情節？ 'matplotlib'的API非常相似，值得借鑑。 [Here]（http://matplotlib.org/users/pyplot_tutorial.html）是一個非常簡單的教程。 – jme

如果您打算使用python進行繪圖，請安裝matplotlib。將您的數據分成兩個向量，x和y。相應的條目是x和y值。

後來乾脆

import pylab 
pylab.plot(x, y, '.') 
pylab.savefig('myfilename.pdf')

的 ''告訴它繪製點。

您可以保存除.pdf以外的大量格式要以另一種格式保存，只需將.pdf擴展名更改爲您想要的格式即可。如果這是一個可接受的格式，它會做到這一點。

來源

2015-01-14 03:38:20 Joel

以下線將繪製你的您請求的數據使用matplotlib方式：

import matplotlib.pyplot as plt 
plt.plot(range(len(ranks)), [r[1] for r in ranks], 'ro') 
plt.ylim([0,12]) 
plt.xlim([0,10]) 
plt.show()

安裝matplotlib很簡單。看到這裏說明您的操作系統：http://matplotlib.org/users/installing.html

來源

2015-01-14 03:45:07 bsa

您可以創建一個Excel scatter plot using XlsxWriter：

enter image description here

來源

2015-01-14 10:26:59 jmcnamara

在Python中創建一個簡單的點圖

回答

相關問題