2016-09-07 56 views
1

我有多個函數,我輸入一個數組或字典以及一個路徑作爲參數,該函數將一個圖保存到特定路徑的路徑。matplotlib繪圖非常慢

試圖保持例如儘可能小,但這裏有兩個功能:

def valueChartPatterns(dict,path): 
    seen_values = Counter() 

    for data in dict.itervalues(): 
     seen_values += Counter(data.values()) 

    seen_values = seen_values.most_common() 
    seen_values_pct = map(itemgetter(1), tupleCounts2Percents(seen_values)) 
    seen_values_pct = ['{:.2%}'.format(item)for item in seen_values_pct] 

    plt.figure() 

    numberchart = plt.bar(range(len(seen_values)), map(itemgetter(1), seen_values), width=0.9,align='center') 
    plt.xticks(range(len(seen_values)), map(itemgetter(0), seen_values)) 

    plt.title('Values in Pattern Dataset') 
    plt.xlabel('Values in Data') 
    plt.ylabel('Occurrences') 

    plt.tick_params(axis='both', which='major', labelsize=6) 
    plt.tick_params(axis='both', which='minor', labelsize=6) 
    plt.tight_layout() 

    plt.savefig(path) 
    plt.clf() 

def countryChartPatterns(dict,path): 
    seen_countries = Counter() 

    for data in dict.itervalues(): 
     seen_countries += Counter(data.keys()) 

    seen_countries = seen_countries.most_common() 

    seen_countries_percentage = map(itemgetter(1), tupleCounts2Percents(seen_countries)) 
    seen_countries_percentage = ['{:.2%}'.format(item)for item in seen_countries_percentage] 

    yvals = map(itemgetter(1), seen_countries) 
    xvals = map(itemgetter(0), seen_countries) 

    plt.figure() 

    countrychart = plt.bar(range(len(seen_countries)), yvals, width=0.9,align='center') 
    plt.xticks(range(len(seen_countries)), xvals) 

    plt.title('Countries in Pattern Dataset') 
    plt.xlabel('Countries in Data') 
    plt.ylabel('Occurrences') 

    plt.tick_params(axis='both', which='major', labelsize=6) 
    plt.tick_params(axis='both', which='minor', labelsize=6) 
    plt.tight_layout() 

    plt.savefig(path) 
    plt.clf() 

一個很小的例子快譯通是,但實際字典包含56000個值:

dict = {"a": {"Germany": 20006.0, "United Kingdom": 20016.571428571428}, "b": {"Chad": 13000.0, "South Africa": 3000000.0},"c":{"Chad": 200061.0, "South Africa": 3000000.0} 
    } 

而且在我的電話:

if __name__ == "__main__": 

    plt.close('all') 

    print "Starting pattern charting...\n" 

    countryChartPatterns(dict,'newPatternCountries.png')) 

    valueChartPatterns(dict,'newPatternValues.png')) 

注意,我加載import matplotlib.pyplot as plt

在PyCharm中運行此腳本時,我在控制檯中得到了Starting pattern charting...,但函數花費了超長時間才繪製出來。

我在做什麼錯?我是否應該使用直方圖而不是條形圖,因爲這應該達到給出國家/值的出現次數的相同目標?我可以以某種方式更改我的GUI後端嗎?歡迎任何建議。

+0

我會時刻通過字典中的'for'循環需要多長時間;對我來說這看起來像是讓事情變得緩慢的嫌疑犯。除此之外;沒有一個工作的例子,我們只能猜測...... – Bart

+0

有沒有什麼標準的事情可以用'matplotlib'來加快速度?嘗試使用小數據集工作得非常好。 –

+0

你確定它是'matplotlib',它很慢嗎?因爲一個小數據集也簡化了預處理。在你真正知道哪個部分很慢之前,不要開始優化!我會使用最簡單的所有計時器(或查看Python分析器); '進口時間; T0 =了time.time(); your_code;打印(time.time() - t0')並且在(1)數據的預處理('for'循環和其他所有內容)和(2)繪圖部分周圍放置這樣一個計時器。我很好奇 – Bart

回答

1

這是我在上面提到的意見,導致測試:

Elapsed pre-processing = 13.79 s 
Elapsed plotting = 0.17 s 
Pre-processing/plotting = 83.3654562565 

測試腳本:

import matplotlib.pylab as plt 
from collections import Counter 
from operator import itemgetter 
import time 

def countryChartPatterns(dict,path): 
    # pre-processing ------------------- 
    t0 = time.time() 

    seen_countries = Counter() 

    for data in dict.itervalues(): 
     seen_countries += Counter(data.keys()) 

    seen_countries = seen_countries.most_common() 

    yvals = map(itemgetter(1), seen_countries) 
    xvals = map(itemgetter(0), seen_countries) 

    dt1 = time.time() - t0 
    print("Elapsed pre-processing = {0:.2f} s".format(dt1)) 

    t0 = time.time() 

    # plotting ------------------- 
    plt.figure() 

    countrychart = plt.bar(range(len(seen_countries)), yvals, width=0.9,align='center') 
    plt.xticks(range(len(seen_countries)), xvals) 

    plt.title('Countries in Pattern Dataset') 
    plt.xlabel('Countries in Data') 
    plt.ylabel('Occurrences') 

    plt.tick_params(axis='both', which='major', labelsize=6) 
    plt.tick_params(axis='both', which='minor', labelsize=6) 
    plt.tight_layout() 

    plt.savefig(path) 
    plt.clf() 

    dt2 = time.time() - t0 
    print("Elapsed plotting = {0:.2f} s".format(dt2)) 
    print("Pre-processing/plotting = {}".format(dt1/dt2)) 

if __name__ == "__main__": 
    import random as rd 
    import numpy as np 

    countries = ["United States of America", "Afghanistan", "Albania", "Algeria", "Andorra", "Angola", "Antigua & Deps", "Argentina", "Armenia", "Australia", "Austria", "Azerbaijan"] 

    def item(): 
     return {rd.choice(countries): np.random.randint(1e3), rd.choice(countries): np.random.randint(1e3)} 
    dict = {} 
    for i in range(1000000): 
     dict[i] = item() 

    print("Starting pattern charting...") 

    countryChartPatterns(dict,'newPatternCountries.png') 
+0

瞭解。感謝您指出。我閱讀了一些關於更改GUI後端的東西,但顯然不是我的問題。謝謝 –