2017-05-13 186 views
0

所以我有這個數據集顯示了數十億國家的GDP(所以1萬億gdp = 1000)。Seaborn:設置distplot bin範圍?

import numpy as np 
import pandas as pd 
import seaborn as sns 
import matplotlib.pyplot as plt 
%matplotlib inline 

df = pd.read_csv('2014_World_GDP') 
df.sort('GDP (BILLIONS)',ascending=False, inplace=True) 
sorted = df['GDP (BILLIONS)'] 

fig, ax = plt.subplots(figsize=(12, 8)) 
sns.distplot(sorted,bins=8,kde=False,ax=ax) 

上面的代碼給我下圖: image

我想要做的不管是誰設置的垃圾箱範圍,使他們看起來更像[250,500,750,1000,2000,5000,10000,20000什麼]。

有沒有辦法做到這一點seaborn?

+1

每API文檔,使用'hist_kws'參數:http://seaborn.pydata.org/generated/seaborn.distplot.html#seaborn.distplot –

回答

0

您可以使用對數箱,它可以很好地處理與您的數據一樣的數據。這裏有一個例子:

import numpy as np 
import pandas as pd 
import seaborn as sns 
import matplotlib.pyplot as plt 

df = pd.DataFrame() 
df['GDP (BILLIONS)'] = 2000*1./(np.random.random(250)) 
df.sort_values(by='GDP (BILLIONS)',ascending=False, inplace=True) 

fig, ax = plt.subplots(1,2,figsize=(8, 3)) 

sns.distplot(df['GDP (BILLIONS)'].values,bins=8,kde=False,ax=ax[0]) 
ax[0].set_title('Linear Bins') 

LogMin, LogMax = np.log10(df['GDP (BILLIONS)'].min()),np.log10(df['GDP (BILLIONS)'].max()) 
newBins = np.logspace(LogMin, LogMax,8) 
sns.distplot(df['GDP (BILLIONS)'].values,bins=newBins,kde=False,ax=ax[1]) 
ax[1].set_xscale('log') 
ax[1].set_title('Log Bins') 

fig.show() 

enter image description here