在Seaborn FacetGrid圖上繪製不同'色調'數據的平均線

我正在與泰坦尼克號的乘客數據集（從Kaggle）一起作爲Udacity課程的一部分。我使用Seaborn FacetGrid來查看Travel類和性別的乘客年齡分佈概況 - 色調爲'Survived'（1/0）。在Seaborn FacetGrid圖上繪製不同'色調'數據的平均線

情節運行良好，我想爲每個子區域添加垂直平均線 - 但是對於每個子區域（1/0）中兩個「色調」中的每一個，使用不同的顏色（以及不同的註釋）。下面代碼中的'vertical_mean_line'函數在沒有多個「色調」數據的情節中效果很好 - 但我無法找到爲每種色調繪製不同線條的方法

任何想法如果可以在Seaborn中執行此操作？

電流Seaborn FacetGrid情節輸出：

Seaborn FacetGrid plot

代碼：

sns.set() 
sns.set_context('talk') 
sns.set_style('darkgrid') 
grid = sns.FacetGrid(titanic_data.loc[titanic_data['is_child_def'] == False], col='Sex', row = 'Pclass', hue='Survived' ,size=3.2, aspect=2) 
grid.map(sns.kdeplot, 'Age', shade=True) 
grid.set(xlim=(14, titanic_data['Age'].max()), ylim=(0,0.06)) 
grid.add_legend() 


# Add vertical lines for mean age on each plot 
def vertical_mean_line_survived(x, **kwargs): 
    plt.axvline(x.mean(), linestyle = '--', color = 'g') 
    #plt.text(x.mean()+1, 0.052, 'mean = '+str('%.2f'%x.mean()), size=12) 
    #plt.text(x.mean()+1, 0.0455, 'std = '+str('%.2f'%x.std()), size=12) 

grid.map(vertical_mean_line_survived, 'Age') 

# Add text to each plot for relevant popultion size 
# NOTE - don't need to filter on ['Age'].isnull() for children, as 'is_child'=True only possible for children with 'Age' data 
for row in range(grid.axes.shape[0]): 
    grid.axes[row, 0].text(60.2, 0.052, 'Survived n = '+str(titanic_data.loc[titanic_data['Pclass']==row+1].loc[titanic_data['is_child_def']==False].loc[titanic_data['Age'].isnull()==False].loc[titanic_data['Survived']==1]['is_male'].sum()), size = 12) 
    grid.axes[row, 1].text(60.2, 0.052, 'Survived n = '+str(titanic_data.loc[titanic_data['Pclass']==row+1].loc[titanic_data['is_child_def']==False].loc[titanic_data['Age'].isnull()==False].loc[titanic_data['Survived']==1]['is_female'].sum()), size = 12) 
    grid.axes[row, 0].text(60.2, 0.047, 'Perished n = '+str(titanic_data.loc[titanic_data['Pclass']==row+1].loc[titanic_data['is_child_def']==False].loc[titanic_data['Age'].isnull()==False].loc[titanic_data['Survived']==0]['is_male'].sum()), size = 12) 
    grid.axes[row, 1].text(60.2, 0.047, 'Perished n = '+str(titanic_data.loc[titanic_data['Pclass']==row+1].loc[titanic_data['is_child_def']==False].loc[titanic_data['Age'].isnull()==False].loc[titanic_data['Survived']==0]['is_female'].sum()), size = 12) 



grid.set_ylabels('Frequency density', size=12) 

# Squash down a little and add title to facetgrid  
plt.subplots_adjust(top=0.9) 
grid.fig.suptitle('Age distribution of adults by Pclass and Sex for Survived vs. Perished')

來源

2017-07-06 chrisrb10

我花了一段時間來重現問題。你能否請下次問一個問題，產生一個可以直接複製和粘貼的[mcve]。您實際上並不需要這種複雜的數據框來問一個關於FacetGrid映射中色調的問題，對吧？ – ImportanceOfBeingErnest

kwargs的包含標籤和相應的色調的顏色。因此，使用

def vertical_mean_line_survived(x, **kwargs): 
    ls = {"0":"-","1":"--"} 
    plt.axvline(x.mean(), linestyle =ls[kwargs.get("label","0")], 
       color = kwargs.get("color", "g")) 
    txkw = dict(size=12, color = kwargs.get("color", "g"), rotation=90) 
    tx = "mean: {:.2f}, std: {:.2f}".format(x.mean(),x.std()) 
    plt.text(x.mean()+1, 0.052, tx, **txkw)

我們會得到

來源

2017-07-07 01:35:08 ImportanceOfBeingErnest

非常感謝 - 這很好。併爲過長的問題代碼道歉 - 我是一個相對的stackoverflow新手。在實際數據上，總體0,1的平均線非常接近，因此xmean（）+ 1的對齊可以覆蓋它們。其中提出了2個後續步驟： 1）如何通過色調參數改變文本位置參數？ 2）是否有函數返回kde曲線的最大y值（所以我設置y的合作伙伴相對於那個？非常感謝。 – chrisrb10

1.你得到色調參數爲'kwargs.get（「標籤「），所以你可以做'如果kwargs.get（」label「）==」0「：... else：...'併爲這兩種情況設置不同的位置。2.問題是你會需要得到標記函數中kde曲線的y值，我想你可以重新計算它內部的kde曲線，例如使用[scipy.stats.gaussian_kde。]（https://docs.scipy.org/doc/scipy -0.19.0/reference/generated/scipy.stats.gaussian_kde.html），然後取其最大值，但看起來似乎有點矯枉過正 – ImportanceOfBeingErnest

Thanks。'kwargs.get（'label'）'完美地工作，同意重新計算標籤位置的kde曲線是過度殺傷 - 現在太雄心勃勃了。 – chrisrb10

在Seaborn FacetGrid圖上繪製不同'色調'數據的平均線

回答

相關問題