2017-09-01 118 views
1

我計算單詞/句子之間的距離,並通過scipy linkage函數運行它們,但我需要知道如何將它與原始輸入關聯起來。即由於聯動功能不接受,我沿途失去了我的標籤。Scipy:鏈接標籤到輸出Z

tl; dr; 我不知道如何將我的標籤(var X)返回到鏈接函數的輸出。

X = [ 
    "the weather is good", 
    "it is a rainy day", 
    "it is raining today", 
    "This has something to do with today", 
    "This has something to do with tomorrow", 
] 

# my magic function 

result_set = [['this has something to do with today', 'this has something to do with tomorrow', 0.95044514149501169], 
    ['this has something to do with today', 'it is a rainy day', 0.27315656750393491], 
    ['this has something to do with today', 'it is raining today', 0.21404567560988952], 
    ['this has something to do with today', 'the weather is good', 0.12284646267479128], 
    ['this has something to do with tomorrow', 'it is a rainy day', 0.28564020977046212], 
    ['this has something to do with tomorrow', 'it is raining today', 0.19174771483161279], 
    ['this has something to do with tomorrow', 'the weather is good', 0.12920110156248313], 
    ['it is a rainy day', 'it is raining today', 0.54390124565447373], 
    ['it is a rainy day', 'the weather is good', 0.20843820300588964], 
    ['it is raining today', 'the weather is good', 0.19278767792873652]] 

sims = np.array(result_set)[:, 2] 
sims = ['0.950445141495' '0.273156567504' '0.21404567561' '0.122846462675' 
    '0.28564020977' '0.191747714832' '0.129201101562' '0.543901245654' 
    '0.208438203006' '0.192787677929'] 

Z = linkage(sims, 'ward') 
Z = [[ 0.   4.   0.12284646 2.  ] 
    [ 1.   3.   0.19174771 2.  ] 
    [ 2.   5.   0.27143491 3.  ] 
    [ 6.   7.   0.70328415 5.  ]] 

回答

2

原來我輸入了一個距離函數的相似度,所以在反轉sim之後,結果確實有意義。以下確實正確顯示標籤

dendrogram(
    Z, 
    labels=X, 
    orientation="right", 
    leaf_rotation=0, # rotates the x axis labels 
    leaf_font_size=8, # font size for the x axis labels 
)