2011-08-05 65 views
9

我有一套隨機生成的形式圖,我想計算每一個的熵。同樣的問題換句話說:我有幾個網絡,並且想要計算每個網絡的信息內容。如何計算圖的熵?

下面是含有圖熵的正式定義兩個來源:
http://www.cs.washington.edu/homes/anuprao/pubs/CSE533Autumn2010/lecture4.pdf(PDF) http://arxiv.org/abs/0711.4175v1

我尋找的代碼採用的曲線圖作爲輸入(爲邊緣列表或鄰接矩陣)和輸出信息內容的許多比特或其他度量。

因爲我無法在任何地方找到這個實現,所以我打算根據正式定義從頭開始編碼。如果任何人已經解決了這個問題,並願意分享代碼,它將不勝感激。

回答

5

最後我用不同的文件供圖熵的定義:
複雜網絡的信息理論:關於進化和建築約束
R.V.鞋底和S.巴爾韋德(2004)

網絡基於拓撲結構及其在隨機網絡計算熵
B.H。 Wang,W.X.王和T.周

計算每個的代碼如下。該代碼假定你有一個沒有自循環的無向圖,未加權圖。它將一個鄰接矩陣作爲輸入,並以位爲單位返回熵的數量。它在R中實現並使用sna package

graphEntropy <- function(adj, type="SoleValverde") { 
    if (type == "SoleValverde") { 
    return(graphEntropySoleValverde(adj)) 
    } 
    else { 
    return(graphEntropyWang(adj)) 
    } 
} 

graphEntropySoleValverde <- function(adj) { 
    # Calculate Sole & Valverde, 2004 graph entropy 
    # Uses Equations 1 and 4 
    # First we need the denominator of q(k) 
    # To get it we need the probability of each degree 
    # First get the number of nodes with each degree 
    existingDegrees = degree(adj)/2 
    maxDegree = nrow(adj) - 1 
    allDegrees = 0:maxDegree 
    degreeDist = matrix(0, 3, length(allDegrees)+1) # Need an extra zero prob degree for later calculations 
    degreeDist[1,] = 0:(maxDegree+1) 
    for(aDegree in allDegrees) { 
    degreeDist[2,aDegree+1] = sum(existingDegrees == aDegree) 
    } 
    # Calculate probability of each degree 
    for(aDegree in allDegrees) { 
    degreeDist[3,aDegree+1] = degreeDist[2,aDegree+1]/sum(degreeDist[2,]) 
    } 
    # Sum of all degrees mult by their probability 
    sumkPk = 0 
    for(aDegree in allDegrees) { 
    sumkPk = sumkPk + degreeDist[2,aDegree+1] * degreeDist[3,aDegree+1] 
    } 
    # Equivalent is sum(degreeDist[2,] * degreeDist[3,]) 
    # Now we have all the pieces we need to calculate graph entropy 
    graphEntropy = 0 
    for(aDegree in 1:maxDegree) { 
    q.of.k = ((aDegree + 1)*degreeDist[3,aDegree+2])/sumkPk 
    # 0 log2(0) is defined as zero 
    if (q.of.k != 0) { 
     graphEntropy = graphEntropy + -1 * q.of.k * log2(q.of.k) 
    } 
    } 
    return(graphEntropy) 
} 

graphEntropyWang <- function(adj) { 
    # Calculate Wang, 2008 graph entropy 
    # Uses Equation 14 
    # bigN is simply the number of nodes 
    # littleP is the link probability. That is the same as graph density calculated by sna with gden(). 
    bigN = nrow(adj) 
    littleP = gden(adj) 
    graphEntropy = 0 
    if (littleP != 1 && littleP != 0) { 
    graphEntropy = -1 * .5 * bigN * (bigN - 1) * (littleP * log2(littleP) + (1-littleP) * log2(1-littleP)) 
    } 
    return(graphEntropy) 
} 
+4

順便說一下,一旦我實現這些函數並計算真實圖的熵,我對這些措施感到失望。 Wang度量僅取決於圖的大小和密度,並且根本不考慮圖的結構。它主要是衡量密度。唯一度量反映了節點之間剩餘度數的多樣性。它比其他任何東西都更能反映出一種協調性。我仍然無法量化多麼複雜或不是一個圖。 –

1

如果你有一個加權圖,一個好的開始將是對所有權重進行排序和計數。然後,您可以使用公式-log(p)+ log(2)(http://en.wikipedia.org/wiki/Binary_entropy_function)來確定代碼所需的位數。也許這不起作用,因爲它是二元熵函數?

0

您可以使用Koerner's entropy(=香農熵應用於圖)。文獻的一個很好的參考是here。但請注意,計算通常是NP-hard(對於需要搜索頂點所有子集的愚蠢原因)。