2015-10-26 41 views
0

首先,感謝閱讀並可能對此作出響應。python igraph在fast_community.c錯誤:553

現在的問題: 我在Python 2.7版,並且我得到試圖找到使用fastgreedy算法在我的圖形社區,當這個錯誤:

--------------------------------------------------------------------------- 
InternalError        Traceback (most recent call last) 
<ipython-input-180-3b8456851658> in <module>() 
----> 1 dendrogram = g_summary.community_fastgreedy(weights=edge_frequency.values()) 

/usr/local/lib/python2.7/site-packages/igraph/__init__.pyc in community_fastgreedy(self, weights) 
    959   in very large networks. Phys Rev E 70, 066111 (2004). 
    960   """ 
--> 961   merges, qs = GraphBase.community_fastgreedy(self, weights) 
    962 
    963   # qs may be shorter than |V|-1 if we are left with a few separated 

InternalError: Error at fast_community.c:553: fast-greedy community finding works only on graphs without multiple edges, Invalid value 

這是我建立了我的圖表:

import igraph as ig 
vertices = words #about 600 words from a number of news articles: ['palestine', 'israel', 'hamas, 'nasa', 'mercury', 'water', ...] 

gen = ig.UniqueIdGenerator() 
[gen[word] for word in vertices] #generate word-to-integer mapping as each edge has to be between integer ids (words) 

edges = [] 
for ind in xrange(articles.shape[0]): # articles is a pandas dataframe; each row corresponds to an article; one column is 'top_words' which includes the top few words of each article. The above list *words* is the unique union set of top_words for all articles. 
    words_i = articles['top_words'].values[ind] # for one article, this looks like ['palestine','israel','hamas'] 
    edges.extend([(gen[x[0]],gen[x[1]]) for x in combinations(words_i,2)]) #basically there is an edge for each pair of top_words in a given article. For the example article above, we get edges between israel-palestine, israel-hamas, palestine-hamas. 

unique_edges = list(set(edges)) 
unique_edge_frequency = {} 
for e in unique_edges: 
    unique_edge_frequency[e] = edges.count(e)  

g = ig.Graph(vertex_attrs={"label": vertices}, edges=unique_edges, directed=False) 
g.es['width'] = np.asarray([unique_edge_frequency[e] for e in unique_edge_frequency.keys()])*1.0/max(unique_edge_frequency.values()) 

而這正是引發錯誤:

dendrogram = g.community_fastgreedy(weights=g.es['width']) 

我在做什麼錯了?

回答

2

您的圖包含多個邊(即同一對節點之間有多個邊)。快速貪婪的社區檢測在這樣的圖表上不起作用;您必須使用g.simplify()將多個邊摺疊爲單個邊。

它似乎也正在試圖根據同一對頂點之間有多少邊緣來設置邊緣的"width"屬性。如果不建立unique_edges然後unique_edge_frequency的,你可以簡單地這樣做:

g = Graph(edges, directed=False) 
g.es["width"] = 1 
g.simplify(combine_edges={ "width": "sum" }) 

這只是第一次創建具有多個邊的圖,然後寬度爲1的分配給每個邊緣,終於崩潰了多條邊成單同時總結它們的寬度。

+0

一如既往的優雅! – JRun