2013-05-08 81 views
0

我試圖通過創建與Networkx python包和Gaphi關係圖來下載Twitter帳戶和追隨者信息試圖可視化數據。我曾與下載的數據創建python keyError使用networkx包時可視化數據

import networkx as nx 
    import MySQLdb 

    conn = MySQLdb.connect(host="localhost", # your host, usually localhost 
       user="root", # your username 
        passwd="123456", # your password 
        db="twitterbank") # name of the data base 
    cur = conn.cursor() 

    def get_user_info(m): 
     cur.execute("SELECT tweeter_name FROM tweets_fetch where tweeter_id=%s" %m) 

    g=nx.Graph() 

    def add_node_tw(n,weight=None,time=None,location=None): 
     if not g.has_node(n): 
      screen_name=get_user_info(n) 
      g.add_node(n) 
      g.node[n]['weight']=1 
      g.node[n]["screen_name"]=screen_name 
     else: 
      g.node[n]['weight']+=1 

    def add_edge_tw(n1,n2,weight=None): 
     if not g.has_edge(n1,n2): 
      g.add_edge(n1,n2) 
      g[n1][n2]['weight']=1 
     else: 
      g[n1][n2]['weight']+=1 

    #generate set of users 

    users=set() 
    cur.execute("SELECT distinct tweeter_id FROM tweets_fetch") 
    cur.fetchall() 
    for row in cur: 
      users.add(row[0]) 


    g=nx.DiGraph() 

    for u_id in users: 
     add_node_tw(u_id) 
     cur.execute("select * from tweeter_followers where tweeter_id=%s" %u_id) 
     cur.fetchall() 
     for row1 in cur: 
      if row1[0] in users: 
       add_node_tw(row1[0]) 
       add_edge_tw(row1[0],row1[1]) 
    nx.write_graphml(g,'relationship_graphml') 

兩個表是:
tweets_fetch: with columns (tweeter_id, tweeter_name, tweet_content, datetime...)
tweeter_followers: with columns (tweeter_id, follower_id)

當我執行上面的代碼,錯誤如下蹦出:

Traceback (most recent call last): 
    File "D:\Sepups\eclipse-SDK-3.7.1-win32- x86_64\eclipse\plugins\org.python.pydev_2.7.3.2013031601\pysrc\pydevd.py", line 1397, in <module> 
    debugger.run(setup['file'], None, None) 
    File "D:\Sepups\eclipse-SDK-3.7.1-win32-x86_64\eclipse\plugins\org.python.pydev_2.7.3.2013031601\pysrc\pydevd.py", line 1090, in run 
    pydev_imports.execfile(file, globals, locals) #execute the script 
    File "D:\java\python\workspace\tweetsHarvest\src\tweet_graph.py", line 47, in <module> 
    add_node_tw(u_id) 
    File "D:\java\python\workspace\tweetsHarvest\src\tweet_graph.py", line 24, in add_node_tw 
    g.node[n]['weight']+=1 
    KeyError: 'weight' 

任何人知道如何要解決這個問題?我真的是python和Gephi的新手。我創建我的代碼時提到我創建的代碼是http://giladlotan.com/blog/mapping-twitters-python-data-science-communities/

回答

0

我創建了一個基於相同代碼的腳本,並且使用一個數據集具有相同的錯誤。如果您遇到與我相同的問題,那麼您的數據中的某些行存在一些問題。對我而言,這只是幾千條邊緣中的一小部分。要診斷出現問題的位置,可以在add_edge_tw語句之前打印每行,並在add_edge_tw之前添加try/except子句。

我相信其他擅長Python和NetworkX的人可以給出更好的答案,但希望這有助於您在診斷時快速修復。