2017-06-29 203 views
1

我如何使用熊貓製作這樣的東西?通過堆列來重塑熊貓DataFrame

in: 
data = {post1: [like1, like2], 
     post2: [like1, like2, like3, like4], 
     post3: [like1, like2, like3] 
     } 

out: 
post1 like1 
post1 like2 
post2 like1 
post2 like2 
post2 like3 
post2 like4 
post3 like1 
post3 like2 
post3 like3 

我試過這段代碼,但是因爲列表的長度不同而失敗了。我可以通過製作大量的數據幀並添加它們來實現,但速度很慢。

def run(): 
    result = {} 

    for link in links: 
     result[link] = id2screen(get_likes(link)) 

    df = DataFrame.from_dict(result) 
    stacked = df.set_index(keys).stack() 

    stacked.to_excel(r'C:\Users\user\Desktop\out.xlsx', 
        index=False) 

run() 

回答

0

from_dictorient='index'更能容忍不同長度的數據:

pd.DataFrame.from_dict(data, orient='index') 
Out[32]: 
      0  1  2  3 
post1 like1 like2 None None 
post3 like1 like2 like3 None 
post2 like1 like2 like3 like4 

然而,

pd.DataFrame.from_dict(data, orient='index').stack() 

給出:

Out[40]: 
post1 0 like1 
     1 like2 
post3 0 like1 
     1 like2 
     2 like3 
post2 0 like1 
     1 like2 
     2 like3 
     3 like4 
dtype: object 

因此得到PICT ured target output,you can add .reset_index(level=1, drop=True)

pd.DataFrame.from_dict(data, orient='index').stack().reset_index(level=1, 
                   drop=True) 
Out[34]: 
post1 like1 
post1 like2 
post3 like1 
post3 like2 
post3 like3 
post2 like1 
post2 like2 
post2 like3 
post2 like4 
dtype: object