從字典中創建一個數據框，其中值是可變長度列表

我有一個字典，其中的值是一個列表，例如;從字典中創建一個數據框，其中值是可變長度列表

my_dict = {1: [964725688, 6928857], 
      ... 

      22: [1667906, 35207807, 685530997, 35207807], 
      ... 
      }

在這個例子中，列表中的最大項目是4，但它可能會比這個更大。

我想將它轉化成一個數據幀，如：

1 964725688 
1 6928857 
... 
22 1667906 
22 35207807 
22 685530997 
22 35207807

來源

2017-05-11 spitfiredd

稍有差別。這個問題在列表中有一個固定數量的項目，而在我的情況下，列表中的項目數量是可變的。 – spitfiredd

[將具有值列表的字典轉換爲數據幀]的可能重複（http://stackoverflow.com/questions/25292568/converting-a-dictionary-with-lists-for-values-into-a-dataframe） –

#Load dict directly to a Dataframe without loops 
df=pd.DataFrame.from_dict(my_dict,orient='index') 

#Unstack, drop na and sort if you need. 
df.unstack().dropna().sort_index(level=1) 
Out[382]: 
0 1  964725688.0 
1 1  6928857.0 
0 22  1667906.0 
1 22  35207807.0 
2 22 685530997.0 
3 22  35207807.0 
dtype: float64

來源

2017-05-11 21:36:34 Allen

我的解決方案與此類似。 – spitfiredd

my_dict ={1: [964725688, 6928857], 22: [1667906, 35207807, 685530997, 35207807]} 

df = pd.DataFrame([ [k,ele] for k,v in my_dict.iteritems() for ele in v ]) 

print df 

    0 1   
0 1 964725688 
1 1 6928857 
2 22 1667906 
3 22 35207807 
4 22 685530997 
5 22 35207807

來源

2017-05-11 18:41:34 galaxyan

做得很好。 =） – Moondra

這是一個不錯的解決方案！ – hjmnzs

一是理念
pandas

s = pd.Series(my_dict) 
pd.Series(
    np.concatenate(s.values), 
    s.index.repeat(s.str.len()) 
) 

1  964725688 
1  6928857 
22  1667906 
22  35207807 
22 685530997 
22  35207807 
dtype: int64

更快！
numpy

values = list(my_dict.values()) 
lens = [len(value) for value in values] 
keys = list(my_dict.keys()) 
pd.Series(np.concatenate(values), np.repeat(keys, lens)) 

1  964725688 
1  6928857 
22  1667906 
22  35207807 
22 685530997 
22  35207807 
dtype: int64

有趣
pd.concat

pd.concat({k: pd.Series(v) for k, v in my_dict.items()}).reset_index(1, drop=True) 

1  964725688 
1  6928857 
22  1667906 
22  35207807 
22 685530997 
22  35207807 
dtype: int64

來源

2017-05-11 18:46:13 piRSquared

略使用zip和reduce功能側面：

from functools import reduce # if working with Python3 
import pandas as pd 


d = {1: [964725688, 6928857], 22: [1667906, 35207807, 685530997, 35207807]} 

df = pd.DataFrame(reduce(lambda x,y: x+y, [list(zip([k]*len(v), v)) for k,v in d.items()])) 

print(df) 

#  0   1 
# 0 1 964725688 
# 1 1 6928857 
# 2 22 1667906 
# 3 22 35207807 
# 4 22 685530997 
# 5 22 35207807

我們zip鍵和創建記錄的值（通過reduce操作擴展）。記錄然後傳遞給pd.DataFrame函數。

我希望這會有所幫助。

來源

2017-05-11 19:13:28 Abdou

從字典中創建一個數據框，其中值是可變長度列表

回答

相關問題