熊貓數據幀從列表/字典/列表

我有這種形式的一些數據：熊貓數據幀從列表/字典/列表

a = [{'table': 'a', 'field':['apple', 'pear']}, 
    {'table': 'b', 'field':['grape', 'berry']}]

我想創建一個數據幀，看起來像這樣：

field table 
0 apple  a 
1 pear  a 
2 grape  b 
3 berry  b

當我試試這個：

pd.DataFrame.from_records(a)

我得到這個：

  field table 
0 [apple, pear]  a 
1 [grape, berry]  b

我正在使用一個循環來重構我的原始數據，但我認爲必須有一個更直接和更簡單的方法。

來源

2017-08-25 Mike Woodward

你如何推斷'漿果C'？不應該是'b'。 – umutto

@umutto是正確的 - 我將編輯問題 –

您可以使用列表理解來連接一系列dataframes，一個用於在a每個字典。

>>> pd.concat([pd.DataFrame({'table': d['table'], # Per @piRSquared for simplification. 
          'field': d['field']}) 
       for d in a]).reset_index(drop=True) 
    field table 
0 apple  a 
1 pear  a 
2 grape  b 
3 berry  b

來源

2017-08-25 01:29:30 Alexander

我喜歡那樣！聰明！ – piRSquared

這是我使用的解決方案。完善。 –

選項1
理解

pd.DataFrame([{'table': d['table'], 'field': f} for d in a for f in d['field']]) 

    field table 
0 apple  a 
1 pear  a 
2 grape  b 
3 berry  b

選項2
重建

d1 = pd.DataFrame(a) 
pd.DataFrame(dict(
    table=d1.table.repeat(d1.field.str.len()), 
    field=np.concatenate(d1.field) 
)).reset_index(drop=True) 

    field table 
0 apple  a 
1 pear  a 
2 grape  b 
3 berry  b

選項3
魔方

pd.DataFrame(a).set_index('table').field.apply(pd.Series) \ 
    .stack().reset_index('table', name='field').reset_index(drop=True) 

    table field 
0  a apple 
1  a pear 
2  b grape 
3  b berry

來源

2017-08-25 01:23:58 piRSquared

我更喜歡選項1.鑑於'table'是一個標量，我可以只取其值。 – Alexander

選項3是一個有趣的方法，雖然我不想在六個月後檢查它，並詢問WTF是否寫回當時...（ - ; – Alexander

阿門那，但嘿！這是一條線，我聽說一條線可以讓事情變得更快，而棉花糖就像獨角獸一樣。 – piRSquared

或者你可以嘗試使用pd.wide_to_long，我想使用lreshape，卻是無證和個人不推薦...牛逼_牛逼

a = [{'table': 'a', 'field':['apple', 'pear']}, 
    {'table': 'b', 'field':['grape', 'berry']}] 
df=pd.DataFrame.from_records(a)

df[['Feild1','Feild2']]=df.field.apply(pd.Series) 
pd.wide_to_long(df,['Feild'],'table','lol').reset_index().drop('lol',axis=1).sort_values('table') 

Out[74]: 
    table Feild 
0  a apple 
2  a pear 
1  b grape 
3  b berry

來源

2017-08-25 02:18:51 Wen

熊貓數據幀從列表/字典/列表

回答

相關問題