2017-07-02 103 views
-1

我有一個形式的字典;分裂熊貓列與元組

data = {A:[(1,2),(3,4),(5,6),(7,8),(8,9)], 
     B:[(3,4),(4,5),(5,6),(6,7)], 
     C:[(10,11),(12,13)]} 

創建由數據幀:

df = pd.DataFrame(dict([ (k,pd.Series(v)) for k,v in data.iteritems()])) 

這又成爲;

A  B  C 
(1,2) (3,4) (10,11) 
(3,4) (4,5) (12,13) 
(5,6) (5,6) NaN 
(6,7) (6,7) NaN 
(8,9) NaN NaN 

有沒有辦法從數據框上面去下面的一個:

A  B  C 
one two one two one two 
1 2 3 4 10 11 
3 4 4 5 12 13 
5 6 5 6 NaN NaN 
6 7 6 7 NaN NaN 
8 9 NaN NaN NaN NaN 
+1

[拆分元組在一個列表的可能的複製熊貓dataframe列](https://stackoverflow.com/questions/31069018/splitting-a-list-of-tuples-in-a-pandas-dataframe-列) – Wen

+1

@idjaw你是對的我的問題寫得不是很好,我希望我的編輯更好地解釋它。 – user3191569

+0

@Wen你提到的問題拆分創建兩個完全不同的列,在我的情況下,我想使用多索引 – user3191569

回答

1

您可以使用list comprehensionDataFrame構造與values + tolistconcat列轉換爲numpy array

cols = ['A','B','C'] 
L = [pd.DataFrame(df[x].values.tolist(), columns=['one','two']) for x in cols] 
df = pd.concat(L, axis=1, keys=cols) 
print (df) 

    A  B  C  
    one two one two one two 
0 1 2 3 4 5 6 
1 7 8 9 10 11 12 
2 13 14 15 16 17 18 

編輯:

dict comprehension類似溶液,integer價值觀轉化爲float S,由於NaNtypefloat太。

data = {'A':[(1,2),(3,4),(5,6),(7,8),(8,9)], 
     'B':[(3,4),(4,5),(5,6),(6,7)], 
     'C':[(10,11),(12,13)]} 

cols = ['A','B','C'] 
d = {k: pd.DataFrame(v, columns=['one','two']) for k,v in data.items()} 
df = pd.concat(d, axis=1) 
print (df) 
    A  B   C  
    one two one two one two 
0 1 2 3.0 4.0 10.0 11.0 
1 3 4 4.0 5.0 12.0 13.0 
2 5 6 5.0 6.0 NaN NaN 
3 7 8 6.0 7.0 NaN NaN 
4 8 9 NaN NaN NaN NaN 

編輯:

對於由一個列中的多個能夠使用slicers

s = df[('A', 'one')] 
print (s) 
0 1 
1 3 
2 5 
3 7 
4 8 
Name: (A, one), dtype: int64 

df.loc(axis=1)[:, 'one'] = df.loc(axis=1)[:, 'one'].mul(s, axis=0) 
print (df) 
     A   B   C  
    one two one two one two 
0 1.0 2 3.0 4.0 10.0 11.0 
1 9.0 4 12.0 5.0 36.0 13.0 
2 25.0 6 25.0 6.0 NaN NaN 
3 49.0 8 42.0 7.0 NaN NaN 
4 64.0 9 NaN NaN NaN NaN 

另一種解決方案:

idx = pd.IndexSlice 
df.loc[:, idx[:, 'one']] = df.loc[:, idx[:, 'one']].mul(s, axis=0) 
print (df) 
     A   B   C  
    one two one two one two 
0 1.0 2 3.0 4.0 10.0 11.0 
1 9.0 4 12.0 5.0 36.0 13.0 
2 25.0 6 25.0 6.0 NaN NaN 
3 49.0 8 42.0 7.0 NaN NaN 
4 64.0 9 NaN NaN NaN NaN 
+0

非常感謝你,想知道有沒有辦法訪問特定的列,即一個數據幀並在它們全部上廣播計算,即對第一列中的所有值乘以1 – user3191569

+0

給我一些時間。 – jezrael

+0

你是否認爲'df [('A','one')]'是多重的? – jezrael