2017-03-03 321 views
1

我被卡住了,需要一些幫助。我有以下數據幀:將熊貓數據框添加到列

+-----+---+---+--+--+ 
|  | A | B | | | 
+-----+---+---+--+--+ 
| 288 | 1 | 4 | | | 
+-----+---+---+--+--+ 
| 245 | 2 | 3 | | | 
+-----+---+---+--+--+ 
| 543 | 3 | 6 | | | 
+-----+---+---+--+--+ 
| 867 | 1 | 9 | | | 
+-----+---+---+--+--+ 
| 345 | 2 | 7 | | | 
+-----+---+---+--+--+ 
| 122 | 3 | 8 | | | 
+-----+---+---+--+--+ 
| 233 | 1 | 1 | | | 
+-----+---+---+--+--+ 
| 346 | 2 | 6 | | | 
+-----+---+---+--+--+ 
| 765 | 3 | 3 | | | 
+-----+---+---+--+--+ 

列A具有重複值,如圖所示。我想要做的是每次看到在列A的重複值I要追加新的式柱與來自塔B爲C列的對應值如下所示時間:

+-----+---+---+-----+ 
|  | A | B | C | 
+-----+---+---+-----+ 
| 288 | 1 | 4 | 9 | 
+-----+---+---+-----+ 
| 245 | 2 | 3 | 7 | 
+-----+---+---+-----+ 
| 543 | 3 | 6 | 8 | 
+-----+---+---+-----+ 
| 867 | 1 | 9 | 1 | 
+-----+---+---+-----+ 
| 345 | 2 | 7 | 6 | 
+-----+---+---+-----+ 
| 122 | 3 | 8 | 3 | 
+-----+---+---+-----+ 
| 233 | 1 | 1 | NaN | 
+-----+---+---+-----+ 
| 346 | 2 | 6 | NaN | 
+-----+---+---+-----+ 
| 765 | 3 | 3 | NaN | 
+-----+---+---+-----+ 

感謝。

+0

你的嘗試在哪裏? – blacksite

+0

聽起來像你最好的選擇是操縱'df.groupby('A')' – BallpointBen

回答

0

假設val是重複的值中的一個,

slice = df.loc[df.A == val, 'B'].shift(-1) 

將創建重新索引到它們的新位置的值的一列數據幀。

由於沒有重新分配的索引值應該是多餘的,因此您可以使用pandas.concat將不同切片拼接在一起,而不用擔心丟失數據。然後,只需將它們作爲新列:

df['C'] = pd.concat([df.loc[df['A'] == x, 'B'].shift(-1) for x in [1, 2, 3]]) 

當列分配,指標值將使一切陣容:

A B C 
0 1 4 9.0 
1 2 3 7.0 
2 3 6 8.0 
3 1 9 1.0 
4 2 7 6.0 
5 3 8 3.0 
6 1 1 NaN 
7 2 6 NaN 
8 3 3 NaN 
+0

的輸出謝謝。這工作。 – magicsword

0

反向數據幀順序,GROUPBY改造它針對移動功能,並將其逆轉:

df = df[::-1] 
df['C'] = df.groupby(df.columns[0]).transform('shift') 
df = df[::-1] 
df 

    A B  C 
0 1 4 9.0 
1 2 3 7.0 
2 3 6 8.0 
3 1 9 1.0 
4 2 7 6.0 
5 3 8 3.0 
6 1 1 NaN 
7 2 6 NaN 
8 3 3 NaN 
相關問題