大熊貓重塑行重複

我想重塑重複行的數據幀。數據來自重複數據塊的csv文件。大熊貓重塑行重複

舉個例子：

Name  1st 2nd 
0 Value1  a1  b1 
1 Value2  a2  b2 
2 Value3  a3  b3 
3 Value1  a4  b4 
4 Value2  a5  b5 
5 Value3  a6  b6

應被重塑成：

Name  1st 2nd 3rd 4th 
Value1 a1 b1 a4 b4 
Value2 a2 b2 a5 b5 
Value3 a3 b3 a6 b6

你有什麼建議，如何做到這一點？我已經看過這個thread，但是我看不到如何將這種方法轉化爲我的問題，其中groupby工作的列有多個列右側。

來源

2017-04-24 Johannes

您可以使用set_index和stack你的兩列合併成一個，cumcount得到新的列標籤，並pivot做整形：

# Stack the 1st and 2nd columns, and use cumcount to get the new column labels. 
df = df.set_index('Name').stack().reset_index(level=1, drop=True).to_frame() 
df['new_col'] = df.groupby(level='Name').cumcount() 

# Perform a pivot to get the desired shape. 
df = df.pivot(columns='new_col', values=0) 

# Formatting. 
df = df.reset_index().rename_axis(None, 1)

輸出結果：

 Name 0 1 2 3 
0 Value1 a1 b1 a4 b4 
1 Value2 a2 b2 a5 b5 
2 Value3 a3 b3 a6 b6

來源

2017-04-24 18:47:01 root

按名稱分組後，重複創建一個帶有df值的數據幀，並將該df與原始文件合併。

df1 = df.groupby('Name')['1st', '2nd'].apply(lambda x: x.iloc[1]).reset_index() 
df1.columns = ['Name', '3rd', '4th'] 
df = df.drop_duplicates(subset=['Name']).merge(df1, on = 'Name')

你得到

Name 1st 2nd 3rd 4th 
0 Value1 a1 b1 a4 b4 
1 Value2 a2 b2 a5 b5 
2 Value3 a3 b3 a6 b6

來源

2017-04-24 19:23:34 Vaishali

大熊貓重塑行重複

回答

相關問題