2016-05-28 159 views
2

部分一個CSV文件(「data.csv」)我必須處理,看起來是這樣的:合併行的數據幀

parent_id,parent_name,Type,Companyname,Custsupid,Streetaddress 
3,Customer,,,C0010, 
3,Customer,A,,, 
3,Customer,,ACE SYSTEMS,, 
3,Customer,,,,Straat 10 
7,Customer,,,Q8484, 
7,Customer,B,,, 
7,Customer,,XYZ AUTOMAT,, 
7,Customer,,,,Laan 99 

要將此文件導入到一個數據幀我做的:

DF = pd.read_csv( 'data.csv')fillna( '')

這導致:

------------------------------------------------------------------ 
| |parent_id|parent_name|Type|Companyname|Custsupid|Streetaddress| 
------------------------------------------------------------------ 
|0|3  |Customer | |   |C0010 |    | 
|1|3  |Customer |A |   |   |    | 
|2|3  |Customer | |ACE SYSTEMS|   |    | 
|3|3  |Customer | |   |   |Straat 10 | 
|4|7  |Customer | |   |Q8484 |    | 
|5|7  |Customer |B |   |   |    | 
|6|7  |Customer | |XYZ AUTOMAT|   |    | 
|7|7  |Customer | |   |   |Laan 99  | 
------------------------------------------------------------------ 

不過,我想結束了,是datafr這個看起來像這樣:

------------------------------------------------------------------ 
| |parent_id|parent_name|Type|Companyname|Custsupid|Streetaddress| 
------------------------------------------------------------------ 
|0|3  |Customer |A |ACE SYSTEMS|C0010 |Straat 10 | 
|1|7  |Customer |B |XYZ AUTOMAT|Q8484 |Laan 99  | 
------------------------------------------------------------------ 

我已經嘗試與df.groupby等,但我不能產生所需的結果。

有沒有辦法用熊貓數據框來實現這一點?

回答

2
In [37]: df.groupby(['parent_id', 'parent_name']).sum() 
Out[37]: 
         Type Companyname Custsupid Streetaddress 
parent_id parent_name           
3   Customer  A ACE SYSTEMS  C0010  Straat 10 
7   Customer  B XYZ AUTOMAT  Q8484  Laan 99 

sum是添加串在一起,因此這依賴於加入空字符串到非空字符串返回非空字符串的事實。