2016-06-17 53 views
1

我想用一個更精確/完整的一組數值來替換DataFrame中的一列值,這些值由一個系列形式的查找表生成I已經準備好。使用系列查找表替換Pandas DataFrame中的值

我以爲我可以這樣做,但結果並不如預期。

這裏是我想解決數據框:

In [6]: df_normalised.head(10) 
Out[6]: 
    code           name 
0 8        Human development 
1 11            
2 1       Economic management 
3 6   Social protection and risk management 
4 5       Trade and integration 
5 2      Public sector governance 
6 11 Environment and natural resources management 
7 6   Social protection and risk management 
8 7     Social dev/gender/inclusion 
9 7     Social dev/gender/inclusion 

(注意第2行缺少名稱)。

這裏是查表我創建做固定:

In [20]: names 
Out[20]: 
1        Economic management 
10        Rural development 
11 Environment and natural resources management 
2       Public sector governance 
3          Rule of law 
4   Financial and private sector development 
5       Trade and integration 
6   Social protection and risk management 
7      Social dev/gender/inclusion 
8        Human development 
9        Urban development 
dtype: object 

這是我認爲可以做到這一點:

In [21]: names[df_normalised.head(10).code] 
Out[21]: 
code 
8        Human development 
11 Environment and natural resources management 
1        Economic management 
6   Social protection and risk management 
5       Trade and integration 
2       Public sector governance 
11 Environment and natural resources management 
6   Social protection and risk management 
7      Social dev/gender/inclusion 
7      Social dev/gender/inclusion 
dtype: object 

不過,我預計上述由此帶來的一系列與df_normalised(即0,1,2,3)的索引具有相同的索引,而不是基於代碼值的索引。

所以我不知道如何用這些系列值替換df_normalised中'name'列中的原始值,因爲索引不一樣。

順便說一句,怎麼可能有一個與上述重複值的索引?

回答

1

您可以使用map()功能爲:

In [38]: df_normalised['name'] = df_normalised['code'].map(name) 

In [39]: df_normalised 
Out[39]: 
    code           name 
0  8        Human development 
1 11 Environment and natural resources management 
2  1       Economic management 
3  6   Social protection and risk management 
4  5       Trade and integration 
5  2      Public sector governance 
6 11 Environment and natural resources management 
7  6   Social protection and risk management 
8  7     Social dev/gender/inclusion 
9  7     Social dev/gender/inclusion 
+0

優秀。謝謝!我看了一下地圖,但認爲這只是應用功能。 – Bill

0

This works。但是,我很確定必須有一個更簡單的方法來做到這一點。

In [50]: df_normalised.name = pd.Series(names[df_normalised.code].values) 

In [51]: df_normalised.head(10) 
Out[51]: 
    code           name 
0 8        Human development 
1 11 Environment and natural resources management 
2 1       Economic management 
3 6   Social protection and risk management 
4 5       Trade and integration 
5 2      Public sector governance 
6 11 Environment and natural resources management 
7 6   Social protection and risk management 
8 7     Social dev/gender/inclusion 
9 7     Social dev/gender/inclusion 
相關問題