如何在保留現有值的情況下在數據框中填入數據

我有腳本將文件（df4）中的值填入現有數據框（df3）。但據幀DF3包含已經填充了值的列和這些現有的值設置爲「南」與下面的腳本：如何在保留現有值的情況下在數據框中填入數據

df5 = df4.pivot_table(index='source', columns='plasmidgene', values='identity').reindex(index=df3.index, columns=df3.columns)

如何避免我的現有值將被覆蓋？由於

例如，我有DF1

a b c d e f 
1 1 30 Nan Nan Nan Nan 
2 2 3 Nan Nan Nan Nan 
3 16 1 Nan Nan Nan Nan

DF2

1 1 d 80 
2 2 e 90 
3 3 c 60

而且我想創造這個

a b c d e f 
1 1 30 Nan 80 Nan Nan 
2 2 3 Nan Nan 90 Nan 
3 16 1 60 Nan Nan Nan

來源

2017-04-06 Gravel

你可以添加數據樣本和所需的輸出？ – jezrael

請參閱：[如何製作好可重複的熊貓示例]（https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples） – languitar

我認爲你可以使用combine_first：

df = df2.pivot_table(index='source', columns='plasmidgene', values='identity') \ 
     .reindex(index=df1.index, columns= df1.columns) \ 
     .combine_first(df1) 

print (df) 
     a  b  c  d  e f 
1 1.0 30.0 NaN 80.0 NaN NaN 
2 2.0 3.0 NaN NaN 90.0 NaN 
3 16.0 1.0 60.0 NaN NaN NaN 

print (df.dtypes) 
a float64 
b float64 
c float64 
d float64 
e float64 
f float64 
dtype: object

對於fillna這是有問題的 - 不改變dtypes到float64，所以不使用它 - 它看起來像錯誤：

df = df2.pivot_table(index='source', columns='plasmidgene', values='identity') \ 
     .reindex(index=df1.index, columns= df1.columns) \ 
     .fillna(df1) 

print (df) 
    a b c d e f 
1 1 30 NaN 80 NaN NaN 
2 2 3 NaN NaN 90 NaN 
3 16 1 60 NaN NaN NaN 

print (df.dtypes) 
a object 
b object 
c object 
d object 
e object 
f object 
dtype: object

來源

2017-04-06 08:39:48 jezrael

是的，最後一個選項很好用！非常感謝！ – Gravel

在我看來，使用'combine_first'更好，因爲混合類型是有問題的 - 一些熊貓功能是越野車。 – jezrael

如果我使用combine_first，我得到以下錯誤[AttributeError：'DataFrame'對象沒有屬性'dtype'] – Gravel

如何在保留現有值的情況下在數據框中填入數據

回答

相關問題