2017-01-04 64 views
1

替換列值。這是希望的人一個簡單的問題在那裏:在python

我有一個數據幀,看起來像這樣:

import pandas as pd 
names_raw = { 
    'device_id': [ '1d28d33a-c98e-4986-a7bb-5881d222c9a8','54322099-e76d-4986-afd2-0861e2113a16','ec3a9f9d-8e4d-4986-bea8-c17c361366e9','cc8e247d-4e2e-4986-b783-e516d03a358c','ca2d8769-ccf5-4986-8aed-741ca68e94cd','12178e22-6d64-4986-966a-374326fdaf3d','50ba7a2e-a1aa-4986-86a7-08e0605dc702','f427c8e9-65d4-46de-b986-8f8e79242842','cee68e2b-135f-45b0-be4b-7c23009866ba','e785988e-2693-47ad-9899-0049860ccaa7','a1986866-13f8-4dbe-b661-8c9f78eac745','a9998ecd-9fe9-4932-870d-29c6b5df1214','9b88e362-b06d-4317-96f5-f266c986a8d6','a04498ef-fd7c-4aa4-bffc-9158ccbad3a1'], 
    'pod_id': ['B00001','B00011','B00013','B00016','B00021','B00023','B00024','B00026','B00027','B00028','B00030','B00032','B00034','B00039'], 
    'native_id': ['zim_pod_0001','zim_pod_0002', 'zim_pod_0003', 'zim_pod_0004', 'zim_pod_0005', 'zim_pod_0006', 'zim_pod_0007', 'zim_pod_0008', 'zim_pod_0009', 'zim_pod_0010', 'zim_pod_0011', 'zim_pod_0012', 'zim_pod_0013','zim_pod_0014'] 
    } 
names = pd.DataFrame(names_raw, columns = ['device_id', 'pod_id', 'native_id']) 

而且看起來像這樣的另一個數據幀:

>>> df 
          device_id  day month year rain 
0 1d28d33a-c98e-4986-a7bb-5881d222c9a8 31  12 2016 0.0 
1 54322099-e76d-4986-afd2-0861e2113a16 31  12 2016 0.0 
2 ec3a9f9d-8e4d-4986-bea8-c17c361366e9 31  12 2016 0.0 
3 cc8e247d-4e2e-4986-b783-e516d03a358c 31  12 2016 1.2 
4 ca2d8769-ccf5-4986-8aed-741ca68e94cd 31  12 2016 2.2 
5 12178e22-6d64-4986-966a-374326fdaf3d 31  12 2016 0.2 
6 9b88e362-b06d-4317-96f5-f266c986a8d6 31  12 2016 0.0 

我想用native_id列替換device_id列。如何使用最少量的代碼行來完成?

最後的數據幀應該是這個樣子:

>>> df 
          native_id  day month year rain 
0       zim_pod_0001 31  12 2016 0.0 
1       zim_pod_0002 31  12 2016 0.0 
2       zim_pod_0003 31  12 2016 0.0 

等等等等......

回答

0

使用內置於Pandas的merge()方法。它本質上是一個連接,使用起來非常簡單。指定DEVICE_ID作爲連接鍵,然後選擇所需的列,就像這樣:

​​

結果:

 native_id day month year rain 
0 zim_pod_0001 31  12 2016 0.0 
1 zim_pod_0002 31  12 2016 0.0 
2 zim_pod_0003 31  12 2016 0.0 
3 zim_pod_0004 31  12 2016 1.2 
4 zim_pod_0005 31  12 2016 2.2 
5 zim_pod_0006 31  12 2016 0.2 
6 zim_pod_0013 31  12 2016 0.0 
+1

這也適用。謝謝!! – JAG2024

1

試試這個:

df['native_id'] = df.device_id.map(names.set_index('device_id')['native_id']) 

或者,如果你不希望保留device_id列中的df DF:

In [210]: df['native_id'] = df.pop('device_id').map(names.set_index('device_id')['native_id']) 

In [211]: df 
Out[211]: 
    day month year rain  native_id 
0 31  12 2016 0.0 zim_pod_0001 
1 31  12 2016 0.0 zim_pod_0002 
2 31  12 2016 0.0 zim_pod_0003 
3 31  12 2016 1.2 zim_pod_0004 
4 31  12 2016 2.2 zim_pod_0005 
5 31  12 2016 0.2 zim_pod_0006 
6 31  12 2016 0.0 zim_pod_0013 
+0

很好,謝謝! – JAG2024

+0

@ JAG2024,很高興我可以幫助:) – MaxU