2
我想「合併連接」兩個熊貓數據框。基本上,我想疊加兩個DataFrame,但只保留每個DataFrame中與其他DataFrame中的值匹配的行。因此,例如:連接pandas DataFrames只保留列中具有匹配值的行嗎?
data1:
+---+------------+-----------+-------+
| | first_name | last_name | class |
+---+------------+-----------+-------+
| 0 | Alex | Anderson | 1 |
| 1 | Amy | Ackerman | 2 |
| 2 | Allen | Ali | 3 |
| 3 | Alice | Aoni | 4 |
| 4 | Andrew | Andrews | 4 |
| 5 | Ayoung | Atiches | 5 |
+---+------------+-----------+-------+
data2:
+---+------------+-----------+-------+
| | first_name | last_name | class |
+---+------------+-----------+-------+
| 0 | Billy | Bonder | 4 |
| 1 | Brian | Black | 5 |
| 2 | Bran | Balwner | 6 |
| 3 | Bryce | Brice | 7 |
| 4 | Betty | Btisan | 8 |
| 5 | Bruce | Bronson | 8 |
+---+------------+-----------+-------+
然後在data1
和data2
執行此操作後所產生的數據幀應該是這樣的:
result:
+---+------------+-----------+-------+
| | first_name | last_name | class |
+---+------------+-----------+-------+
| 3 | Alice | Aoni | 4 |
| 4 | Andrew | Andrews | 4 |
| 5 | Ayoung | Atiches | 5 |
| 0 | Billy | Bonder | 4 |
| 1 | Brian | Black | 5 |
+---+------------+-----------+-------+
基本上,我試圖合併這兩個數據集,然後堆積列。我可以想到一些方法來做到這一點,但他們都是黑客。我可以合併data1
和data2
,然後疊加起來的列,或使用地圖,如:
map1 = data1['subject_id'].map(lambda x: x in list(data2['subject_id']))
map2 = data2['subject_id'].map(lambda x: x in list(data1['subject_id']))
pd.concat([data1[map1], data2[map2]])
但有一個更優雅的解決方案呢?