2016-06-24 54 views
3

想找到兩個外部合併和內部合併數據幀之間的區別,但沒有找到與NaN的任何行 - 我想保留一些行。有沒有辦法使用difference方法做到這一點,或者最好不必創建FrameAFrameB尋找合併(外部 - 內部)熊貓DF差異

import pandas as pd 

DataA = pd.DataFrame([{"a": 1, "b": 4}, {"a": 6, "b": 2}, {"a": 2, "b": 5}, {"a": 3, "b": 6}, {"a": 7, "b": 2}]) 
DataB = pd.DataFrame([{"a": 2, "d": 7}, {"a": 7, "d": 8}, {"a": 3, "d": 8}]) 

DataA的

a b 
0 1 4 
1 6 2 
2 2 5 
3 3 6 
4 7 2 

數據B

a d 
0 2 7 
1 7 8 
2 3 8 

...

FrameA = pd.merge(DataA, DataB, on = "a", how ='inner') 
FrameB = pd.merge(DataA, DataB, on = "a", how ='outer') 

FrameA

a b d 
0 2 5 7 
1 3 6 8 
2 7 2 8 

FrameB

a b d 
0 1 4 NaN 
1 6 2 NaN 
2 2 5 7 
3 3 6 8 
4 7 2 8 

試圖找到數據幀的差別......

list(FrameB.index.difference(FrameA.index)) 

也許你有一個更好的解決方案,這種期望輸出:

a b d 
0 1 4 NaN 
1 6 2 NaN 

回答

4

您正在尋找symmetric_difference

a = DataA.set_index('a') 
b = DataB.set_index('a') 

# select rows from the outer join using the symmetric difference (^) 
a.join(b, how='outer').loc[a.index^b.index].reset_index() 
+1

是的,就是這樣! – MaxU