2016-07-30 33 views
0

我有兩個數據幀:鍵出現在第二一個和標記這一事實

data = { 
    'year': ['11:23:19', '11:23:19', '11:24:19', '11:25:19', '11:25:19', '11:23:19', '11:23:19', '11:23:19', '11:23:19', '11:23:19'], 
    'store_number': ['1944', '1945', '1946', '1948', '1948', '1949', '1947', '1948', '1949', '1947'], 
    'retailer_name': ['Walmart', 'Walmart', 'CRV', 'CRV', 'CRV', 'Walmart', 'Walmart', 'CRV', 'CRV', 'CRV'], 
    'amount': [5, 5, 8, 6, 1, 5, 10, 6, 12, 11], 
    'id': [10, 10, 11, 11, 11, 10, 10, 11, 11, 10] 
} 

df1 = pd.DataFrame(data, columns = ['retailer_name', 'store_number', 'year', 'amount', 'id']) 
df1.set_index(['retailer_name', 'store_number', 'year'], inplace = True) 

retailer_name store_number year  amount id 
Walmart  1944   11:23:19  5 10 
       1945   11:23:19  5 10 
CRV   1946   11:24:19  8 11 
       1948   11:25:19  6 11 
          11:25:19  1 11 
Walmart  1949   11:23:19  5 10 
       1947   11:23:19  10 10 
CRV   1948   11:23:19  6 11 
       1949   11:23:19  12 11 
       1947   11:23:19  11 10 

,第二個:

data2 = { 
    'year': ['11:23:19', '11:23:19', '13:23:19'], 
    'store_number': [1944, 1947, 1978], 
    'retailer_name': ['Walmart', 'CRV', 'CRV12'], 
    'amount': [5, 11, 11] 
} 

df2 = pd.DataFrame(data2, columns = ['retailer_name', 'store_number', 'year', 'amount']) 
df2.set_index(['retailer_name', 'store_number', 'year'], inplace = True) 

retailer_name store_number year  amount 
Walmart  1944   11:23:19  5 
CRV   1947   11:23:19  11 
CRV12   1978   13:23:19  11 

如何檢查DF2的出現在DF1對那些確實出現並0如果沒有鑰匙,標誌1

retailer_name store_number year  amount flag 
Walmart  1944   11:23:19  5 1 
CRV   1947   11:23:19  11 1 
CRV12   1978   13:23:19  11 0 

回答

1

可以使用MultiIndex.intersection()方法,如果您要確保兩個multiindexes具有相同dtypes :

In [74]: df2['flag'] = 0 

In [75]: df2.ix[df2.index.intersection(df.index), 'flag'] = 1 
c:\envs\py35\lib\site-packages\IPython\terminal\ipapp.py:344: PerformanceWarning: indexing past lexsort depth may impact performance. 
    self.shell.mainloop() 

In [76]: df2 
Out[76]: 
            amount flag 
retailer_name store_number year 
Walmart  1944   11:23:19  5  1 
CRV   1947   11:23:19  11  1 
CRV12   1978   13:23:19  11  0 

注意:它不會與樣品的DF工作,因爲列store_number有不同的dtypes:stringdfintdf2

+1

@NightWalker您也可以使用'df2.index.isin(df1.index)'而不是'intersection'。 – ptrj

相關問題