從兩個話語結構具有罕見的列值

主動刪除行：

Customer_ID | product_No| Rating 
7   | 111  | 3.0 
7   | 222  | 1.0 
7   | 333  | 5.0 
7   | 444  | 3.0

用戶：

Customer_ID | product_No| Rating 
9   | 111  | 2.0 
9   | 222  | 5.0 
9   | 666  | 5.0 
9   | 555  | 3.0

我想找到共同的產品評級兩個用戶評分（例如111,222）並刪除任何不常見的產品（例如444,333,555,666）。因此，新的DFS應該是這樣的：

活動：

Customer_ID | product_No| Rating 
7   | 111  | 3.0 
7   | 222  | 1.0

用戶：

Customer_ID | product_No| Rating 
9   | 111  | 2.0 
9   | 222  | 5.0

我不知道該怎麼做，而無需進行循環。你能幫助我，請

這是我的代碼至今：

import pandas as pd 
ratings = pd.read_csv("ratings.csv",names['Customer_ID','product_No','Rating']) 
active=ratings[ratings['UserID']==7] 
user=ratings[ratings['UserID']==9]

來源

2017-04-16 fsfr23

你可以先拿到普通product_No使用交集然後用isin方法對原始數據幀的過濾器：

common_product = set(active.product_No).intersection(user.product_No) 

common_product 
# {111, 222} 

active[active.product_No.isin(common_product)] 

#Customer_ID product_No Rating 
#0   7   111  3.0 
#1   7   222  1.0 

user[user.product_No.isin(common_product)] 

#Customer_ID product_No Rating 
#0   9   111  2.0 
#1   9   222  5.0

來源

2017-04-16 00:51:27 Psidom

這個我試過用INNER JOIN如下：

import pandas as pd 

df1 = pd.read_csv('a.csv') 
df2 = pd.read_csv('b.csv') 
print df1 
print df2 

df_ij = pd.merge(df1, df2, on='product_No', how='inner') 
print df_ij 

df_list = [] 
for df_e,suffx in zip([df1,df2],['_x','_y']): 
    df_e = df_ij[['Customer_ID'+suffx,'product_No','Rating'+suffx]] 
    df_e.columns = list(df1) 
    df_list.append(df_e) 

print df_list[0] 
print df_list[1]

它給出以下輸出：

# print df1 
    Customer_ID product_No Rating 
0   7   111  3 
1   7   222  1 
2   7   333  5 
3   7   444  3 

# print df2 
    Customer_ID product_No Rating 
0   9   111  2 
1   9   222  5 
2   9   777  5 
3   9   555  3 

# print the INNER JOINed df 
    Customer_ID_x product_No Rating_x Customer_ID_y Rating_y 
0    7   111   3    9   2 
1    7   222   1    9   5 

# print the first df you want, with common 'product_No' 
    Customer_ID product_No Rating 
0   7   111  3 
1   7   222  1 

# print the second df you want, with common 'product_No' 
    Customer_ID product_No Rating 
0   9   111  2 
1   9   222  5

的inner join選擇在每個df公共行。由於有共同的列名稱，對於未在聯接中使用的列，聯接的df已添加後綴以區分這些列名稱。然後，只需指定適當的後綴，即可簡單地提取列以獲得所需的最終結果。

有一個很好的例子INNER JOINhere。

來源

2017-04-16 01:01:58

使用query引用其他dataframes

Active.query('product_No in @User.product_No') 

    Customer_ID product_No Rating 
0   7   111  3.0 
1   7   222  1.0 

User.query('product_No in @Active.product_No') 

    Customer_ID product_No Rating 
0   9   111  2.0 
1   9   222  5.0

來源

2017-04-16 02:23:08 piRSquared

您的這個問題的答案是....

import pandas as pd 
dict1={"Customer_id":[7,7,7,7], 
     "Product_No":[111,222,333,444], 
     "rating":[3.0,1.0,5.0,3.0]} 
active=pd.DataFrame(dict1) 
dict2={"Customer_id":[9,9,9,9], 
     "Product_No":[111,222,666,555], 
     "rating":[2.0,5.0,5.0,3.0]} 
user=pd.DataFrame(dict2) 
df3=pd.merge(active,user,on="Product_No",how="inner") 
df3 
active=df3[["Customer_id_x","Product_No","rating_x"]] 
print(active) 
user=df3[["Customer_id_y","Product_No","rating_y"]] 
print(user)

來源

2017-04-16 08:32:02

從兩個話語結構具有罕見的列值

回答

相關問題