2017-03-17 179 views
0

我有A R數據框df_big合併r dataframes

Candidate Status 
A   1 
B   10 
C   12 
D   15 
E   25 

我有第二個數據幀df_small

Candidate_1 Candidate_2 
A    C 
B    E  
C    D 

我要合併df_smalldf_big到得到df_final看起來像

Candidate_1 Candidate_2  Status_1  Status_2 
A    C     1   12 
B    E     10   25 
C    D     12   15 

我試過的東西效果

df_small_1 = merge(x=df_small,y = df_big,by.x = "Candidate_1",by.y="Candidate") 

df_small_2 = merge(x=df_small,y = df_big,by.x = "Candidate_2",by.y="Candidate") 

,但我不知道如何結合df_small_1df_small_2df_small

+0

像'df_final =合併(X =合併(X = df_small,Y = df_big,by.x = 「Candidate_2」,by.y = 「候選人」 ),y = df_big,by.x =「Candidate_1」,by.y =「Candidate」)' – HubertL

+0

剛剛重塑爲long形式比較容易:'library(tidyverse); df_small%>%gather(var,Candidate)%>%left_join(df_big)' – alistaire

回答

1

您需要爲每個兩位候選人的身份參加兩次,一次:

df_result <- merge(x=df_small, y=df_big, by.x="Candidate_1", by.y="Candidate") 
df_result <- merge(x=df_result, y=df_big, by.x="Candidate_2", by.y="Candidate") 
0

合併是一項昂貴的操作。您可以更好地執行此操作,而無需使用其組合和索引的合併操作。我已經對合並和非合併解決方案進行了基準測試。答案也根據需要給出列的順序。

doit <- function(df_small, df_big) 
{ 

    # Which elements do we need to copy 
    indx1 <- df_big[["Candidate"]] %in% df_small[["Candidate_1"]] 

    indx2 <- df_big[["Candidate"]] %in% df_small[["Candidate_2"]] 

    # Copy them 
    df_needed <- data.frame(Candiate_1 = df_big[indx1, "Candidate"], Candiate_2 = df_big[indx2, "Candidate"], 
          Status_1 = df_big[indx1, "Status"], Status_2 = df_big[indx2, "Status"]) 

} 

#merge two times 
doit_merge <- function(df_small, df_big) 
{ 
    df_result <- merge(x=df_small, y=df_big, by.x="Candidate_1", by.y="Candidate") 
    df_result <- merge(x=df_result, y=df_big, by.x="Candidate_2", by.y="Candidate") 
} 

library(microbenchmark) 

# benchmark results 
microbenchmark(
    doit(df_small, df_big) , 
    doit_merge(df_small, df_big) 
) 

成績

Unit: microseconds 
expr        min  lq  mean median  uq  max neval cld 
doit(df_small, df_big)  676.570 758.472 1077.203 834.0115 978.9315 4591.473 100 a 
doit_merge(df_small, df_big) 1329.327 1449.205 1986.995 1612.3940 2021.9070 5966.780 100 b