2015-11-02 74 views
1

我希望能夠通過表格內的訂單號來比較差異,並附上說明差異的列。例如,我想這計算數據幀R和添加到列的差異

order color type shape    alert 
1  1 blue a circle    type 
2  1 blue b circle     
3  2 green a circle    color 
4  2 blue a circle color type shape 
5  2 yellow b triangle    type 
6  2 yellow c triangle     
7  3 orange c triangle     

看起來像這樣

order color type shape    alert 
1  1 blue a circle    type 
2  1 blue b circle     
3  2 green a circle    color type shape 
4  2 blue a circle 
5  2 yellow b triangle    
6  2 yellow c triangle     
7  3 orange c triangle     

我的代碼只比較兩行旁邊,是對方我怎麼有效地使用相同的訂單號碼比較所有行?我可以避免循環?這裏是我的代碼

order = c(0001, 0001, 0002, 0002, 0002, 0002, 0003) 
color = c("blue", "blue", "green", "blue", "yellow", "yellow", "orange") 
type = c("a", "b", "a", "a", "b", "c", "c") 
shape = c("circle", "circle", "circle", "circle", "triangle", "triangle", "triangle") 
df = data.frame(order, color, type, shape) 

df$alert <- "" 

for(i in 1:nrow(df)-1){ 
    if(identical(df$order[i+1],df$order[i])){ 
    if(!identical(df$color[i+1],df$color[i])){ 
     df$alert[i] <- paste(df$alert[i],"color") 
    } 
    if(!identical(df$type[i+1],df$type[i])){ 
     df$alert[i] <- paste(df$alert[i],"type") 
    } 
    if(!identical(df$shape[i+1],df$shape[i])){ 
     df$alert[i] <- paste(df$alert[i],"shape") 
    } 
    } 
} 

回答

0

這裏有一個dplyr基礎的解決方案:

library(dplyr) 
dat1 %>% gather(measure, val, -order) %>% 
     group_by(order, measure) %>% 
     summarise(alerts = length(unique(val))) %>% 
     filter(alerts>1) %>% 
     summarise(alerts = paste0(measure, collapse = " ")) %>% 
     left_join(dat1, .) 

    order color type shape   alerts 
1  1 blue a circle    type 
2  1 blue b circle    type 
3  2 green a circle color type shape 
4  2 blue a circle color type shape 
5  2 yellow b triangle color type shape 
6  2 yellow c triangle color type shape 
7  3 orange c triangle    <NA>