2014-11-21 135 views
-2

我有一個數據幀,我想刪除所有重複的行。比如我的數據框的樣子:R刪除重複行

> df <- data.frame(A = c("Happy", "Happy", "Sad", "Confused", "Mad", "Mad"), B = c(1, 2, 3, 4, 5, 6)) 
> df 
     A B 
1 Happy 1 
2 Happy 2 
3  Sad 3 
4 Confused 4 
5  Mad 5 
6  Mad 6 

我只想要其中一個條目都是唯一的行獲得:

  A B 
1  Sad 3 
2 Confused 4 

回答

4

您可以嘗試duplicated

df[!(duplicated(df$A)|duplicated(df$A,fromLast=TRUE)),] 
#   A B 
#3  Sad 3 
#4 Confused 4 

df[df$A %in% with(as.data.frame(table(df$A)), Var1[Freq==1]),] 
#  A B 
#3  Sad 3 
#4 Confused 4 

df[colSums(sapply(df$A, `==`, df$A))==1,] 
#   A B 
#3  Sad 3 
#4 Confused 4 

df[colSums(Vectorize(function(x) x==df$A)(df$A))==1,] 

或使用data.table(類似@初學者的使用ave

library(data.table) 
setDT(df)[,.SD[.N==1], by=A] 
#   A B 
#1:  Sad 3 
#2: Confused 4 

setDT(df)[df[,.I[.N==1], by=A]$V1] 
#   A B 
#1:  Sad 3 
#2: Confused 4 
3

akrun似乎被各色收集NT方法,所以這裏的另一個在基地:

df[ave(as.numeric(df$A), df$A, FUN = length) == 1,] 
#   A B 
#3  Sad 3 
#4 Confused 4 

(我猜一個與duplicated將是最常用的方法)

或者使用dplyr:

require(dplyr) 
group_by(df, A) %>% filter(n() == 1) 
+0

我等着'dplyr'的答案。你怎麼這麼久? ;-) – 2014-11-21 18:42:02

+0

感謝您的評論:) – 2014-11-22 12:08:41