按組選擇最大行值

我一直在試圖通過查看其他帖子來執行此操作，但我一直收到錯誤消息。我的數據new看起來是這樣的：按組選擇最大行值

id year name gdp 
1 1980 Jamie 45 
1 1981 Jamie 60 
1 1982 Jamie 70 
2 1990 Kate 40 
2 1991 Kate 25 
2 1992 Kate 67 
3 1994 Joe  35 
3 1995 Joe  78 
3 1996 Joe  90

我想選擇與ID最高的年份值的行。所以想要的輸出是：

id year name gdp 
1 1982 Jamie 70 
2 1992 Kate 67 
3 1996 Joe  90

從Selecting Rows which contain daily max value in R我嘗試以下，但沒有奏效

ddply(new,~id,function(x){x[which.max(new$year),]})

我也試過

tapply(new$year, new$id, max)

但這並沒有給我想要的輸出。

任何建議將真的幫助！

來源

2014-04-04 song0089

只需使用split：

df <- do.call(rbind, lapply(split(df, df$id), 
    function(subdf) subdf[which.max(subdf$year)[1], ]))

例如，

df <- data.frame(id = rep(1:10, each = 3), year = round(runif(30,0,10)) + 1980, gdp = round(runif(30, 40, 70))) 
print(head(df)) 
# id year gdp 
# 1 1 1990 49 
# 2 1 1981 47 
# 3 1 1987 69 
# 4 2 1985 57 
# 5 2 1989 41 
# 6 2 1988 54 

df <- do.call(rbind, lapply(split(df, df$id), function(subdf) subdf[which.max(subdf$year)[1], ])) 
print(head(df)) 
# id year gdp 
# 1 1 1990 49 
# 2 2 1989 41 
# 3 3 1989 55 
# 4 4 1988 62 
# 5 5 1989 48 
# 6 6 1990 41

來源

2014-04-04 02:14:26

說實話，這似乎過於複雜這個任務。你基本上用'split' +'lapply'來重新創建'by'' – thelatemail

你ddply工作對我來說很好，但你提到的回調函數的原始數據集。

ddply(new,~id,function(x){x[which.max(new$year),]}) 
# should be 
ddply(new,.(id),function(x){x[which.max(x$year),]})

來源

2014-04-04 02:33:57

似乎應該選擇這個答案。 –

您可以duplicated

# your data 
df <- read.table(text="id year name gdp 
1 1980 Jamie 45 
1 1981 Jamie 60 
1 1982 Jamie 70 
2 1990 Kate 40 
2 1991 Kate 25 
2 1992 Kate 67 
3 1994 Joe  35 
3 1995 Joe  78 
3 1996 Joe  90" , header=TRUE) 

# Sort by id and year (latest year is last for each id) 
df <- df[order(df$id , df$year), ] 

# Select the last row by id 
df <- df[!duplicated(df$id, fromLast=TRUE), ]

來源

2014-04-04 02:34:52 user20650

做到這一點對於大表很好地擴展使用data.table另一種選擇。

DT <- read.table(text = "id year name gdp 
          1 1980 Jamie 45 
          1 1981 Jamie 60 
          1 1982 Jamie 70 
          2 1990 Kate 40 
          2 1991 Kate 25 
          2 1992 Kate 67 
          3 1994 Joe  35 
          3 1995 Joe  78 
          3 1996 Joe  90", 
       header = TRUE) 

require("data.table") 
DT <- as.data.table(DT) 

setkey(DT,id,year) 
res = DT[,j=list(year=year[which.max(gdp)]),by=id] 
res 

setkey(res,id,year) 
DT[res] 
# id year name gdp 
# 1: 1 1982 Jamie 70 
# 2: 2 1992 Kate 67 
# 3: 3 1996 Joe 90

來源

2014-04-04 02:55:09 marbel

ave作品在這裏再次和將佔與最大一年多行的情況。

new[with(new, year == ave(year,id,FUN=max)),] 

# id year name gdp 
#3 1 1982 Jamie 70 
#6 2 1992 Kate 67 
#9 3 1996 Joe 90

來源

2014-04-04 03:20:22 thelatemail

按組選擇最大行值

回答

相關問題