[R比較當前行的下一行（在同一列）

我有類似：[R比較當前行的下一行（在同一列）

ISBN Date Quantity 
3457 2004 10 
3457 2004 6 
3457 2004 10 
3457 2005 7 
3457 2005 12 
9885 2013 10 
9885 2013 6 
9855 2013 10 
9885 2014 7 
9885 2014 12

而且我想：

ISBN Date Quantity Year 
3457 2004 10  1st Year 
3457 2004 6  1st Year 
3457 2004 10  1st Year 
3457 2005 7  2nd Year 
3457 2005 12  2nd Year 
9885 2013 10  1st Year 
9885 2013 6  1st Year 
9855 2013 10  1st Year 
9885 2014 7  2nd Year 
9885 2014 12  2nd Year

而且我有這樣的代碼：

df<-df %>% group_by(ISBN) %>% mutate(Year = ifelse(DateYear > DateYear,"1st Year","2nd Year"))

但是我到處都是「第二年」，所以我想在ifelse的比較中我實際上並沒有比較「日期」列中的行。我想我必須使用for循環，但是認爲這是在R中的其他方式。我怎樣才能得到我需要的結果？

來源

2016-10-21 adlisval

我不明白你的標準是'第一年'和'第二年'？爲什麼2013年回到第一年，2014年回到第二年？ – Phil

@Phil請看這裏：https://stackoverflow.com/questions/40159194/r-plot-months-for-the-first-2-years – adlisval

按在評論中提到的，你應該有更多的情況下，你可以這樣做：

library(dplyr) 
library(toOrdinal) 

df %>% 
    group_by(ISBN) %>% 
    mutate(Year = paste(sapply(cumsum(Date != lag(Date, default = 0)), toOrdinal), "Year"))

對於爲例：

# ISBN Date Quantity 
#1 3457 2004  10 
#2 3457 2004  6 
#3 3457 2005  10 
#4 3457 2006  7 
#5 3457 2007  12 
#6 9885 2013  10 
#7 9885 2014  6 
#8 9855 2015  10 
#9 9885 2015  7 
#10 9885 2016  12

授予：

#Source: local data frame [10 x 4] 
#Groups: ISBN [3] 
# 
# ISBN Date Quantity  Year 
# <int> <int> <int> <chr> 
#1 3457 2004  10 1st Year 
#2 3457 2004  6 1st Year 
#3 3457 2005  10 2nd Year 
#4 3457 2006  7 3rd Year 
#5 3457 2007  12 4th Year 
#6 9885 2013  10 1st Year 
#7 9885 2014  6 2nd Year 
#8 9855 2015  10 1st Year 
#9 9885 2015  7 3rd Year 
#10 9885 2016  12 4th Year

來源

2016-10-21 11:26:17

使用windowing logic：

library(dplyr) 
library(readr) 

df_foo = read.table(textConnection("ISBN Date Quantity 
3457 2004 10 
3457 2004 6 
3457 2004 10 
3457 2005 7 
3457 2005 12 
9885 2013 10 
9885 2013 6 
9855 2013 10 
9885 2014 7 
9885 2014 12"), header = TRUE, stringsAsFactors = FALSE) 


df_foo %>% 
    group_by(ISBN) %>% 
    arrange(Date) %>% 
    mutate(
    ifelse(
    cumsum(Date != lag(Date, default = first(Date))), 
    "2nd Year", "1st Year" 
    ) 
)

來源

2016-10-21 10:40:16 tchakravarty

它很接近，但它給： – adlisval

對不起。這是接近，但它提供了：'ISBN日期數量年 3457 2004年10月1日年 3457 2004年6月1日年 3457 2004年10月1日年 3457 2005年7月第1年 3457 2005年12月1日年 9885 2013 10月1日年 9885 2013 6年第一年 9855 2013 10第一年 9885 2014 7第二年 9885 2014 12第一年'所以在第一次給出正確的結果之後2004> 2003 =「第二年」，它繼續到2004年> 2004年=「第一年年「 – adlisval

@adlisval你確定每個ISBN內只有兩年的時間嗎？ – tchakravarty

只是爲了完整性，因爲我個人比較喜歡這樣的解決方案，只用基礎R，依靠split和lapply達到的效果在這裏之一。有效地，它循環使用ISBN的不同值。

# examples data (note possible error on line 8, ISBN==9855) 
dat0 <- read.table(text="ISBN Date Quantity 
3457 2004 10 
3457 2004 6 
3457 2004 10 
3457 2005 7 
3457 2005 12 
9885 2013 10 
9885 2013 6 
9855 2013 10 
9885 2014 7 
9885 2014 12", header=T) 

# treat separately (loop using 'lapply') 
datlist <- split(dat,dat$ISBN) 
datlist <- lapply(datlist, 
    function(x) within(x, Year <- as.numeric(as.factor(Date)))) 

# bind together 
dat <- do.call(rbind, datlist) 
rownames(dat) <- NULL

輸出：

# ISBN Date Quantity Year 
# 1 3457 2004  10 1 
# 2 3457 2004  6 1 
# 3 3457 2004  10 1 
# 4 3457 2005  7 2 
# 5 3457 2005  12 2 
# 6 9855 2013  10 1 
# 7 9885 2013  10 1 
# 8 9885 2013  6 1 
# 9 9885 2014  7 2 
# 10 9885 2014  12 2

注意，此方法重新排列數據以這樣的方式使得行按照ISBN有序集。此外，我沒有打擾編碼Year列1st Year, 2nd Year, ...等，因爲我沒有真正看到一個價值超越像1, 2, ...更簡單的格式。

來源

2016-10-21 11:51:41 SimonG

[R比較當前行的下一行（在同一列）

回答

相關問題