我有一個0/1虛擬變量的數據幀。每個虛擬變量只取值1一次。對於每一列，我想用n值替換前n /後n個觀察值，並將其值設爲1（例如1）。R：重新編碼之前/之後的n個觀察值

所以對於單個矢量，其中n = 1：

c(0, 0, 1, 0, 0)

我會想

c(0, 1, 1, 1, 0)

什麼是與n列了良好的一般方法，並允許不同數量的之前/之後的觀察替換（例如之前的&之後的n-1）？

感謝您的幫助！

來源

2015-11-19 Maarölli

像'as.numeric（過濾器（X，代表（1,3）的，圓形= TRUE））'。 –

另一種選擇：

f <- function(x, pre, post) { 
    idx <- which.max(x) 
    x[max(1, (idx-pre)):min(length(x), (idx+post))] <- 1 
    x 
}

的樣本數據：

df <- data.frame(x = c(0, 0, 1, 0, 0), y = c(0, 1, 0, 0, 0))

鴨折襞：

df[] <- lapply(df, f, pre=2, post=1) 
#df 
# x y 
#1 1 1 
#2 1 1 
#3 1 1 
#4 1 0 
#5 0 0

來源

2015-11-19 12:38:57

你可以做的是：

vec <- c(0, 0, 1, 0, 0) 

sapply(1:length(vec), function(i) { 
    minval <- max(0, i - 1) 
    maxval <- min(i + 1, length(vec)) 
    return(sum(vec[minval:maxval])) 
}) 
# [1] 0 1 1 1 0

或者把它放在一個函數（相同的代碼，但有點更緊湊）

f <- function(vec){ 
    sapply(1:length(vec), function(i) 
       sum(vec[max(0, i-1):min(i+1, length(vec))])) 
} 

f(vec) 
# [1] 0 1 1 1 0

SPEEDTEST

爲了比較兩個不同的解決方案，我很快用microbenchmark進行了一個基準測試，獲勝者是：很清楚@盛林的代碼....總是很高興看到簡單的解決方案（以及看看有多複雜（m y）解決方案可以）。

fDavid <- function(vec){ 
    sapply(1:length(vec), function(i) 
    sum(vec[max(0, i-1):min(i+1, length(vec))])) 
} 
fHeroka <- function(vec){ 
    res <- vec 
    test <- which(vec==1) 

    #create indices to be replaced 

    n=1 #variable n 
    replace_indices <- c(test+(1:n),test-(1:n)) 
    #filter out negatives (may happen with larger n) 
    replace_indices <- replace_indices[replace_indices>0] 
    #replace items in 'res' that need to be replaced with 1 

    res[replace_indices] <- 1 
} 
fShenglin <- function(vec){ 

    ind<-which(vec==1) 
    vec[(ind-1):(ind+x)]<-1 
} 

vect <- sample(0:1, size = 1000, replace = T) 

library(microbenchmark) 
microbenchmark(fHeroka(vect), fDavid(vect), fShenglin) 
# # Unit: nanoseconds 
# expr  min  lq  mean median  uq  max 
# fHeroka(vect) 38929 42999 54422.57 49546 61755.5 145451 
# fDavid(vect) 2463805 2577935 2875024.99 2696844 2849548.5 5994596 
# fShenglin  0  0  138.63  1  355.0 1063 
# neval cld 
# 100 a 
# 100 b 
# 100 a 
# Warning message: 
# In microbenchmark(fHeroka(vect), fDavid(vect), fShenglin) : 
# Could not measure a positive execution time for 30 evaluations.

來源

2015-11-19 12:25:14 David

這可能是一個開始：

myv <- c(0, 0, 1, 0, 0) 

#make a copy 
res <- myv 

#check where the ones are 
test <- which(myv==1) 

#create indices to be replaced 

n=1 #variable n 
replace_indices <- c(test+(1:n),test-(1:n)) 
#filter out negatives (may happen with larger n) 
replace_indices <- replace_indices[replace_indices>0] 
#replace items in 'res' that need to be replaced with 1 

res[replace_indices] <- 1 
res 

    > res 
    [1] 0 1 1 1 0

來源

2015-11-19 12:25:23 Heroka

x<-c(0,0,1,0,0) 
ind<-which(x==1) 
x[(ind-1):(ind+x)]<-1

來源

2015-11-19 12:32:49

非常簡單快捷的解決方案，但是，您缺少一個檢查，例如：在向量'x < - c（1,0,1,0,0,1）'上運行代碼，您需要檢查' ind'在0以上並且在'length（x）'以下' – David

你可以用這行來做：'x [（max（0，ind-1））：min（（ind + x），length（x））] <-1' – David

這可能是一個解決方案：

dat<-data.frame(x=c(0,0,1,0,0,0),y=c(0,0,0,1,0,0),z=c(0,1,0,0,0,0)) 
which_to_change<-data.frame(prev=c(2,2,1),foll=c(1,1,3)) 
for(i in 1:nrow(which_to_change)){ 
    dat[(which(dat[,i]==1)-which_to_change[i,1]):(which(dat[,i]==1)+which_to_change[i,2]),i]<-1 
}

來源

2015-11-19 12:40:00 Maju116

R：重新編碼之前/之後的n個觀察值

回答

SPEEDTEST

相關問題