2015-03-31 36 views
3

我有一個由639個數據組成的數據框(數據)並且有6列。每個單元格表示秒數的時間。我計算了每列的閾值。當滿足閾值時用循環寫一個函數

到目前爲止,我已經這樣做了:計算每列的閾值。因此6列的閾值爲

threshold1 
[1] 16 22 31 6 11 13 

threshold2 
[1] 200.0 275.0 387.5 75.0 137.5 162.5 

該閾值表示每列的最小值和最大值秒數。所以我想(對所有列執行此操作):在第1列中高亮顯示所有值小於16秒的單元格以及所有值大於200秒的單元格。

我已經這樣做:

column1<-ifelse(data$column1<threshold1[1],"speeder",  
     ifelse(data$column1>threshold2[1], "slower",1)) 


column2<-ifelse(data$column2<threshold1[2],"speeder",  
     ifelse(data$column2>threshold2[2], "slower",1)) 

column3<-ifelse(data$column3<threshold1[3],"speeder",  
     ifelse(data$column3>threshold2[3], "slower",1)) 

等所有6列。

現在我想在循環中編寫它,所以我不需要每次都手動編寫函數ifelse,因爲我有不同的數據集,它們由不同數量的列組成。

回答

1

首先生成的數據,名爲 「逸」:

dat <- data.frame(
    column1 = runif(n = 638, min=0, max=220), 
    column2 = runif(n = 638, min=0, max=300), 
    column3 = runif(n = 638, min=0, max=400), 
    column4 = runif(n = 638, min=0, max=100), 
    column5 = runif(n = 638, min=0, max=150), 
    column6 = runif(n = 638, min=0, max=200)) 

# define thresholds  
threshold1 <- c(16, 22, 31, 6, 11, 13) 
threshold2 <- c(200.0, 275.0, 387.5, 75.0, 137.5, 162.5) 

使用循環

# Declare a list that will contain the results 
results <- list() 

# Loop over the columns 
for(i in seq_len(ncol(dat))) { 
    results[[colnames(dat)[i]]] <- ifelse(dat[,i] < threshold1[i], 
              yes = "speeder", 
              no = ifelse(dat[,i] > threshold2[i], 
                 yes = "slower", no = 1)) 
} 

使用lapply

你也可以lapply使用,而不是一個循環,像這樣:

results <- lapply(1:ncol(dat), function(x) { 
    ifelse(dat[,x] < threshold1[x], 
      yes = "speeder", 
      no = ifelse(dat[,x] > threshold2[x], 
         yes = "slower", no = 1)) 
}) 

names(results) <- colnames(dat) 

結果

,您可以訪問results[[1]]的結果results[[6]]results$column1results$column6

> head(results$column1, 100) 

    [1] "1"  "1"  "1"  "1"  "1"  "1"  "slower" 
    [8] "1"  "slower" "1"  "1"  "1"  "speeder" "1"  
[15] "1"  "1"  "1"  "1"  "1"  "1"  "1"  
[22] "slower" "1"  "1"  "1"  "1"  "1"  "1"  
[29] "1"  "1"  "1"  "slower" "1"  "slower" "slower" 
[36] "1"  "1"  "1"  "1"  "speeder" "1"  "1"  
[43] "1"  "1"  "speeder" "speeder" "1"  "1"  "slower" 
[50] "1"  "1"  "slower" "1"  "1"  "1"  "1"  
[57] "1"  "1"  "1"  "1"  "1"  "1"  "1"  
[64] "1"  "1"  "1"  "1"  "slower" "1"  "1"  
[71] "slower" "1"  "1"  "1"  "speeder" "1"  "1"  
[78] "1"  "1"  "1"  "1"  "slower" "1"  "1"  
[85] "1"  "1"  "1"  "1"  "1"  "1"  "1"  
[92] "1"  "1"  "1"  "1"  "1"  "1"  "1"  
[99] "speeder" "1" 
+0

工作得很好。感謝您的幫助和及時的答覆。 – Miha 2015-03-31 08:15:12

+0

不客氣!不要忘記對最後一起回答的答案加以注意/接受(綠色對號)! :) – 2015-03-31 08:18:29

0

可以lapply試訓以及..這將是比循環要快..

dat <- data.frame(
    column1 = runif(n = 638, min=0, max=220), 
    column2 = runif(n = 638, min=0, max=300), 
    column3 = runif(n = 638, min=0, max=400), 
    column4 = runif(n = 638, min=0, max=100), 
    column5 = runif(n = 638, min=0, max=150), 
    column6 = runif(n = 638, min=0, max=200)) 

# define thresholds  
threshold1 <- c(16, 22, 31, 6, 11, 13) 
threshold2 <- c(200.0, 275.0, 387.5, 75.0, 137.5, 162.5) 

result = matrix(unlist(lapply(seq(6), function(i){ 
    ifelse(dat[,i] < threshold1[i], 
     yes = "speeder", 
     no = ifelse(dat[,i] > threshold2[i], 
        yes = "slower", no = 1)) 
})), ncol = 6, byrow = FALSE) 

head(result) 
    [,1]  [,2] [,3] [,4]  [,5] [,6] 
[1,] "speeder" "1" "1" "slower" "1" "1" 
[2,] "1"  "1" "1" "1"  "1" "1" 
[3,] "1"  "1" "1" "1"  "1" "1" 
[4,] "1"  "1" "1" "slower" "1" "1" 
[5,] "1"  "1" "1" "1"  "1" "1" 
[6,] "1"  "1" "1" "slower" "1" "1" 
相關問題