我正在嘗試製作一個循環,用於統計數據框的多行中的多個模式並報告新數據框中出現的次數。循環遍歷數據框以搜索多個模式並報告它們
這是我輸入:模式我想搜索
input <- data.frame(V1 = LETTERS[1:4],
V2 = c("ABCDEF", "AAABBBCCA", "CCAABBCC", "ACCCCCCA"),
stringsAsFactors = FALSE)
列表:
list<-c("ABC", "AA", "CC", "CCCC", "A")
和預期輸出:
structure(list(V1 = structure(1:4, .Label = c("A", "B", "C",
"D"), class = "factor"), V2 = structure(c(2L, 1L, 4L, 3L), .Label = c("AAABBBCCA",
"ABCDEF", "ACCCCCCA", "CCAABBCC"), class = "factor"), ABC = c(1L, 0L, 0L, 0L), AA = c(0L, 1L, 1L, 0L), CC = 0:3, CCCC = c(0L, 0L, 0L, 1L), A = c(1L, 4L, 2L, 1L), ABC_length = c(1L, 0L, 0L, 0L), AA_length = c(0L, 1L, 1L, 0L), CC_length = structure(1:4, .Label = c("0", "1", "1,1", "2"), class = "factor"), CCCC_length = c(0L, 0L, 0L, 1L), A_length = structure(c(1L, 4L, 3L, 2L), .Label = c("1", "1,1", "2", "3,1"), class = "factor")), .Names = c("V1", "V2", "ABC", "AA", "CC", "CCCC", "A", "ABC_length", "AA_length", "CC_length", "CCCC_length", "A_length"), class = "data.frame", row.names = c(NA, -4L))
一種解決方案可以使用str_count或str_locate_all,下面的例子。 但實際上我想用上面提到的模式列表進行搜索。
library(stringr)
input$ABC <- str_count(input$ABC, "ABC")
input$ABC_length <- lapply(str_locate_all(input$ABC_length, "ABC"), function(x) {
paste(x[, 2] - x[, 1] + 1, collapse = ",")
})
只是爲了說清楚,我的例子包括一個解決方案如何找到一個模式「ABC」,但問題是關於搜索多個模式 – user2904120
您沒有找到「ABC」模式的解決方案,因爲您是指的是你正在嘗試創建的列。 – lebelinoz