如何將數據框分成與R中列名相關的數據框列表？

假設我有以下數據框：如何將數據框分成與R中列名相關的數據框列表？

df <- data.frame(BR.a=rnorm(10), BR.b=rnorm(10), BR.c=rnorm(10), 
USA.a=rnorm(10), USA.b = rnorm(10), FRA.a=rnorm(10), FRA.b=rnorm(10))

我想創建dataframes名單，由列名，即第一部分將它們分開，以「BR」開始的列是一個元素的列表中，以「USA」開頭的列將是另一列，依此類推。

我可以得到列名並使用strsplit將它們分開。不過，我不知道如何迭代它並分離數據框是最好的方式。

strsplit(names(df), "\\.")

給了我一個清單，其頂層元素是列和第二級的名稱本薩姆斯由"."分裂。

我該如何迭代這個列表才能獲得以相同子字符串開頭的列的索引號，並將這些列作爲另一個列表的元素進行分組？

來源

2012-02-14 João Daniel

達誠打我給它，但這裏有一個不同的風味相同的概念方法的：「」

library(plyr) 

# Use regex to get the prefixes 
# Pulls any letters or digits ("\\w*") from the beginning of the string ("^") 
# to the first period ("\\.") into a group, then matches all the remaining 
# characters (".*"). Then replaces with the first group ("\\1" = "(\\w*)"). 
# In other words, it matches the whole string but replaces with only the prefix. 

prefixes <- unique(gsub(pattern = "^(\\w*)\\..*", 
         replace = "\\1", 
         x = names(df))) 

# Subset to the variables that match the prefix 
# Iterates over the prefixes and subsets based on the variable names that 
# match that prefix 
llply(prefixes, .fun = function(x){ 
    y <- subset(df, select = names(df)[grep(names(df), 
              pattern = paste("^", x, sep = ""))]) 
})

我想這些正則表達式應該還是給你，即使有正確的結果後來在變量名：

unique(gsub(pattern = "^(\\w*)\\..*", 
      replace = "\\1", 
      x = c(names(df), "FRA.c.blahblah")))

或者，如果一個前綴後出現在變量名：

# Add a USA variable with "FRA" in it 
df2 <- data.frame(df, USA.FRANKLINS = rnorm(10)) 

prefixes2 <- unique(gsub(pattern = "^(\\w*)\\..*", 
         replace = "\\1", 
         x = names(df2))) 

llply(prefixes2, .fun = function(x){ 
    y <- subset(df2, select = names(df2)[grep(names(df2), 
              pattern = paste("^", x, sep = ""))]) 
})

來源

2012-02-14 17:13:39

如果列名始終以您擁有的形式（基於「。」分割），並且您希望根據第一個「。」之前的標識符進行分組，那麼這將僅適用。

df <- data.frame(BR.a=rnorm(10), BR.b=rnorm(10), BR.c=rnorm(10), 
USA.a=rnorm(10), USA.b = rnorm(10), FRA.a=rnorm(10), FRA.b=rnorm(10)) 

## Grab the component of the names we want 
nm <- do.call(rbind, strsplit(colnames(df), "\\."))[,1] 
## Create list with custom function using lapply 
datlist <- lapply(unique(nm), function(x){df[, nm == x]})

來源

2012-02-14 17:12:00 Dason

如何將數據框分成與R中列名相關的數據框列表？

回答

相關問題