將存儲在列表中的變量轉換爲r中的字符向量列表

我有一個來自非常大的數據集的數據子集。我將這個數據子集分成了一個數據框列表，這樣每個case/id就是列表中的一個獨立元素。每個元素都用case/id命名。然後，我從每個dataframe元素中刪除所有變量，只留下一個變量 - 稱爲「狀態」。它目前是7個級別的因素。將存儲在列表中的變量轉換爲r中的字符向量列表

我試圖將這個「狀態」元素列表變成一個字符向量列表。下面的元素是列表中的第一個元素，並且包含行號（源自更大的原始數據集）。

[[1]] 
     state 
104246 active 
104247 rest 
104248 active 
104249 active 
. 
. 
. 
104315 active 
104316 active 
104317 rest 
104318 rest

我試圖把這個簡單地成應該是這樣的一個特徵向量：

[1] "active" "rest" "active" "active" ........... "active" "active" "rest" "rest"

這似乎很簡單。我曾嘗試做這樣的事情（其中「臨時」的列表名稱）：

as.vector(as.matrix(temp))

這將返回是這樣的：

  [,1] 
    id1 List,1 
    id2 List,1 
    id3 List,1 
    id4 List,1

當我看到每一個元素，從這個他們基本上看起來是仍然長存。

另外，我嘗試直接轉換爲字符：

as.vector(as.character(temp))

但是，這回來爲不理想的格式（不過，我想我可以破解這個的因子水平數轉換成的話.. （注意在大的數據集，有7個級別的因子「州」的）

[1] "list(state = c(1, 4, 1, 1, 1, 1, 1, 4, 4, 4, 1, 1, 1, 1, 1, 1, 1, 1, 1, 4, 4, 1, 6, 1, 4, 4, 1, 1, 1, 4,  1, 1, 1, 6, 4, 1, 1, 1, 1, 1, 4, 4, 1, 4, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 4, 4, 4, 4, 1, 1, 1, 1, 4, 4, 1, 1, 1, 1,  1, 1, 1, 4, 4))"

我還試圖使變量「狀態」，這是一個因素的字符變量轉換之前，但沒」 t help。

以下是一個可重現的例子的數據。它僅包含在這個例子中列表「臨時」兩個元素：

temp<-list(structure(list(state = structure(c(1L, 4L, 1L, 1L, 1L, 1L, 
              1L, 4L, 4L, 4L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 4L, 4L, 1L, 
              6L, 1L, 4L, 4L, 1L, 1L, 1L, 4L, 1L, 1L, 1L, 6L, 4L, 1L, 1L, 1L, 
              1L, 1L, 4L, 4L, 1L, 4L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
              4L, 4L, 4L, 4L, 1L, 1L, 1L, 1L, 4L, 4L, 1L, 1L, 1L, 1L, 1L, 1L, 
              1L, 4L, 4L), .Label = c("active", "active2", "active3", "rest", "rest2", 
                    "stop", "stop2"), class = "factor")), .Names = "state", row.names = 104246:104318, class = "data.frame"), 
     structure(list(state = structure(c(1L, 4L, 4L, 4L, 1L, 1L, 
              1L, 4L, 4L, 4L, 4L, 1L, 4L, 4L, 4L, 1L, 1L, 6L, 4L, 1L, 4L, 
              4L, 4L, 1L, 4L, 1L, 1L, 1L), .Label = c("active", "active2", 
                        "active3", "rest", "rest2", "stop", "stop2"), class = "factor")), .Names = "state", row.names = 950:977, class = "data.frame")) 



str(temp)

來源

2014-07-16 jalapic

L = lapply(temp, function(x) as.character(unlist(x)))只是L[[1]]或L[[2]]的載體。

來源

2014-07-16 03:49:42 Vlo

嘗試這段代碼

as.vector(unlist(temp[[1]]))

來源

2014-07-16 02:50:50

這可能是一個很好的機會，利用rapply：

x <- rapply(temp, as.character, how = "replace") 
str(x) 
# List of 2 
# $ :List of 1 
# ..$ state: chr [1:73] "active" "rest" "active" "active" ... 
# $ :List of 1 
# ..$ state: chr [1:28] "active" "rest" "rest" "rest" ...

如果您想進一步壓平，然後就可以使用unlist(..., recursive = FALSE)。

str(unlist(rapply(temp, as.character, how = "replace"), recursive=FALSE)) 
# List of 2 
# $ state: chr [1:73] "active" "rest" "active" "active" ... 
# $ state: chr [1:28] "active" "rest" "rest" "rest" ...

這第二種方法會給你同樣的結果@ VLO的做法，但比它調用unlist只是一次會更有效。要看看它可能有多不同，下面是一些較大的基準list：

x <- replicate(1000, temp) ## A larger list 

## Vlo's approach 
fun1 <- function() { 
    lapply(x, function(y) as.character(unlist(y, use.names = FALSE))) 
} 

## My approach 
fun2 <- function() { 
    unlist(rapply(x, as.character, how = "replace"), 
     recursive=FALSE, use.names=FALSE) 
} 

## Benchmarking 
library(microbenchmark) 
microbenchmark(fun1(), fun2(), times = 50) 
# Unit: milliseconds 
# expr  min  lq median  uq  max neval 
# fun1() 435.84992 475.17146 497.63325 533.68488 1570.6814 50 
# fun2() 50.90449 55.79023 63.85908 70.78956 111.0357 50 

## Comparison of results 
all.equal(fun1(), fun2(), check.attributes=FALSE) 
# [1] TRUE

來源

2014-07-16 04:23:38 A5C1D2H2I1M1N2O1R2T1

將存儲在列表中的變量轉換爲r中的字符向量列表

回答

相關問題