1
創建一個散列數據框鑑於以下數據(myinput.txt):如何R中
A q,y,h
B y,f,g
C n,r,q
### more rows
我怎麼能轉換成這樣的數據結構中的R?
$A
[1] "q" "y" "h"
$B
[1] "y" "f" "g"
$C
[1] "n" "r" "q"
創建一個散列數據框鑑於以下數據(myinput.txt):如何R中
A q,y,h
B y,f,g
C n,r,q
### more rows
我怎麼能轉換成這樣的數據結構中的R?
$A
[1] "q" "y" "h"
$B
[1] "y" "f" "g"
$C
[1] "n" "r" "q"
,我認爲這是您的數據:
dat <- read.table(text="q,y,h
y,f,g
n,r,q", header=FALSE, sep=",", row.names=c("A", "B", "C"))
如果你想要一個自動的方法:
as.list(as.data.frame((t(dat)), stringsAsFactors=FALSE))
## $A
## [1] "q" "y" "h"
##
## $B
## [1] "y" "f" "g"
##
## $C
## [1] "n" "r" "q"
另一對夫婦的方法,其工作是:
lapply(apply(dat, 1, list), "[[", 1)
unlist(apply(dat, 1, list), recursive=FALSE)
使用位的readLines
strsplit
和正則表達式來解釋打破了名關開始:
dat <- readLines(textConnection("A q,y,h
B y,f,g
C n,r,q"))
result <- lapply(strsplit(dat,"\\s{2}|,"),function(x) x[2:length(x)])
names(result) <- gsub("^(.+)\\s{2}.+$","\\1",dat)
> result
$A
[1] "q" "y" "h"
$B
[1] "y" "f" "g"
$C
[1] "n" "r" "q"
或用更少的正則表達式和更多的步驟:
result <- strsplit(dat,"\\s{2}|,")
names(result) <- lapply(result,"[",1)
result <- lapply(result,function(x) x[2:length(x)])
> result
$A
[1] "q" "y" "h"
$B
[1] "y" "f" "g"
$C
[1] "n" "r" "q"
@塞巴斯蒂安 - C:非常感謝。有沒有辦法讓'dat'自動識別row.names? 即不分配它。 – neversaint 2013-02-15 04:41:09
@neversaint我只是這樣做,重新創建您的數據。我應該使用'row.names = 1',所以一個例子是:'read.csv(「dat.csv」,row.names = 1)'。您可能還想將'colClasses =「字符」'或'stringsAsFactors = FALSE'添加到'read.table'中。 – 2013-02-15 04:45:56