2011-10-14 49 views
6

我有一個大名單,但微例子是這樣的:合併兩個列表組件

A <- c("A", "a", "A", "a", "A") 
B <- c("A", "A", "a", "a", "a") 
C <- c(1, 2, 3, 1, 4) 
mylist <- list(A=A, B=B, C= C) 

預期輸出是合併A和B,使用每個部件看起來就像AB

AA, aA, Aa, aa, Aa 

更好的排序方式,大寫永遠是第一位

AA, Aa, Aa, aa, Aa 

因此新的列表或矩陣應該有兩列或行:

AA, Aa, Aa, aa, Aa 
1, 2, 3, 1, 4 

現在我想計算基礎類平均C的 - 「AA」,「AA」和「AA」

看起來簡單,但我不容易弄清楚。

回答

2
> (ab <- paste(A, B, sep="")) 
[1] "AA" "aA" "Aa" "aa" "Aa" 
> (ab <- paste(A, B, sep="")) # the joining step 
[1] "AA" "aA" "Aa" "aa" "Aa" 
> (ab <- sub("([a-z])([A-Z])", "\\2\\1", ab)) # swap lowercase uppercase 
[1] "AA" "Aa" "Aa" "aa" "Aa" 

> rbind(ab, C)     # matrix 
    [,1] [,2] [,3] [,4] [,5] 
ab "AA" "Aa" "Aa" "aa" "Aa" 
C "1" "2" "3" "1" "4" 
> data.frame(alleles=ab, count=C) # dataframes are lists 
    alleles count 
1  AA  1 
2  Aa  2 
3  Aa  3 
4  aa  1 
5  Aa  4 
2

我能做到這一點,如果你的數據是使用包裝plyr

> A <- c("A", "a", "A", "a", "A") 
> B <- c("A", "A", "a", "a", "a") 
> C <- c(1, 2, 3, 1, 4) 
> groups <- sort(paste(A, B, sep="")) 
[1] "AA" "aA" "Aa" "aa" "Aa" 
> my.df <- data.frame(A=A, B=B, C=C, group=groups) 

> require(plyr) 
> result <- ddply(my.df, "group", transform, group.means=mean(C)) 
> result[order(result$group, decreasing=TRUE),] 
    A B C group group.means 
5 A A 1 AA   1.0 
3 A a 3 Aa   3.5 
4 A a 4 Aa   3.5 
2 a A 2 aA   2.0 
1 a a 1 aa   1.0 
1

與您的數據排列在data.frame

A <- c("A", "a", "A", "a", "A") 
B <- c("A", "A", "a", "a", "a") 
C <- c(1, 2, 3, 1, 4) 

我定義了一個data.frame使用A的組合和B作爲關鍵列:

AB <- paste(A, B, sep='') 
df <- data.frame(id=AB, C=C) 

> df 
    id C 
1 AA 1 
2 aA 2 
3 Aa 3 
4 aa 1 
5 Aa 4 

如果需要聚合之前訂購此data.frame則:

df <- df[order(AB, decreasing=TRUE),] 

> df 
    id C 
1 AA 1 
3 Aa 3 
5 Aa 4 
2 aA 2 
4 aa 1 

並與aggregate你計算平均每個id

meanDF <- aggregate(C~id, data=df, mean) 

> meanDF 

    id C 
1 aa 1.0 
2 aA 2.0 
3 Aa 3.5 
4 AA 1.0 

但是如果你想聚合後訂購,那麼:

df <- data.frame(id=AB, C=C) 
meanDF <- aggregate(C~id, data=df, mean) 
meanDF <- meanDF[order(meanDF$id, decreasing=TRUE),]