2013-09-01 80 views
2

現在我有一個數據集,看起來像這樣:使用功能作爲R每兩列

> data 
      a  b  c   d 
[1,] 0.5943590 2.195610 0.5332164 1.3004142 
[2,] 0.7635876 1.917823 0.9714945 1.3251010 
[3,] 0.9942722 2.350122 1.2048159 1.1675700 
[4,] 0.3736785 1.876318 0.9109197 0.8520509 

然後我想使用的功能,每兩列,例如,

F2<- function(x,y) (sum((x - y)^2)) #define function 
F2(data$a, data$b) #use function for first two columns 
F2(data$a, data$c) #use function for first and third columns 
F2(data$b, data$c) #use function for second and third columns 
.................. 

如何使用應用系列來做到這一點?任何幫助是極大的讚賞。

回答

5

這對combn工作:

#some data 
set.seed(42) 
m <- matrix(rnorm(16),4) 

F2<- function(x,y) (sum((x - y)^2)) 

res <- matrix(NA, ncol(m), ncol(m)) 

res[lower.tri(res)] <- combn(ncol(m), 2, 
          FUN=function(ind) F2(m[,ind[1]], m[,ind[2]])) 

print(res) 

#   [,1]  [,2]  [,3] [,4] 
# [1,]  NA  NA  NA NA 
# [2,] 2.992875  NA  NA NA 
# [3,] 4.293073 8.320698  NA NA 
# [4,] 7.944818 6.484424 16.44946 NA 

#for nicer printing 
as.dist(res) 

#   1   2   3 
# 2 2.992875      
# 3 4.293073 8.320698   
# 4 7.944818 6.484424 16.449463 

當然這個特定的功能,你應該更好地利用dist,這是爲那種距離計算的優化和:

dist(t(m))^2 

#   1   2   3 
# 2 2.992875      
# 3 4.293073 8.320698   
# 4 7.944818 6.484424 16.449463