2013-06-20 61 views
2

我有data.frame,看起來像這樣:(我真正據幀越大):隨機化或在data.frame置換值

df <- data.frame(A=c("a","b","c","d","e","f","g","h","i"), 
       B=c("1","1","1","2","2","2","3","3","3"), 
       C=c(0.1,0.2,0.4,0.1,0.5,0.7,0.1,0.2,0.5)) 

> df 
    A B C 
1 a 1 0.1 
2 b 1 0.2 
3 c 1 0.4 
4 d 2 0.1 
5 e 2 0.5 
6 f 2 0.7 
7 g 3 0.1 
8 h 3 0.2 
9 i 3 0.5 

我想補充一些n列(東西類似於排列),其中該柱d將是從df$C的隨機值,但該值應該只從與df$B聖母院值,所期望的輸出的示例的那些行拾取將是:

df <- data.frame(A=c("a","b","c","d","e","f","g","h","i"), 
       B=c("1","1","1","2","2","2","3","3","3"), 
       C=c(0.1,0.2,0.4,0.1,0.5,0.7,0.1,0.2,0.5), 
       D=c(0.2,0.2,0.1,0.5,0.7,0.1,0.5,0.5,0.2)) 

> df 
    A B C D 
1 a 1 0.1 0.2 
2 b 1 0.2 0.2 
3 c 1 0.4 0.1 
4 d 2 0.1 0.5 
5 e 2 0.5 0.7 
6 f 2 0.7 0.1 
7 g 3 0.1 0.5 
8 h 3 0.2 0.5 
9 i 3 0.5 0.2 

我試着plyr包,但我的方法不正常:

ddply(df, levels(.(B)), transform, D=sample(C)) 

我也想過分裂基於df$B數據幀,然後用功能來添加使用lapply然而,在每個數據幀列我不知道如何選擇對的df$B水平,

非常感謝

回答

2

無需plyrave會做的伎倆。

transform(df, D=ave(C, B, FUN=function(b) sample(b, replace=TRUE))) 
+0

感謝@Matthew Plourde,它的工作!,你能告訴我請,如果'ave'函數隨機從每個值在C的水平 – user2380782

+0

其他問題@Matthew Plourde挑選,任何想法如何加入正 - 列到我的數據框,我想要得到每行,1000個值(隨機選擇)考慮到B的水平,也許是一個for循環?感謝您的幫助 – user2380782

+0

'ave'將'sample'函數應用於每個分組。 'sample'正在做隨機選擇。 –