2017-03-03 41 views
0

我在r中一個數據幀和我子集成兩個:從另一個數據框中選擇具有相同特徵的元素?

p<-c(3.14,3.56,7.45,8.33,5.44,3.12,3.78,7.62,9.12,4.34,6.78,8.65,6.99) 
n<-c("mQTL","mQTL","null","null","null","null","null","null","null","null","null","null","null") 
s<-c(2,2,1,2,1,1,2,2,2,1,2,1,2) 
g<-c("female","male","female","male","female","female","male","female","female","male","female","female","female") 
df<-data.frame(n,g,s,p) 
df 


mQTL<-subset(df,df$n=='mQTL') 

mQTL

n  g s p 
1 mQTL female 2 3.14 
2 mQTL male 2 3.56 


null<-subset(df,df$n=="null") 

n  g  s p 
3 null female 1 7.45 
4 null male 2 8.33 
5 null female 1 5.44 
6 null female 1 3.12 
7 null male 2 3.78 
8 null female 2 7.62 
9 null female 2 9.12 
10 null male 1 4.34 
11 null female 2 6.78 
12 null female 1 8.65 
13 null female 2 6.99 

我想隨機搜索從空兩個元件,其中每個的它們匹配基於性別(df $ g)和數量(df $ s)的兩個mQTL

例如,我想有這樣的事情第一個隨機畫

n g  s p 
null female 2 7.62 
null male 2 3.78 

第二隨機畫

n g  s p 
null female 2 9.12 
null male 2 3.78 

我想隨機得出這樣的5倍,例如,得到5不同的組合

我試圖

null[which((mQTL$g==null$g)& (mQTL$s==null$s)),] 

,但它給了我一個datafram所有的人都沒有兩屆組合電子

 n  g s p 
4 null male 2 8.33 
9 null female 2 9.12 
11 null female 2 6.78 
13 null female 2 6.99 
+1

我不明白。爲什麼8.33會用於男性排 – Crt

+0

我編了一些數據,你不需要解釋實際值。我的實際數據幀比這個大得多。實際上,我有4000個mQTL從null(10000行)中抽樣。我希望他們每個人都有基於'性別'和'數字'(s專欄)的相同功能。但我想從null中隨機選擇4000,他們只需要具有相同的功能(標準)! – dizue

+1

您可能想閱讀http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/28481250#28481250重現您的示例有點痛苦,因爲您擁有它這裏。 – Frank

回答

0
mQTL = subset(df,df$n=='mQTL') 
null = subset(df,df$n=='null') 

# Check if the combination of null$g and null$s matches with that of mQTL$g and mQTL$s 
null$match = paste(null$g, null$s) %in% paste(mQTL$g, mQTL$s) 

# Random sample of two of the matched rows 
null[sample(which(null$match), 2),] 

# > null[sample(which(null$match), 2),] 
#  n  g s p match 
# 13 null female 2 6.99 TRUE 
# 4 null male 2 8.33 TRUE 

要繪製的5倍,你運行一個for循環和存儲在列表得出:

draws = list() 
for(ii in 1:5){ 
    draws[[ii]] = null[sample(which(null$match), 2),] 
} 

# > draws 
# [[1]] 
#  n  g s p match 
# 4 null male 2 8.33 TRUE 
# 13 null female 2 6.99 TRUE 
# 
# [[2]] 
#  n  g s p match 
# 11 null female 2 6.78 TRUE 
# 9 null female 2 9.12 TRUE 
# 
# [[3]] 
#  n  g s p match 
# 9 null female 2 9.12 TRUE 
# 8 null female 2 7.62 TRUE 
# 
# [[4]] 
#  n  g s p match 
# 13 null female 2 6.99 TRUE 
# 4 null male 2 8.33 TRUE 
# 
# [[5]] 
#  n  g s p match 
# 7 null male 2 3.78 TRUE 
# 8 null female 2 7.62 TRUE 
+0

非常感謝你!這正是我需要的! – dizue

+0

@dizue如果你認爲這回答你的問題,請接受它,讓其他人可以看到。 – useR

+0

非常感謝你! – dizue

0

嘗試使用merge()功能:

merge(mQTL, null, by.x = c("g","s"), by.y = c("g","s)) 

但你可能要重命名的列,使事情clearier。

+0

對不起,我更新了我的問題。事實上,我需要做一個排列而不是合併 – dizue

相關問題