R - 配對數據中的樣本

我想隨機抽樣配對數據中的變量。 idmen是我的一對夫婦標識符，idind是我perso 標識符和jour是需要被隨機子集的變量。 jour需要對一個idmen對相同。因此，例如，idmen == 2，我們需要子集etheir dimanche或vendredi。R - 配對數據中的樣本

這是數據

idmen idind jour actpr1 
     1  1 lundi  111 
     1  2 lundi  111 
     2  1 dimanche 111 
     2  2 dimanche 111 
     2  1 vendredi 111 
     2  2 vendredi 111 
     3  1 dimanche 113 
     3  2 dimanche 121 
     3  1 lundi  111 
     3  2 lundi  111

這是所需的輸出（當然可以變化的，因爲它必須被隨機選擇的輸出中）

我需要採樣一天以每個idmen。

 idmen idind jour actpr1 
     1  1 lundi  111 
     1  2 lundi  111 
     2  1 dimanche 111 
     2  2 dimanche 111 
     3  1 dimanche 113 
     3  2 dimanche 121

我想到了什麼樣

library(dplyr) 
dta %>% group_by(idmen, jour) %>% sample_n(2)

但我不明白爲什麼這是行不通的。

任何線索？

structure(list(idmen = c(1, 1, 2, 2, 2, 2, 3, 3, 3, 3), idind = c(1, 
2, 1, 2, 1, 2, 1, 2, 1, 2), jour = structure(c(3L, 3L, 1L, 1L, 
7L, 7L, 1L, 1L, 3L, 3L), .Label = c("dimanche", "jeudi ", "lundi ", 
"mardi ", "mercredi", "samedi ", "vendredi"), class = "factor"), 
actpr1 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 3L, 4L, 1L, 
1L), .Label = c("111", "112", "113", "121", "122", "123", 
"131", "132", "141", "143", "144", "145", "146", "151", "211", 
"212", "213", "223", "231", "233", "241", "261", "262", "271", 
"272", "311", "312", "313", "324", "331", "332", "334", "335", 
"341", "342", "343", "351", "372", "373", "374", "381", "382", 
"384", "385", "399", "411", "412", "413", "414", "419", "422", 
"423", "429", "431", "433", "510", "511", "512", "513", "514", 
"521", "522", "523", "524", "531", "532", "533", "541", "542", 
"613", "614", "616", "621", "622", "623", "627", "631", "632", 
"633", "634", "635", "636", "637", "638", "641", "651", "653", 
"655", "658", "661", "662", "663", "665", "667", "668", "669", 
"671", "672", "673", "674", "678", "810", "811", "812", "813", 
"819", "911", "999"), class = "factor")), .Names = c("idmen", 
"idind", "jour", "actpr1"), row.names = c(NA, -10L), class = "data.frame")

來源

2015-11-20 giacomo

也許試試這個：

> dta %>% group_by(idmen) %>% filter(jour == jour[sample(length(jour),1)]) 
Source: local data frame [6 x 4] 
Groups: idmen [3] 

    idmen idind  jour actpr1 
    (dbl) (dbl) (fctr) (fctr) 
1  1  1 lundi  111 
2  1  2 lundi  111 
3  2  1 vendredi 111 
4  2  2 vendredi 111 
5  3  1 lundi  111 
6  3  2 lundi  111

...雖然這將有內置dplyr也許是一個「完整的樣本羣體」功能是一種整齊。

來源

2015-11-20 17:26:30 joran

這裏有一個基礎R解決方案：

dta[unlist(sample(as.data.frame(matrix(1:nrow(dta),nrow = 2)),10,replace=T)),]

這需要的事實，數據幀是一個列表的優勢。當您在列表上使用sample()時，它將佔用整個數據幀列。然後對結果使用unlist()，並且您已將兩行一起取樣。這樣可以替換10對，但當然可以改變。

來源

2015-11-20 17:34:33

R - 配對數據中的樣本

回答

相關問題