0
我有兩個數據集。讓我們假設它們看起來像這樣簡單:將建模數據集的分佈與觀測數據集的分佈匹配?
observed <- data.frame(name = c("Jenny", "Mark", "James", "Amber", "Jamie"),
height = c(68, 69, 72, 63, 77),
mood = c("content", "content", "melancholy", "happy", "melancholy"))
modeled <- data.frame(name = c("Alex", "Jimmy", "Sal", "Evelyn", "Maria", "George", "Hilary", "Donny", "Jose", "Luke", "Leia"),
height = c(74, 71, 68, 66, 80, 59, 67, 67, 69, 65, 72),
mood = c("content", "content", "melancholy", "happy", "melancholy","content", "content", "melancholy", "happy", "melancholy", "happy"))
我想從選擇行建模,使得建模$高度的分佈儘可能接近觀察到$高度的分佈。我需要保持行不變,而不是簡單地匹配高度整數的分佈。任何有識之士將不勝感激。
你的意思是*儘可能接近*?如果你基於'%observed $ height'中的'建模$ height%'來過濾'模型',那麼你將得到完全匹配。這是你想要的嗎? – coffeinjunky
這些數據集很差,無法解決這個問題,因爲它們太小了。我希望高度欄的密度分佈匹配。 –