1

下面的代碼顯示均值的採樣分佈圖並計算20個95%置信區間的批次。如何在直方圖上繪製置信區間,如下面的Photoshopped圖中所示?使用ggplot2在直方圖上繪製置信區間

# plot sampling distribution of mean ----------------------------------------------------------- 
set.seed(1) 

population <- rnorm(10000, 3, 3) 

population_mean <- mean(population) 

my_sample <- sample(population, 100, replace = FALSE) 

standard_error <- sqrt(var(my_sample)/length(my_sample)) 

sampling_distribution_of_mean <- rnorm(10000, mean = population_mean, sd = standard_error) 

library(ggplot2) 
ggplot(data.frame(x = sampling_distribution_of_mean), aes(x)) + geom_histogram() + geom_vline(xintercept = population_mean, color = "red") 


# calculate 20 lots of 95% confidence intervals ----------------------------------------------------------- 

my_confidence_intervals <- function(){ 

    my_sample <- sample(population, 100, replace = FALSE) 

    sample_mean <- mean(my_sample) 

    standard_error <- sqrt(var(my_sample)/length(my_sample)) 

    margin_of_error <- 1.96*standard_error 

    mean_minus_margin_of_error <- sample_mean - margin_of_error 
    mean_plus_margin_of_error <- sample_mean + margin_of_error 

    c(mean_minus_margin_of_error, mean_plus_margin_of_error) 

} 

library(plyr) 
llply(1:20, function(x) my_confidence_intervals()) 

enter image description here

+0

我的問題與任務無關,但您究竟想要展示什麼? – Dason

+0

樣本的均值的抽樣分佈與95%的置信區間意味着 – luciano

+0

也許您會想要包含一些垂直線以指示中間95%的抽樣分佈也是如此。這樣,更容易看到樣本意味着落在該邊界之外導致置信區間不能捕獲均值。 – Dason

回答

7

你會想建立包含間隔data.frame,然後添加的水平誤差線層繪製它們。首先,我將您的範圍爲data.frame

xx<-llply(1:20, function(x) my_confidence_intervals()) 
xx<-data.frame(y=1:20*50, x=do.call(rbind, xx)) 

現在,我將它們添加到情節

ggplot(data.frame(x = sampling_distribution_of_mean), aes(x)) + 
    geom_histogram() + 
    geom_vline(xintercept = population_mean, color = "red") + 
    geom_errorbarh(aes(y=y, x=x.1, xmin=x.1, xmax=x.2), data=xx, col="#0094EA", size=1.2) 

這給我明確設置Y-

enter image description here

公告創建data.frame時每個範圍的值。

+0

顯然,你選擇的藍色與我在我的問題中使用的藍色是很匹配的。你是如何找到這樣的近距離比賽的? – luciano

+0

我將您的圖像複製到Paint.NET並使用顏色選擇器。 – MrFlick