從生成分級數據

我分級一些數據，目前有指定這樣的頻率的數據框是由兩列組成，一個指定的倉範圍和另一直方圖和密度圖： -從生成分級數據

> head(data) 
     binRange Frequency 
1 (0,0.025]  88 
2 (0.025,0.05]  72 
3 (0.05,0.075]  92 
4 (0.075,0.1]  38 
5 (0.1,0.125]  20 
6 (0.125,0.15]  16

我想用這個來繪製直方圖和密度的情節，但我似乎無法找到這樣做的一種方式，以便無需產生新的垃圾箱等。利用這一解決方案here我試着做到以下幾點： -

p <- ggplot(data, aes(x= binRange, y=Frequency)) + geom_histogram(stat="identity")

但它崩潰。任何人都知道如何處理這個問題？

謝謝

來源

2015-04-29 user2062207

看看這個[post]（http://stackoverflow.com/questions/18219704/histogram-of-分級數據幀-在-R）。 –

謝謝你，只是更新了我的文章。我試圖做我的數據，所以我執行'p < - ggplot（數據，aes（x = binRange，y = Frequency））+ geom_histogram（stat =「identity」）'但它只是崩潰 – user2062207

做什麼錯誤信息你得到？ –

問題是ggplot犯規理解這些數據，你輸入它，你需要重塑它像這樣的方式（我不是一個正則表達式高手，所以肯定有更好的方法做的是）：

df <- read.table(header = TRUE, text = " 
       binRange Frequency 
1 (0,0.025]  88 
2 (0.025,0.05]  72 
3 (0.05,0.075]  92 
4 (0.075,0.1]  38 
5 (0.1,0.125]  20 
6 (0.125,0.15]  16") 

library(stringr) 
library(splitstackshape) 
library(ggplot2) 
# extract the numbers out, 
df$binRange <- str_extract(df$binRange, "[0-9].*[0-9]+") 

# split the data using the , into to columns: 
# one for the start-point and one for the end-point 
df <- cSplit(df, "binRange") 

# plot it, you actually dont need the second column 
ggplot(df, aes(x = binRange_1, y = Frequency, width = 0.025)) + 
    geom_bar(stat = "identity", breaks=seq(0,0.125, by=0.025))

，或者如果你不希望數據進行數值解釋，則可以只是簡單的做到以下幾點：

df <- read.table(header = TRUE, text = " 
       binRange Frequency 
1 (0,0.025]  88 
2 (0.025,0.05]  72 
3 (0.05,0.075]  92 
4 (0.075,0.1]  38 
5 (0.1,0.125]  20 
6 (0.125,0.15]  16") 

library(ggplot2) 
ggplot(df, aes(x = binRange, y = Frequency)) + geom_bar(stat = "identity")

你將不能夠繪製密度積無線你的數據，因爲它不是連續的，而是絕對的，這就是爲什麼我更喜歡第二種顯示方式，

來源

2015-04-29 17:22:52 grrgrrbla

謝謝，出色地工作！ – user2062207

從生成分級數據

回答

相關問題