2014-10-29 57 views
1

我正在使用ggplot2並試圖更改箱的順序。我使用紐約州的Stop and Frisk程序的數據:http://www.nyclu.org/content/stop-and-frisk-data更改geom_histogram中垃圾箱的順序?

時間以整數形式給出(例如:5 = 12:05 AM,355 = 3:55 AM,2100 = 9 PM)。

我用下面的創建停止

myplot <- ggplot(Stop.and.Frisk.2011) + geom_histogram(aes(x=timestop),binwidth=300) 

這給了我的時間的相當好圖的時間的直方圖,與箱從午夜,凌晨3點去,凌晨3點 - 早上6點,6 AM - 9 AM等。

但是,我希望將前兩個箱(午夜 - 凌晨3點和上午6點 - 上午9點)移動到最後以模擬更多正常工作日。

有沒有簡單的方法來改變箱的順序?我嘗試過使用休息功能,但無法找到一種方法讓它循環回去。

本質上,我希望垃圾箱的排列順序如下:600-900,900-1200,1200-1500,1500-1800,1800-2100,2100-2400,0-300,300-600。

在此先感謝!

回答

0

一種方法是在調用ggplot之前將數據裝箱。這裏是使用cut函數來創建3小時間隔的一個示例:

# Load ggplot2 for plotting 
library(ggplot2) 

# Read in the data 
df <- read.csv('SQF 2012.csv', header = TRUE) 

# Create intervals every 3 hours based 
# on the `timestop` variable 
df$intervals <- cut(df$timestop, 
        breaks = c(0, 300, 600, 
           900, 1200, 1500, 
           1800, 2100, 2400)) 

# Re-order the sequence prior to plotting 
df$sequence <- ifelse(df$intervals == '(600,900]', 1, NA) 
df$sequence <- ifelse(df$intervals == '(900,1.2e+03]', 2, df$sequence) 
df$sequence <- ifelse(df$intervals == '(1.2e+03,1.5e+03]', 3, df$sequence) 
df$sequence <- ifelse(df$intervals == '(1.5e+03,1.8e+03]', 4, df$sequence) 
df$sequence <- ifelse(df$intervals == '(1.8e+03,2.1e+03]', 5, df$sequence) 
df$sequence <- ifelse(df$intervals == '(2.1e+03,2.4e+03]', 6, df$sequence) 
df$sequence <- ifelse(df$intervals == '(0,300]', 7, df$sequence) 
df$sequence <- ifelse(df$intervals == '(300,600]', 8, df$sequence) 
df$sequence <- as.numeric(df$sequence) 

# Create the plot 
ggplot(df, aes(x = sequence)) + 
    geom_histogram(binwidth = 0.5) + 
    scale_x_continuous(breaks = c(1, 2, 3, 4, 5, 6, 7, 8), 
        labels = c('6AM-9AM', '9AM-12PM', '12PM-3PM', '3PM-6PM', 
           '6PM-9PM', '9PM-12AM', '12AM-3AM', '3AM-6AM')) + 
    xlab('Time') + 
    ylab('Number\n') + 
    theme(axis.text = element_text(size = rel(1.1))) + 
    theme(axis.text.x = element_text(angle = 45, hjust = 1)) + 
    theme(axis.title = element_text(size = rel(1.1), face = 'bold')) 

Output

+0

非常感謝,這正是我一直在尋找! – 2014-10-30 16:11:57

0

這裏是單向的。我將2400添加到0到599之間的所有時間戳值。通過這種方式,我將所需的時間範圍移動到了圖的末尾(即右側)。當我繪製圖形時,我爲您修改了x軸。

library(data.table) 
library(dplyr) 

# Read the file 
foo <- fread("SQF 2012.csv", header = TRUE, na.strings="NA", colClasses="character") 

# Change timestop values 
ana <- setDF(foo) %>% 
     select(datestop,timestop) %>% 
     mutate(timestop = as.numeric(timestop), 
       timestop = ifelse(timestop >= 0 & timestop < 600, 2400 + timestop, timestop)) 

# Draw the graph 
ggplot(data = ana, aes(x = timestop)) + 
    geom_histogram() + 
    scale_x_continuous(limit = c(600, 3000), 
         breaks = c(600, 900, 1200, 1500, 
            1800, 2100, 2400, 2700, 3000), 
         labels = c("6:00", "9:00", "12:00", "15:00", 
            "18:00", "21:00", "24:00", "3:00", "6:00")) + 
    xlab("Time") 

enter image description here

+0

非常感謝您的幫助! – 2014-10-30 16:12:29