2013-02-11 115 views
2

我有THID數據幀:增加圖例項中GGPLOT2

頭(X)

 Date Company Region Units 
1 1/1/2012 Gateway America  0 
2 1/1/2012 Gateway Europe  0 
3 1/1/2012 Gateway America  0 
4 1/1/2012 Gateway Americas  0 
5 1/1/2012 Gateway Europe  0 
6 1/1/2012 Gateway Pacific  0 

X dput(X)

structure(list(Date = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("1/1/2012", 
"1/12/2012", "1/2/2012"), class = "factor"), Company = structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L), .Label = c("Gateway", "HP", "IBM"), class = "factor"), 
    Region = structure(c(1L, 3L, 1L, 2L, 3L, 4L, 2L, 1L, 3L, 
    1L, 2L, 3L, 4L, 2L, 1L, 3L, 1L, 2L, 3L, 4L, 2L, 1L, 3L, 1L, 
    2L, 3L, 4L, 2L, 1L, 3L, 1L, 2L, 3L, 4L, 2L, 1L, 3L, 1L, 2L, 
    3L, 4L, 2L, 1L, 3L, 1L, 2L, 3L, 4L, 2L, 1L, 3L, 1L, 2L, 3L, 
    4L, 2L, 1L, 3L, 1L, 2L, 3L, 4L, 2L, 1L, 3L, 1L, 2L, 3L, 4L, 
    2L, 1L, 3L, 1L, 2L, 3L, 4L, 2L, 1L, 3L, 1L, 2L, 3L, 4L, 2L 
    ), .Label = c("America", "Americas", "Europe", "Pacific"), class = "factor"), 
    Units = c(1L, 3L, 1L, 6L, 20L, 2L, 2L, 10L, 2L, 1L, 2L, 4L, 
    6L, 30L, 2L, 15L, 10L, 3L, 4L, 7L, 9L, 12L, 34L, 50L, 3L, 
    2L, 4L, 3L, 1L, 3L, 3L, 1L, 4L, 0L, 1L, 0L, 0L, 1L, 0L, 4L, 
    0L, 0L, 0L, 0L, 5L, 0L, 8L, 0L, 0L, 0L, 0L, 0L, 9L, 0L, 56L, 
    10L, 0L, 0L, 5L, 7L, 0L, 0L, 8L, 0L, 2L, 0L, 4L, 0L, 5L, 
    7L, 0L, 0L, 8L, 10L, 0L, 6L, 0L, 4L, 4L, 0L, 2L, 0L, 5L, 
    0L)), .Names = c("Date", "Company", "Region", "Units"), class = "data.frame", row.names = c(NA, 
-84L)) 

我想建立一個熱地圖:

ggplot(x, aes(Date, Company, fill=Units)) + geom_tile(aes(fill=Units)) + facet_grid(~Region) + scale_fill_gradient(low="white", high="red") 

這個命令的作品,但我需要能夠使用不同的顏色,而不是白色和紅色,並增加了傳說上的剝落。現在,默認情況下,有5個傳說。我喜歡增加10.O將是白色的,其他應該與白色明顯不同,以便用戶能夠注意到它。

我該如何使用ggplot增加圖例值的數量併爲每個圖例分配不同的顏色?

+0

幾乎所有的值都是零。我不認爲,使用這個數據集,任何數量的顏色都會產生變化。或者這是一個測試數據集? – Arun 2013-02-11 15:14:22

+0

@阿倫,我剛剛更新了dput。它是實際數據的一部分。 – user1471980 2013-02-11 15:17:08

+0

是的,仍然'table(x $ units)'給出'c(75,4,1,4)'0s,2s,4s和10s。這是很多0。 – Arun 2013-02-11 15:26:44

回答

3

我發現它非常豐富的使用quantiles來繪製heatmapsas done here in this blog。這有助於生成傾斜的顏色集(如博客所示)。假設數據與您的數據相似(相當高的0),然後通過計算適當的分位數,我們可以創建一個傾斜的色彩圖,該色彩圖具有合適的標籤,在視覺上非常出色且信息豐富。我修改了已經爲此問題鏈接的博客地圖中的代碼,並添加了更多解釋。博客文章必須獲得所有的想法和實施的功勞。

在進入代碼之前,我們必須對您的數據使用quantiles進行一些分析,以查看要使用的分位數。通過這樣做:

quantile(x$Units, seq(0, 1, length.out = 25) 

#  0% 4.166667% 8.333333%  12.5% 16.66667% 20.83333%  25% 29.16667% 33.33333% 
# 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 
# 37.5% 41.66667% 45.83333%  50% 54.16667% 58.33333%  62.5% 66.66667% 70.83333% 
# 1.00000 1.00000 2.00000 2.00000 3.00000 3.00000 4.00000 4.00000 5.00000 
#  75% 79.16667% 83.33333%  87.5% 91.66667% 95.83333%  100% 
# 6.00000 7.00000 8.00000 9.62500 10.16667 25.41667 56.00000 

你看到0%位數對應於數據Units=0。並且直到33%(準確地說是33.33%)。所以,也許我們選擇38%作爲下一個分位數。然後說,60%,75%,90%並最終以100%完成。現在,我們有足夠的水平,你想要的水平,他們在你的數據有意義的水平。

我們將需要zoo包來完成此操作。現在讓我們來構建數據:

require(zoo) # for rollapply 
# the quantiles we just decided to categorise the data into classes. 
qtiles <- quantile(x$Units, probs = c(0, 38, 60, 75, 90, 100)/100) 
# a color palette 
c_pal  <- colorRampPalette(c("#3794bf", "#FFFFFF", 
         "#df8640"))(length(qtiles)-1) 
# since we are using quantile classes for fill levels, 
# we'll have to generate the appropriate labels 
labels <- rollapply(round(qtiles, 2), width = 2, by = 1, 
         FUN = function(i) paste(i, collapse = " : ")) 
# added the quantile interval in which the data falls, 
# which will be used for fill 
x$q.units <- findInterval(x$Units, qtiles, all.inside = TRUE) 

# Now plot 
library(ggplot2) 
p <- ggplot(data = x, aes(x = Date, y = Company, fill = factor(q.units))) 
p <- p + geom_tile(color = "black") 
p <- p + scale_fill_manual(values = c_pal, name = "", labels = labels) 
p <- p + facet_grid(~ Region) 
p <- p + theme(axis.text.x = element_text(angle = 90, hjust = 1)) 
p 

你得到這樣的: ggplot2_heatmap_skewed

希望這有助於。

編輯:您還可以訪問colorbrewer2.org以獲得不錯的調色板和自己設置顏色。例如:

# try out these colors: 
c_pal  <- c("#EDF8FB", "#B3CDE3", "#8C96C6", "#8856A7", "#810F7C") 
c_pal  <- c("#FFFFB2", "#FECC5C", "#FD8D3C", "#F03B20", "#BD0026") 

另外,嘗試在代碼中設置geom_tile(color = "black", alpha = 0.5")alpha

+0

哇,這真的很棒。非常感謝。 – user1471980 2013-02-11 16:38:47

+0

您可能希望以'x $ Date < - factor(x $ Date,levels = c(「1/1/2012」,「1/2/2012」,「1/12/2012「),命令= T)',以便日期按正確順序(或通過將日期更改爲實際日期) – Arun 2013-02-11 16:42:36

+0

這很好。我有一個問題。如果你通過R腳本來做這件事,你會如何選擇分位數? – user1471980 2013-02-11 17:20:45