2011-08-01 112 views
9

使用CSV格式的此數據集並感謝來自stackoverflow的輸入/幫助我設法繪製了帶有使用ggplot的「組」的彩色水平條和條的barplot!這是我第一次使用ggplot。R ggplot在「barplot-like」圖中的排序條

GO Biological Process,regulation of lipid metabolic process,1.87E-35 
GO Biological Process,acute inflammatory response,3.21E-37 
GO Biological Process,response to insulin stimulus,1.05E-38 
GO Biological Process,steroid metabolic process,4.19E-39 
GO Biological Process,cholesterol metabolic process,1.19E-40 
GO Biological Process,cellular response to chemical stimulus,5.87E-42 
GO Biological Process,alcohol metabolic process,5.27E-43 
GO Biological Process,sterol metabolic process,2.61E-43 
GO Biological Process,lipid homeostasis,1.12E-44 
GO Biological Process,response to peptide hormone stimulus,1.29E-45 
GO Biological Process,monocarboxylic acid metabolic process,2.33E-54 
GO Biological Process,cellular ketone metabolic process,5.46E-74 
GO Biological Process,carboxylic acid metabolic process,2.41E-76 
GO Biological Process,organic acid metabolic process,5.30E-79 
Pathway Commons,FOXA transcription factor networks,7.40E-61 
Pathway Commons,FOXA2 and FOXA3 transcription factor networks,1.39E-64 
Transcription Factor Targets,"Targets of HNF6, identified by ChIP-chip in hepatocytes",1.77E-32 
Transcription Factor Targets,"Targets of HNF1alpha, identified by ChIP-chip in hepatocytes",3.87E-65 
Transcription Factor Targets,"Targets of HNF4alpha, identified by ChIP-chip in hepatocytes",1.38E-131 

這是我的代碼:

ggplot(tmp, aes(x=tmp$V2, y=-log10(tmp$V3), fill=tmp$V1)) + 
geom_bar(stat="identity") + 
coord_flip() 

enter image description here

現在我想創建同積同上,但其中每個「羣」中的值進行排序。看起來像這樣的東西。

enter image description here

我是新來ggplot,所以任何幫助將不勝感激。 謝謝。

回答

9

您可以通過將其轉換爲因子來對變量進行排序。

> head(d) 
        V1          V2  V3 
1 GO Biological Process regulation of lipid metabolic process 1.87e-35 
2 GO Biological Process   acute inflammatory response 3.21e-37 
3 GO Biological Process   response to insulin stimulus 1.05e-38 
4 GO Biological Process    steroid metabolic process 4.19e-39 
5 GO Biological Process   cholesterol metabolic process 1.19e-40 
6 GO Biological Process cellular response to chemical stimulus 5.87e-42 

> d$V4 <- factor(d$V2, levels=d$V2) # convert V2 into factor 
> head(d) 
        V1          V2  V3          V4 
1 GO Biological Process regulation of lipid metabolic process 1.87e-35 regulation of lipid metabolic process 
2 GO Biological Process   acute inflammatory response 3.21e-37   acute inflammatory response 
3 GO Biological Process   response to insulin stimulus 1.05e-38   response to insulin stimulus 
4 GO Biological Process    steroid metabolic process 4.19e-39    steroid metabolic process 
5 GO Biological Process   cholesterol metabolic process 1.19e-40   cholesterol metabolic process 
6 GO Biological Process cellular response to chemical stimulus 5.87e-42 cellular response to chemical stimulus 

> # plot 
> ggplot(d, aes(V4, -log10(V3), fill=V1)) + geom_bar() + coord_flip() 

這裏是進一步的信息:http://kohske.wordpress.com/2010/12/29/faq-how-to-order-the-factor-variables-in-ggplot2/

9
ggplot(df, aes(reorder(x,y),y)) + geom_bar() 

你要找的部分是重新排序(X,Y)。但是如果你能向我們展示你當前的ggplot()調用,我們可以更具體一些,因爲reorder()不是唯一的方法。

對於這種類型的排序,您可能需要使用relevel(),但它取決於您的數據。

您也可以在您的data.frame()中添加另一列,它將手動或自動地用作排序變量,並將您的reorder()調用取消。

+0

我想你錯過了ggplot函數的關閉')'。我會自己編輯它,但它少於6個字符。 – Kevin

+0

@凱文很好趕! Sneeky語法,我第一次總是弄錯了 –

4

假設由本所提供的數據是一個CSV文件名爲data.csv

d <- read.csv('data.csv', header = F) 
d$V2 <- factor(d$V2, levels=d[order(d$V1, -d$V3), ]$V2) #reorder by grp/value 
ggplot(d, aes(x=V2, y=-log10(V3), fill=V1)) + geom_bar() + coord_flip() 

這種方法是多一點一般比答案來自kohske,並且不需要對CSV進行排序(更改CSV文件中行的順序仍然會重現正確的圖)。