2011-02-15 32 views
2

我想做一些令人難以置信的簡單:我想創建一個一個 boxplot爲一個完整的數據框。然而,搜索「組合盒子」和相關術語並沒有提出任何建議。如果我忽略了一個明顯的方式,請告訴我。如何從數據框創建組合箱線圖?

,我有以下數據:

> theData 
    X20.7 X21.7 X22.7 X23.7 X24.7 X25.7 X26.7 X27.7 X28.7 X29.7 X30.7 X31.7 X32.7 X33.7 X34.7 X35.7 
1 99.64920 99.49319 99.49319 99.49319 99.49319 99.49319 99.80837 99.29348 99.29348 99.29348 99.29348 99.29348 99.29348 99.46376 99.46376 99.51554 
2 98.76469 98.60867 98.60867 98.60867 98.60867 98.60867 99.41553 98.40896 98.40896 98.40896 98.40896 98.40896 98.40896 98.74975 98.74975 98.54527 
3 98.37824 98.22222 98.22222 98.22222 98.22222 98.22222 98.70900 98.13767 98.13767 98.13767 98.13767 98.13767 98.13767 98.47846 98.47846 98.01791 
4 98.11356 97.95754 97.95754 97.95754 97.95754 97.95754 97.82447 97.93003 97.93003 97.93003 97.93003 97.93003 97.93003 98.27083 98.27083 97.81027 
5 97.80027 97.64424 97.64424 97.64424 97.64424 97.48632 97.43801 97.40158 97.40158 97.40158 97.40158 97.40158 97.40158 97.74239 97.74239 97.28181 
6 97.47825 97.32222 97.32222 97.32222 97.43795 97.12131 97.17333 97.03658 97.10158 97.10158 97.10158 97.10158 97.10158 97.44239 97.44239 96.98180 
> dput(theData) 
structure(list(X20.7 = c(99.6492, 98.7646913866934, 98.3782376564915, 
98.1135635544627, 97.8002672890352, 97.4782549804011), X21.7 = c(99.4931928571429, 
98.6086741582754, 98.2222160140822, 97.9575388921788, 97.6442390541023, 
97.3222230681959), X22.7 = c(99.4931928571429, 98.6086741582754, 
98.2222160140822, 97.9575388921788, 97.6442390541023, 97.3222230681959 
), X23.7 = c(99.4931928571429, 98.6086741582754, 98.2222160140822, 
97.9575388921788, 97.6442390541023, 97.3222230681959), X24.7 = c(99.4931928571429, 
98.6086741582754, 98.2222160140822, 97.9575388921788, 97.6442390541023, 
97.437947563131), X25.7 = c(99.4931928571429, 98.6086741582754, 
98.2222160140822, 97.9575388921788, 97.4863155584865, 97.121313307238 
), X26.7 = c(99.8083714285714, 99.415530164398, 98.7090041774867, 
97.8244717838903, 97.4380076185552, 97.173326388931), X27.7 = c(99.2934828571429, 
98.4089615689001, 98.1376722694449, 97.9300324124538, 97.401583100132, 
97.03657716757), X28.7 = c(99.2934828571429, 98.4089615689001, 
98.1376722694449, 97.9300324124538, 97.401583100132, 97.1015782240536 
), X29.7 = c(99.2934828571429, 98.4089615689001, 98.1376722694449, 
97.9300324124538, 97.401583100132, 97.1015782240536), X30.7 = c(99.2934828571429, 
98.4089615689001, 98.1376722694449, 97.9300324124538, 97.401583100132, 
97.1015782240536), X31.7 = c(99.2934828571429, 98.4089615689001, 
98.1376722694449, 97.9300324124538, 97.401583100132, 97.1015782240536 
), X32.7 = c(99.2934828571429, 98.4089615689001, 98.1376722694449, 
97.9300324124538, 97.401583100132, 97.1015782240536), X33.7 = c(99.4637585714286, 
98.7497473555799, 98.478463763926, 98.2708282766442, 97.7423900760775, 
97.4423915096353), X34.7 = c(99.4637585714286, 98.7497473555799, 
98.478463763926, 98.2708282766442, 97.7423900760775, 97.4423915096353 
), X35.7 = c(99.5155421428571, 98.5452656069643, 98.0179127183643, 
97.81026932055, 97.2818110000344, 96.9818010094329)), .Names = c("X20.7", 
"X21.7", "X22.7", "X23.7", "X24.7", "X25.7", "X26.7", "X27.7", 
"X28.7", "X29.7", "X30.7", "X31.7", "X32.7", "X33.7", "X34.7", 
"X35.7"), row.names = c(NA, 6L), class = "data.frame") 

我想在一個箱線圖所有數據彙總,然而,當我嘗試繪圖的箱線圖(即boxplot(theData))R自動使基於列名小組。我也嘗試把完整的數據框放在一個向量中,但是,因爲我的(完整)數據集也包含NA值,所以我沒有成功。到目前爲止,我有以下幾個功能,儘量使數據幀的載體中,使得這可以在一個箱線圖繪製:

for(i in 1:ncol(allTheData)) { 
     tmpData <- allTheData[,i] 
     for(j in 1:length(tmpData)){ 
      if(!is.na(j)){ 
       tmpVector <- c(tmpVector, j) 
      } 
     } 
    } 

不過,我覺得我這個過於複雜的問題,我懷疑如果這樣的環路的建設將有利於R.

這樣的表現,我怎樣才能使其中包括一個箱圖的一個完整數據幀的箱線圖?那麼,我沒有得到由X20.7到X35.7組成的boxplot,但給出了一個「Overall」boxplot?

回答

5

嘗試是這樣的

boxplot(unlist(theData)) 
2

朱拉

如何使用melt功能reshape轉換您的數據爲「長」格式,然後使用上boxplot?假設你的數據在一個對象命名df

> library(reshape) 
> df.m <- melt(df) 
Using as id variables 
> head(df.m) 
    variable value 
1 X20.7 99.64920 
2 X20.7 98.76469 
3 X20.7 98.37824 
4 X20.7 98.11356 
5 X20.7 97.80027 
6 X20.7 97.47825 
> 
> boxplot(df.m$value) 
+0

@Joris - 我覺得OP是一個「聚合」箱線圖不被列名分開後,除非我誤解了這個問題? – Chase 2011-02-15 14:12:32

+0

剛注意到。它沒有任何意義,因爲人們可以簡單地做boxplot(數據)。 – 2011-02-15 14:13:30