2015-10-16 83 views
2

我在學習R,想繪製一個大型數據框(~55000行)的散點圖。我使用的是scatterplotcar如何在R中繪製分層散點圖?

library(car) 
d=read.csv("patches.csv", header=T) 
scatterplot(energy ~ homogenity | label, data=d, 
    ylab="energy", xlab="homogenity ", 
    main="Scatter Plot", 
    labels=row.names(d)) 

其中patches.csv包含數據幀(下)

我想以不同的方式顯示兩個label套。有了大量的數據,情節非常密集,所以我得到了正確的結果(主要是紅色數據可見)。圖像需要一段時間才能渲染,因此我可以在最後一個圖中隱藏黑色標記數據之前(左下角)。

Figure

我可以控制R鍵用紅色第一繪製數據,或者是有沒有更好的方式來實現我的目標?

這裏是我的數據樣本:

label,channel,x,y,contrast,energy,entropy,homogenity 
1,21,460,76,0.991667,0.640399,0.421422,0.939831 
1,22,460,76,0.0833333,0.62375,0.364379,0.969445 
1,23,460,76,0.129167,0.422908,0.589938,0.935417 
1,24,460,76,0,1,0,1 
1,25,460,76,0,1,0,1 
1,26,460,76,0.0875,0.789627,0.253649,0.967361 
1,27,460,76,2.4,0.528516,0.700859,0.845558 
1,28,460,76,0.120833,0.562066,0.392998,0.945139 
1,29,460,76,0.0125,0.975234,0.0329461,0.99375 
1,30,460,76,0,1,0,1 
1,31,460,76,0.1625,0.384662,0.5859,0.929861 
0,0,483,82,0.404167,0.309505,0.61573,0.947222 
0,1,483,82,0.0166667,0.728559,0.221967,0.991667 
0,2,483,82,0,1,0,1 
0,3,483,82,0.416667,0.327083,0.644057,0.940972 
0,4,483,82,0.0208333,0.919054,0.0940364,0.989583 
0,5,483,82,0.416667,0.327083,0.644057,0.940972 
0,6,483,82,0,1,0,1 
0,7,483,82,0.0333333,0.794479,0.192471,0.983333 
0,8,483,82,0,1,0,1 
0,9,483,82,0,1,0,1 
0,10,483,82,0.0208333,0.958984,0.0502502,0.989583 
+0

您是否嘗試過半透明顏色?這是_overplotting_的一個常見方法:我認爲'car :: scatterplot'的參數是'col = adjustcolor(palette()[1:2],.5)'。 – lukeA

+1

嘗試使用'ggplot',看看'geom_point(...,alpha = 0.3)',也許'facet_grid()'。 – zx8754

回答

1

如果你想改變顏色的順序,參數傳遞給col=2:1scatterplot,那麼你就可以繪製之前黑紅。您可以使用scales包裝中的功能alpha使您的點半透明(它需要顏色和alpha值的向量,以使每種顏色的密度不同)。

## More data 
d <- data.frame(homogeneity=(x=rnorm(10000, 0.85, sd=0.15)), 
       label=factor((lab=1:2)), 
       energy=rnorm(10000, lab^1.8*x^2-lab, sd=x)) 

library(car) 
library(scales)   # for alpha 
opacity <- c(0.3, 0.1) # opacity for each color 
col <- 1:2    # black then red 
scatterplot(energy ~ homogeneity | label, data=d, 
      ylab="energy", xlab="homogenity ", 
      main=paste0(palette()[col], "(", opacity, ")", collapse=","), 
      col=alpha(col, opacity), 
      labels=row.names(d)) 

enter image description here

+0

我想以不同的順序繪製數據,所以稀疏數據繪製在密集的圖上。顏色本身在這一點上並不重要 – cdmh

+0

@cdmh你只需要改變'col'變量,例如'col = 2:1'應該首先繪製紅色,然後繪製黑色。往上看。您只需要將與稀疏數據對應的顏色最後保存在「col」向量中。 – jenesaisquoi

0

類似阿爾法說什麼廢話,

如果你有很多的點,各個點的實際確定不再有意義。相反,你可能想要一個密度的表示。爲此使用smoothScatter(x,y)並使用通常的points(morex,morey)覆蓋突出顯示的點。您顯然知道如何使用點(與繪圖相同的參數),因此您可以很容易地實現,並且只需要很少的額外知識。