2015-05-18 50 views
1

我遇到了關於使用geom_line()函數的一個小問題。 我的數據是由經過訓練的觀察者對某些行爲的逐幀手動視頻評估組成的,這導致每個觀察者有數千個數據點。這基本上是一個由每個觀察者0和1組成的向量,其中1表示想要的行爲和0不想要的行爲。爲什麼我的geom_lines無法打破正確的顏色?

玩的時候,我想出了以下內容:

# a dataset from a manual videoanalysis with frame by frame behaviour assessment in binary. 0 = no, 1 = yes. 
data1<-read.csv("ObserversBehaviour.csv", ",", header=T) 

# my solution of giving each observer his own line, without having to transform the entire set 
Obsy0 <- rep(0,4528) 
Obsy1 <- rep(1,4528) 
Obsy2 <- rep(2,4528) 
Obsy3 <- rep(3,4528) 
Obsy4 <- rep(4,4528) 
Obsy5 <- rep(5,4528) 
Obsy6 <- rep(6,4528) 
Obsy7 <- rep(7,4528) 
Obsy8 <- rep(8,4528) 
Obsy9 <- rep(9,4528) 
Obsy10 <- rep(10,4528) 

ObsData <- data.frame(data1,Obsy0,Obsy1,Obsy2,Obsy3,Obsy4,Obsy5,Obsy6,Obsy7,Obsy8,Obsy9,Obsy10) 

#vector giving each observer a number 
Obsall <- c(0:10) 

#The list of individual frames of video M01 (4528 in total) 
Framerange <- ObsData[["Frames.M01"]] 

ylabels <- c("Observer0","Observer1","Observer2","Observer3","Observer4","Observer5","Observer6","Observer7","Observer8","Observer9","Observer10") 

#Ob<n>value is the 1 or 0 assessment 
#had to use as.factor() because for some reason my 0s and 1s are seen as continuous 
GraphObserve <-ggplot(ObsData,ylim=range(Obsall),xlim=max(Framerange),aes(x=Framerange)) 
geom_point(aes(x=Frames.M01, y = Obsy0, colour = as.factor(Ob0value), size=as.factor(Ob0value)), shape=15) + 
geom_point(aes(x=Frames.M01, y = Obsy1, colour = as.factor(Ob1value), size=as.factor(Ob1value)), shape=15) + 
geom_point(aes(x=Frames.M01, y = Obsy2, colour = as.factor(Ob2value), size=as.factor(Ob2value)), shape=15) + 
geom_point(aes(x=Frames.M01, y = Obsy3, colour = as.factor(Ob3value), size=as.factor(Ob3value)), shape=15) + 
geom_point(aes(x=Frames.M01, y = Obsy4, colour = as.factor(Ob4freeze.0.no.1.yes), size=as.factor(Ob4freeze.0.no.1.yes)), shape=15) + 
geom_point(aes(x=Frames.M01, y = Obsy5, colour = as.factor(Ob5value), size=as.factor(Ob5value)), shape=15) + 
geom_point(aes(x=Frames.M01, y = Obsy6, colour = as.factor(Ob6value), size=as.factor(Ob6value)), shape=15) + 
geom_point(aes(x=Frames.M01, y = Obsy7, colour = as.factor(Ob7value), size=as.factor(Ob7value)), shape=15) + 
geom_point(aes(x=Frames.M01, y = Obsy8, colour = as.factor(Ob8value), size=as.factor(Ob8value)), shape=15) + 
geom_point(aes(x=Frames.M01, y = Obsy9, colour = as.factor(Ob9value), size=as.factor(Ob9value)), shape=15) + 
geom_point(aes(x=Frames.M01, y = Obsy10, colour = as.factor(Ob10value), size=as.factor(Ob10value)), shape=15) + 

scale_colour_manual(breaks = c(0, 1), 
    labels = c("No","Yes"), 
    values = c("green4","red"), 
    name="Assessment")+ 
#needed to let the wanted behaviour stand out, so I changed pointsize 
scale_size_manual(breaks = c(0, 1), values=c(1,2), guide="none")+ 
scale_y_discrete(limit=Obsall, labels=ylabels, expand=c(0,0))+ 
scale_x_continuous(expand=c(0,0),breaks = round(seq(min(0), max(Framerange), by = 200),5000))+ 
expand_limits(y=c(1,-.5)) 

update_labels(GraphObserve,list(x="Frames (M01)",y ="Observers")) 

這使我由漂亮的彩色圓點的每一個數據點一個公平的圖形,但由於點是重疊的,仍然相當小,這是不是我要走的路。而不是使用geom_point()我去geom_line()。該圖確實代表我想要的每個顏色中斷。

所以接下來我每geom_point()行更改爲geom_line(),而其餘的保持不變。 (該scale_size_manual()變得相當冗餘)

geom_line(aes(x=Framerange, y=Obsy0, colour=as.factor(Ob0value)),size=14) + 
geom_line(aes(x=Framerange, y=Obsy1, colour=as.factor(Ob1value)),size=14) + 
geom_line(aes(x=Framerange, y=Obsy2, colour=as.factor(Ob2value)),size=14) + 
geom_line(aes(x=Framerange, y=Obsy3, colour=as.factor(Ob3value)),size=14) + 
geom_line(aes(x=Framerange, y=Obsy4, colour=as.factor(Ob4value)),size=14) + 
geom_line(aes(x=Framerange, y=Obsy5, colour=as.factor(Ob5value)),size=14) + 
geom_line(aes(x=Framerange, y=Obsy6, colour=as.factor(Ob6value)),size=14) + 
geom_line(aes(x=Framerange, y=Obsy7, colour=as.factor(Ob7value)),size=14) + 
geom_line(aes(x=Framerange, y=Obsy8, colour=as.factor(Ob8value)),size=14) + 
geom_line(aes(x=Framerange, y=Obsy9, colour=as.factor(Ob9value)),size=14) + 
geom_line(aes(x=Framerange, y=Obsy10, colour=as.factor(Ob10value)),size=14) + 

我認爲這將制定得很好,但事實並非如此。

而不是爲文件中的每個0和1切換顏色,看起來好像顏色在數據集中的第一個和最後一個發生切換。

從上面的腳本的圖表:http://imgur.com/2baseCa,bJa2Ab7#0

我似乎無法找到我的代碼中的錯誤,也沒有我似乎在網上找到一個解決方案。有沒有人可以幫我解決這個問題?

更新

更清晰的概述,我把鏈接產生的圖形從我以前它們下面scrips。

把我的數據在一個「長」格式的建議後,我用下面的腳本:

data1<-read.csv("ObserversBehaviour.csv", ",", header=T) 

Frames<-data1[["Frames.M01"]] 
Obs<-paste0("Observer",0:10) 
Obsy <- sort(rep(0:10,4528),decreasing=F) 
Obsvalue <- stack(data1[,c(Obs)]) 
ObsData2 <- expand.grid(Frames=data1[["Frames.M01"]],Obs=paste0("Observer",0:10)) 
ObsData2$Observer = Obsy 
ObsData2$Assessment = Obsvalue$values 

ggplot(ObsData2, aes(Frames, Observer, colour=Assessment)) + 
    geom_line(show_guide=T) + 
    scale_y_discrete(limit=0:10, labels=Obs, expand=c(0,0))+ 
    scale_x_continuous(expand=c(0,0),breaks = round(seq(min(0), max(Frames), by = 200),5000))+ 
    expand_limits(y=c(1,.5)) + 
    #The manual colorcoding actually failed, since it keeps returning this error "Continuous value supplied to discrete scale". 
    scale_color_manual(breaks = c(0,1), 
       labels = c("No","Yes"), 
       values = c("green4","red"), 
       name="Assessment") 

儘管基於行爲的價值評估現在實際上改變了顏色,新的問題出現了。

Observer5-10的值已全部被Observer10的值所取代。

通過改變幾個參數,我發現通過改變行的大小,值恢復正常。但是,Observer10的值完全消失。

來自新腳本的圖表: http://imgur.com/AiKeXLc,kPgIKKZ#1(第二圖像是第一圖)

的事實,我不能手動更改顏色(儘管我試圖用我的價值觀as.factor()as.discrete()結合這些問題)我不知道我現在可以嘗試什麼。

我可能會錯過這裏很明顯的東西,作爲一個初學者與河。更新輸出的dput(head(ObsData2))

## structure(list(Frames = 1:6, Obs = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = c("Observer0", "Observer1", "Observer2", "Observer3", 
## "Observer4", "Observer5", "Observer6", "Observer7", "Observer8", 
## "Observer9", "Observer10"), class = "factor"), Observer = c(0L, 
## 0L, 0L, 0L, 0L, 0L), Assessment = c(0L, 0L, 0L, 0L, 0L, 0L)), .Names = c("Frames", 
## "Obs", "Observer", "Assessment"), out.attrs = structure(list(
##  dim = structure(c(4528, 11), .Names = c("Frames", "Obs")), 
##  dimnames = structure(list(Frames = c("Frames= 1", "Frames= 2", 
##  "Frames= 3", "Frames= 4", "Frames= 5", "Frames= 6", 
##  "Frames= 7", "Frames= 8", "Frames= 9", "Frames= 10", 
##  "Frames= 11", "Frames= 12", "Frames= 13", "Frames= 14", 
##  "Frames= 15", "Frames= 16", "Frames= 17", "Frames= 18", 
##  "Frames= 19", "Frames= 20", "Frames= 21", "Frames= 22", 
##  "Frames= 23", "Frames= 24", "Frames= 25", "Frames= 26", 
##  "Frames= 27", "Frames= 28", "Frames= 29", "Frames= 30", 
##  "Frames= 31", "Frames= 32", "Frames= 33", "Frames= 34", 
##  "Frames= 35", "Frames= 36", "Frames= 37", "Frames= 38", 
##  "Frames= 39", "Frames= 40", "Frames= 41", "Frames= 42", 
##  "Frames= 43", "Frames= 44", "Frames= 45", "Frames= 46", 
##  "Frames= 47", "Frames= 48", "Frames= 49", "Frames= 50", 
##  "Frames= 51", "Frames= 52", "Frames= 53", "Frames= 54", 
##  "Frames= 55", "Frames= 56", "Frames= 57", "Frames= 58", 
##  "Frames= 59", "Frames= 60", "Frames= 61", "Frames= 62", 
##  "Frames= 63", "Frames= 64", "Frames= 65", "Frames= 66", 
##  "Frames= 67", "Frames= 68", "Frames= 69", "Frames= 70", 
##  "Frames= 71", "Frames= 72", "Frames= 73", "Frames= 74", 
# Long patch of "Frames= <75-4502>" omitted due to space saving 
##  "Frames=4503", "Frames=4504", "Frames=4505", "Frames=4506", 
##  "Frames=4507", "Frames=4508", "Frames=4509", "Frames=4510", 
##  "Frames=4511", "Frames=4512", "Frames=4513", "Frames=4514", 
##  "Frames=4515", "Frames=4516", "Frames=4517", "Frames=4518", 
##  "Frames=4519", "Frames=4520", "Frames=4521", "Frames=4522", 
##  "Frames=4523", "Frames=4524", "Frames=4525", "Frames=4526", 
##  "Frames=4527", "Frames=4528"), Obs = c("Obs=Observer0", "Obs=Observer1", 
##  "Obs=Observer2", "Obs=Observer3", "Obs=Observer4", "Obs=Observer5", 
##  "Obs=Observer6", "Obs=Observer7", "Obs=Observer8", "Obs=Observer9", 
##  "Obs=Observer10")), .Names = c("Frames", "Obs"))), .Names = c("dim", 
## "dimnames")), row.names = c(NA, 6L), class = "data.frame") 
+0

你可以發佈一個'ObsData2'的樣本。這會讓你更容易幫助你。將你的問題粘貼到'dput(head(ObsData2))'的輸出中。 – eipi10

回答

0

問題規避

我的同事的幫助和使用geom_tile()代替geom_line()圖是現在正是我想要它。

require("ggplot2") 

data1<-read.csv("ObserversBehaviour.csv", ",", header=T) 

Frames<-data1[["Frames.M01"]] 
Obs.lab<-paste0("Observer",0:10) 
Obsy <- sort(rep(1:11,4528),decreasing=F) 
Obsvalue <- stack(data1[,c(Obs.lab)]) 

ObsData2 <- expand.grid(Frames=data1[["Frames.M01"]],Obs.lab=paste0("Observer",0:10)) 
ObsData2$Observer = Obsy 
ObsData2$Assessment = Obsvalue$values 

GraphObserve <- ggplot(ObsData2, aes(Frames, Observer, height=.9)) + 
    geom_tile(aes(fill = factor(Assessment)))+ 
    scale_fill_manual(values=c("0"="green4", "1"="red"), labels= c("No", "Yes"))+ 
    scale_y_discrete(expand=c(0,0), limit=1:11, labels=Obs.lab)+ 
    scale_x_continuous(expand=c(0,0), breaks = round(seq(min(0), max(Frames), by = 200),5000)) 
    update_labels(GraphObserve,list(x="Frames (M01)",y ="Observers")) 

顏色中斷正好發生在他們需要的地方,沒有重疊,所有觀察者都列在圖中。

儘管這實際上並不能解決我以前的腳本中出現的問題,但它確實提供了更好的結果。

最終圖表: http://i.imgur.com/pW8Qh0I.png

謝謝eipi10,用於顯示我如何壓縮我的腳本。

3

,如果你把你的數據在 「長」 格式,這將容易得多。下面是用假數據的示例:

## Create fake data in long format 
ObsData = expand.grid(Frames=1:4258, Obs=paste0("Observer",0:10)) 

# Add y values 
set.seed(10) 
ObsData$y = cumsum(rnorm(4258*11)) 

在長格式數據幀時,所有的觀察員「堆疊」到一個單一的因素變量(Obs)與11個類別 - 一個用於每個觀察者。現在,您可以將其用作ggplot中顏色審美的分組變量。

## Plot with a different color for each observer 
ggplot(ObsData, aes(Frames, y, colour=Obs)) + 
     geom_line() 

這裏的圖形看起來像使用默認的顏色,但你可以改變通過添加scale_colour_manual()你的劇情和設定任何顏色你喜歡。

enter image description here

+0

感謝您的快速反應和出色的建議。將數據轉換爲「長」格式確實幫助了很多,但尚未解決整個問題。看到我更新的問題了解更多詳情。 –

相關問題