使用ggplot在多個文件中繪製數據

我有一個時間序列數據文件，它具有4種代謝物A，B，AE和E隨時間的濃度。我有很多這種類型的數據文件（大約100）。我想繪製一張圖中所有文件中所有四種代謝物的時間序列。每種代謝物都被賦予一種特定的顏色。使用ggplot在多個文件中繪製數據

我編譯了下面的代碼，但它只繪製了一個文件（最後一個）的數據。我認爲這是因爲當我打電話給ggplot（）時，它會創建一個新的情節。我試圖在四個循環之外創建劇情，但沒有奏效。

p = NULL 

for(i in 1:length(filesToProcess)){ 
    fileName = filesToProcess[i] 

    fileContent = read.csv(fileName) 
    #fileContent$Time <- NULL 

    p <- ggplot()+ 
    geom_line(data = fileContent, aes(x = Time, y = A, color = "A"), size =0.8) + 
    geom_line(data = fileContent, aes(x = Time, y = B, color = "B"), size =0.8) + 
    geom_line(data = fileContent, aes(x = Time, y = AE, color = "AE"), size =0.8) + 
    geom_line(data = fileContent, aes(x = Time, y = E, color = "E"), size =0.8) + 
    xlab('Time') + 
    ylab('Metabolite Concentration')+ 
    ggtitle('Step Scan') + 
    labs(color="Metabolites") 

} 
plot(p)

下面是曲線圖

示例文件可以發現here

來源

2016-07-31 SriniShine

我通常採取以下方法（未經測試，因爲缺乏可再現例子的）

read_one <- function(f, ...){ 
    w <- read.csv(f, ...) 
    m <- reshape2::melt(w, id = c("Time")) 
    m$source <- tools::file_path_sans_ext(f) # keep track of filename 
    m 
} 

plot_one <- function(d){ 
    ggplot(d, aes(x=Time, y=value)) + 
    geom_line(aes(colour=variable), size = 0.8) + 
    ggtitle('Step Scan') + 
    labs(x = 'Time', y = 'Metabolite Concentration', color="Metabolites") 
} 

## strategy 1 (multiple independent plots) 

ml <- lapply(filesToProcess, read_one) 
pl <- lapply(ml, plot_one) 

gridExtra::grid.arrange(grobs = pl) 

## strategy 2: facetting 

m <- plyr::ldply(filesToProcess, read_one) 
ggplot(m, aes(x=Time, y=value)) + 
    facet_wrap(~source) + 
    geom_line(aes(colour=variable), size = 0.8) + 
    ggtitle('Step Scan') + 
    labs(x = 'Time', y = 'Metabolite Concentration', color="Metabolites")

來源

2016-07-31 12:01:15 baptiste

謝謝你的答案。我試圖圍繞你的解決方案來解決問題。這對我來說看起來有點複雜。另外我還包含了一些示例文件。 – SriniShine

~~由於plot(p)在lo外面op，它只會繪製最後生成的圖表。在循環內移動plot(p)。~~

~~注意：雖然這個問題有點含糊不清，但我假設您需要每個輸入文件一個圖。~~

編輯：把所有的數據放在一個圖中，假設你所有的文件有相同的順序相同的列。

all_data <- lapply(filesToProcess, read.csv) 
fileContent <- do.call(rbind, all_data)

然後你可以像上面那樣運行ggplot代碼（沒有循環）。

來源

2016-07-31 14:06:27

@Marchand我需要在所有文件中的數據圖。 – SriniShine

@Marchand謝謝你的建議。是的，所有文件都具有相同順序的相同列（時間，A，A | E，B，E）。 HOWere我試過你的方法，情節看起來不像它應該的樣子。另外我還包含了一些示例文件。 – SriniShine

我想我解決了這個問題。我承認答案有點粗糙。但是，如果我可以初始化for循環之外的「p」變量，它將解決問題。

filesToProcess = readLines("FilesToProcess.txt") 

#initializing the variable with ggplot() object 
p <- ggplot() 

for(i in 1:length(filesToProcess)){ 
    fileName = filesToProcess[i] 
    fileContent = read.csv(fileName) 

    p <- p + 
    geom_line(data = fileContent, aes(x = Time, y = A, color = "A"), size =0.8) + 
    geom_line(data = fileContent, aes(x = Time, y = B, color = "B"), size =0.8) + 
    geom_line(data = fileContent, aes(x = Time, y = AE, color = "AE"), size =0.8) + 
    geom_line(data = fileContent, aes(x = Time, y = E, color = "E"), size =0.8) 

} 

p <- p + theme_bw() + scale_x_continuous(breaks=1:20) + 
    xlab('Time') + 
    ylab('Metabolite Concentration')+ 
    ggtitle('Step Scan') + 
    labs(color="Legend text") 
plot(p)

來源

2016-08-01 11:05:31 SriniShine

使用ggplot在多個文件中繪製數據

回答

相關問題