2017-07-19 48 views
0

我正在執行很多種羣隨時間的相關性。我已經將它們分開,並通過lapply的函數來完成它們。我想把每個相關的輸出到數據幀(即:每個行會的信息爲一個相關,與列:相關的名字p值噸統計DFCIcorcoeff)。將函數的相關輸出放入數據框中

我有兩個問題:

  1. 我不知道如何提取中的斷續進行的相關
  2. 我可以讓我的函數運行在分裂相關的名稱(600+相關性),但我無法將其打印到數據框中。澄清:當我運行沒有循環的函數時,它爲每個組執行所有600次相關。但是,當我添加循環時,它會爲分區中的所有組生成NULL。

這裏是我迄今:

> head(Birds) #Shortened for this Post 
Location  Species Year Longitude Latitude Section Total Percent Family 
1 Chiswell A Kittiwake 1976 -149.5847 59.59559 Central 310 16.78397 Gull 

BigSplit<-split(Birds,list(Birds$Family, Birds$Location, 
Birds$Section,Birds$Species), drop=T) #A list of Dataframes 

#Make empty data frame 
resultcor <- data.frame(Name = character(), 
         tvalue = character(), 
         degreeF = character(), 
         pvalue = character(), 
         CIs = character(), 
         corcoeff = character(),stringsAsFactors = F) 

WorkFunc <- function(dataset) { 
    data.name = substitute(dataset) #Use "dataset" as substitute for actual dataset name 

    #Correlation between Year and population Percent 
    try({ 
      correlation <- cor.test(dataset$Year, dataset$Percent, method = "pearson")  
    }, silent = TRUE) 

    for (i in 1:nrow(resultcor)) { 
      resultcor$Name[i] <- ??? #These ??? are not in the code, just highlighting Issue 1 
      resultcor$tvalue[i] <- correlation$dataset$statistic 
      resultcor$degreeF[i] <- correlation$dataset$parameter 
      resultcor$pvalue[i] <- correlation$dataset$p.value 
      resultcor$CIs[i] <- correlation$dataset$conf.int 
      resultcor$corcoeff[i] <- correlation$dataset$estimate 
    } 
} 

lapply(BigSplit, WorkFunc) 

任何幫助,將不勝感激,謝謝!

+1

檢查包'掃帚'它爲你做這一切。 – sinQueso

+0

*分裂在哪裏?請顯示該代碼。 *我無法將它打印到數據框中* ...請解釋發生了什麼。什麼是* BigSplit *,一個數據框列表? – Parfait

+0

@Parfait爲了清晰我編輯過。是_BigSplit_數據框的列表。 謝謝 – LearningTheMacros

回答

1

考慮使用Map(至mapply的變體),其中您傳遞BigSplit的所有元素和名稱。使用Map將輸出一個數據幀列表,然後您可以在末尾與do.call()進行綁定。以下假設BigSplit是一個命名列表。

WorkFunc <- function(dataset, dataname) { 
    # Correlation between Year and population Percent 
    tryCatch({ 
     correlation <- cor.test(dataset$Year, dataset$Percent, method = "pearson") 
     CIs <- correlation$conf.int 

     return(data.frame(
        Name = dataname, 
        tvalue = correlation$statistic, 
        degreeF = correlation$parameter, 
        pvalue = correlation$p.value, 
        CI_lower = ifelse(is.null(CIs), NA, CIs[[1]]), 
        CI_higher = ifelse(is.null(CIs), NA, CIs[[2]]), 
        corcoeff = correlation$estimate 
      ) 
     ) 
    }, error = function(e) 
      return(data.frame(
         Name = character(0), 
         tvalue = numeric(0), 
         degreeF = numeric(0), 
         pvalue = numeric(0), 
         CI_lower = numeric(0), 
         CI_higher = numeric(0), 
         corcoeff = numeric(0) 
        ) 
      ) 
    ) 
}  

# BUILD CORRELATION DATAFRAMES INTO LIST 
cor_df_list <- Map(WorkFunc, BigSplit, names(BigSplit)) 
cor_df_list <- mapply(WorkFunc, BigSplit, names(BigSplit), SIMPLIFY=FALSE) # EQUIVALENT 

# ROW BIND ALL DATAFRAMES TO FINAL LARGE DATAFRAME 
finaldf <- do.call(rbind, cor_df_list) 
+0

我遵循這個代碼,但我似乎得到這個錯誤: 錯誤data.frame(Name = dataname,tvalue = correlation $ statistic,degreeF = correlation $參數,: 參數暗含不同行數:1,0 另外:警告消息: 1:在data.frame(Name = dataname,tvalue =相關性$統計,degreeF =相關性$參數,: 行名從一個短變量中找到並已被丟棄 2:在data.frame(Name = dataname,tvalue = correlation $ statistic,degreeF = correlation $參數,: 行名從短變量中找到並已被丟棄 – LearningTheMacros

+0

道歉。我忘了在'data.frame()'中添加逗號來分隔值。 – Parfait

+0

哦,不,我也得到了這個錯誤,但我理解並添加了逗號,這個錯誤是在逗號添加後。 – LearningTheMacros