2017-01-17 92 views
2

如果某些列名不存在,我該如何忽略數據集?R rbind - 參數列數不匹配

我從流天氣數據的列表,但我認爲某些關鍵的天氣條件不存在,因此我有以下rbind此錯誤:

Error in rbind(deparse.level, ...) : 
    numbers of columns of arguments do not match 

我的代碼:

weatherDf <- data.frame() 

for(i in weatherData) { 
    # Get the airport code. 
    airport <- i$airport 

    # Get the date. 
    date <- as.POSIXct(as.numeric(as.character(i$timestamp))/1000, origin="1970-01-01", tz="UTC-1") 

    # Get the data in dailysummary only. 
    dailySummary <- i$dailysummary 

    weatherDf <- rbind(weatherDf, ldply(
     list(dailySummary), 
     function(x) c(airport, format(as.Date(date), "%Y-%m-%d"), x[["meanwindspdi"]], x[["meanwdird"]], x[["meantempm"]], x[["humidity"]]) 
    )) 
} 

所以,我怎樣才能確保低於這些關鍵條件的數據存在:

meanwindspdi 
meanwdird 
meantempm 
humidity 

如果其中任何一個都不退出,那麼忽略其中的一堆。可能嗎?

編輯:

weatherData的內容是jsfiddle(我不能張貼在這裏,因爲它是太長,我不知道哪裏是公開顯示的數據爲R上的最好的地方... )

編輯2:

我得到了一些錯誤,當我嘗試將數據導出到一個txt:

> write.table(weatherData,"/home/teelou/Desktop/data/data.txt",sep="\t",row.names=FALSE) 
Error in data.frame(date = list(pretty = "January 1, 1970", year = "1970", : 
    arguments imply differing number of rows: 1, 0 

這是什麼意思?它似乎有在數據的一些錯誤......

編輯3:

我已經出口了我的整個數據.RData到我的谷歌驅動器:

https://drive.google.com/file/d/0B_w5RSQMxtRSbjdQYWJMX3pfWXM/view?usp=sharing

如果你使用RStudio,那麼你可以導入數據。

編輯4:

target_names <- c("meanwindspdi", "meanwdird", "meantempm", "humidity") 

# If it has data then loop it. 
if (!is.null(weatherData)) { 
    # Initialize a data frame. 
    weatherDf <- data.frame() 

    for(i in weatherData) { 
     if (!all(target_names %in% names(i))) 
      next 

     # Get the airport code. 
     airport <- i$airport 

     # Get the date. 
     date <- as.POSIXct(as.numeric(as.character(i$timestamp))/1000, origin="1970-01-01", tz="UTC-1") 

     # Get the data in dailysummary only. 
     dailySummary <- i$dailysummary 

     weatherDf <- rbind(weatherDf, ldply(
      list(dailySummary), 
      function(x) c(airport, format(as.Date(date), "%Y-%m-%d"), x[["meanwindspdi"]], x[["meanwdird"]], x[["meantempm"]], x[["humidity"]]) 
     )) 
    } 

    # Rename column names. 
    colnames(weatherDf) <- c("airport", "key_date", "ws", "wd", "tempi", 'humidity') 

    # Convert certain columns weatherDf type to numberic. 
    columns <-c("ws", "wd", "tempi", "humidity") 
    weatherDf[, columns] <- lapply(columns, function(x) as.numeric(weatherDf[[x]])) 
} 

檢查weatherDf

> View(weatherDf) 
Error in .subset2(x, i, exact = exact) : subscript out of bounds 
+2

嘗試'dput(head(weatherData,50))'爲sharin g你的數據。 – nrussell

+0

@nrussell我在jsfiddle中分享我所有的數據。請參閱我上面的編輯。謝謝。 – laukok

+0

當我運行你的代碼時,我沒有得到任何錯誤? – G5W

回答

1

您可以使用next跳過循環的當前迭代,並進入下一個迭代:

target_names <- c("meanwindspdi", "meanwdird", "meantempm", "humidity") 

for(i in weatherData) { 
    if (!all(target_names %in% names(i))) 
    next 
    # continue with loop... 
+0

感謝您的回答。但是它不會再提供任何數據......請參閱我的編輯4.我已將所有數據導出爲編輯3. – laukok

+1

@teelou目標名稱不在「weatherData」中的任何數據框中。因此,所有迭代都會被跳過。 –