如何對數據框屬性邏輯測試導致NA-行

我有以下形式的數據幀：如何對數據框屬性邏輯測試導致NA-行

>df 
stationid station  gear sample  lat lon  date depth 
1  25679   CORBOX150 UE4 53.9015 7.8617 15.07.1987 19 
2  25681 UE9 Kern CORCRB050 UE9 54.0167 7.3982 15.07.1987 33 
3  NA       54.0167 7.3982 15.07.1987 33

上stationid邏輯測試給我，旁邊的正確的第一線，一個惱人的線路全程的NAs：

> df[df$stationid=="25679",] 
stationid station  gear sample  lat lon  date depth 
1  25679   CORBOX150 UE4 53.9015 7.8617 15.07.1987 19 
NA  NA <NA>  <NA> <NA>  NA  NA  <NA> NA

這是爲什麼？

df第3行的某處，我猜想事情會搞砸。

繼承人的數據：

df<-structure(list(stationid = c(25679L, 25681L, NA), station = structure(c(2L, 
3L, 1L), .Label = c("", " ", "UE9 Kern"), class = "factor"), 
gear = structure(c(2L, 3L, 1L), .Label = c("", "CORBOX150", 
"CORCRB050"), class = "factor"), sample = structure(c(2L, 
3L, 1L), .Label = c("", "UE4", "UE9"), class = "factor"), 
lat = c(53.9015, 54.0167, 54.0167), lon = c(7.8617, 7.3982, 
7.3982), date = structure(c(1L, 1L, 1L), .Label = "15.07.1987", class = "factor"), 
depth = c(19L, 33L, 33L)), .Names = c("stationid", "station", 
"gear", "sample", "lat", "lon", "date", "depth"), class = "data.frame", row.names = c(NA, 
-3L))

來源

2012-08-24 Janhoo

這是因爲你在'stationid'列有'NA'，使用'which'。這個'df [which（df $ stationid ==「25679」）]''應該可以工作 – dickoa

與NA任何比較導致的結果NA（見http://cran.r-project.org/doc/manuals/R-intro.html#Missing-values）...您可以使用

df[df$stationid==25679 & !is.na(df$stationid),]

或（如建議在上面的註釋）

df[which(df$stationid==25679),]

或

subset(df,stationid==25679)

（subset has t他有時不想要的副作用，但是在這種情況下，它正是你想要的做的想要的）

來源

2012-08-24 15:24:28

感謝Ben，那個cran鏈接解釋了爲什麼我會得到一排充滿NA的行，而不是df中的另外一行3！乾杯 – Janhoo

另一種解決方案是df[df$stationid==25679 & !is.na(df$stationid),]。時間更長，但更明確。

來源

2012-08-24 15:24:06 Alan

如何對數據框屬性邏輯測試導致NA-行

回答

相關問題