2015-05-01 18 views
0

我想通過「ward.name」將非空間對象(Merged_Census2011)加入到shapefile多邊形(LDN_wards)中。它看起來工作正常,直到我看到新創建的對象,並看到所有數據都變成了NAs。以下是我如何繼續。如何在將非空間對象連接到幾何數據/多邊形時獲得NA值?

#Join Merged_Census2011 data to LDN_wards shapefile 
LDN_wards <- readOGR(dsn = "data", layer = "LDN_wards") 
head([email protected]) 
#Explore the object 
plot(LDN_wards) 
summary(LDN_wards) 
names(Merged_Census2011) 
names(LDN_wards) 
names(LDN_wards) <- c("Code", "ward.name") #rename LND-wards name heading to ward.name so it can be matched later 

#Join datasets 
[email protected] <- left_join([email protected], Merged_Census2011) 
head([email protected]) 

我也得到:

[email protected] <- left_join([email protected], Merged_Census2011) 
Joining by: "ward.name" 
Warning message: 
In left_join_impl(x, y, by$x, by$y) : 
joining factors with different levels, coercing to character vector 
> head([email protected]) 
    Code ward.name ward.code.x electorate votescast ward.code.y per.owner per.white per.noquals per.degree per.couple 
1 E05000001 Aldersgate  <NA>   NA  NA  <NA>  NA  NA   NA   NA   NA 
2 E05000002  Aldgate  <NA>   NA  NA  <NA>  NA  NA   NA   NA  

我有直覺,這是因爲在兩個集合之間不同的數字行。這可能是問題嗎?是否不可能加入具有不同行級別的數據集(據此,缺少的數據在相應的觀測中仍然不匹配)? 我曾比較了兩組數據如下:

#Compare the two datasets 
nrow(LDN_wards) 
nrow(Merged_Census2011) 
LDN_wards$ward.name %in% Merged_Census2011$ward.name 
LDN_wards$ward.name %in% Merged_Census2011$ward.name 
> nrow(LDN_wards) 
[1] 787 
> nrow(Merged_Census2011) 
[1] 668 
> LDN_wards$ward.name %in% Merged_Census2011$ward.name 
    [1] FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSEFALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 
[21] FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE 
    [41] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE ETC... 
> summary(LDN_wards$ward.name %in% Merged_Census2011$ward.name) 
    Mode FALSE TRUE NA's 
logical  24  763  0 

難道是因爲FALSE = 24?如果是這樣,我該如何刪除這些錯誤?

道歉,如果這聽起來很明顯,我是相當新:)

感謝您的幫助!

+0

嘗試LDN_wards @數據[complete.cases(LDN_wards @數據)] 我的直覺告訴我,你LDN_wards @數據的第24行不匹配,所以當你做一個頭,你只能得到NA結果。 –

+0

的確,我剛剛查過,其餘的數據都在這裏。非常感謝你對我的肯揚楊肯:) –

回答

0

我剛剛嘗試使用(新發現的)inner_join函數,它似乎工作。如果我理解的很好,inner_join函數只會合併匹配的行......所以我認爲它會更好。事實上,我不再獲得NA值。但奇怪的是,我得到了重複的觀察結果......如果有人有更好的建議,歡迎您分享。請參閱下面的會議錄。

#Join datasets 
[email protected] <- inner_join([email protected], Merged_Census2011) 
head([email protected], n=10) 

> #Join datasets 
> [email protected] <- inner_join([email protected], Merged_Census2011) 
Joining by: c("ward.name", "ward.code.x", "electorate", "votescast","ward.code.y", "per.owner", "per.white", "per.noquals", "per.degree", "per.couple", "per.higher.managerial", "per.christian", "per.no.car", "per.limill", "per.goodhealth", "per.males", "per.aged60plus") 
Warning message: 
In inner_join_impl(x, y, by$x, by$y) : 
joining character vector and factor, coercing into character vector 
> head([email protected], n=10) 
    Code  ward.name ward.code.x electorate votescast ward.code.y per.owner per.white per.noquals per.degree per.couple 
1 E05000007   Bridge E05000497  8677  5654 E05000497  69.8  71.9  19.9  29.9  55.3 
2 E05000026   Abbey E05000026  8110  4712 E05000026  32.7  28.1  16.4  34.5  47.2 
3 E05000026   Abbey E05000026  8110  4712 E05000455  48.5  73.4  10.1  55.4  52.4 
4 E05000026   Abbey E05000455  7250  4808 E05000026  32.7  28.1  16.4  34.5  47.2 
5 E05000026   Abbey E05000455  7250  4808 E05000455  48.5  73.4  10.1  55.4  52.4 
6 E05000027   Alibon E05000027  6971  4127 E05000027  45.1  70.1  31.2  16.7  49.2 
7 E05000028  Becontree E05000028  7535  4538 E05000028  46.7  58.8  28.0  20.6 
相關問題