如何使用KML多邊形過濾地理座標的數據框？

我有一個跨越整個地球的數據點的CSV（the US Geological Service's earthquake feed），我只想篩選美國境內的地震。如何使用KML多邊形過濾地理座標的數據框？

KML文件我已經從美國人口普查局拉：

https://www.census.gov/geo/maps-data/data/kml/kml_nation.html

在R，該rgdal庫可以加載KML文件：

library(rgdal) 
kml = readOGR("kmls/cb_2014_us_nation_20m.kml", 'cb_2014_us_nation_20m')

如何使用dplyr/plyr /等。僅針對落在由KML指定的邊界內的行來過濾data.frame（地理數據的列爲latitude和longitude）？

編輯，回答後：

這是我從@hrbrmstr's answer用來得到一個快速的可視化：

library(sp) 
library(rgdal) 
# download earthquakes 
url <- "http://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_week.csv" 
fil <- "all_week.csv" 
if (!file.exists(fil)) download.file(url, fil) 
quakes <- read.csv("all_week.csv", stringsAsFactors=FALSE) 
# create spatial object 
sp::coordinates(quakes) <- ~longitude+latitude 

# download nation KML 
url <- "http://www2.census.gov/geo/tiger/GENZ2014/kml/cb_2014_us_nation_20m.zip" 
fil <- "uskml.zip" 
if (!file.exists(fil)) download.file(url, fil) 
unzip(fil, exdir="uskml") 
# read KML file 
us <- rgdal::readOGR("./uskml/cb_2014_us_nation_20m.kml", "cb_2014_us_nation_20m") 
sp::proj4string(quakes) <- sp::proj4string(us) 

length(quakes) 
# 1514 

usquakes = quakes[us,] 
length(usquakes) 
# 1260 

### visualize 
plot(us) 
# plot all quakes 
points(quakes$longitude, quakes$latitude) 
# plot just US 
points(usquakes$longitude, usquakes$latitude)

產生的圖像：

感謝@ hrbrmstr！

來源

2015-08-17 Dan Nguyen

美國本土或美國所有？ – hrbrmstr

@hrbrmstr：我會解決這個問題，但我猜你在問，因爲在美國不連續的情況下，多邊形數據結構是不同的？讓我們假裝我只關心大陸，儘管我現在使用的文件是全國人口普查的界限：https://www.census.gov/geo/maps-data/data/kml/kml_nation.html –

這會做一個CPL方式：

library(sp) 
library(maptools) 
library(rgeos) # not entirely necessary 
library(rgdal) # not entirely necessary 

url <- "http://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_week.csv" 
fil <- "all_week.csv" 
if (!file.exists(fil)) download.file(url, fil) 

quakes <- read.csv("all_week.csv", stringsAsFactors=FALSE) 
coordinates(quakes) <- ~longitude+latitude 

url <- "http://www2.census.gov/geo/tiger/GENZ2014/shp/cb_2014_us_nation_20m.zip" 
fil <- "us.zip" 
if (!file.exists(fil)) download.file(url, fil) 
unzip(fil, exdir="us") 
us <- readShapePoly("us/cb_2014_us_nation_20m.shp") 

# alternatively 
# us <- rgdal::readOGR("us/cbcb_2014_us_nation_20m.shp", "cb_2014_us_nation_20m") 

# TRUE if in 
in_us_rgeos <- rgeos::gIntersects(quakes, us, byid=TRUE) 

# <NA> if in 
in_us_over <- quakes %over% us

gIntersects需要更長的時間。 rgdal和rgeos可以是一個在某些系統上工作的熊。因人而異

使用美國KML需要（大部分）rgdal：

# you wanted KML shapefile tho 
url <- "http://www2.census.gov/geo/tiger/GENZ2014/kml/cb_2014_us_nation_20m.zip" 
fil <- "uskml.zip" 
if (!file.exists(fil)) download.file(url, fil) 
unzip(fil, exdir="uskml") 

us <- rgdal::readOGR("uskml/cb_2014_us_nation_20m.kml", "cb_2014_us_nation_20m") 
proj4string(quakes) <- proj4string(us) 
rgeos::gIntersects(quakes, us, byid=TRUE) 
head(quakes %over% us)

來源

2015-08-17 00:52:01 hrbrmstr

謝謝！...很快的問題......爲什麼'over'似乎比'gIntersects'快得多？ –

'rgeos'是Java代碼的C++端口（可能意味着它攜帶了一些Java代碼包），但'gIntersects'也更靈活（它可以處理所有'Spatial ...'類型）。對於特殊情況，'over'（對於'SpatialPolygons'中的'SpatialPoints'的情況）將調用_super_高效的C代碼（函數是'R_point_in_polygon_sp'）。 – hrbrmstr

如何使用KML多邊形過濾地理座標的數據框？

回答

相關問題