开发者

How to attach a simple data.frame to a SpatialPolygonDataFrame in R?

开发者 https://www.devze.com 2023-01-14 19:29 出处:网络
I have (again) a problem with combining data frames in R. But this time, one is a SpatialPolygonDataFrame (SPDF)and the other one is usual data.frame (DF). The SPDF has around 1000 rows the DF only 40

I have (again) a problem with combining data frames in R. But this time, one is a SpatialPolygonDataFrame (SPDF) and the other one is usual data.frame (DF). The SPDF has around 1000 rows the DF only 400. Both have a common column, QDGC

Now, I tried

oo <- merge(SPDF,DF, by="QDGC", all=T)

but this only results in a normal data.frame, not a spatial polygon data frame any more. I read somewhere else, that this does not work, but I did not understand what to do in such a case (has to do something with the ID columns, merge uses)

oooh such a hard question, I开发者_开发百科 quess...

Thanks! Jens


Let df = data frame, sp = spatial polygon object and by = name or column number of common column. You can then merge the data frame into the sp object using the following line of code

sp@data = data.frame(sp@data, df[match(sp@data[,by], df[,by]),])

Here is how the code works. The match function inside aligns the columns so that order is preserved. So when we merge it with sp@data, order is correctly preserved. A quick check to see if the code has worked is to inspect the two columns corresponding to the common column and see if they are identical (the common columns get duplicated and it is easy to remove the copy, but i keep it as it is a good check)


It is as easy as this:

require(sp) # the trick is that this package must be loaded!

oo <- merge(SPDF,DF, by="QDGC")

I've tested by myself. But it only works if you use merge from package sp. This is the default when sp package is loaded. merge function is then overloaded and sp::merge is used if the first argument is spatial structure.


merge can produce a dataframe with more rows than the originals if there's not a simple 1-1 mapping of the two dataframes. In which case, it would have to copy all the geometry and create multiple polygons, which is probably not a good thing.

If you have a dataframe which is the same number of rows as a SpatialPointsDataFrame, then you can just directly replace the @data slot.

library(sp)
example(overlay) # to get the srdf object
srdf@data
spplot(srdf)
srdf@data=data.frame(x=runif(3),xx=rep(0,3))
spplot(srdf)

if you get the number of rows wrong:

srdf@data=data.frame(x=runif(2),xx=rep(0,2))
spplot(srdf)
Error in data.frame(..., check.names = FALSE) : 
  arguments imply differing number of rows: 3, 2


Maybe the function joinCountryData2Map in the rworldmap package can give inspiration. (But I may be wrong, as I was last time.)


One more solution is to use append_data function from the tmaptools package. It is called with these arguments:

append_data(shp, data, key.shp = NULL, key.data = NULL,
  ignore.duplicates = FALSE, ignore.na = FALSE,
  fixed.order = is.null(key.data) && is.null(key.shp))

It's a bit unfortunate that it's called append since I'd understand append more ina sense of rbind and we want to have something like join or merge here.

Ignoring that fact, function is really useful in making sure you got your joins correct and if some rows are present only on one side of join. From the docs:

Under coverage (shape items that do not correspond to data records), over coverage (data records that do not correspond to shape items respectively) as well as the existence of duplicated key values are automatically checked and reported via console messages. With under_coverage and over_coverage the under and over coverage key values from the last append_data call can be retrieved,


If it is two shapefiles that are needed to be merged to a single object, just use rbind().

When using rbind(), just make sure that both the arguments you use are SpatialDataFrames. You can check this using class(sf). If it is not a dataframe, then use st_as_sf() to convert them to a SpatialDataFrame before you rbind them.

Note : You can also use this to append to NULLs, especially when you are using a result from a loop and you want to cumulate the results.

0

精彩评论

暂无评论...
验证码 换一张
取 消