# How to deal with spatial data in R and ggplot (Shapefiles)

During my first project that involved manipulating big files containing spatial data, to be more precise shapefiles, I couldn’t find a good tutorial that helped me to understand how to handle the structure of the data, it was overwhelming and frustrating, that is why I’m doing this tutorial explaining shapefiles and how to work with them in R using ggplot, hopefully I can help many others that are in my previous situation.

Let’s start by reading the data, when you have a shapefile (or more precisely a group of files inside a folder that represent spatial points or polygons) the easiest way to do it is using `rgdal`.

```  library(rgdal)
Map
When you are dealing with a big file this might take a while and this is obvioulsy frustrating , fortunately R has a way to save the `Map` object in a format that loading the data next times will be much faster.
#Save the data as an R object
saveRDS(Map,'Map.RDS')

#Now you can load the shapefile using this line
Map
If you want to reproduce the maps shown in this tutorial you can download the data here.
Once the data is loaded you might want to take a look at `Map`, since `Map` class is `SpatialPolygonsDataFrame` you will find that each element of `Map` contains 5 slots (`data`, `polygons`, `bbox`, `plotOrder` and `proj4string`).
To access any of these slots instead of using `\$` you have to use `@`, here is an example:
#Check the coordinate system used

#Check data associated to each spatial element

Transform data
Now that the data is loaded we can proceed to manipulate the data so we can create a Map using `ggplot2`.
#Transform the data into the desired coordinate system
Map
The first step is required if you want to interact with other spatial elements such as Google Maps, otherwise you can omit it; the second step transform the spatial data of the object into a data frame and let us use the data on `ggplot`, although it drops any additional information that might be valuabe for the analysis.
Let’s get the join the lost data, this data most of the times is the one you are interested in, here you can find the metrics and data that will give real value to your map.
#Add the [email protected] to the Map_draw data frame
Map_draw\$id
Plot data
Now the data is ready to let ggplot do it’s magic, `geom_polygon` will bring to life the layer we want to draw.
ggplot() +
geom_polygon(data = Map_draw,
aes(long, lat, group =group), color = 'blue',
fill = 'white') +
coord_map()

So far the map doesn’t represent any type of data other than the shape, to add real value let’s use the variable `Population` to fill each polygon using `scale_fill_gradient` to determine the colour scale.
ggplot() +
geom_polygon(data = Map_draw,
aes(long, lat, group =group, fill =Population),
color = 'gray') +

coord_map()+
#Add the scale of colour you want
scale_fill_gradient(low = 'light blue', high = 'dark blue')

Final details
Now the map shows where are the most crowded areas. To finish lets add final details, We can add and remove some of elements that will make the map much cleaner.
```
• Delete the axis, on a map data context the axis labels are irrelevant.
```   ditch_the_axes
```
• Download the map terrain and replace those ugly gray rectangles as background.
```   #Zoom should change depending the range of your spatial data
library(ggmap)
Background
```

Now everything is ready to create a nice plot combining all the previous elements.

```  Plot

When we add the background the plotting area changes, but we can fix that controlling the axis to get our final plot.
Plot + scale_x_continuous(expand = c(-.2,0)) +
scale_y_continuous(expand = c(-.24,0))

I hope this tutorial was helpful, if you have any question leave it on the comments section.```

This post first appeared on StaRtistician, please read the originial post: here

# Share the post

How to deal with spatial data in R and ggplot (Shapefiles)

×