Get Even More Visitors To Your Blog, Upgrade To A Business Listing >>

How to deal with spatial data in R and ggplot (Shapefiles)

During my first project that involved manipulating big files containing spatial data, to be more precise shapefiles, I couldn’t find a good tutorial that helped me to understand how to handle the structure of the data, it was overwhelming and frustrating, that is why I’m doing this tutorial explaining shapefiles and how to work with them in R using ggplot, hopefully I can help many others that are in my previous situation.

Read data

Let’s start by reading the data, when you have a shapefile (or more precisely a group of files inside a folder that represent spatial points or polygons) the easiest way to do it is using rgdal.

  library(rgdal)
  Map 

When you are dealing with a big file this might take a while and this is obvioulsy frustrating , fortunately R has a way to save the Map object in a format that loading the data next times will be much faster.

 #Save the data as an R object
  saveRDS(Map,'Map.RDS')

 #Now you can load the shapefile using this line
  Map 

If you want to reproduce the maps shown in this tutorial you can download the data here.

Once the data is loaded you might want to take a look at Map, since Map class is SpatialPolygonsDataFrame you will find that each element of Map contains 5 slots (data, polygons, bbox, plotOrder and proj4string).

To access any of these slots instead of using $ you have to use @, here is an example:

 #Check the coordinate system used
 head(Map@proj4string)

 #Check data associated to each spatial element
 head(Map@data)
 

Transform data

Now that the data is loaded we can proceed to manipulate the data so we can create a Map using ggplot2.

 #Transform the data into the desired coordinate system
   Map 

The first step is required if you want to interact with other spatial elements such as Google Maps, otherwise you can omit it; the second step transform the spatial data of the object into a data frame and let us use the data on ggplot, although it drops any additional information that might be valuabe for the analysis.

Let’s get the join the lost data, this data most of the times is the one you are interested in, here you can find the metrics and data that will give real value to your map.

   #Add the Map@data to the Map_draw data frame
   Map_draw$id 

Plot data

Now the data is ready to let ggplot do it’s magic, geom_polygon will bring to life the layer we want to draw.

   ggplot() +
     geom_polygon(data = Map_draw,
                  aes(long, lat, group =group), color = 'blue',
                  fill = 'white') +
     coord_map()

So far the map doesn’t represent any type of data other than the shape, to add real value let’s use the variable Population to fill each polygon using scale_fill_gradient to determine the colour scale.

   ggplot() +
     geom_polygon(data = Map_draw,
                  aes(long, lat, group =group, fill =Population),
                      color = 'gray') +

     coord_map()+
#Add the scale of colour you want
     scale_fill_gradient(low = 'light blue', high = 'dark blue')

Final details

Now the map shows where are the most crowded areas. To finish lets add final details, We can add and remove some of elements that will make the map much cleaner.

  • Delete the axis, on a map data context the axis labels are irrelevant.
       ditch_the_axes 
    
  • Download the map terrain and replace those ugly gray rectangles as background.
    
       #Zoom should change depending the range of your spatial data
       library(ggmap)
       Background 
    

Now everything is ready to create a nice plot combining all the previous elements.

  Plot 

When we add the background the plotting area changes, but we can fix that controlling the axis to get our final plot.

  Plot + scale_x_continuous(expand = c(-.2,0)) +
    scale_y_continuous(expand = c(-.24,0))

I hope this tutorial was helpful, if you have any question leave it on the comments section.




This post first appeared on StaRtistician, please read the originial post: here

Share the post

How to deal with spatial data in R and ggplot (Shapefiles)

×

Subscribe to Startistician

Get updates delivered right to your inbox!

Thank you for your subscription

×