Mapping in Python: Folium


Tutorial Information


  • Module 7 in Spatial Data Science.

Methods


Spatial Analysis, Mapping

Tools


Python

In this workshop, we’ll be using Python’s folium package to produce interactive maps of ‘public’ tree densities in Cambridge and Somerville using data drawn from the former’s Public Works department and the latter’s Urban Forestry Division. Folium provides a neat and full-featured wrapper for the well-known and widely loved Leaflet.js web mapping library. With it, we can draw interactive, feature-rich Leaflet maps using straightforward Python without ever leaving the comfort of a familiar Pandas-based workflow. Note that if you’re more of an R afficianado (n.b., the author is just such a creature), you can find an almost identical tutorial written for R’s Leaflet wrapper earlier in this series.

Create a New Virtual Environment

We begin by creating a new virtual environment for our Python spatial work. As a reminder, a virtual environment is a self-contained Python environment, generally built to accommodate a specific project. We use these so that we’re not endlessly installing pacakges into a single bloated environment; instead, we work only with those packages we need.

First, we want to navigate to the folder to which we downloaded the exercise data. In a terminal (macOS/Linux) or command prompt (Windows) window type…

cd path/to/your/data/

Once you’ve changed your working directory, you’ll want to create a new virtual environment using the venv utility, which is a module (hence, -m) of Python.

# Tells python 3 to run the venv module...
# ...and create a new virtual environment...
# ...in our current directory.
python3 -m venv venv/

# On Windows...
python -m venv venv\

We can activate this virtual environment by running the following at the terminal:

# On macOS or Linux...
. venv/bin/activate
# On Windows...
venv\Scripts\activate.bat
# The next line of your prompt should begin (venv)

Once we’ve activated our virtual environment, any packages we install will be specific to this environment—this is very useful for managing packages on a project-by-project basis. We’re going folium, as well as the GeoPandas package, which extends the ever-popular Pandas package. We’ll also make sure that matplotlib is installed (these packages help us plot, in general, and plot polygons, specifically). We’ll also install jenkspy, which gives us a very simple utility function that produces Jenks Natural Breaks.

pip install folium geopandas matplotlib jenkspy

Draw a Map

We begin by importing the folium package and creating a new map object. We create the map using only a few properties: a latitude-longitude pair on which to center it, and a zoom level at which to render it. Zoom levels are standardized for webmaps, and range from 0-20, with 0 being the ‘largest scale’, and 20 being the ‘smallest scale’ (speaking in terms of cartographic scale). As a default, a new folium map will include an OpenStreetMap tile layer. While a full discussion of map tiles is outside the scope of this workshop, I would recommend reading the map tile introduction linked above—it’s a simple enough concept to grasp!

import folium

# Create new map.
m = folium.Map(
    location=[42.38172, -71.111408],
    zoom_start=13
)
# Draw map.
m

We can use a different tileset by providing a string argument to the tiles parameter. This can be either…

  1. a string denoting a specific tileset included by the Folium developers (the documentation tells us this can be ‘OpenStreetMap’, ‘Mapbox Bright’, ‘Mapbox Control Room’, ‘Stamen Terrain’, ‘Stamen Toner’, ‘Stamen Watercolor’, ‘Cloudmade’, ‘Mapbox’, and ‘CartoDB positron’ or ‘CartoDB dark_matter’) or…
  2. a custom tile server, e.g. http://{s}.yourtiles.com/{z}/{x}/{y}.png

Let’s first try adding a custom tile server! I’m a big fan of the Atlascope project from the Leventhal Map & Education Center at the Boston Public Library. Their collection of georeferenced and mosaic-ed atlases is available as a collection of hosted tilesets! (If you want to locate others, simply find an atlas you’re interested in and click ‘About This Map’ under the dropdown menu in the footer.) We can add the 1894 Bromley Atlas of the City of Cambridge like so. Note that we need to pass an attribution to the attr parameter when loading a custom tilset. This is interpreted as HTML, we can pass a hyperlink.

# Create new map with provider tileset.
m = folium.Map(
    location=[42.38172, -71.111408],
    tiles='https://s3.us-east-2.wasabisys.com/urbanatlases/39999059015576/tiles/{z}/{x}/{-y}.png',
    attr='<a href="https://www.leventhalmap.org/">Leventhal Map & Education Center at the Boston Public Library</a>',
    zoom_start=13
)
# Draw map.
m

But, given that we might want our basemap to be slightly less interesting, we can add Stamen Toner instead. Note that we don’t require an attribution here—the attribution is stored alongside the tilset identifier and is automatically included in the bottom-right corner.

# Create new map with provider tileset.
m = folium.Map(
    location=[42.38172, -71.111408],
    tiles='Stamen Toner',
    zoom_start=13
)
# Draw map.
m

Drawing Spatial Data

Basemaps are well and good, but we’re in this to do exploratory data visualization! As such, we’ll need a way to add spatial data to our project, whether it is stored on disk or in a Pandas object. As a case study, let’s read a GeoJSON file from disk. This is the dataset at which we arrived at the conclusion of last week’s session—it contains ‘public’ tree counts and densities for the cities of Cambridge and Somerville.

…from a GeoJSON

Recall that Folium is built on top of Leaflet. Leaflet, as a JavaScript library, expects JSON (JavaScript Object Notation); it therefore provides a helpful GeoJson object that will read a geojson file from disk and add it to our map. This is remarkably simple. Assuming that we have a map m, we can simply create a new GeoJson object, passing the file path, and add the geometries to the map. Note that the geometry type (POINT, LINE, POLYGON, etc.) is automatically determined, as this information is included in the GeoJSON geometries.

folium.GeoJson(
    'data/trees_per_area.geojson'
).add_to(m)

m

…from a GeoPandas Object

However, in many cases, you will want to render spatial data that is stored in a non-GeoJSON format, or that is the result of spatial analysis in GeoPandas. Why introduce the intermediate step of writing the JSON and reading if you don’t have to?

As an example, let’s read the same data from a Shapefile instead of a GeoJSON, storing it first in a GeoDataFrame.

import geopandas as gpd
trees = gpd.read_file('data/trees_per_area.shp')

We can’t pass this object directly to folium, but we also don’t want to introduce the complexity of writing and re-reading a geojson file. Instead, we can simply convert the trees GeoDataFrame to JSON using the .to_json() method.

For example, let’s convert the first five rows of the dataframe using the .head() method…

from pprint import pprint
pprint(trees.head().to_json())
('{"type": "FeatureCollection", "features": [{"id": "0", "type": "Feature", '
 '"properties": {"area": 1073506.546135969, "area_sqmi": 0.038504538957531, '
 '"id": 0, "land_area": 0.038504, "tree_count": null, "trees_p_sq": null, '
 '"water_area": 0.0}, "geometry": {"type": "Polygon", "coordinates": '
 '[[[-71.15932825278648, 42.419756678264065], [-71.15727140671973, '
 '42.418868638184634], [-71.15728097173223, 42.41710479634876], '
 '[-71.159347267742, 42.41622899425637], [-71.16140405707002, '
 '42.41711699781159], [-71.16139460712705, 42.4188808399828], '
 '[-71.15932825278648, 42.419756678264065]]]}}, {"id": "1", "type": "Feature", '
 '"properties": {"area": 1073506.546135969, "area_sqmi": 0.038504538957531, '
 '"id": 1, "land_area": 0.038504, "tree_count": null, "trees_p_sq": null, '
 '"water_area": 0.0}, "geometry": {"type": "Polygon", "coordinates": '
 '[[[-71.15520499564046, 42.41974440245647], [-71.15314820791414, '
 '42.41885628870503], [-71.15315788799609, 42.41709244720855], '
 '[-71.15522424073497, 42.416216719123526], [-71.15728097173223, '
 '42.41710479634876], [-71.15727140671973, 42.418868638184634], '
 '[-71.15520499564046, 42.41974440245647]]]}}, {"id": "2", "type": "Feature", '
 '"properties": {"area": 1073506.546135969, "area_sqmi": 0.038504538957531, '
 '"id": 2, "land_area": 0.038504, "tree_count": null, "trees_p_sq": null, '
 '"water_area": 0.0}, "geometry": {"type": "Polygon", "coordinates": '
 '[[[-71.1510817401059, 42.41973197896548], [-71.14902501072956, '
 '42.41884379154408], [-71.14903480588082, 42.417079950391056], '
 '[-71.15110121533912, 42.416204296315385], [-71.15315788799609, '
 '42.41709244720855], [-71.15314820791414, 42.41885628870503], '
 '[-71.1510817401059, 42.41973197896548]]]}}, {"id": "3", "type": "Feature", '
 '"properties": {"area": 1073506.546135969, "area_sqmi": 0.038504538957531, '
 '"id": 3, "land_area": 0.038504, "tree_count": null, "trees_p_sq": null, '
 '"water_area": 0.0}, "geometry": {"type": "Polygon", "coordinates": '
 '[[[-71.14695848620202, 42.41971940779114], [-71.14490181518528, '
 '42.41883114670187], [-71.14491172540569, 42.41706730589637], '
 '[-71.14697819157371, 42.41619172583203], [-71.14903480588082, '
 '42.417079950391056], [-71.14902501072956, 42.41884379154408], '
 '[-71.14695848620202, 42.41971940779114]]]}}, {"id": "4", "type": "Feature", '
 '"properties": {"area": 1073506.546135969, "area_sqmi": 0.038504538957531, '
 '"id": 4, "land_area": 0.034519, "tree_count": null, "trees_p_sq": null, '
 '"water_area": 111090.88590224039}, "geometry": {"type": "Polygon", '
 '"coordinates": [[[-71.14283523394816, 42.419706688933566], '
 '[-71.14077862130056, 42.41881835417849], [-71.14078864658997, '
 '42.417054513724565], [-71.14285516945803, 42.416179007673534], '
 '[-71.14491172540569, 42.41706730589637], [-71.14490181518528, '
 '42.41883114670187], [-71.14283523394816, 42.419706688933566]]]}}]}')

.to_json() creates a GeoJSON representation from a GeoDataFrame! We can pass the results to our GeoJson object constructor in the same was as a GeoJSON file read from disk.

m = folium.Map(
    location=[42.38172, -71.111408],
    tiles='Stamen Toner',
    zoom_start=13
)

folium.GeoJson(
    trees.to_json()
).add_to(m)

m

Drawing a Choropleth Map

But of course, geometries aren’t the matter of interest here. Hexagons are nice shapes, but they’re useful here primarily as symbols for our measured tree densities! We want to create a choropleth map, visualizing the density observed in each hexagon using a color ramp. This is, of course, an exceedingly common map type, and for that reason, folium provides a built-in object that simplifies choropleth mapping.

It looks like this: assuming that we have a map m we create a new Choropleth object using the following parameters.

  1. geo_data = trees.to_json() This simply passes our JSON-ified spatial data to the choropleth constructor.
  2. data = trees This passes our un-JSONified data frame that includes measured tree densities.
  3. columns = ['id', 'trees_p_sqmi'] This identifies columns of interest in the data. Namely, the first is a unique identifier that will join the dataframe to our JSON-ified spatial data, and the second is the quantity to be visualized.
  4. key_on = 'feature.properties.id' This identifies the property of our JSON which will be joined the first attribute passed to the columns parameter. We’re joining feature.properties.id to id: this makes sense! Look at the JSON representation—feature.properties.id is simply the id stored in the properties list for each feature.
  5. fill_color = 'YlGn' This identifies one of Cindy Brewer’s ColorBrewer color ramps.
  6. bins = 5 The data will be classed into five equal-interval bins.

The other parameters are simply stylistic, setting the line properties and fill properties.

m = folium.Map(
    location=[42.38172, -71.111408],
    tiles='Stamen Toner',
    zoom_start=13
)

folium.Choropleth(
    geo_data = trees.to_json(),
    data = trees,
    columns = ['id', 'trees_p_sq'],
    key_on = 'feature.properties.id',
    fill_color = 'YlGn',
    fill_opacity = 0.7,
    line_color = 'White',
    line_opaciy = 0.1,
    legend_name = 'Public Trees per Square Mile',
    bins = 5
).add_to(m)

m