Batch Geocoding Example App

You can see this app running online at: Batch Geocoding App Online

The Batch Geocoding App determines the latitude and longitude for a given set of addresses entered by the user. The results are displayed on a map using KML and the KML file can be downloaded by the user. The app has very simple input and output information, which belies the complexity of the problem it solves. The app uses the MapQuest web service to perform the geocoding, and serves as an example of how a third party webservice can be integrated within a Tropofy app.

To run this app locally:

$ source tropofy_env/bin/activate
$ tropofy quickstart tropofy_batch_geocoding
$ cd tropofy_batch_geocoding
$ pip install -e .
$ nano config.py  # add the keys for this app
$ tropofy app -c config.py

Imported Modules

First we import the SQLAlchemy and Tropofy modules required by our app. We use the urllib2 and json modules to communicate with the MapQuest web service. We could equally well have used the more modern requests package to solve the same problem

import requests
import json
from simplekml import Kml, Style, IconStyle, Icon
from sqlalchemy.schema import Column
from sqlalchemy.types import Text, Float
import pkg_resources

from tropofy.app import AppWithDataSets, Step, StepGroup
from tropofy.widgets import ExecuteFunction, KMLMap, SimpleGrid

SQLAlchemy Classes

We then define the two classes that we are going to need, InputAddress and OutputGeocodedLocation.

InputAddress Class

  • Note the base class of all SQLAlchemy derived classes must be either DataSetMixin if your app is going to support multiple data sets for your users, or ORMBase if it is not
  • The only Column that must not be empty is the street column, so it is the only column with nullable=False specified
  • The __init__ method allows InputAddress objects to be created with the addess either broken into its constituents or simply defined as a single string
class InputAddress(DataSetMixin):
    street = Column(Text, nullable=False)
    suburb = Column(Text, nullable=False, default='')
    city = Column(Text, nullable=False, default='')
    state = Column(Text, nullable=False, default='')
    post_code = Column(Text, nullable=False, default='')
    country = Column(Text, nullable=False, default='')

    def __init__(self, street, suburb='', city='', state='', post_code='', country=''):
        self.street = street
        self.suburb = suburb
        self.city = city
        self.state = state
        self.post_code = post_code
        self.country = country

OutputGeocodedLocation Class

  • We include columns in the OutputGeocodedLocation class corresponding to the output of the MapQuest web service.
  • We allow most of the columns to be empty, using nullable=True to ensure we are very forgiving of any outputs produced by the MapQuest web service.
class OutputGeocodedLocation(DataSetMixin):
    address = Column(Text, nullable=False)
    latitude = Column(Float, nullable=True)
    longitude = Column(Float, nullable=True)
    city = Column(Text, nullable=True)
    state = Column(Text, nullable=True)
    country = Column(Text, nullable=True)
    post_code = Column(Text, nullable=True)
    geocode_quality = Column(Text, nullable=True)

    def __init__(self, address, latitude=None, longitude=None, state=None, city=None, country=None, post_code=None, geocode_quality=None):
        self.address = address
        self.latitude = latitude
        self.longitude = longitude
        self.city = city
        self.state = state
        self.country = country
        self.post_code = post_code
        self.geocode_quality = geocode_quality

Widgets

In addition to the tropofy.widgets.SimpleGrid we are going to use the tropofy.widgets.ExecuteFunction and tropofy.widgets.KMLMap widgets.

The ExecuteFunction Widget

We use this widget to wrap up calling the MapQuest web service. For users who have signed up but not paid for a subscription to our app, and so are previewing it, tropofy.app.AppDataSet.user_is_previewing_app() we enforce a limit of 20 input addresses. We use the tropofy.app.AppDataSet.send_progress_message() function to report progress to the user. Most of the logic of interfacing with the MapQuest web service has been moved out into functions in the rest of our python module.

class GeocodeAddresses(ExecuteFunction):

    def get_button_text(self, app_session):
        return "Geocode your Addresses"

    def execute_function(self, app_session):

        if len(app_session.data_set.query(InputAddress).all()) > 20:
            app_session.task_manager.send_progress_message("You can only geocode up to 20 locations using the free version of this app")
        else:
            app_session.task_manager.send_progress_message("Deleting old results")
            app_session.data_set.query(OutputGeocodedLocation).delete()

            app_session.task_manager.send_progress_message("Geocoding addresses")
            result_text = call_geocoding_api(app_session)

            if result_text:
                app_session.task_manager.send_progress_message("Writing results to DB")
                results = json.loads(remove_non_ascii(result_text))
                gc = [construct_gl_from_result(mq_provided_location_as_string(result['providedLocation']), result['locations'][0] if result['locations'] else None) for result in results['results']]
                app_session.data_set.add_all(gc)

            app_session.task_manager.send_progress_message("Finished")

The KMLMapOutput Widget

This example illustrates the simplest possible interface to the KMLMap widget using the KML generation module provided with the Tropofy framework.

class KMLMapOutput(KMLMap):

    def get_kml(self, app_session):
        kml = Kml()
        mystyle = Style(iconstyle=IconStyle(scale=0.8, icon=Icon(href='https://maps.google.com/mapfiles/kml/paddle/blu-circle-lv.png')))
        for gc in app_session.data_set.query(OutputGeocodedLocation).all():
            if gc.longitude and gc.latitude:
                point = kml.newpoint(name=gc.address, coords=[(gc.longitude, gc.latitude)])
                point.style = mystyle
        return kml.kml()

The App Itself

The key features of our app is the gui we define with the tropofy.app.AppWithDataSets.get_gui() function and and the example data we provide with the tropofy.app.AppWithDataSets.get_examples() function.

  • We want users to be able to store different data sets within our app so we derive our app from the tropofy.app.AppWithDataSets class
  • Note there must be one and only one class that derives from tropofy.app.AppWithDataSets in your python file so that Tropofy can instantiate an instance of your app
  • Example data sets are defined by a name and a function which will load the example data into the database. Here we provide some example data for both the USA and Australia
  • Note the functions provided to load up the example data sets take a single tropofy.app.data_set.AppDataSet parameter, which wraps up SQLAlchemy’s session object, and provides an interface to the database and allows us to add, via tropofy.app.AppDataSet.add_all() the example data
  • For simplicity we hard code the example data in our python code using two @staticmethod functions. Generally embedding data in code should be avoided.
  • You can use the function read_write_xl.create_example_data_set_from_excel to load your example data from Excel files, other tutorials do exactly this.
class MyBulkGeocoderApp(AppWithDataSets):

    def get_name(self):
        return "Batch Geocoding"

    def get_examples(self):
        return {"Demo Brisbane Addresses": self.load_example_data_for_brisbane,
                "Demo New York Addresses": self.load_example_data_for_new_york}

    def get_static_content_path(self, app_session):
        return pkg_resources.resource_filename('te_batch_geocoding', 'static')

    def get_gui(self):
        step_group1 = StepGroup(name='Enter your data')
        step_group1.add_step(Step(
            name='Enter addresses data',
            widgets=[SimpleGrid(InputAddress)],
            help_text="""
                You can can enter the entire address into the 'street' column, or if you have the address information broken up you can
                enter the components of each address into the right columns, this will generally give you higher quality answers"""
        ))

        step_group2 = StepGroup(name='Geocode')
        step_group2.add_step(Step(name='Geocode your Addresses', widgets=[GeocodeAddresses()]))

        step_group3 = StepGroup(name='View Geocodes')
        step_group3.add_step(Step(
            name='View Geocodes',
            widgets=[SimpleGrid(OutputGeocodedLocation), KMLMapOutput()],
            help_text='''
                The grid lists the latitude and longitude of each address that was geocoded.
                The map displays a preview of the KML (Google Earth) file that you can download below'''
        ))

        return [step_group1, step_group2, step_group3]

    @staticmethod
    def load_example_data_for_brisbane(app_session):
        addresses = []
        addresses.append(InputAddress("6 Luke St, Wavell Heights, Brisbane, 4012"))
        addresses.append(InputAddress("244 Edinburgh Castle Rd, Wavell Heights, Brisbane, 4012"))
        addresses.append(InputAddress("4 Aloe St, Wavell Heights, Brisbane, 4012"))
        addresses.append(InputAddress("6 Sylvan Ave, Wavell Heights, Brisbane, 4012"))
        app_session.data_set.add_all(addresses)

    @staticmethod
    def load_example_data_for_new_york(app_session):
        addresses = []
        addresses.append(InputAddress("Worth st", "Lower Manhatten", "New York City", "New York", "", "USA"))
        addresses.append(InputAddress("Fulton st", "Lower Manhatten", "New York City", "New York", "", "USA"))
        addresses.append(InputAddress("Washington St", "MeatPacking District", "New York City", "New York", "", "USA"))
        app_session.data_set.add_all(addresses)

    def get_icon_url(self):
        return "/{}/static/{}/geocoding.png".format(
            self.url_name,
            self.get_app_version(),
        )

Full code

"""
Author:      www.tropofy.com

Copyright 2015 Tropofy Pty Ltd, all rights reserved.

This source file is part of Tropofy and governed by the Tropofy terms of service
available at: http://www.tropofy.com/terms_of_service.html

This source file is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
or FITNESS FOR A PARTICULAR PURPOSE. See the license files for details.
"""

import requests
import json
from simplekml import Kml, Style, IconStyle, Icon
from sqlalchemy.schema import Column
from sqlalchemy.types import Text, Float
import pkg_resources

from tropofy.app import AppWithDataSets, Step, StepGroup
from tropofy.widgets import ExecuteFunction, KMLMap, SimpleGrid
from tropofy.database.tropofy_orm import DataSetMixin


class InputAddress(DataSetMixin):
    street = Column(Text, nullable=False)
    suburb = Column(Text, nullable=False, default='')
    city = Column(Text, nullable=False, default='')
    state = Column(Text, nullable=False, default='')
    post_code = Column(Text, nullable=False, default='')
    country = Column(Text, nullable=False, default='')

    def __init__(self, street, suburb='', city='', state='', post_code='', country=''):
        self.street = street
        self.suburb = suburb
        self.city = city
        self.state = state
        self.post_code = post_code
        self.country = country


class OutputGeocodedLocation(DataSetMixin):
    address = Column(Text, nullable=False)
    latitude = Column(Float, nullable=True)
    longitude = Column(Float, nullable=True)
    city = Column(Text, nullable=True)
    state = Column(Text, nullable=True)
    country = Column(Text, nullable=True)
    post_code = Column(Text, nullable=True)
    geocode_quality = Column(Text, nullable=True)

    def __init__(self, address, latitude=None, longitude=None, state=None, city=None, country=None, post_code=None, geocode_quality=None):
        self.address = address
        self.latitude = latitude
        self.longitude = longitude
        self.city = city
        self.state = state
        self.country = country
        self.post_code = post_code
        self.geocode_quality = geocode_quality


class GeocodeAddresses(ExecuteFunction):

    def get_button_text(self, app_session):
        return "Geocode your Addresses"

    def execute_function(self, app_session):

        if len(app_session.data_set.query(InputAddress).all()) > 20:
            app_session.task_manager.send_progress_message("You can only geocode up to 20 locations using the free version of this app")
        else:
            app_session.task_manager.send_progress_message("Deleting old results")
            app_session.data_set.query(OutputGeocodedLocation).delete()

            app_session.task_manager.send_progress_message("Geocoding addresses")
            result_text = call_geocoding_api(app_session)

            if result_text:
                app_session.task_manager.send_progress_message("Writing results to DB")
                results = json.loads(remove_non_ascii(result_text))
                gc = [construct_gl_from_result(mq_provided_location_as_string(result['providedLocation']), result['locations'][0] if result['locations'] else None) for result in results['results']]
                app_session.data_set.add_all(gc)

            app_session.task_manager.send_progress_message("Finished")


class KMLMapOutput(KMLMap):

    def get_kml(self, app_session):
        kml = Kml()
        mystyle = Style(iconstyle=IconStyle(scale=0.8, icon=Icon(href='https://maps.google.com/mapfiles/kml/paddle/blu-circle-lv.png')))
        for gc in app_session.data_set.query(OutputGeocodedLocation).all():
            if gc.longitude and gc.latitude:
                point = kml.newpoint(name=gc.address, coords=[(gc.longitude, gc.latitude)])
                point.style = mystyle
        return kml.kml()


class MyBulkGeocoderApp(AppWithDataSets):

    def get_name(self):
        return "Batch Geocoding"

    def get_examples(self):
        return {"Demo Brisbane Addresses": self.load_example_data_for_brisbane,
                "Demo New York Addresses": self.load_example_data_for_new_york}

    def get_static_content_path(self, app_session):
        return pkg_resources.resource_filename('te_batch_geocoding', 'static')

    def get_gui(self):
        step_group1 = StepGroup(name='Enter your data')
        step_group1.add_step(Step(
            name='Enter addresses data',
            widgets=[SimpleGrid(InputAddress)],
            help_text="""
                You can can enter the entire address into the 'street' column, or if you have the address information broken up you can
                enter the components of each address into the right columns, this will generally give you higher quality answers"""
        ))

        step_group2 = StepGroup(name='Geocode')
        step_group2.add_step(Step(name='Geocode your Addresses', widgets=[GeocodeAddresses()]))

        step_group3 = StepGroup(name='View Geocodes')
        step_group3.add_step(Step(
            name='View Geocodes',
            widgets=[SimpleGrid(OutputGeocodedLocation), KMLMapOutput()],
            help_text='''
                The grid lists the latitude and longitude of each address that was geocoded.
                The map displays a preview of the KML (Google Earth) file that you can download below'''
        ))

        return [step_group1, step_group2, step_group3]

    @staticmethod
    def load_example_data_for_brisbane(app_session):
        addresses = []
        addresses.append(InputAddress("6 Luke St, Wavell Heights, Brisbane, 4012"))
        addresses.append(InputAddress("244 Edinburgh Castle Rd, Wavell Heights, Brisbane, 4012"))
        addresses.append(InputAddress("4 Aloe St, Wavell Heights, Brisbane, 4012"))
        addresses.append(InputAddress("6 Sylvan Ave, Wavell Heights, Brisbane, 4012"))
        app_session.data_set.add_all(addresses)

    @staticmethod
    def load_example_data_for_new_york(app_session):
        addresses = []
        addresses.append(InputAddress("Worth st", "Lower Manhatten", "New York City", "New York", "", "USA"))
        addresses.append(InputAddress("Fulton st", "Lower Manhatten", "New York City", "New York", "", "USA"))
        addresses.append(InputAddress("Washington St", "MeatPacking District", "New York City", "New York", "", "USA"))
        app_session.data_set.add_all(addresses)

    def get_icon_url(self):
        return "/{}/static/{}/geocoding.png".format(
            self.url_name,
            self.get_app_version(),
        )


def call_geocoding_api(app_session):
    '''
    For some mapquest development resources see...
    http://developer.mapquest.com/web/info/terms-of-use
    http://www.mapquestapi.com/geocoding/#batch
    http://developer.mapquest.com/web/products/open/geocoding-service

    You will need to get a valid MapQuest key to run this yourself. Put the key in the .ini file you're using to run this app under 'custom.mapquest_key'.
    '''
    addresses_to_geocode = {'locations': [
        {'street': b.street, 'city': b.city, 'state': b.state, 'postalCode': b.post_code, 'country': b.country} for b in app_session.data_set.query(InputAddress).all()]
    }

    key = app_session.server_settings.get('custom.mapquest_key')
    if not key:
        raise Exception('No MapQuest key found.')
    url = 'http://open.mapquestapi.com/geocoding/v1/batch?key=%s' % (key)

    try:
        response = requests.post(url, data=json.dumps(addresses_to_geocode))  # Send json in POST body
        if not response.ok:
            if response.status_code == 404:
                raise Exception("An error occurred connecting to mapquestapi.")
            elif response.status_code == 500:
                raise Exception("The server couldn't fulfill the request.")
            elif response.status_code == 403:
                raise Exception("Unauthorised MapQuest key.")
            else:
                raise Exception('An error occurred')
    except requests.exceptions.ConnectionError:
        raise Exception("An error occurred connecting to mapquestapi.")
    return response.content


def remove_non_ascii(s):
    return "".join(i for i in s if ord(i) < 128)


def mq_provided_location_as_string(data):
    return ", ".join(value for key, value in data.items() if value)


def interpret_mq_geocode_quality(qual):
    if qual[:2] in ['P1', 'L1', 'I1', 'B1', 'B2', 'B3']:
        return "Excellent"
    elif qual[:2] in ['B2', 'B3']:
        return "Fair"
    else:
        return "Poor"


def construct_gl_from_result(address, result):
    if result is not None:
        return OutputGeocodedLocation(
            address=address,
            latitude=result['displayLatLng']['lat'],
            longitude=result['displayLatLng']['lng'],
            state=result['adminArea3'],
            city=result['adminArea5'],
            country=result['adminArea1'],
            post_code=result['postalCode'],
            geocode_quality=interpret_mq_geocode_quality(result['geocodeQualityCode'])
        )
    return OutputGeocodedLocation(address)