Description and context

The R package GPE (Georeferenced Program Evaluation), includes functions that will allow the user to study various aspects of consumer or beneficiary behavior, including:

  • Georeference postal addresses (converting street name and number to latitude and longitude points)

  • Obtain base maps of any city’s urban grid and visualize overlaid projections of the distance traveled (between participants’ homes and points of service accessed, for example)

  • Estimate a matrix of distances between sites accessed and consumer/beneficiary origin

  • Calculate metrics of frequency, distribution by group, and distance travelled by consumers/beneficiaries to points of sale/access

  • Create visualizations that explore the difference between frequency and type of consumption by consumer/beneficiary attribute (depending on the information available: socioeconomic status, length of program participation, type of program participation, etc)

User guide

Basic usage

For complete documentation of this R package, please see the documentation website. Examples of the functions can be found below and in the doc folder of this repository.

You’ll need tabular data with addresses, or “lon” and “lat” columns containing WG84 (Mercator) location coordinates. In other words, a table with the addresses of schools that program beneficiaries attend, or a street or centroid identifying their neighborhoods, or their home addresses if using personal data is allowed.

Geocoding

You can geocode your data (obtain latitude and longitude from addresses) using GPE_geocode(), a function that queries the Google Maps Platform to translate addresses into latitude/longitude coordinates.

Using the Google Maps Platform requires a registered API key. To obtain it, follow instructions from https://developers.google.com/maps/gmp-get-started. Make sure you enable the Geocoding API.

A valid API key is a string of leters and numbers that looks like AIjaSyBR76W62lloYPh_c01LYGhCOZuKU6RVW9 -this is not a real ID by the way, just an example-. Once you have your key, you can take a data frame with an address column like this one: Having a “participants” dataframe such as

… and add latitude and longitude columns using GPE_geocode, specifying the name of the address column and your Google Maps Platform API key: key <- "AIjaSyBR76W62lloYPh_c01LYGhCOZuKU6RVW9" # This is not a real API key, you must provide yours

GPE_geocode(schools, address, key)

Columns with latitude and longitude coordinates are added to the original data frame.

Mapping geocoded data

To map the geocoded data, you’ll need a data frame containing participant data. GPE includes “participants”, an example data frame with fictional public program beneficiaries. The geographic position of participants and program locations can be plotted, over a basemap, using:

GPE_plot_map(participants)

In order for GPE_plot_map() to work, the input data frame must include columns named “lat” and “lon” representing WGS84 coordinates. As previously shown, latitude and longitude columns can be obtained from addresses using GPE_geocode().

In addition, a data frame containing locations -places that participants visit in order to interact with the program, such as training centers, day care providers, etc.- can also be mapped. GPE includes “locations”, an example data frame with fictional public program sites. This dummy data can be used as follows to generate a more detailed map:

GPE_plot_map(participants, locations)

You can also visualize participant attributes by indicating the name of the column that should be used:

GPE_plot_map(participants, locations, participant_attribute = "group")

The same can be done for location attributes…

GPE_plot_map(participants, locations, location_attribute = "type")

… or both participant and location attributes:

GPE_plot_map(participants, locations, participant_attribute = "group", location_attribute = "type")

Estimating travel distance and time

Using records of interaction between people and places -representing trips by consumers/beneficiaries to points of sale/access- GPE can estimate time and distance travelled with the Google Maps Platform. As is the case with geocoding, a valid API key is needed to access this service.

GPE provides an example “visits” data frame:

visits

Trip distance and duration can be obtained using GPE_travel_time_dist(). As inputs, the function uses visits data, and the locations and participants data frames to provide the origin and destination coordinates. The mode of transport can be chosen from “transit” (default), “driving”, “walking”, or “bicycling”. Keep in mind that transit routing information is not available for all cities; “driving” and “walking” routing is usually available everywhere.

GPE_travel_time_dist(visits, participants, locations, key)

Summaries

GPE includes a simple summary function that takes a “visits” data frame, as described above, and returns basic descriptive statistics for: o frequency of visits, by participant (person) o frequency of visits, by location (site)

If the input data frame includes time_minutes and distance_km columns (i.e. as a result of using GPE_travel_time_dist()) the summary will also include basic descriptive statistics for

o travel time, in minutes o travel distance, in km

For example, given a data frame like:

… the result is:

GPE_summary(visits_timedist)

Installation guide

install.packages("devtools")
devtools::install_github("EL-BID/GPE")

Dependencies

GPE depends on the following packages:

ggmap

Some funcitons in GPE will also require a Google API key. A key can be created here. For more information, use the R command ?ggmap::register_google.

How to contribute

For all contributions to this project, this repository may be forked directly.

Authors

GPE was developed by H. Antonio Vazquez Brust

License

The Documentation of Support and Use of the software is licensed under Creative Commons IGO 3.0 Attribution-NonCommercial-NoDerivative (CC-IGO 3.0 BY-NC-ND)

The codebase of this repo uses AM-331-A3 Software License.

Limitation of responsibilities

The IDB is not responsible, under any circumstance, for damage or compensation, moral or patrimonial; direct or indirect; accessory or special; or by way of consequence, foreseen or unforeseen, that could arise:

I. Under any concept of intellectual property, negligence or detriment of another part theory; I

ii. Following the use of the Digital Tool, including, but not limited to defects in the Digital Tool, or the loss or inaccuracy of data of any kind. The foregoing includes expenses or damages associated with communication failures and / or malfunctions of computers, linked to the use of the Digital Tool.