Data Science Programming Library

This page presents some examples of core R code for kriging and mapping that has been used by the project to support its work in geospatial analysis with big data for global health.

In the course of our work, MEASURE Evaluation staff repeatedly use standard datasets and rely on a core set of methods to analyze that data. Frequently project staff will use a consistent core set of code across many of its activities. The outputs of this code inform the project’s analysis and findings are presented in reports.

This page presents some examples of core R code for kriging and mapping that has been used by the project to support its work in geospatial analysis with big data for global health. The code presents examples of how to programmatically employ Demographic and Health Survey (DHS) and Twitter data and analyze it.

The code is provided in three sets, two Jupyter notebooks and an R Markdown document in order to facilitate its use. More information about Jupyter notebooks is provided at the end of this document. R Markdown files can be opened directly in R and or in R Studio.

1)  Spatial regression and kriging using DHS data

  • Spatial regression and kriging are two powerful techniques for looking at the relationship between data using a spatial perspective. This document provides an overview of the basics of spatial regression and then demonstrates how to conduct this type of analysis using DHS data.
  • HTML: Kriging.html
  • R Markdown: Kriging.Rmd

2)  Mapping Twitter data

3)  K-Means of Twitter data

  • K-Means is a basic clustering method that groups records into similar clusters. This approach can be useful with social media data to look at patterns behind factors such as number of followers (to identify high influencers for example), and trends around hashtags and likes.
  • HTML:Twitter Kmeans.HTML
  • Jupyter notebook: Twitter Kmeans.ipynb