Tools for Researchers

Explore and download the latest open-source tools for data analysis developed as part of PDEL's work. 

SMSSurvey: An Open-Source SMS-based Messaging Platform for Data Intensive Research


The widespread expansion of mobile phones in even remote parts of the developing world has opened up a flexible and cost-effective conduit through which researchers can interact with previously marginalized populations. Building on the success of mobile-based research projects run from UC San Diego, we have constructed a customizable open-source software platform that fundamentally changes the ability of researchers to gather and disseminate data to low income households worldwide. The platform is an open-source solution for the creation of SMS data loops that will enable new forms of sophisticated mobile phone-based information interventions and research. It will give researchers and graduate students the last mile capability to create systems to solicit and provide information directly to the “bottom billion.”

Our SMS messaging platform addresses a fundamental barrier to reaching remote populations in developing countries: cost. A typical study in rural Africa might cost $100 per survey to conduct, with much of that money eaten up in fuel and vehicle rentals. An SMS message typically costs 5 cents or less to send (and nothing to receive) within a developing country, and the mobile network is the most ubiquitous and robust infrastructure to have been deployed in rural Africa and South Asia.

There are currently many services that allow a large number of text messages to be sent relatively cheaply almost anywhere on earth, but these systems are not typically built to handle responses—the information flow is one way. In our SMS messaging platform, incoming and outgoing messages are generated by an in-country server, so that very low-cost communication is enabled both for researchers and respondents.

The software has been field tested in Uganda.

System Features

The system has the following features:

  1. Ability to send and receive SMS messages. SMS messages are sent at the convenience of the mobile provider, and can often arrive in an order different from the one in which they were sent. An important part of building a sophisticated SMS system, then, is establishing protocols for the ways in which responses are sent that will allow them to be sequenced upon receipt.
  2. The freedom to choose a local number belonging to any carrier in any nation.
  3. Supports multiple carriers at the same time for the same survey. The system can dynamically select the originating number to be of a particular carrier, based on the carrier of the destination number. This has implications for mobile airtime transfers.
  4. Backend data infrastructure to collect, clean, and analyze the incoming data.
  5. Runs on low-cost single board computers such as the Raspberry Pi (a low-cost, credit-card sized computer) as well as higher-end servers, and runs using MySQL and PHP, both readily available programs.
  6. Integration to SMS-based mobile airtime transfers.

sms-project-fig.-1.pngOur core architecture is based on using AT commands to communicate with cell phones or GSM modems to send and receive text messages. The setup consists of servers or Raspberry Pis connected to cell phones via bluetooth or USB. A single server could connect up to 64 cell phones. Each phone will have a SIM card that can belong to any carrier of choice. Fig 1 shows the block schematic.

The SMS gateway engine is responsible for sending and receiving SMS via the cell phones. The duties of the engine include:

  • Reading SMS messages from the outgoing message queue and sending them. If necessary, the engine could send via the phone of a specific carrier.
  • Reading incoming SMS and triggering the appropriate callbacks to initiate the necessary action.
  • Periodically retry messages that were not successfully delivered.


SMSSurvey is open-source and is now accessible to the research community through Github.



PowerCalculator: Designing Experiments in the Presence of Interference


The possibility of interference between individuals has traditionally been seen as the Achilles heel of randomized experiments, because contamination of the control group by spillover effects generates impact estimates that are internally invalid. Research designs and randomized control trials that fail to account for spillovers can produce biased estimates of intention-to-treat effects, while finding meaningful treatment effects but failing to observe deleterious spillovers can lead to misconstrued policy conclusions. In many contexts, a full understanding of the policy environment requires us to measure spillover and threshold effects that are not captured by (or, worse, are sources of bias in) standard experimental designs.

This software allows a researcher to explore the statistical power of experiments to identify estimands of treatment and spillover effects when there is interference between units. We focus on settings with partial interference, in which individuals are split into mutually exclusive clusters, such as villages or schools, and interference occurs between individuals within a cluster but not across clusters. We consider experiments in which treatment is allocated using a randomized saturation (RS) design, which is a two-stage randomization procedure in which first the share of individuals assigned to treatment within a cluster is randomized, and second, the individuals within each cluster are randomly assigned to treatment according to the realized cluster-level saturation from the first stage. RS designs can be used to identify a rich set of estimands, including the treatment effect and spillover effect on untreated individuals at specific saturations, slope effects measuring how spillover effects change with respect to treatment saturation, and pooled effects across multiple saturations.

For a given RS design, our software allows a user to calculate the minimum detectable effect (MDE) of these estimands, which is the smallest value of an estimand that it is possible to distinguish from zero. The software also calculates the optimal RS design for different researcher objectives. Given a set of estimands and a set of weights specified by the researcher, the software calculates the RS design that minimizes the weighted sum of the MDEs for the specified set of estimands. Our paper establishes that introducing variation into the treatment saturation of clusters impacts the power of different estimands. These optimal design calculations allow the researcher to precisely characterize this power trade-off.


We have provided a GUI for ease of use. The video provides a tutorial on how to use it. We have also supplied Python, R, and MATLAB code. 

Citing the Software

A. Bohren, P. Staples, S. Baird, C. McIntosh, and B. Özler, (2016).  Power Calculation Software for Randomized Saturation Experiments, Version 1.0.  Available from

Video Tutorial for the Application

Theoretical Underpinnings

Download the working paper