Creating Technology for Social Change

Dispatches from #NICAR14: Mohammed Haddad and Robert Benincasa on harnessing the power of the crowd

I’m currently in Baltimore for the 2014 conference for NICAR (National Institute of Computer Assisted Reporting). In this series, I’ll be liveblogging the various talks and workshops I attend — keep in mind, this is by no means exhaustive coverage of all the cool stuff going on at the conference. For more, check out Chrys Wu’s index of slides, tutorials, links, and tools or follow #nicar14 on Twitter. Read on for my summary of Mohammed Haddad and Robert Benincasa’s presentation on the crowdsourced story.

Mohammed Haddad is a data journalist at Al Jazeera English. His work revolves around taking data sets and transforming them into data visualizations for interactive storytelling. Prior to joining Al Jazeera Haddad was involved in crowdsourcing research and applications in South Africa.

Robert Benincasa is producer, computer-assisted reporting for NPR News’ investigations unit, based in Washington. His recent work includes an examination and social-media driven database of playground access for children with disabilities; and an analysis of air pollution discharges in Louisiana. Benincasa also serves on the faculty of Georgetown University’s master’s degree program in journalism.


Mohammed, a newcomer to NICAR, kicks off the talk. “My job is to take data and tell stories from that data… if you give individuals the ability to create data and tell their stories, there is something incredibly powerful about what they’re able to give you.” He uses the visual metaphor of a megaphone — we shouldn’t be focusing on aggregate voice of what the crowd is saying, but the individual stories that are going into the megaphone.

Mohammed points us to Jeff Howe’s definition of crowdsourcing: “The act of taking a job traditionally performed by a designated agent and outsourcing it to an undefined, generally large group of people in the form of an open call.” You shouldn’t only think about the open call as being people on the internet, he says.

There is some pushback against the idea of crowdsourcing — the crowds can never be more reliable than experts, critics say. Others, like James Surowiecki, are believers in the “theory of wise crowds” and see crowdsourcing as being more than the sum of its parts. When he joined Al Jazeera in 2011, the Arab Spring was in full force, with the crowd — the public — disrupting their governments from the ground up.

When utilizing crowdsourcing to investigate the Arab Spring, there were many issues to keep in mind. In the Arab world in particular, there is a bias between who’s using social media and who’s not. Now, he brings up the example of Somalia Speaks, a crowdsourced project that seeks to visualize citizen sentiments. SMS was chosen as the platform of choice, due to its greater degree of penetration. This was used as the basis for numerous other “Speaks” projects, including Gaza, Uganda, and The Balkans.

Technologically, these projects weren’t too sophisticated — they used standard Ushahidi instances. But for Libya and Mali, they began to explore different techniques of visualization based on emergent trends in the responses. In Kenya, Mohammed’s team used the map to draw relationships between these sentiments and elections, as well as violent incidents. Most recently, the Nelson Mandela visualization — which was not based on geographic data — used a mosaic of pictures in the images of Mandela’s face.

Mohammed gives some tips for crowdsourcing projects. “Internally, it’s up to your team to make sure you’re asking the right questions.” He also emphasizes the importance of teaming up with local partners to ensure robust sampling, rather than relying on a convenience sample of, say, the capital city. Over time, you can develop a repeatable workflow and modify parts of it as needed. Finally, Mohammed mentions technology. He’s left it until the end because you should only make decisions about implementation after you’ve done your research and found your story.

It’s key to understand that data is an abstraction of somebody’s real life story. Mohammed emphasizes, “It’s not just a data point, it’s a real story that you shouldn’t just dilute for the aggregate voice of the crowd.” In particular, you should look for gaps in this aggregate voice — it represents places where people’s stories aren’t being told. Big data doesn’t mean diversity. “If you’re sampling an entire country, don’t only look at the capital city… always look for the gaps.”

Mohammed concludes by saying that crowdsourced data points should be the starting point for storytelling, not the final destination. “If you found that an individual has written about something, it’s a starting point for a new piece. That’s not the end of the story.”

Now Robert takes the stand to share some insights from his experience with NPR News Investigations, in particular his work building a database of accessible public playgrounds. The goals for the database were simple: users should be able to contribute and expand it. To this end, it should work on mobile devices, be location-aware, and include robust GIS capabilities.

“The first thing you should do when something has a federal regulatory angle is go to regulations.gov and look for a docket file.” When starting this project, Robert first read the regulations, and more importantly, the comment files relevant to public playgrounds. Next, you should find out if someone else ever wanted to make this database, and if they’re still working on it. Robert found an advocate who had been individually compiling these data points, but she’s not a professional data person — collection was idiosyncratic and subjective. Nonetheless, she had a good list of playgrounds.

After that, Robert followed this up by calling up all the interested parties. He contacted major accessible playground builders and asked them for a list. He went to the National Recreation and Park Association, and had them do a survey of their members. Finally, cold calls and FOIAs to major jurisdictions helped round out the rest of the database.

When building the database, you need to make conscious decisions about what you want to include and exclude. This can be based on a number of factors, including how long you want to work on it or editorial decisions about the story itself. In the case of accessible public playgrounds, Robert’s team focused on elements in the regulations, interviews with experts and advocates, and what they wanted the crowd to tell them.

Based on these decisions, the NPR News Apps team created a responsive app for adding and editing playgrounds — one of their mottos is “if it doesn’t work on mobile, it doesn’t work.” The app launched with 1,300 accessible playgrounds; since then, nearly 600 have been added by users. In addition, users have edited individual site data and helped refine locations.