Understanding media coverage: seven summer-long experiments with Media Cloud

At Center for Civic Media, we use Media Cloud, our system that collects and analyzes stories from over 50,000 media sources, to study protest movements and understand why some disasters generate more news coverage than others. Our colleagues at Harvard have used Media Cloud to understand how internet-based activists defeated SOPA/PIPA and the structure of online debates about net neutrality.

We’ve been curious what other researchers would do with our tools. Thanks to the Ford Foundation, this summer, we had a great chance to find out. Our friends at Ford sponsored a contest in which we invited teams of researchers to pitch us on research projects using Media Cloud. We hoped to receive ten applications and fund three projects – instead, we had almost fifty applications and ended up working with ten teams on their projects. Applicants and award winners included academics, activists and individuals with interesting questions about media attention where we felt Media Cloud could advance their research agendas.

On Monday, seven of the teams visited Center for Civic Media to show off their work in progress. Their talks offer a great overview of what’s possible to do with the Media Cloud tools, as well as illustrating a wide range of techniques and approaches to quantitative and qualitative research on media attention and issues of social change.

Many of these research projects are heading towards academic papers, while others will likely become articles in the popular press. With permission from their authors, here’s a brief sneak peek at the teams’ questions, approaches and findings.


Julia Wejchert and Katherine Ida used Media Cloud to analyze the the visual reality of abortion news coverage. They downloaded thousands of stories about the abortion debate using the Media Cloud tool, then hand-coded the images that appeared in each story, discovering that news articles about abortion rarely show the people most likely to be having abortions. Instead, the visuals of these articles illustrate abortion as an issue about politics, not about patients.

Wejchert and Ida analyzed two sets of stories, one set randomly selected from the Media Cloud corpus, the other sorted to detect stories that were frequently shared on social media. (Media Cloud uses bit.ly data to determine how often stories were shared online.) Only 8% of the most shared abortion stories featured a potential abortion patient – 22% showed activists or protesters, while 24% showed politicians. When women appear in these images as potential patients, only 27% of images feature women of color, while 64% of abortion patients are women of color. The images frequently portray visibly pregnant women to illustrate these stories, though most abortions occur much earlier in the pregnancy.

Screen Shot 2015-08-26 at 12.22.10 PM

Media Cloud includes sets of media sources that have been hand-coded for their political leanings, sorted into liberal, conservative, centrist and libertarian sets. Progressive media most frequently showed images of protesters – usually anti-abortion protesters – in stories about abortion. Mainstream media most often showed legislative photos, and conservative media most often showed a fetus or a live infant. This language of imagery is a conscious strategy, Ida and Wejchert report, on the part of anti-abortion activists, who want to shape a narrative about defenseless infants rather than about women’s choice. But their analysis of imagery suggests that there’s no conscious narrative on the pro-choice side countering this visual narrative.


Kate Mays and Karin Seth from the BU Emerging Media Studies Program used Media Cloud to examine the framing of dialog around same sex marriage before and after Obergefell v. Hodges was decided in the Supreme Court on June 26, 2015. They retrieved 8,000 stories from Media Cloud and 42,000 tweets using DMI-TCAT, tracking hastags like #gaymarriage, #marriageequality and #samesexmarriage. They then hand-coded the top 600 stories (ranked by social media shares and by inlinks) to identify a dozen different ways these issues are framed in online media.

Traditionally, the equal marriage debate has been framed in terms of morality (identifying same sex marriage as immoral or sinful) and in terms of equality (gays and lesbians should have the same rights as other citizens.) Mays and Seth find that two narratives ended up dominating the debate after the Supreme Court decision. Those who favored the decision saw it as a civil rights victory, while those who did not invoked the first amendment’s protections of religious freedom to assert a right not to recognize these marriages. They also found extensive evidence that the US decision was influential in an international context, invoked in discussions in Australia and other countries making judicial and legislative decisions around equal marriage.


Daniel Preotiuc-Pietro and Jordan Carpenter at the University of Pennsylvania used Media Cloud to test Jonathan Haidt’s “moral foundations theory”, the idea that ideas of harm, fairness, authority, loyalty and purity/sanctity underly society’s moral institutions and debates about matters of morality. Haidt suggests that liberals argue primarily from two moral bases – harm and fairness – while conservatives argue from all five. Haidt and colleagues have created a lexicon that identifies words associated with positive and negative invocations of each of these foundations – for instance, “unclean” might be a word associated with a negative invocation of purity and sanctity, while “patriot” might be associated with a positive invocation of authority and loyalty.

Screen Shot 2015-08-26 at 12.21.17 PM

Carpenter and Preotiuc-Pietro took stories from 17 “controversies” – collections of Media Cloud stories on a specific controversial topic – and analyzed the words used in thousands of stories on each controversy to determine which moral foundations were invoked. They were able to find similar moral framings for related stories – stories on teen pregnancy and on Hobby Lobby’s decision not to pay employee medical costs associated with contraception invoked similar moral foundations. Other related stories – Trayvon Martin, Freddie Gray, Ferguson – did not show the same pattern of shared foundations. And while Carpenter and Preotiuc-Pietro found frequent invocation of harm, loyalty and, in one case, authority, they saw no evidence of appeals to fairness or purity in these controversies.

There’s many possible next steps to Carpenter and Preotiuc-Pietro’s research. They’d like to improve the lexicons they’re working with, so they do a better job identifying the foundations invoked. But they also have questions about whether political and moral arguments really do invoke these five foundations, or whether questions of harm – who’s hurt by a decision – end up dominating most media discussions.


Marie Lamensch and Nikolai Pogadl from the Montreal Institute for Genocide and Human Rights Studies are deeply engaged with the task of monitoring media coverage, especially for sub-Saharan African nations. Lamensch notes that close monitoring of Rwandan media would have provided early warning of genocidal violence. Understanding US media is particularly important as it can often predict US and European response to African issues – when US media talked about Rwanda in terms of the US failed engagement in Somalia, it was an indicator that the US would not intervene in Rwanda.

Lamensch and Pogadl used Media Cloud to conduct automated monitoring of media stories agout Cameroon, Central African Republic, Chad, Democratic Republic of Congo, Kenya, Niger, Nigeria, Uganda and South Sudan. They were interested in seeing how campaigns like #BringBackOurGirls (urging the Nigerian government to find girls kidnapped from Chibok, Nigeria) and #148notjustanumber (calling attention to the 148 students and teachers killed in Garissa, Kenya) influenced global media. What they found instead is that African presence in US and European media is linked heavily to football, and that coverage of the Africa’s Cup of Nations and the Women’s World Cup greatly outweighed coverage of political and human rights issues on the continent.

One of the most interesting findings was that international coverage of African issues is deeply influential to local debates, sometimes in damaging ways. US coverage of Ebola was deeply influential within Liberia, they found. And because US coverage tended to focus on Ebola as deadly rather than survivable, those who were exposed to US media stories became more fatalistic about Ebola and less likely to support efforts to control the spread of the disease.


Miranda Bogen from the Fletcher School at Tufts University used Media Cloud to examine the phenomenon of internet companies making “foreign policy” decisions. She examined two cases where Google’s corporate policies were examined in as if the company were making diplomatic judgements: the decision to rename Google Palestinian Territories to Google Palestine, and decisions to block access to the Innocence of Muslims movie trailer in Egypt and Libya, despite no legal requirements to do so. In both cases, Bogen used Media Cloud to create detailed timelines of media coverage of these decisions, looking at how language to describe Google’s behavior changed over time.

While headlines like “Google ‘Recognizes’ Palestine” caught a good deal of public attention, analysis of the controversy shows that Google’s actions followed a UN decision to upgrade Palestine’s status to a non-member observer, triggering a change in ISO designation, which is what Google cited in making their change. While news organizations made much of the “symbolic importance” of Google’s decision, the company described the change as a technical change in international naming conventions, and the quick decay of the story suggests that media organizations took the company’s explanation at face value.

Google’s decision to take down the “Innocence of Muslims” trailer was much more complicated. Early in the discussion, news outlets referred to this clip as “causing violent riots” throughout the Middle East. Over time, that assertion dropped to an conjecture, with outlets saying that the clip “might have caused riots” or “was said to have caused riots”. Google acknowledged that they made an unusual decision to block the content, and Bogen sees evidence that media framing influenced the decision to make the block. In this case, it may be less that Google is making foreign policy than that media coverage is making Google policy.


Eric Enrique Borja and colleagues from UT Austin used Media Cloud to study media coverage of protests associated with #BlackLivesMatter in Ferguson and Baltimore. He used Media Cloud to create collections of stories about protests in Ferguson after Michael Brown’s death and again after a jury failed to indict Darren Wilson, and protests in Baltimore after the death of Freddie Gray. Noting that social movements “live and die by mainstream media public opinion”, he sorted stories into frames with positive and negative valences, offering the real CNN headline “Rioters set fire to looted drug store” as an example of negative framing.

Screen Shot 2015-08-26 at 12.20.02 PM

In both waves of Ferguson protest, Borja sees comparable levels of positive and negative framing. Many stories invoke rioting and looting, but there is also discussion of activists, civil rights, uprisings, protests and demonstrations. By the second wave of Ferguson protests, the negative frame is increasing in power. In Baltimore, there’s a massive disparity between positive and negative frames: there is virtually no media coverage of the events after Freddie Gray’s death that refers to protest, and massive coverage of riots and violence. Borja points to a story of an “angry Baltimore mother” physically restraining her son from entering the protests as part of a media framing of the Baltimore events as chaotic and lawless, rather than as legitimate outrage at police abuse.

Understanding the disparities in coverage between Ferguson and Baltimore is critical, Borja argues, because he sees evidence that black politics in general is often portrayed as a disruption to peace and order. The more balanced coverage, especially to the first wave of Ferguson protests, gives room for protest and dissent as legitimate expression, while the Baltimore framing makes that discussion impossible.


Brandi Collins of Color of Change took on a project called “Deconstructing ‘Thug’”, a cultural history of a fraught and loaded term. President Obama and Baltimore Mayor Stephanie Rawlings-Blake both described demonstrators and protestors in Baltimore as “thugs” (the mayor subsequently apologized, while the President did not), leading to discussions that “thug” had emerged as a proxy for race, or as a new, socially acceptable “N-word”. Collins was interested in starting a campaign called “Ban the T-word”, when she discovered that people’s feelings about the term “thug” are complex and multi-layered – even her mother wasn’t convinced that “thug” was a word that should disappear from the vocabulary. So Collins took a deep dive into the origins of the term, and its recent rise in media discourse.

“Thug” is traceable back to the 14th century, used as a pejorative term to describe Indian worshipers of Kali. These “thugees” were described as robbing and strangling innocents, but it’s unclear that these attacks actually happened. In the mid-1800s, British colonial administrators embraced the myth of the thugee to justify mass incarceration in India. The term became popular in the US in associated with labor protests, tied to union organizers in the Haymarket uprisings, and then associated with the Italian mafia and union organizers. The term took on another layer of meaning in the 1990s when Tupac Shakur’s “Thug Life” tattoo tied the term to corners of hiphop culture.

In contemporary usage, Collins sees “thug” attached to marginalized people – African Americans in the US, Muslims in the UK, as well as to organized labor. Tracking the work in the Media Cloud corpus from 2011 to 2015, she saw a spike in usage of “thug” connected to union organizers opposing Scott Walker in Wisconsin. Usage of the term appears to be increasing in US media, though slowing in mainstream, centrist media. It’s had slow, steady growth in conservative media, and Collins was surprised to find “thug” appearing widely in liberal media, often invoked to identify and fight conservative framings. At the end of the day, “thug” does appear to be code for race – 70% of the peaks in usage of “thug” coincided with stories about race and racial justice.


It was an incredible experience to see these teams work with our tools in such creative and disparate ways. We built Media Cloud to solve some of our own research questions, but also to be useful to anyone with questions about how ideas, frames and terms spread through media. If you’re interested in asking these sorts of questions about media, please sign up for a free account and try it out. We would love to hear what you discover.