What role can field experiments and other causal research play in efforts toward social justice in social computing? Aren’t experiments tools for reductionistic, top-down paternalism? How could causal inference ever support grassroots approaches to social justice?
This question is a central struggle in my effort to decide a dissertation topic. The idea of participatory experimentation motivated me to work on the cornhole experiment. It was also at the back of my mind in my talk on discrimination and other social problems online at the Platform Cooperativism conference (start at 1:00:30 mark). In this post, I outline my current thinking on this question. I would love to hear your thoughts in the comments.
“A Peaceable Kingdom with Quakers Bearing Banners 1829-30” by Edward Hicks. I love Edward Hicks’s paintings, which often show the cracks in ideas of utopia that differentiated between rights and roles for different people.
Because software systems shape human affairs, they necessarily advance or detract from the realization of human dignity. For example, information systems have enabled oppression under apartheid , welfare management systems have undermined the dignity of the poor , government systems have removed people of color’s right to vote , and advertising systems have carried out systematic discrimination , to name a few. Social justice research studies the forces detracting from human dignity and advances the wider realization of people’s experience of that common dignity .
The Role of Causal Inference In Social Justice Work
From Betty Friedan’s counts of women’s presence in 1960s magazines  and studies of race and gender bias by juries  to economic models of discrimination , quantitative methods have expanded our awareness and understanding of systematic patterns of social injustice. Yet researchers using these correlational methods struggle when data is limited and where the causes of injustices are unclear.
This post considers two kind of research that ask these causal questions: field experiments and natural experiments . All of these methods ask a counterfactual question of events outside the lab: what would have happened if things had been different? Causal methods ask this question by comparing measured outcomes between groups. Field experiments randomly assign subjects to comparison groups [21,29]. Natural experiments construct post hoc comparison groups from observational data .
Field experiments are a primary method for investigating systematic discrimination when data is limited or differences in outcomes might be related to differences in selection. For example, employment discrimination can be hard to study observationally if a particular group is mostly absent from a sector suspected of discrimination. Aside from the lack of data, people in the absent group may simply be selecting other sectors. In those cases, audit studies use experimental methods to estimate differences in job and loan application outcomes for identical people who differ only by race or gender . These studies can estimate the magnitude of discrimination on average, identifying just how much discrimination affects a person’s chance to get a loan or job interview.
In HCI, field experiments ask traditional questions of discrimination alongside emerging issues unique to social computing. Audit studies have documented discrimination against dark-skinned users on classified ad platforms and on airbnb[12, 14]. Audit studies of HR systems have shown systematic discrimination against certain kinds of surnames .
Among questions unique to social computing, field experiments on videogames have shown that in some gaming communities, women receive more negative and hateful comments than men . Other studies use experiments to investigate discrimination by machine learning systems . Natural experiments are also used to detect social injustices online, including one work-in-progress study estimating the effect of mass surveillance on civic participation online .
While audit studies help identify social injustices, they rarely reveal the underlying explanations. The work of advancing social justice requires theories of the causes of social problems so that the causes can be addressed. Social psychologists have long used lab experiments to test theories on the mechanisms behind behaviors like gender discrimination , methods that social computing researchers have adopted for field experimentation online . Natural experiments can also test theories on the causes of social injustice; one work-in-progress study uses natural language processing in an observational study of DonorsChoose to estimate the role of gender stereotypes in the effects of platform design on gender discrimination .
Evaluating Social Justice Interventions
As a stance with normative goals, social justice research is fundamentally concerned with the outcomes of interventions to advance human dignity and justice. For example, social justice researchers do not stop at defining and understanding a problem like prejudice; they also evaluate the practical outcomes of prejudice reduction efforts .
Research on the social justice effects of technology interventions includes research on the effects of police-worn body cameras on police violence [2,33], the effect of digital, student-designed anti-conflict campaigns on school conflict , and the effect of peer pressure on government compliance with citizen appeals . My own in-progress research is testing the effect of a self-tracking system on gender discrimination on social media .
Social justice interventions can also backfire or have side effects. For example, in some cases, efforts to defend a community from antisocial behavior can make that behavior worse . In another case, research has linked the labor of reviewing and responding to violent materials online with secondary trauma [17,39]. Future causal research could help estimate the human cost of interventions, supporting decision-makers to limit or avoid these new problems introduced by social justice efforts.
Risks from Causal Inference Methods
Researchers who focus on social justice often have well-theorized skepticism towards causal methods, grounded in the limitations of quantitative research and the inequalities of power that often come with experimental research.
The difficulty of measuring meaningful outcomes is a fundamental weakness of all quantitative social justice research. For example, studies of discrimination often rely on classifications of race and gender that are theoretically weak, incorporating structural injustices into the research and making those injustices invisible to researchers [7,10,35]. More broadly, social computing and HCI research have a deficit of reliable dependent variables on issues of public interest compared to measures of productivity .
Participation and Deliberation
Quantitative researchers have a history of paternalism that ignores people’s voices in favor of surveilling their behavior and forcing interventions into their lives . For that reason, deliberative democracy theorists identify experimentation as a major risk to citizen participation and agency. Rhetoric from experimental results can override citizen deliberation and paternalistic policies could nudge away citizen agency . In contrast with these paternalistic approaches, social justice HCI research has emphasized participatory methods that include participants as co-creators of the goals, design, and evaluation of social justice efforts .
Field experiments are also at the heart of an ongoing debate over the ethics of HCI research . In particular, research on issues of inequality and injustice often involve vulnerable populations and offer differential risks and benefits to participants . While some scholars advocate for an obligation to experiment in cases of public interest , the work of maintaining and studying large online platforms raises ethics and accountability challenges for experimenters .
Participatory Field Experiments
I believe that some risks of causal inference in social justice work can be addressed by incorporating lessons from participatory and emancipatory action research  in the design of experiments.
First, quantitative research on social justice can employ qualitative research and participatory design. Methods of “experimental ethnography” structure qualitative research through an experimental design . My own research on social movements at Microsoft this summer explored “participatory hypothesis testing,” mixed-methods, participatory approaches to sampling, modeling, and interpreting quantitative research on social platforms .
Second, participatory methods can be used within the interventions that experimenters evaluate. In one field experiment, students were offered training and support to develop their own digital media campaigns; this intervention reduced conflict reports in schools while also testing theories related to the position of students within their social networks .
Third, marginalized groups are already using socio-technical systems to develop situated knowledge  on issues of labor rights, street harassment, and online governance . A social justice approach to causal research might also expand access to experimental capacities, supporting marginalized groups to develop their own situated knowledge on causal questions.
Finally, experimental results need not override marginalized voices in policy debates. They just offer one more piece of evidence to deliberation. As I show participants in the cornhole experiment, even the cleanest experiment leaves plenty of questions for debate. In fact, research on discrimination in meetings may even expand participation and fairness in delibaration [26,27]. Other experiments could even test ways to expand citizen power in the face of paternalism. In one work in progress study, researchers conducted bottom-up experiments to optimize government compliance with citizen requests .
In this post, I have outlined ways that causal research is used for monitoring social injustices, understanding the causes of those injustices, and evaluating interventions to expand the realization of human dignity. While causal methods do introduce risks of reductionist paternalism, I have tried to sketch out possible directions for “participatory field experiments.” By experimenting from the standpoint of citizens, we may be able to work through some of those risks.
What are your thoughts? I would love to hear your reactions in the comments.
- 1. Joshua Angrist and Jorn-Steffen Pischke. 2010. The Credibility Revolution in Empirical Economics: How Better Research Design is Taking the Con out of Econometrics. Working Paper 15794. National Bureau of Economic Research. http://www.nber.org/papers/w15794
- 2. Barak Ariel, William A. Farrar, and Alex Sutherland. 2014. The Effect of Police Body-Worn Cameras on Use of Force and Citizens’ Complaints Against the Police: A Randomized Controlled Trial. J Quant Criminol 31, 3 (Nov. 2014), 509-535. DOI: http://dx.doi.org/10.1007/s10940-014-9236-3
- 3. David C. Baldus, Charles Pulaski, and George Woodworth. 1983. Comparative review of death sentences: An empirical study of the Georgia experience. Journal of Criminal Law and Criminology (1983), 661-753. http://www.jstor.org/stable/1143133
- 4. Francois Bar, Melissa Brough, Sasha Costanza-Chock, Carmen Gonzalez, C. J. Wallis, and Amanda Garces. 2009. Mobile voices: A mobile, open source, popular communication platform for first-generation immigrants in Los Angeles. In Pre-conference workshop at the (ICA) Conference Chicago, Illinois. http://lirneasia.net/wp-content/uploads/2009/05/final-paper_bar_et_al.pdf
- 5. Shaowen Bardzell. 2010. Feminist HCI: taking stock and outlining an agenda for design. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 1301-1310. http://dl.acm.org/citation.cfm?id=1753521
- 6. Marianne Bertrand and Sendhil Mullainathan. 2003. Are Emily and Greg more employable than Lakisha and Jamal? A field experiment on labor market discrimination. Technical Report. National Bureau of Economic Research.http://www.nber.org/papers/w9873
- 7. Geoffrey C. Bowker and Susan Leigh Star. 1999. The case of race classification and reclassification under apartheid. Sorting things out: Classification and its consequences (1999), 195-225.
- 8. d. Boyd. 2016. Untangling research and practice: What Facebook’s” emotional contagion” study teaches us. Research Ethics 12, 1 (2016), 4-13. DOI:http://dx.doi.org/10.1177/1747016115583379
- 9. Justin Cheng, Cristian Danescu-Niculescu-Mizil, and Jure Leskovec. 2015. Antisocial Behavior in Online Discussion Communities. arXiv preprint arXiv:1504.00680 (2015). http://arxiv.org/abs/1504.00680
- 10. Sophie Chou. 2015. Race and the Machine.(Sept. 2015). http://sophiechou.com/papers/chou_racepaper.pdf
- 11. Jill P. Dimond. 2012. Feminist HCI for real: Designing technology in support of a social movement. Ph.D. Dissertation. Georgia Institute of Technology. http://jilldimond.com/wp-content/uploads/2012/08/dimond-dissertation.pdf
- 12. Jennifer L. Doleac and Luke CD Stein. 2013. The visible hand: Race and online market outcomes. The Economic Journal 123, 572 (2013), F469-F492. http://onlinelibrary.wiley.com/doi/10.1111/ecoj.12082/full
- 13. Paul Dourish. 2010. HCI and environmental sustainability: the politics of design and the design of politics. In Proceedings of the 8th ACM Conference on Designing Interactive Systems. ACM, 1-10. http://dl.acm.org/citation.cfm?id=1858173
- 14. Benjamin G. Edelman, Michael Luca, and Dan Svirsky. 2016. Racial Discrimination in the Sharing Economy: Evidence from a Field Experiment. SSRN Scholarly Paper ID 2701902. Social Science Research Network, Rochester, NY. http://papers.ssrn.com/abstract=2701902
- 15. Virginia Eubanks. 2011. Digital dead end: Fighting for social justice in the information age. MIT Press.
- 16. Hanming Fang and Andrea Moro. 2010. Theories of statistical discrimination and affirmative action: A survey. Technical Report. National Bureau of Economic Research. http://www.nber.org/papers/w15860.pdf
- 17. Anthony Feinstein, Blair Audet, and Elizabeth Waknine. 2014. Witnessing images of extreme violence: a psychological study of journalists in the newsroom. JRSM open 5, 8 (2014), 2054270414533323. http://shr.sagepub.com/content/5/8/2054270414533323.short
- 18. Betty Friedan. 2010. The feminine mystique. WW Norton & Company.
- 19. R. Stuart Geiger. 2014. Successor Systems: The Role of Reflexive Algorithms in Enacting Ideological Critique. Selected Papers of Internet Research 4 (2014). http://spir.aoir.org/index.php/spir/article/view/942
- 20. Andrew Gelman. 2011. Causality and statistical learning. Amer. J. Sociology 117, 3 (2011), 955-966.http://www.jstor.org/stable/10.1086/662659
- 21. Alan S. Gerber and Donald P. Green. 2012. Field experiments: Design, analysis, and interpretation. WW Norton.
- 22. James Grimmelmann. 2015. The Law and Ethics of Social Media Experiments.(2015). http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2604168
- 23. Donna Haraway. 1988. Situated knowledges: The science question in feminism and the privilege of partial perspective. Feminist studies (1988), 575-599. http://www.jstor.org/stable/3178066
- 24. Eszter Hargittai. 2008. The digital reproduction of inequality. Social stratification (2008), 936-944. http://webuse.org/p/c11/
- 25. Gillian R. Hayes. 2011. The relationship of action research to human-computer interaction. ACM Transactions on Computer-Human Interaction (TOCHI) 18, 3 (2011), 15. http://dl.acm.org/citation.cfm?id=1993065
- 26. Peter John, Graham Smith, and Gerry Stoker. 2009. Nudge nudge, think think: two strategies for changing civic behaviour. The Political Quarterly 80, 3 (2009), 361-370. http://onlinelibrary.wiley.com/doi/10.1111/j.1467-923X.2009.02001.x/full
- 27. Christopher F. Karpowitz and Tali Mendelberg. 2014. The silent sex: Gender, deliberation, and institutions. Princeton University Press.
- 28. Brian C. Keegan and J. Nathan Matias. 2015. Actually, It’s About Ethics in Computational Social Science: A Multi-party Risk-Benefit Framework for Online Community Research. arXiv:1511.06578 [cs] (Nov. 2015). http://arxiv.org/abs/1511.06578
- 29. Ron Kohavi, Roger Longbotham, Dan Sommerfield, and Randal M. Henne. 2009. Controlled experiments on the web: survey and practical guide. Data mining and knowledge discovery 18, 1 (2009), 140 181. http://link.springer.com/article/10.1007/s10618-008-0114-1
- 30. Kenneth L. Kraemer, Jason Dedrick, and Prakul Sharma. 2009. One Laptop Per Child: Vision vs. Reality.Commun. ACM 52, 6 (June 2009), 66-73. DOI:http://dx.doi.org/10.1145/1516046.1516063
- 31. Robert E. Kraut, Paul Resnick, Sara Kiesler, Moira Burke, Yan Chen, Niki Kittur, Joseph Konstan, Yuqing Ren, and John Riedl. 2012. Building successful online communities: Evidence-based social design. MIT Press.
- 32. Jeffrey H. Kuznekoff and Lindsey M. Rose. 2013. Communication in multiplayer gaming: Examining player responses to gender cues. New Media Society 15, 4 (June 2013), 541-556. DOI:http://dx.doi.org/10.1177/1461444812458271
- 33. Alexandra Claudia Mateescu, Alex Rosenblat, and others. 2015. Police Body-Worn Cameras.(2015). http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2569481
- 34. J. Nathan Matias. 2013. Can Gender Metrics Make the News More Diverse? (Nov. 2013). https://www.youtube.com/watch?v=Rt9JLLMcjmE
- 35. J. Nathan Matias. 2014. How to Identify Gender in Datasets at Large Scales, Ethically and Responsibly. (Oct. 2014). https://civic.mit.edu/blog/natematias/best-practices-for-ethical-gender-research-at-very-large-scales
- 36. J. Nathan Matias. 2015. 30 Years of Research on Gender, Equality and Diversity at the MIT Media Lab. Medium (Nov. 2015).
- 37. J. Nathan Matias. 2016. Going Dark: Social Factors in Collective Action Against Platform Operators in the Reddit Blackout. (forthcoming) In Proceedings of the 2016 SIGCHI Conference on Human Factors in Computing Systems. ACM.
- 38. J. Nathan Matias and Stuart Geiger. 2014. Defining, Designing, and Evaluating Civic Values in Human Computationand Collective Action Systems. In HCOMP 2014. http://www.aaai.org/ocs/index.php/HCOMP/HCOMP14/paper/view/9268
- 39. J. Nathan Matias, Amy Johnson, Whitney Erin Boesel, Brian Keegan, Jaclyn Friedman, and Charlie DeTar. 2015. Reporting, Reviewing, and Responding to Harassment on Twitter. arXiv:1505.03359 [cs] (May 2015). http://arxiv.org/abs/1505.03359
- 40. Michelle N. Meyer. 2015. Two Cheers for Corporate Experimentation: The A/B Illusion and the Virtues of Data-Driven Innovation.(2015). http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2605132
- 41. Martha C. Nussbaum. 1998. Sex and Social Justice. Oxford University Press.
- 42. Devah Pager. 2007. The use of field experiments for studies of employment discrimination: Contributions, critiques, and directions for the future. The Annals of the American Academy of Political and Social Science 609, 1 (2007), 104-133. http://ann.sagepub.com/content/609/1/104.short
- 43. Greg Palast. 2014. Jim Crow Returns: Millions of minority voters threatened by electoral purge. Al Jazeera (Nov. 2014). http://projects.aljazeera.com/2014/double-voters/
- 44. Elizabeth Levy Paluck. 2010. The promising integration of field experimentation and qualitative methods. The ANNALS of the American Academy of Political and Social Science 628 (2010), 59-71.
- 45. Elizabeth Levy Paluck and Donald P. Green. 2009. Prejudice reduction: What works? A review and assessment of research and practice. Annual review of psychology 60 (2009), 339-367. http://www.annualreviews.org/doi/abs/10.1146/annurev.psych.60.110707.163607
- 46. Elizabeth Levy Paluck, Hana Shepherd, and Peter M. Aronow. 2016. Changing climates of conflict: A social network experiment in 56 schools. Proceedings of the National Academy of Sciences (2016), 201514483. http://www.pnas.org/content/early/2016/01/02/1514483113.short
- 47. Jonathan W. Penney. 2015. Chiling Effects: online Surveillance and Wikipedia Use. Technical Report. http://www.icore-huji.com/sites/default/files/JPenney
- 48. Jason Radford. 2014. Architectures of Virtual Decision-Making: The Emergence of Gender Discrimination on a Crowdfunding Website. arXiv preprint arXiv:1406.7550 (2014). http://arxiv.org/abs/1406.7550
- 49. Christian Sandvig, Kevin Hamilton, Karrie Karahalios, and Cedric Langbort. 2014. Auditing algorithms: Research methods for detecting discrimination on internet platforms. Data and Discrimination: Converting Critical Concerns into Productive Inquiry (2014). http://www-personal.umich.edu/ csandvig/research/Auditing
- 50. Latanya Sweeney. 2013. Discrimination in online ad delivery. ACM Queue 11, 3 (2013), 10. http://dl.acm.org/citation.cfm?id=2460278
- 51. Richard H. Thaler and Cass R. Sunstein. 2003. Libertarian paternalism. American Economic Review (2003), 175-179. http://www.jstor.org/stable/3132220
- 52. Hanna Wallach. 2014. Learning in the Sunshine: Analysis of Local Government Email Corpora.(Sept. 2014). http://sites.duke.edu/researchcomputing/2014/09/16/learning-in-the-sunshine-analysis-of-local-government-email-corpora-hanna-wallach-umass/