Data for Black Lives: Automating (In)Justice | MIT Center for Civic Media

Data for Black Lives: Automating (In)Justice

Automating (In)Justice: Policing and Sentencing in the Algorithmic Age

Data for Black Lives (D4BL)  is "a group of activists, organizers, and mathematicians committed to the mission of using data science to create concrete and measurable change in the lives of Black people." This is a liveblog from the Automating (In)Justice panel for the D4BL 2017 Inaugural Conference. Liveblogging contributed by Rahul Bhargava – apologizes for any errors or omissions.

Adam Foss starts by talking about how criminal Justice reform has been a hot-button issue. In Boston we incarcerated a generation of black men, but now we are feeling the impact of this the “smart on crime” approach. Right now all along the continuum people are trying to use data to solve this historical problem of mass incarceration.  There’s good to that, and bad to that.


  • Adam Foss
  • Charmaine Arthur
  • Samuel Sinyangwe
  • Kim Foxx
  • Julia Anfwin

Charmaine Arthur

Arthur is the Director of Community Programs at Freedom House (in Roxbury, Boston). Their founders were at the forefront of the Boston bussing crisis.  They’ve started a school for children of color to fight for equitable education.  They work with high school and college students to create success and opportunities through coaching, college-level opportunities, and other community work and civic engagement.

Data helps them in a number of ways. It helps them do their work better.  It gives them context. It helps them identify who they serve. They measure things like race, sex, grade, graduation, attendance, family base, economics, and more. They use SalesForce for a lot of this. Data allows for some accountability.

This data is a shell.  Until they meet a person they don’t see the life. And they let the students use their own data and be advocates.

Data can also be a false sense of progress and hope. It takes time to work against this. Freedom House survives through funding from foundations, and often they dictate how to do the work.  The corporatization of non-profits is happening – they’re using the same language as Wall Street.  How do you feel about the “return on my investment” in this work?  Absolutely not. We don’t talk that way about our young people.

Samuel Sinyangwe

Sinyangwe’s work began with the death of Michael Brown in 2014.  Just afterwards communities that had been experiencing police violence were able to say that. Others attempted to shut this down by saying they didn’t have the data, as if your lived experience needed a study to justify it.

They built the most comprehensive database of people killed by police in the US. They showed that police killed 323 black people the year Michael Brown was killed.  Then they began to use data as a tool for accountability.

Then they could have a conversation about why the numbers were the way they were.  Why are 1 in 6 homicides in Oklahoma City committed by police? 1 in 3 people killed by strangers in the US are killed by police officers.  Over 1200 people a year for the last five years. How do we make this apparent and accessible to people?  Visualization has been critical to help peopleunderstand what is going on, and move to some kind of action.

They have national data, and also deeper data bout the top 100 departments in the US (through public records requests and other means). In Orlando, FL they met with police leadership and the data showed that they are the second highest for people killed by police. When they presented all this they explain this because Orlando is a heavy tourist place, and there are lots of folks on Orange Ave corridor; clearly this is unque and they can’t be compared.  So Sinyangwe pulled the New Orleans Bourbon St. data, which shut down that conversation.

The people in this room can download the dataset and use it –

Kim Foxx

Foxx is the state’s attorney of Cook Country, Chicago. They release this data to the public in a very accessible way. There is a sense around mass incarceration that things are “anecdotal.” 86% of the people in Cook County jail are black or brown.  94% of people in the juvenile system are black and brown. In the prosecutor’s office we don’t know how this happens, because the systems are black boxes.

For Foxx it was important to have the public know what she was doing, and how she makes decisions. How do you measure if you are better than your predecessor? She ran on the issue of people in jail being stuck there only because they are not able to afford their bail. Sharing information gives them a benchmark.  Foxx insists that “you can’t fix what you can’t measure.” 

Sharing budget, agenda, and more lets the public know. People can run the datasets themselves. They’ve hired a Chief Data Officer for the Prosecutors office, and released the last 6 years of data; precisely because they wanted it to be continuing and accountable.

In 2016 their second highest felony offense (after gun possession), was retail theft (shoplifting). They didn’t know that until they dove into the data.  Illinois’ level of retail theft felony is $300.  Indiana is $750.  Wisconsin is $2500. When you think about the impact of a felony conviction you can ask a question about what we are doing.  At her department, they decided to not charge retail theft as a felony for under $1000 (they can do that at their discretion). The data helped them see that, and led to that decision.  Next year they’ll be able to look at the impact of that on prison and jail populations, and more.

Chicago has an issue of violence, and Foxx has limited resources. If we are about public safety, we must look at on a continuum. Violence is connected to education, arrests in schools, and more. The highest incidence of violence is in places with under-resourced schools, the places where people returning from convictions live, and more. You can’t arrest your way out of violence. The justice system is not just reactive.  We can’t put the wrong people in jail.

Julia Anfwin

Julia was destined for computer science, but took a turn towards journalism.  She covered technology for 15 years. She started writing about criminal justice because she was writing about the data being collected and was wondering about how it was being used.

The highest stakes algorithm judging people is the software used across the country to create “risk assessment scores”. At ProPublica she wrote about this. This is used at pre-trial, parole, and sentencing.  San Francisco, most of New York, and lots of other places use this. As someone math and data literate she looked for studies to justify this.  No one was doing these studies. In fact Eric Holder asked the sentencing commission to have these studied. The only studies were by the companies that created it. New York State purchased this in 2001 and released number in 2012; but they didn’t look at race. She did a FOIA request in Florida to get data, and succeeded in getting 2 years worth of score (2 years).

Anfwin looked at the scores and found for black defendants there were sentences across the board. For white people, almost no-one was getting longer sentences. These algorithms are totally biased. The rigorous statistical analysis after 6 moths of work backed this up.

Computer science community has validated all this work, but the criminal justice community has totally rejected this. It becomes bout a debate about the definition of fairness. Bringing numbers to the table helped this debate happen.

“You gotta bring numbers to the fight”



Adam shares that in Chicago the shootings aren’t that outside the average, but you just can’t get to a hospital in 45 minutes or less so the homicide numbers are worse. In Boston, they say homicide is way down, while shooting rate is through the roof (because you are at a trauma hospital in 4 minutes).

Samuel shares that in the absence of data you just have assumptions. When you talk about addressing police violence you run into an old script.  It says that anything that restricts how police use force endangers police or community. There is no data to support any of those claims. These are assumptions that are taken as fact. This couldn’t be challenged well because the data wasn’t there.

They’ve tested this with the data we have now, and find they are lies. They looked at use of force policies and how restrictive they were. They tested whether there was an increased risk in departments that are most restrictive. In fact these were the safest for civilians and police officers. You share that finding in the room with the police union and they have nothing to respond with. This can move those conversations forward.

Foxx asks what makes you a good prosecutor? How do you measure the outcomes of what you do? IF you say that you want to keep communities safe, and give someone a harsh punishment, and then see the person over and over, are you successful? Is this harsh sentencing aiding public safety? We have to look at the aggregate impacts on community, otherwise we’ll continue to do the same thing.

We haven’t defined what “tough on crime” or “smart on crime” means.  If you don’t have to own that “tough on crime” means lots of people in prison and decimated neighborhoods, then the data doesn’t matter. The narrative of “personal responsibly” has dominated prosecutorial offices for years. This narrative lets you not care about the impact, and absolves you from the conversation. We cannot afford to do that. In what place can you invest 500 million on crime and have a 55% recidivism rate?

Adam shares that Foxx was elected on a wave of anti-incumbent prosecutorial elections. Next year there are 1000 DA elections across the country. This is an opportunity. Foxx is a leading example of what can happen when we change.

Foxx shares that 80% of elected prosecutors are white men. Less than 1% are women of color. This is important, because we need people in these positions to push back on this. She is from public housing, a single mother-family, all the risk factors that make her high risk from an algorithmic sentencing point of view. These un-connected people don’t know the impact of the policies, and that’s a problem.

Arthur shares that there are lots of egos at the table.  Yes you have to bring numbers, but what happens when you are worn out fighting with the numbers, because those numbers are lives. Understanding “why” matters.  There has to be action with the communication. It takes time, and we have to keep chipping away. But the funders say here, have 3 years to fix it. We just can’t do it. Quality programs are proactive and find youth before they fall off the wall.

Adam asks the panelists – what do you need to do your work better? How is data going to help us?

Arthur shares the story of her kids, who have had different experiences of racism – from shootings and support failures to more. The danger of the story about 18 year old black males is dangerous for individuals. The information Sam has is information Freedom House can use. They can give youth the tools to advocate for themselves. We need to advocate for our own.

Sinyangwe argues that the field of stopping police violence is new. The data is out there for you. The policy information is out there. Help produce knowledge that communities can use for change. Look at civilian review boards – there is no data to tell you which structure is the most useful. Make this stuff accessible.

Foxx wants to amplify this. We don’t validate why things are happening; we don’t understand them. We have to be cognizant of the nuances in spaces, otherwise we’ll just adopt things because other folks have. They need people in the data/analytical space to come to the criminal justice system. Advocacy from outside is good, but we need help inside it too to figure out what questions to ask. Foxx wants people to work with prosecutors to help.

Anfwin has a team of two programmers that she works with. Every industry needs more tech literacy. The most shocking thing of criminal justice scores was the shocking amount of forgiveness applied to white defendants. Her analysis of car insurance rates was the same chart – with higher risk the rates declined in white neighborhoods. They use the word “bias”, but the algorithms have allocated “forgiveness”. This is an important re-framing. Can we build-in forgiveness for more than one group of people.



How do we help advocates use data better?

Anfwin shares that people over-collect data before they have a question.  You need a targeted smart question before you start collecting data.  Otherwise your data is putting data at risk. You have to think about when your data is lost, because it will happen.

Surveillance is a real risk, reminds Sinyangwe. You have to take steps to protect yourself. The framing of your statistics matter.  Especially folks that aren’t data literate; they’re the ones that need to take these numbers and use them.

What do you say to black communities that don’t feel safe and want more policing and surveillance?

There is not a magic answer to that, says Arthur. We used to depend on our neighbors. We’ve lost trust. We used to have a shared understanding of what the village looked like. Community policing works for some people, but it doesn’t work for everyone. It can build trust.

Those numbers are real people that live and breathe. We need to really remember that. We need the police. Arthur was going to be a cop, but her mama said that her calling was to work with young people.

Foxx hears this question a lot. She goes into neighborhoods and talks with people in forums. The ACLU folks were talking about stop and frisk. A woman stood up and shared that she was scared to go to the bus stop. She didn’t want an open-air drug market next to the bus stop. Another woman asked about getting rid of the unlicensed snow cone seller. Foxx didn’t understand, because she didn’t live there, that the problem was around the loitering around the snow cone person and the drug sales and more that happened there. People have a deep fear of the police, and a deep fear of the person causing harm.

People want policing that isn’t dangerous to them. This narrative can’t be lost. Law enforcement has to contend with bad tactics and bad policies in the communities that need to trust law the most (because they are suffering the most).

Is the decision whether to keep this data a problem of resources, or a deliberate effort to not collect it?

Sinyangwe says it is a combination. The political system responds to crises. The Department of Justice only opened up an investigation into Fergurson, Baltimore, Chicago when something big happened. Patrick Sharkey found in a recent study that the crime decline in the past few decades was driven by non-profit organizations. For every 10 NGOs working on stuff, there was a 6% drop in violent crime and 10% drop in homicide. The only place resourced to respond when you need safety help is the police department. That was a choice; and they defunded other alternatives. Other studies show that mass incarceration had zero perfect impact on the decline in crime; but that is where all the money goes. Same result for spending on police – very little impact on crime (0% to 5%). We have to shift to community-based responses. Those are the evidence based responses to this problem.

Anfwin attributes this to benign malice (if that exists). Journalism is the last watchdog – people respond. And journalism is in crisis.  All the money is going from them to Google and Facebook. Journalism needs our support to bring attention to this.