data | MIT Center for Civic Media

UN World Data Forum Summary

I'm just back from the first UN World Data Forum in Cape Town, South Africa. I presented there on my creative, hands-on approach to building empowerment through data literacy.

I liveblogged a number of sessions while I was there.  Read some of these to get a sense of how non-profits, official statisticians, and journalists are thinking about data.  There was a particular focus on support the Sustainable Development Goals, but many of the comments and case studies shared have impacts for anyone working with data-driven decision making.

How to Identify Gender in Datasets at Large Scales, Ethically and Responsibly

A practical guide to methods and ethics of gender identification

For the past three years, I've been using methods to identify gender in large datasets to support research, design, and data journalism, supported by the Knight Foundation, with an amazing group of collaborators. In my Master's thesis, used these techniques to support inclusion of women in citizen journalism, the news, and collective aciton online. Last February, I was invited to give a talk about my work at the MIT Symposium on Gender and Technology, hosted by the MIT Program in Women's and Gender Studies. I have finally written the first part of the talk, a practical guide to methods and ethics of gender identification approaches.

Data Therapy Workshop Followup

As part of our Data Therapy project, we just ran two workshops to help build capacity within small community organizations to understand and creatively present their data. We were able to accommodate about 25 people in our training space, and hope to help more of those that were on the waitlist in the future. These workshops were run in collaboration with the Regional Center for Healthy Communities, whose training needs assessment has spurred these workshops and where I've run them for the last few years. This blog post includes some follow-up information for attendees to those workshops, but may be interesting for other people too!

For those of you that did attend, here are some references for the tools and ideas that we talked about.

We're totally PDF'd: Open state-level datasets still fail to inspire

(Edited to add Max Ogden's recommendation of ScraperWiki to help deal with PDF datasets.)

FACEPALM

Courtesy of a recommendation by John Wonderlich at the Sunlight Foundation, here's a faceted browser/catalog of state- and other-level datasets to explore: http://datos.fundacionctic.org/sandbox/catalog/faceted/

And while the tool itself is indespensible, it highlights the bane of our data-loving existence: tons of state-level data have been posted to data.gov-style sites only as PDFs.

Want to know how the Alabama liquor control board has been spending its money? You'll have to read through forty-five 20+ page PDF'd spreadsheets: http://open.alabama.gov/frmsReport/ReportList.aspx?AppID=GFS&AgencyID=00...

UN Global Pulse Camp 1.0


(Photo credit: Christopher Fabian of UNICEF & Global Pulse)

Just got back from the UN "Pulse Camp 1.0".

Global Pulse is a new and quite ambitious UN initiative "to improve evidence-based decision-making and close the information gap between the onset of a global crisis and the availability of actionable information to protect the vulnerable" (Full overview at http://www.unglobalpulse.org/about).

At PBS IdeaLab: "Sourcemap Makes Data Visualizations Transparent"

The latest C4FCM post from the Idea Lab blog:

While pitched as a way to create and visualize "open supply chains," Sourcemap's real virtue is that the data itself is fully sourced. Like the links at the bottom of a Wikipedia article and the accompanying edit history, you know exactly who added the data and where that data came from. You can take that data and make counter-visualizations if you feel the data isn't correctly represented. Sourcemap's very structure acknowledges that visualization is an editorial process and gives others a chance to work with the original data. For example, here's an example of a Sourcemap for an Ikea bed:

Read the rest at PBS MediaShift Idea Lab: "Sourcemap Makes Data Visualizations Transparent"