On Thursday, December 5th, Seamus Kraft from the OpenGov Foundation gave a lunch talk at the MIT Center for Civic Media. This is a liveblog of the talk authored by Alexis Hope, Heather Craig, & @schock.

Lorrie introduces the talk: Seamus studied classical political theory, went into government, discovered that government is complicated and should be open, and then founded OpenGov Foundation. His talk today will focus on how data and people are coming together.

Seamus begins with a quote from this Guardian article about MIT’s history as a center of innovation: “The Massachusetts Institute of Technology has led the world into the future for 150 years with scientific innovations. Its brainwaves keep the US a superpower. But what makes the university such a fertile ground for brilliant ideas?”

The problems we face as citizens, journalists, and researchers in accessing data are equal to and opposite from the problems faced by people in government who have to disseminate this information.

He describes the Open Gov Foundation as a “scrappy, non-partisan non-profit.”

You can find all of their code at github.com/opengovfoundation.

It began when Kraft was working for congressman Darrell Issa on the Oversight committee, to develop a tool called Madison for crowdsourced legislation. This became the genesis of the OpenGov foundation.

One of the challenges they want to address: more than 75% of young americans are disengaged from politics.

Seamus says that an approach that brings together left and right, and science and civics is needed to solve problems. He shows the example of Aaron Schwartz, and mentions that problems we face are real and impact people’s lives.

In the eighteenth century, the best form of access people had to information and legal code was horses and parchment. Or, they could find their representative and talk to them in person.

The story that he will talk about today is getting information that is written down (he holds up a little book with the Constitution) and putting it in a format that is accessible to people (he holds up a mobile phone).

If you don’t have access to government, you can’t be a citizen. You can’t participate if you don’t have access to the words and people who form government.

Our founders would be horrified if they compared the possibilities of technology, for example the “whiz bang” stuff right here in this building at MIT, with the reality of most people’s lack of access to the government, law, and data.

Seamus asks us “Raise your hand if you like Paper, PDFs, and copyright restrictions.” Those are the problems we are facing with the law.

He says that the law is the most important dataset in any community. Asks people in the audience if they work with civic data.

“The Cambridge budget is a big PDF file, for example. The spreadsheets everyone wants are inaccessible unless we file a request”

Tech, Process, People – he’ll talk today about tech and process, because people are too hard to change: “unfortunately, we can’t fork the human soul.”

Seamus asks the community what we would do if we wanted to find out what the leash laws are in our area if we were a citizen. “You’d probably google it right? If it doesn’t show up on the first page, it’s not there.” If you’re a developer, you won’t have access to APIs or bulk downloads. If you are a lawyer or legal support person, you won’t be able to send your clients a link to information that is available because it is not there.

As a state lawmaker, your first step is to learn what lawmakers in other states are doing, but it’s hard to find that information.

If you’re an academic researcher, you can’t find what you need online. Same thing with journalists. Same thing with business owners and entrepreneurs — they are hit with far more law than most of us are.

All the problems we have been discussing involve discoverability problems, jargon/expertise, cost and time and money, design problems, and data formats — it seems daunting. But, all the skills necessary to solve these problems exist in people within the community.

How is this stuff actually produced by people in state and municipal governments?

(Shows the text of a gambling ordinance)

But the law is actually not this, it is just a snapshot of what the law happens to be right now. Every sentence is a mosaic – the sum of all the changes that have been made since the law was first written. A law is a total of many sub revisions. It is the source code.

How is it made? It starts with a problem. Let’s talk about Gambling, for example.

Example problem: gambling

First challenge : where is the law?

Shows a search page from GeneralLaw to identify all the places where gambling is referred to in Massachusetts state law. Seamus tells us that the people making the laws have to use the same crappy tools that we might use, and they need a lot of help.

After the legislators discuss, it goes back to the people who are writing the bill. They are compiling all the little changes to the format of the law, and that is called codification. Remember, they are doing this in the same crappy tools that we have: Word, and PDFs.

Fixing the problem requires “sympathy for the devil.” Government sucks, the law sucks online, the ability to access it sucks. You have to have sympathy for the people who do it — they want to do their job better too. They are doing it on paper, with an army of lawyers. It is expensive to do this works — states and cities are paying hundreds of thousands of dollars to keep laws up to date.

That’s where OpenGov Foundation wants to help.

He shows the State Decoded project. It is a way to put your city code online and share it with others. It is an open source project that started in Virginia. He was frustrated by his inability to access the laws of his own state. Now Open Gov Foundation is taking it to states and cities across the country.

How does it start? We start with the state code. In a PDF, rtf, text file. Parse that into XML. BOOM: decoded law. Now you can send a link, search, plug into an API, enable bulk downloads. All the things we know we need to be full citizens, given the tools we have; that’s what this is about. It’s spreading to other cities.

Open law is spreading like (candle) fire; major cities are adopting it, but there are about 15,000 cities and states left to go.

Sometimes governments like San Francisco hire us to do this, sometimes they just give us the data and ask for anything we can do to publish it, and sometimes it’s scraped, with all of the legal implications of that.

Q&A

Ian Condry: I’m a little uncomfortable with the language that “law is the source code of society.” Transparency doesn’t necessarily equal access. Not everyone has equal ability to take advantage of the law. For example, rich music industries sue college students, and even though the student is right, they don’t have 250k to fight the suit in court. Maybe it’s beyond your purview, but to what extent do you think about the distance between law and community action. The Casino law starting with the problem of not accessing the Casino law, that sounds a little naive in terms of what a Casino does to a community. So how do we actually enable people who aren’t expensive law firms and legislators to be able to make use of this stuff? Some laws will be helpful, but many aren’t about our daily life. Is there a social API? How do you connect law to what society actually needs. Part of the frustration with government isn’t just that it’s not searchable, but that it may not be tackling the problems around us. Are you working with groups to do that?

Seamus: Next time I will use a better example. I was trying to give an example of how a law starts.

Ian: before there were laws, people knew it wasn’t great to kill people. Legislators are the last ones to get it. Culture changes, then the law. Copyright is a great example of that.

Seamus: To your question about how we get this data to people who need it, that’s how we spend half of our time. We come up with use cases for people and why they might need particular pieces of data. Our workflow is responsive. We get it out to the hands of users, and then see what comes back and build around that. The first users are the people who need volume, they need technology to make their jobs more efficient. They need to serve more people or serve the clients they have more efficiently.

Yu: How are you going to present and visualize the laws? Have you thought about the annotation framework? Or social interactions?

Seamus: Visualization is super powerful. It’s built in, in a very small sense, we have word clouds. As we get more and more legal codes up, the possibilities for visualizations and comparisons become greater. For example, what is the state next door doing? And how did that work? We’re not there yet. Madison is going to sit on top of this and allow you to draft legislation or edit legislation on top of the legal code. Madison 2.0 is under development right now, you can check it out on out on our github. We are doing the state of Maryland in January and the Federal government in February. It is bare-bones right now, but you can check it out.

Andrew Whitacre: Is this mainly aimed at legislators or those who work for them? At the federal level, ACA or DoddFrank, I can’t imagine someone outside of gov engaging just because they’re plain text or searchable. Maybe yes with a leash law. But at the top level, who will make the most use of this? Who is your audience? If you would user test, who would you test?

Seamus: The most use of Madison or the State Decoded? Broady we define our user testing groups as a group on the inside of government (like staff assistants, etc.), and a group on the outside. We built Madison so we could get folks like you to tell us what is wrong with SOPA, etc. To bring constituent input into the bill writing process. I didn’t talk about our user research — it is not just UI or UX research, but how do you introduce this to people? What do they actually want? They feel like if they send an email, no one reads it. Or if they call, no one is listening. Some of those problem we can bring technical solutions to. For example, if you are a legislator you could send a read receipt to your constituent. But some of these technical issues run up against people issues.

Sasha: The goal is open up the data to as many people as possible, the method is open up data and APIs. But there are challenges as developers implement closed services on top of open data. For example, as cities open transit data, entrepreneurs develop both paid and free apps around these data. Oftentimes as data becomes more open, private, paid services emerge and therefore the process is classed. Those who already had the most access, get even more access, and those with less resources are still closed out. Are you having this debate? Are there social, technical, legal constraints we might want to place on what’s done with open gov services? For example see the debates over the AGPL in the free software community.

Seamus: That’s a fantastic question. Are we having that debate? I don’t think we are. (Shows his disclaimer on the SF Decoded website). That means do whatever you want with it, paid or unpaid. That’s why there’s an API on top of this data. The data is useless without an application layer. How does that get better? Part of it is what we’re doing, part of it is what you’re doing — and part of it is paid services. But that’s how innovation happens in a lot of cases. I don’t think you can ensure that there’s nobody walling off access somewhere down the pipeline.

Sasha: I’m not going to argue against paid services, but if you did want to ensure the broadest possible access, there are legal strategies you could use. For example, you could require that services built on top of yours are free/libre and open. it’s a debate worth having.

Seamus: It is a debate that is worth having, and very much underway right now.

Rodrigo: One narrative for the rise of open gov is pressure from social groups, or the rise of technologies. I’ve seen another narrative — openness as the result of decline. In one UK municipal government (Barnet, in London), the government projected that they would not be able to provide services beyond basic ones by 2015. Solihull in the west midlands of England recently said it’s switching to a ‘social council’ model in part to deal with scarcity. In other words, because we have no money, and government is in decline, that’s why we should be open. Do you see this in the US?

Seamus: You are seeing this in the US. It is the world government lives in today. When I was talking about sympathy for the devil, I swear the folks who are doing this had a little more money to do it better. They are having to do more with less. They don’t know what more to do with less. Goverments have to take the first step of saying that they don’t know what to do, and people will be willing to sit at the table to help.

Felix: I’m with the W3C. Wouldn’t it be useful to link it to actual cases? A situation where you know whether the law is applicable or not. For example, in the Dog Leash case, I would have asked someone who has a dog, since I’d assume he already knows what’s right or wrong. That’s two clicks away from where I want to be. You also want to enhance the law, as a citizen. Was it made a hundred years ago, does it apply? When you elect people, you want to make them enhance the law. But you can’t choose the right people if you don’t know the law.

Q2: You’re taking a next step. You’re saying how can people find all the things this legal information is linked to. I think getting it up there in the first place is valuable.Does this extend to common law, cases that interpret the law? That may be in the future. But you’re also asking beyond the letter of the law.

Felix: I like the idea of the index. If people could add their current case to what’s written in the law. then you could see how articles are linked to each other, see the emerging landscape, notice what’s really useful to people. An index would be the first thing.

Seamus: Seamus points to court decisions on vacode.org. We’re facing the issue of scale right now, and getting the actual data often takes weeks or months. The beauty of the framework is if you get the data and get it into the right format, you are able to get the data subsequent times (like for court decisions). And then you can hand it over to government for continued use because they will realize they don’t have to spend millions of dollars on a system.

We are starting with comments at the bottom because it is more familiar than annotating the text directly. He shares an example of people finding in SF code that you can’t store a bike in a regular parking spot. People thought that was dumb and once it was identified, it could be addressed. I want to start a “dumb law cleanup.” When the code is accessible, that can be done. It is a very small step for people to be able to flag a dumb law with a comment. The technology part is not that hard in this case — the process part is really hard and sticky.

Saul: Let me tell you about a sticky issue in MA: building an ethanol train depot. MA doesn’t have a law about trains because it is superseded by federal law. For the train depot to be built, the company needed a maritime wetlands permit for ethanol trains. They weren’t just trying to pass a law, it was late in the session and they attached it as a rider to the budget bill. If you had access to the environmentalists working on it, they could walk you through it, and tell you where to look to follow it. But without that access, what do you do? You can never know to google ‘ordinances about ethanol trains and wetlands in the budget” if you’re following this issue. The second complication was the governor’s stance, regarding the veto. So there’s a discovery problem: “this is what to watch,” as well as a secondary problem, how a bill actually moves into law.

Seamus: There are lots of sticky wickets there. But, throughout what you just described, there are examples of technology and process components that can be improved. The advocacy part is the “people” side of it. But, how can we arm people, legislators, reporters like you, to talk about whether it meets the state’s needs, instead of arguing about how stupid it is that the gov can find something in a bill I can’t, or “you need an army of lawyers to even see what’s in there.”

Waldo has a great phrase about the state decoded: It’s got all the niceties of web design, with the powerful tools that lawyers use.

The Madison side of it is gathering all the feedback in a useful format. It’s super useful if groups can point to something specific in the bill: “we like this, we don’t want this in here.”

What you are describing there, I would have liked to have on Obamacare. About 70% of it both parties could agree on. But there was no way to break that out and pass it as a bipartisan bill.

Q: What has the reception of this been like on both sides of the spectrum? On one end, you have lefties who say “now underprivileged people can access the law and take action!” On the other hand hardcore libertarian software engineers say ‘now we have the data we can refactor the law and eliminate the IRS!” What happens when you put these people in the same conversation?

Seamus: Seamus: It’s all over the place. There is a lot of opposition. There a little bit of a sense of a lack of utility: “I don’t know what this does.” But for us, the ‘dog laws’ get hundreds or thousands of references, compared to just a few in Lexis Nexis, or none in pdfs.

There are entrenched people who are against this in principle — they don’t want to make it easy for you to discover the law. They make a lot of money from you not being able to access the law. That isn’t stopping us though. We do get opposition all the time.

Nadeem Mazen, Cambridge City Councilor-Elect: Maybe we could talk about using this locally. I can see how this could be a collaborative document, like etherpad, or google docs, although the code for those is messy. Two questions:

1. I spend lots of time parsing Large XML text documents for textbook manufacturers. I’m not convinced XML can be the solution. Is there something at the nexus of what you’re doing, google does, and Wolfram Alpha is doing, CouchDB comes to mind, alternatives to XML. (Data in OpenGov is stored in MySQL). APIs are fine but at the lowest level the way the data is stored makes a difference to how APIs are made and how the programmer thinks.

2. The end conclusion is we want a google for Saul’s question — how does line item veto work, for example. We want everything related to the law: a larger system of organizing municipal data.

Seamus: XML is not the answer, you’re right. No one has sat down and said “we actually have the ability to have the laws all on the same database in the same data format — how do we do this?” We would love to work with google, etc. but their best minds aren’t working on this, they are working on Google Glass and talking computers.

Nadeem: How do you get a new city in?

Seamus: Sometimes we go to them and say “hey, look what we did.” Sometimes a city comes to us. Sometimes we just do it and hope no one gets upset — like in Chicago for example. Doing it one by one is not scalable for us.