« Voice of five generations | Main | Slip-sliding away »

Original sin

Is there a prayer that says, "Oh, Lord, protect us from giddy optimists?" If not, maybe there should be.

Granted, optimists, as the veteran screenwriter Earl Pomerantz wrote recently, are responsible for much of the progress in the world. (Pessimists are too busy coming up with the reasons why it - whatever it is - will not work.) But *giddy* optimism is the kind of thing that gets people imagining that if we can get this bit of new technology to work it will be a great force for democracy, fairness, and social justice.

Last week, as one of the Open Data Institute's Friday lunchtime series, Alan Patrick poured a few helpful dark side thoughts onto open data.

Patrick has company. In Seeing like a geek, Tom Slee writes that open data enables the replacement of small gatekeepers with fewer, larger ones, devalues "informal knowledge", and widens existing inequalities by empowering the already empowered. Both Patrick and Slee cite the same study that shows that They Work For You, the much-praised MySociety site, is disproportionately used by males, college-educated, and over-54s, the very people who find it easiest to contact their MPs in other ways. O'Reilly Radar has a rebuttal by Mike Loukides, which argues that while open data is not an unalloyed good, private data is a public bad. The problem, as I wrote last year in response to a talk given by Bill Thompson, is that there's no way to open data only to the good guys with pure motives without making it closed.

The good thing about these nay-sayers is that they're speaking up so early in the history of open data. Patrick himself compared the current state of open data to the early days of the Internet itself. At the beginning, he said, there was an assumption that the "bad guys" would always be on the outside and that the Internet would only be used for good. Patrick has a sufficiently long history at places like BT and the BBC to be able to lay reasonable claim to remember the Internet's origins, but I'm still not sure he's right. When the pioneers talk, such as at last summer's Internet history event, the stories they tell are about just trying to get stuff to work.

Patrick is right, however, that at the moment the most common complaint about open data is that there isn't enough of it opening up fast enough. He cited, among other things, a 2012 study from the University of Albany, NY (PDF) that advised taking time to think about consequences and sustainability. You can see their point about the latter: at CDPD a few weeks ago, Meg Ambrose outlined her study showing that only 10 to 15 percent of Web content lasts online as long as a year. We call that vanishing content "link rot" or "bit rot"; DuckDuckGo founder Gabriel Weinberg has talked about "API rot"; next will be "dataset rot".

Consequences are of necessity harder to guess at. One of the points Patrick cited from the Albany study was the need to understand the practices that created a given dataset, especially since most were not created with public reuse in mind. Worst, of course, is the case he cited from the UK, in which the Secretary of State for Health, Jeremy Hunt, conflated open data with personal data with the idea of offering patient data to private companies for research. We've written enough here about the privacy aspects of this sort of thing; Patrick's larger point was that this textbook case of how not to approach open data is asymmetric. That is, the people bearing the risks and who will suffer any damage this is caused are not the beneficiaries. "The benefits will be private, the losses public," Patrick said.

And won't hackers have fun matching black market data with the new open stuff? Here Patrick made a point I've long thought about: "This is a read/write game." In other words, what we have to fear is not just that criminals will be enabled to mount far more sophisticated spear phishing attacks but that they can literally poison the information supply. In one case Patrick cited, crime reporting, some areas stopped reporting crimes because opening the data made them fear their property values would sink.

When Patrick started talking about collateral damage, I began wondering about an analogy Michael Froomkin was mulling over a year or two back between privacy and environmental protection. As in the case of pollution, any damage caused by open data is likely to be cumulative and take place far downstream, possibly many years in the future. There's no obvious way to plan for this, any more than the guy making the first refrigerator could have foreseen the hole in the ozone layer over Australia.

Patrick is certainly right that the time to think about these issues is now, at the beginning. But there's a downside to the downside: there is a lot of data we really do want opened. Sometimes the most important thing at the beginning is to make a start before the pessimists talk you out of the whole thing.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Stories about the border wars between cyberspace and real life are posted occasionally during the week at the net.wars Pinboard - or follow on Twitter.


TrackBack URL for this entry:

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)