There are many ways for a computer system to fail. This week's disclosure that Her Majesty's Revenue and Customs has played lost-in-the-post with two CDs holding the nation's Child Benefit data is one of the stranger ones. The Child Benefit database includes names, addresses, identifying numbers, and often bank details, on all the UK's 25 million families with a child under 16. The National Audit Office requested a subset for its routine audit; the HMRC sent the entire database off by TNT post.
There are so many things wrong with this picture that it would take a village of late-night talk show hosts to make fun of them all. But the bottom line is this: when the system was developed no one included privacy or security in the specification or thought about the fundamental change in the nature of information when paper-based records are transmogrified into electronic data. The access limitations inherent in physical storage media must be painstakingly recreated in computer systems or they do not exist. The problem with security is it tends to be inconvenient.
With paper records, the more data you provide the more expensive and time-consuming it is. With computer records, the more data you provide the cheaper and quicker it is. The NAO's file of email relating to the incident (PDF) makes this clear. What the NAO wanted (so it could check that the right people got the right benefit payments): national insurance numbers, names, and benefit numbers. What it got: everything. If the discs hadn't gotten lost, we would never have known.
Ironically enough, this week in London also saw at least three conferences on various aspects of managing digital identity: Digital Identity Forum, A Fine Balance, and Identity Matters. All these events featured the kinds of experts the UK government has been ignoring in its mad rush to create and collect more and more data. The workshop on road pricing and transport systems at the second of them, however, was particularly instructive. Led by science advisor Brian Collins, the most notable thing about this workshop is that the 15 or 20 participants couldn't agree on a single aspect of such a system.
Would it run on GPS or GSM/GPRS? Who or what is charged, the car or the driver? Do all roads cost the same or do we use differential pricing to push traffic onto less crowded routes? Most important, is the goal to raise revenue, reduce congestion, protect the environment, or rebalance the cost of motoring so the people who drive the most pay the most? The more purposes the system is intended to serve, the more complicated and expensive it will become, and the less likely it is to answer any of those goals successfully. This point has of course also been made about the National ID card by the same sort of people who have warned about the security issues inherent in large databases such as the Child Benefit database. But it's clearer when you start talking about something as limited as road charging.
For example: if you want to tag the car you would probably choose a dashboard-top box that uses GPS data to track the car's location. It will have to store and communicate location data to some kind of central server, which will use it to create a bill. The data will have to be stored for at least a few billing cycles in case of disputes. Security services and insurers alike would love to have copies. On the other hand, if you want to tag the driver it might be simpler just to tie the whole thing to a mobile phone. The phone networks are already set up to do hand-off between nodes, and tracking the driver might also let you charge passengers, or might let you give full cars a discount.
The problem is that the discussion is coming from the wrong angle. We should not be saying, "Here is a clever technological idea. Oh, look, it makes data! What shall we do with it?" We should be defining the problem and considering alternative solutions. The people who drive most already pay most via the fuel pump. If we want people to drive less, maybe we should improve public transport instead. If we're trying to reduce congestion, getting employers to be more flexible about working hours and telecommuting would be cheaper, provide greater returns, and, crucially for this discussion, not create a large database system that can be used to track the population's movements.
(Besides, said one of the workshop's participants: "We live with the congestion and are hugely productive. So why tamper with it?")
It is characteristic of our age that the favored solution is the one that creates the most data and the biggest privacy risk. No one in the cluster of organisations opposing the ID card - No2ID, Privacy International, Foundation for Information Policy Research, or Open Rights Group - wanted an incident like this week's to happen. But it is exactly what they have been warning about: large data stores carry large risks that are poorly understood, and it is not enough for politicians to wave their hands and say we can trust them. Information may want to be free, but data want to leak.
Wendy M. Grossman’s Web site has an extensive archive of her books, articles, and music, and an archive of all the earlier columns in this series. Readers are welcome to post here, at net.wars home, at her personal blog, or by email to firstname.lastname@example.org (but please turn off HTML).