Main

July 25, 2008

Who?

A certain amount of government and practical policy is being made these days based on the idea that you can take large amounts of data and anonymize it so researchers and others can analyze it without invading anyone's privacy. Of particular sensitivity is the idea of giving medical researchers access to such anonymized data in the interests of helping along the search for cures and better treatments. It's hard to argue with that as a goal - just like it's hard to argue with the goal of controlling an epidemic - but both those public health interests collide with the principle of medical confidentiality.

The work of Latanya Sweeney was I think the first hint that anonymizing data might not be so straightforward; I've written before about her work. This week, at the Privacy Enhancing Technologies Symposium in Leuven, Belgium (which I regrettably missed) researchers Arvind Narayanan and Vitaly Shmatikov from the University of Texas at Austin won an award sponsored by Microsoft for taking reidentifying supposedly anonymized data a step further.

The pair took a database released by the online DVD rental company Netflix last year as part of the $1 million Netflix Prize, a project to improve upon the accuracy of the system's predictions. You know the kind of thing, since it's built into everything from Amazon to Tivos - you give the system an idea of your likes and dislikes by rating the movies you've rented and the system makes recommendations for movies you'll like based on those expressed preferences. To enable researchers to work on the problem of improving these recommendations, Netflix released a dataset containing more than 100 million movie ratings contributed by nearly 500,000 subscribers between December 1999 and December 2005 with, as the service stated in its FAQ, all customer identifying information removed.

Maybe in a world where researchers only had one source of information that would be a valid claim. But just as Sweeney showed in 1997 that it takes very little in the way of public records to re-identify a load of medical data supplied to researchers in the state of Massachusetts, Narayananan and Shamtikov's work reminds us that we don't live in a world like that. For one thing, people tend disproportionately to rate their unusual, quirky favorites. Rating movies takes time; why spend it on giving The Lord of the Rings another bump when what people really need is to know about the wonders of King of Hearts, All That Jazz, and The Tall Blond Man with One Black Shoe? The consequence is that the Netflix dataset is what they call "sparse" - that is, there few subscribers have very similar records.

So: how much does someone need to know about you to identify a particular user from the database? It turns out, not much. The is the public ratings and dates at the Internet Movies Database, which include dates and real names. Narayanan and Shmatikov concluded that 99 percent of records could be uniquely identified from only eight matching ratings (of which two could be wrong); for 68 percent of the records you only need two (and reidentifying the rest becomes easier). And of course, if you know a little bit about the particular person whose record you want to identify things get a lot easier - the three movies I've just listed would probably identify me and a few of my friends.

Even if you don't care if your tastes in movies are private - and both US law and the American Library Association's take on library loan records would protect you more than you yourself would - there are couple of notable things here. First of all, the compromise last week whereby Google agreed to hand Viacom anonymized data on YouTube users isn't as good a deal for users as they might think. A really dedicated searcher might well think it worth the effort to come up with a way to re-identify the data - and so far rightsholders have shown themselves to be very dedicated indeed.

Second of all, the Thomas-Walport review on data-sharing actually recommends requiring NHS patients to agree to sharing data with medical researchers. There is a blithe assumption running through all the government policies in this area that data can be anonymized, and that as long as they say our privacy is protected it will be. It's a perfect example of what someone this week called "policy-based evidence-making".

Third of all, most such policy in this area assumes it's the past that matters. What may be of greater significance, as Narayanan and Shmatikov point out, is the future: forward privacy. Once a virtual identity has been linked to a real-world identity, that linkage is permanent. Yes, you can create a new virtual identity, but any slip that links it to either your previous virtual or your real-world identity blows your cover.

The point is not that we should all rush to hide our movie ratings. The point is that we make optimistic assumptions every day that the information we post and create has little value and won't come back to bite us on the ass. We do not know what connections will be possible in the future.

Wendy M. Grossman's Web site has an extensive archive of her books, articles, and music, and an archive of all the earlier columns in this series. Readers are welcome to post here, at net.wars home, at her personal blog, or by email to netwars@skeptic.demon.co.uk (but please turn off HTML).

July 11, 2008

Voters for sale

It must be hard to be the Direct Marketing Association. All individuals in the DMA must know that they themselves hate getting marketing calls during dinner, weeding the real post out from the junk mail, and constantly having to unsubscribe from email lists that they're only on because they had the misfortune to buy something from the sender. Collectively, the DMA remains firmly convinced that people want advertising really, it just has to be targeted right (at which point people no longer call it advertising). It must be very hard for everyone involved to maintain this level of cognitive dissonance.

And it leads them to do things as an organization that probably each individual would oppose if they were working for someone else. Today the DMA is opposing the withdrawal of the edited electoral register, a recommendation appearing in the Data-Sharing Review, published by the Ministry of Justice and written by Information Commissioner Richard Thomas and Dr Mark Walport. There's a lot of interesting stuff to digest; the electoral register issue is one of the simpler bits.

To recap: historically the UK, like the US, treated the electoral rolls as public information. In the UK every household gets sent a canvassing form once a year that comes with a stern warning that you are legally required to register.

Starting in the 1830s, the British electoral rolls have been available for public inspection and sale; what a godsend for direct marketers as their industry grew up. As of 2001, electoral registration officers are required to sell a copy of the register at a specified price to anyone who wants it under Regulation 48 of the Representation of the People (England and Wales) Regulations. Almost immediately there were objections on privacy grounds, most notably a complaint by Pontefract-based Brian Robertson, a retired accountant, against Wakefield City Council because there was no provision for him to prevent the sale of his information for commercial use. He refused to register, took them to court - and won.

The regulations were promptly amended to require councils to maintain two registers: the full public register and an edited version that could be sold to commercial organizations and others and to which voters would be added automatically - but with the right to opt out. The first edited registers appeared in 2002.

And there was a lot of confusion. The canvassing forms that first year didn't make it very clear what the edited register was, and it was easy to make the mistake of thinking that if you opted out you would not be able to vote. Subsequent years saw amended forms that made it more clear just what you were opting out of. And the results really shouldn't surprise anyone: in the latest rolls 40 percent of voters opted out, double the percentage in the first years. Given that, it's not entirely clear why the government needs to withdraw the register. If they just wait a few more years everyone of any value to marketers will have opted out, and the edited rolls will become useful again as a list of all the people who aren't worth marketing to. Anyone left presumably either didn't understand the form, so lonely they enjoy the attention, or so mentally afflicted that someone else filled out the form for them.

The full register is available - at least in theory - only to a select group of people and organizations: political parties for electoral purposes, credit reference agencies to check names and addresses when people apply for credit, and law enforcement. The main purchasers of the edited register, the Thomas-Walport report notes, are direct marketing companies and companies compiling directories.

Thomas and Walport disapprove of its existence on these grounds: "It sends a particularly poor message to the public that personal information collected for something as vital as participation in the democratic process can be sold to 'anyone for any purpose'."

A key data protection principle is that a chance of use in personal information requires the consent of the individual. If ever there were a more significant change of use than selling information collected to enable people to vote to third party companies for general marketing purposes, I don't know what it would be.

The DMA's objection to its withdrawal is that its members won't be able to clean their lists and keep them accurate and up-to-date. And it happily sees the direct mail envelope as more than half full: "Some householders have opted out, but around 60 petrcent have chosen to remain on the edited register." They don't believe the forms are all confusing. And the DMA plays the environmental card: targeting reduces the amount of waste paper the industry produces.

One issue neither group tackles is whether the register represents a significant source of income for councils. How much are we willing to pay for privacy. This warrants more research; a quick glance turns up figures from Bath and North East Somerset Counil. In 2005-2006, the council netted £1,553 and £380.50 for the sales of the full and edited registers respectively; in 2006-2007 those figures were £1558.50 and £681. If that's indicative of national trends, we can afford it, especially given the savings on administering the opt-out process.

"The edited register does serve a purpose," the DMA concludes, "and so should not be abolished." A purpose, yes. Just not our purpose.

Wendy M. Grossman's Web site has an extensive archive of her books, articles, and music, and an archive of all the earlier columns in this series. Readers are welcome to post here, at net.wars home, at her personal blog, or by email to netwars@skeptic.demon.co.uk (but please turn off HTML).

June 27, 2008

Mistakes were made

This week we got the detail on what went wrong at Her Majesty's Revenue and Customs that led to the loss of those two CDs full of the personal details of 25 million British households last year with the release of the Poynter Review (PDF). We also got a hint of how and whether the future might be different with the publication yesterday of Data Handling: Proecures in Government (PDF), written by Sir Gus O'Donnell and commissioned by the Prime Minister after the HMRC loss. The most obvious message of both reports: government needs to secure data better.

The nicest thing the Poynter review said was that HMRC has already made changes in response to its criticisms. Otherwise, it was pretty much a surgical demonstration of "institutional deficiencies".

The chief points:


- Security was not HMRC's top priority.

- HMRC in fact had the technical ability to send only the selection of data that NAO actually needed, but the staff involved didn't know it.

- There was no designated single point of contact between HMRC and NAO.

- HMRC used insecure methods for data storage and transfer.

- The decision to send the CDs to the NAO was taken by junior staff without consulting senior managers - which under HMRC's own rules they should have done.

- The reason HMRC's junior staff did not consult managers was that they believed (wrongly) that NAO had absolute authority to access any and all information HMRC had.

- The HMRC staffer who dispatched the discs incorrectly believed the TNT Post service was secure and traceable, as required by HMRC policy. A different TNT service that met those requirements was in fact available.

- HMRC policies regarding information security and the release of data were not communicated sufficiently through the organization and were not sufficiently detailed.

- HMRC failed on accountability, governance, information security...you name it.

The real problem, though, isn't any single one of these things. If junior staff had consulted senior staff, it might not have mattered that they didn't know what the policies were. If HMRC used proper information security and secure methods for data storage (that is, encryption rather than simple password protection), they wouldn't have had access to send the discs. If they'd understood TNT's services correctly, the discs wouldn't have gotten lost - or at least been traceable if they had.

The real problem was the interlocking effect of all these factors. That, as Nassim Nicholas Taleb might say, was the black swan.

For those who haven't read Taleb's The Black Swan: The Impact of the Highly Improbable, the black swan stands for the event that is completely unpredictable - because, like black swans until one was spotted in Australia, no such thing has ever been seen - until it happens. Of course, data loss is pretty much a white swan; we've seen lots of data breaches. The black swan, really, is the perfectly secure system that is still sufficiently open for the people who need to use it.

That challenge is what O'Donnell's report on data handling is about and, as he notes, it's going to get harder rather than easier. He recommends a complete rearrangement of how departments manage information as well as improving the systems within individual departments. He also recommends greater openness about how the government secures data.

"No organisation can guarantee it will never lose data," he writes, "and the Government is no exception." O'Donnell goes on to consider how data should be protected and managed, not whether it should be collected or shared in the first place. That job is being left for yet another report in progress, due soon.

It's good to read that some good is coming out of the HMRC data loss: all departments are, according to the O'Donnell report, reviewing their data practices and beginning the process of cultural change. That can only be a good thing.

But the underlying problem is outside the scope of these reports, and it's this government's fondness for creating giant databases: the National Identity Register, ContactPoint, the DNA database, and so on. If the government really accepted the principle that it is impossible to guarantee complete data security, what would they do? Logically, they ought to start by cancelling the data behemoths on the understanding that it's a bad idea to base public policy on the idea that you can will a black swan into existence.

It would make more sense to create a design for government use of data that assumes there will be data breaches and attempts to limit the adverse consequences for the individuals whose data is lost. If my privacy is compromised alongside 50 million other people's and I am the victim of identity theft does it help me that the government department that lost the data knows which staff member to blame?

As Agatha Christie said long ago in one of her 80-plus books, "I know to err is human, but human error is nothing compared to what a computer can do if it tries." The man-machine combination is even worse. We should stop trying to breed black swans and instead devise systems that don't create so many white ones.

Wendy M. Grossman's Web site has an extensive archive of her books, articles, and music, and an archive of all the earlier columns in this series. Readers are welcome to post here, at net.wars home, at her personal blog, or by email to netwars@skeptic.demon.co.uk (but please turn off HTML).

May 30, 2008

Ten

It's easy to found an organization; it's hard to keep one alive even for as long as ten years. This week, the Foundation for Information Policy Research celebrated its tenth birthday. Ten years is a long time in Internet terms, and even longer when you're trying to get government to pay attention to expertise in a subject as difficult as technology policy.

My notes from the launch contain this quote from FIPR's first director, Caspar Bowden, which shows you just how difficult FIPR's role was going to be: "An educational charity has a responsibility to speak the truth, whether it's pleasant or unpleasant." FIPR was intended to avoid the narrow product focus of corporate laboratory research and retain the traditional freedoms of an academic lab.

My notes also show the following list of topics FIPR intended to research: the regulation of electronic commerce; consumer protection; data protection and privacy; copyright; law enforcement; evidence and archiving; electronic interaction between government, businesses, and individuals; the risks of computer and communications systems; and the extent to which information technologies discriminate against the less advantaged in society. Its first concern was intended to be researching the underpinnings of electronic commerce, including the then recent directive launched for public consultation by the European Commission.

In fact, the biggest issue of FIPR's early years was the crypto wars leading up to and culminating in the passage of the Regulation of Investigatory Powers Act (2000). It's safe to say that RIPA would have been a lot worse without the time and energy Bowden spent listening to Parliamentary debates, decoding consultation papers, and explaining what it all meant to journalists, politicians, civil servants, and anyone else who would listen.

Not that RIPA is a fountain of democratic behavior even as things are. In the last couple of weeks we've seen the perfect example of the kind of creeping functionalism that FIPR and Privacy International warned about at the time: the Poole council using the access rules in RIPA to spy on families to determine whether or not they really lived in the right catchment area for the schools their children attend.

That use of the RIPA rules, Bowden said at at FIPR's half-day anniversary conference last Wednesday, sets a precedent for accessing traffic data for much lower level purposes than the government originally claimed it was collecting the data for. He went on to call the recent suggestion that the government may be considering a giant database, updated in real time, of the nation's communications data "a truly Orwellian nightmare of data mining, all in one place."

Ross Anderson, FIPR's founding and current chair and a well-known security engineer at Cambridge, noted that the same risks adhere to the NHS database. A clinic that owns its own data will tell police asking for the names of all its patients under 16 to go away. "If," said Anderson, "it had all been in the NHS database and they'd gone in to see the manager of BT, would he have been told to go and jump in the river? The mistake engineers make too much is to think only technology matters."

That point was part of a larger one that Anderson made: that hopes that the giant databases under construction will collapse under their own weight are forlorn. Think of developing Hulk-Hogan databases and the algorithms for mining them as an arms race, just like spam and anti-spam. The same principle that holds that today's cryptography, no matter how strong, will eventually be routinely crackable means that today's overload of data will eventually, long after we can remember anything we actually said or did ourselves, be manageable.

The most interesting question is: what of the next ten years? Nigel Hickson, now with the Department of Business, Enterprise, and Regulatory Reform, gave some hints. On the European and international agenda, he listed the returning dominance of the large telephone companies on the excuse that they need to invest in fiber. We will be hearing about quality of service and network neutrality. Watch Brussels on spectrum rights. Watch for large debates on the liability of ISPs. Digital signatures, another battle of the late 1990s, are also back on the agenda, with draft EU proposals to mandate them for the public sector and other services. RFID, the "Internet for things" and the ubiquitous Internet will spark a new round of privacy arguments.

Most fundamentally, said Anderson, we need to think about what it means to live in a world that is ever more connected through evolving socio-technological systems. Government can help when markets fail; though governments themselves seem to fail most notoriously with large projects.

FIPR started by getting engineers, later engineers and economists, to talk through problems. "The next growth point may be engineers and psychologists," he said. "We have to progressively involve more and more people from more and more backgrounds and discussions."

Probably few people feel that their single vote in any given election really makes a difference. Groups like FIPR, PI, No2ID, and ARCH remind us that even a small number of people can have a significant effect. Happy birthday.


Wendy M. Grossman's Web site has an extensive archive of her books, articles, and music, and an archive of all the earlier columns in this series. Readers are welcome to post here, at net.wars home, at her personal blog, or by email to netwars@skeptic.demon.co.uk (but please turn off HTML).


May 23, 2008

The haystack conundrum

Early this week the news broke that the Home Office wants to create a giant database in which will be stored details of all communications sent in Britain. In other words, instead of data retention, in which ISPs, telephone companies, and other service providers would hang onto communications data for a year or seven in case the Home Office wanted it, everything would stream to a Home Office data center in real time. We'll call it data swallowing.

Those with long memories - who seem few and far between in the national media covering this sort of subject - will remember that in about 1999 or 2000 there was a similar rumor. In the resulting outraged media coverage it was more or less thoroughly denied and nothing had been heard of it since, though privacy advocates continued to suspect that somewhere in the back of a drawer the scheme lurked, dormant, like one of those just-add-water Martians you find in the old Bugs Bunny cartoons. And now here it is again in another leak that the suspicious veteran watcher of Yes, Minister might think was an attempt to test public opinion. The fact that it's been mooted before makes it seem so much more likely that they're actually serious.

This proposal is not only expensive, complicated, slow, and controversial/courageous (Yes, Minister's Fab Four deterrents), but risk-laden, badly conceived, disproportionate, and foolish. Such a database will not catch terrorists, because given the volume of data involved trying to use it to spot any one would-be evil-doer will be the rough equivalent of searching for an iron filing in a haystack the size of a planet. It will, however, make it possible for anyone trawling the database to make any given individual's life thoroughly miserable. That's so disproportionate it's a divide-by-zero error.

The risks ought to be obvious: this is a government that can't keep track of the personal details of 25 million households, which fit on a couple of CDs. Devise all the rules and processes you want, the bigger the database the harder it will be to secure. Besides personal information, the giant communications database would include businesses' communication information, much of likely to be commercially sensitive. It's pretty good going to come up with a proposal that equally offends civil liberties activists and businesses.

In a short summary of the proposed legislation, we find this justification: "Unless the legislation is updated to reflect these changes, the ability of public authorities to carry out their crime prevention and public safety duties and to counter these threats will be undermined."

Sound familiar? It should. It's the exact same justification we heard in the late 1990s for requiring key escrow as part of the nascent Regulation of Investigatory Powers Act. The idea there was that if the use of strong cryptography to protect communications became widespread law enforcement and security services would be unable to read the content of the messages and phone calls they intercepted. This argument was fiercely rejected at the time, and key escrow was eventually dropped in favor of requiring the subjects of investigation to hand over their keys under specified circumstances.

There is much, much less logic to claiming that police can't do their jobs without real-time copies of all communications. Here we have real analogies: postal mail, which has been with us since 1660. Do we require copies of all letters that pass through the post office to be deposited with the security services? Do we require the Royal Mail's automated sorting equipment to log all address data?

Sanity has never intervened in this government's plans to create more and more tools for surveillance. Take CCTV. Recent studies show that despite the millions of pounds spent on deploying thousands of cameras all over the UK, they don't cut crime, and, more important, the images help solve crime in only 3 percent of cases. But you know the response to this news will not be to remove the cameras or stop adding to their number. No, the thinking will be like the scheme I once heard for selling harmless but ineffective alternative medical treatments, in which the answer to all outcomes is more treatment. (Patient gets better - treatment did it. Patient stays the same - treatment has halted the downward course of the disease. Patient gets worse - treatment came too late.)

This week at Computers, Freedom, and Privacy, I heard about the Electronic Privacy Information Center's work on fusion centers, relatively new US government efforts to mine many commercial and public sources of data. EPIC is trying to establish the role of federal agencies in funding and controlling these centers, but it's hard going.

What do these governments imagine they're going to be able to do with all this data? Is the fantasy that agents will be able to sit in a control room somewhere and survey it all on some kind of giant map on which criminals will pop up in red, ready to be caught? They had data before 9/11 and failed to collate and interpret it.

Iron filing; haystack; lack of a really good magnet.

Wendy M. Grossman's Web site has an extensive archive of her books, articles, and music, and an archive of all the earlier columns in this series. Readers are welcome to post here, at net.wars home, at her personal blog, or by email to netwars@skeptic.demon.co.uk (but please turn off HTML).

May 9, 2008

Swings and roundabouts

There was a wonderful cartoon that cycled frequently around computer science departments in the pre-Internet 1970s - I still have my paper copy - that graphically illustrated the process by which IT systems get specified, designed, and built, and showed precisely why and how far they failed the user's inner image of what it was going to be. There is a scan here. The senior analyst wanted to make sure no one could possibly get hurt; the sponsor wanted a pretty design; the programmers, confused by contradictory input, wrote something that didn't work; and the installation was hideously broken.

Translate this into the UK's national ID card. Consumers, Sir James Crosby wrote in March (PDF)want identity assurance. That is, they - or rather, we - want to know that we're dealing with our real bank rather than a fraud. We want to know that the thief rooting through our garbage can't use any details he finds on discarded utility bills to impersonate us, change our address with our bank, clean out our accounts, and take out 23 new credit cards in our name before embarking on a wild spending spree leaving us to foot the bill. And we want to know that if all that ghastliness happens to us we will have an accessible and manageable way to fix it.

We want to swing lazily on the old tire and enjoy the view.

We are the users with the seemingly simple but in reality unobtainable fantasy.

The government, however - the project sponsor - wants the three-tiered design that barely works because of all the additional elements in the design but looks incredibly impressive. ("Be the envy of other major governments," I feel sure the project brochure says.) In the government's view, they are the users and we are the database objects.

Crosby nails this gap when he draws the distinction between ID assurance and ID management:

The expression 'ID management' suggests data sharing and database consolidation, concepts which principally serve the interests of the owner of the database, for example, the Government or the banks. Whereas we think of "ID assurance" as a consumer-led concept, a process that meets an important consumer need without necessarily providing any spin-off benefits to the owner of any database.

This distinction is fundamental. An ID system built primarily to deliver high levels of assurance for consumers and to command their trust has little in common with one inspired mainly by the ambitions of its owner. In the case of the former, consumers will extend use both across the population and in terms of applications such as travel and banking. While almost inevitably the opposite is true for systems principally designed to save costs and to transfer or share data.

As writer and software engineer Ellen Ullman wrote in her book Close to the Machine, databases infect their owners, who may start with good intentions but are ineluctibly drawn to surveillance.

So far, the government pushing the ID card seems to believe that it can impose anything it likes and if it means the tree collapses with the user on the swing, well, that's something that can be ironed out later. Crosby, however, points out that for the scheme to achieve any of the government's national security goals it must get mass take-up. "Thus," he writes, "even the achievement of security objectives relies on consumers' active participation."

This week, a similarly damning assessment of the scheme was released by the Independent Scheme Assurance Panel (PDF) (you may find it easier to read this clean translation - scroll down to policywatcher's May 8 posting). The gist: the government is completely incompetent at handling data, and creating massive databases will, as a result, destroy public trust in it and all its systems.

Of course, the government is in a position to compel registration, as it's begun doing with groups who can't argue back, like foreigners, and proposes doing for employees in "sensitive roles or locations, such as airports". But one of the key indicators of how little its scheme has to do with the actual needs and desires of the public is the list of questions it's asking in the current consultation on ID cards, which focus almost entirely on how to get people to love, or at least apply for, the card. To be sure, the consultation document pays lip service to accepting comments on any ID card-related topic, but the consultation is specifically about the "delivery scheme".

This is the kind of consultation where we're really damned if we do and damned if we don't. Submit comments on, for example, how best to "encourage" young people to sign up ("Views are invited particularly from young people on the best way of rolling out identity cards to them") without saying how little you like the government asking how best to market its unloved policy to vulnerable groups and when the responses are eventually released the government can say there are now no objectors to the scheme. Submit comments to the effect that the whole National Identity scheme is poorly conceived and inappropriate, and anything else you say is likely to be ignored on the grounds that they've heard all that and it's irrelevant to the present consultation. Comments are due by June 30.


Wendy M. Grossman's Web site has an extensive archive of her books, articles, and music, and an archive of all the earlier columns in this series. Readers are welcome to post here, at net.wars home, at her personal blog, or by email to netwars@skeptic.demon.co.uk (but please turn off HTML).

May 2, 2008

Bet and sue

Most net.wars are not new. Today's debates about free speech and censorship, copyright and control, nationality and disappearing borders were all presaged by the same discussions in the 1980s even as the Internet protocols were being invented. The rare exception: online gambling. Certainly, there were debates about whether states should regulate gambling, but a quick Usenet search does not seem to throw up any discussions about the impact the Internet was going to have on this particular pastime. Just sex, drugs, and rock 'n' roll.

The story started in March, when the French Tennis Federation (FFT - Fédération Française de Tennis) filed suit in Belgium against Betfair, Bwin, and Ladbrokes to prevent them from accepting bets on matches played at the upcoming French Open tennis championships, which start on May 25. The FFT's arguments are rather peculiar: that online betting stains the French Open's reputation; that only the FFT has the right to exploit the French Open; that the online betting companies are parasites using the French Open to make money; and that online betting corrupts the sport. Bwin countersued for slander.

On Tuesday of this week, the Liège court ruled comprehensively against the FFT and awarded the betting companies costs.

The FFT will still, of course, control the things it can: fans will be banned from using laptops and mobile phones in the stands. The convergence of wireless telephony, smart phones, and online sites means that in the second or two between the end of a point and the electronic scoreboard updating, there's a tiny window in which people could bet on a sure thing. Why this slightly improbable scenario concerns the FFT isn't clear; that's a problem for the betting companies. What should concern the FFT is ensuring a lack of corruption within the sport. That means the players and their entourages.

The latter issue has been a touchy subject in the tennis world ever since last August, when Russian player Nikolay Davydenko, currently fourth in the world rankings, retired in the third and final set of a match in Poland against 87th ranked Marin Vassallo Arguello, citing a foot injury. Davydenko was accused of match-fixing; the investigation still drags on. In the resulting publicity, several other players admitted being approached to fix matches. As part of subsequent rule-tightening by the Association of Tennis Professionals, the governing body of men's professional tennis, three Italian players were suspended briefly late last year for betting on other players' matches.

Probably the most surprising thing is that tennis, along with soccer and horse racing, is actually among the most popular sports for betting. A minority sport like tennis? Yet according to USA Today, the 2007 Paris Masters event saw $750 million to $1.5 billion in bets. I can only assume that the inverted pyramid of matches every week involving individual players fits well with what bettors like to do.

Fixing matches seems even more unlikely. The best payouts come from correctly picking upsets, the bigger the better. But top players are highly unlikely to throw matches to order. Most of them play a relatively modest number of events (Davydenko is admittedly the exception) and need all the match wins and points from those events to sustain their rankings. Plus, they're just too damn rich.

In 2007, Roger Federer, the ultra-dominant number one player since the end of 2003, earned upwards of $10 million in prize money alone; Davydenko picked up over $2 million (and has already won another $1 million in 2008). All of the top 12 earned over $1 million. Add in endorsements, and even after you subtract agents' fees, tax, and travel costs for self and entourage, you're still looking at wealthy guys. They might tank matches at events where they're being paid appearance fees (which are legal on the men's tour at all but the top 14 events, but proving they've done so is exceptionally difficult. Fixing matches, which could cost them in lost endorsements on top of the tour's own sanctions, surely can't be worth it.

There are several ironies about the FFT's action. First of all (something most of the journalists covering this story don't mention, probably because they don't spend a lot of time watching tennis on TV), Bwin has been an important advertiser sponsoring tennis on Eurosport. It's absolutely typical of the counter-productive and intricately incestuous politics that characterize the tennis world that one part of the sport would sue someone who pays money into another part of the sport.

Second of all, as Betfair and Bwin pointed out, all three of these companies are highly regulated European licensed operations. Ruling them out of action would mean shift online betting to less well regulated offshore companies. They also pointed out the absurdity of the parasites claim: how could they accept bets on an event without using its name? Betfair in particular documented its careful agreements with tennis's many governing bodies.

Third of all, the only reason match-fixing is an issue in the tennis world right now is that Betfair spotted some unusual betting patterns during that Polish Davydenko match, cancelled all the bets, and went public with the news. Without that, Davydenko would have avoided the fight over his family's phone records. Come to think of it, making the issue public probably explains the FFT's behavior: it's revenge.


Wendy M. Grossman's Web site has an extensive archive of her books, articles, and music, and an archive of all the earlier columns in this series. Readers are welcome to post here, at net.wars home, at her personal blog, or by email to netwars@skeptic.demon.co.uk (but please turn off HTML).

April 11, 2008

My IP address, my self

Some years back when I was writing about the data protection directive, Simon Davies, director of Privacy International, predicted a trade war between the US and Europe over privacy laws. It didn't happen, or at least it hasn't happened yet.

The key element to this prediction was the rule in the EU's data protection laws that prohibited sending data on for processing to countries whose legal regimes aren't as protective as those of the EU. Of course, since then we've seen the EU sell out on supplying airline passenger data to the US. Even so, this week the Article 29 Data Protection Working Party made recommendations about how search engines save and process personal data that could drive another wedge between the US and Europe.

The Article 29 group is one of those arcane EU phenomena that you probably don't know much about unless you're a privacy advocate or paid to find out. The short version: it's a sort of think tank of data protection commissioners from all over Europe. The UK's Information Commissioner, Richard Thomas, is a member, as are his equivalents in countries from France to Lithuania.

The Working Party (as it calls itself) advises and recommends policies based on the data protection principles enshrined in the EU Data Protection Directive. It cannot make law, but both its advice to the European Commission and the Commission's action (or lack thereof) are publicly reported. It's arguable that in a country like the UK, where the Information Commissioner operates with few legal teeth to bite with, the existence of such a group may help strengthen the Commissioner's hand.

(Few legal teeth, at least in respect of government activities: the Information Commissioner has issued an opinion about Phorm indicating that the service must be opt-in only. As Phorm and the ISPs involved are private companies, if they persisted with a service that contravened data protection law, the Information Commissioner could issue legal sanctions. But while the Information Commissioner can, for example, rule that for an ISP to retain users' traffic data for seven years is disproportionate, if the government passes a law saying the ISP must do so then within the UK's legal system the Information Commissioner can do nothing about it. Similarly, the Information Commissioner can say, as he has, that he is "concerned" about the extent of the information the government proposes to collect and keep on every British resident, but he can't actually stop the system from being built.)

The group's key recommendation: search engines should not keep personally identifiable search histories for longer than six months, and it specifically includes search engines whose headquarters are based outside the EU. The group does not say which search engines it studied, but it was reported to be studying Google as long ago as last May. The report doesn't look at requirements to keep traffic data under the Data Retention Directive, as it does not apply to search engines.

Google's shortening the life of its cookies and anonymizing its search history logs after 18 months turns out to have a significance I didn't appreciate when, at the time, I dismissed it as insultingly trivial (which it was): it showed the Article 29 working group that the company doesn't really need to keep all that data for so long. In

One of the key items the Article 29 group had to decide in writing its report on data protection issues related to search engines (PDF) is this: are IP addresses personal information? It sounds like one of those bits of medieval sophistry, like asking how many angels can dance on the head of a pin. In the dial-up days, it might not have mattered, at least in Britain, where local phone charges forced limited usage, so users were assigned a different IP address every time they logged in. But in the world of broadband, where even the supposedly dynamic IP addresses issued by cable suppliers may remain with a single subscriber for years on end. Being able to track your IP address's activities is increasingly like being able to track your library card, your credit card, and your mobile phone all at the same time. Fortunately, the average ISP doesn't have the time to be that interested in most of its users.

The fact is that any single piece of information that identifies your activities over a long period and can be mapped to your real-life identity has to be considered personal information or the data protection laws make no sense. The libertarian view, of course, would be that there are other search engines. You do not actually have to use Google, Gmail, or even YouTube. But if all search engines adopted Google's habits the choice would be more apparent than real. Time was when the US was the world's policeman. With respect to data, it seems that the EU has taken on this role. It will be interesting to see whether this decision has any impact on Google's business model and practices. If it does, that trade war could finally be upon us. If not, then Google was building up a vast data store just because we can.

Wendy M. Grossman's Web site has an extensive archive of her books, articles, and music, and an archive of all the earlier columns in this series. Readers are welcome to post here, at net.wars home, at her personal blog, or by email to netwars@skeptic.demon.co.uk (but please turn off HTML).

March 28, 2008

Leaving Las Vegas

Las Vegas shouldn't exist. Who drops a sprawling display of electric lights with huge fountains and luxury hotels that into the best desert scenery on the planet during an energy crisis? Indoors, it's Britain in mid-winter; outdoors you're standing in a giant exhaust fan. The out-of-proportion scale means that everything is four times as far away as you think, including the jackpot you're not going to win at one of its casinos. It's a great place to visit if you enjoy wallowing in self-righteous disapproval.

This all makes it the stuff of song, story, and legend and explains why Jeff Jonas's presentation at etech was packed.

The way Jonas tells it in his blog and at his presentation, he got into the gaming industry by driving through Las Vegas in 1989 idly wondering what was going on behind the scenes at the casinos. A year later he got the tiny beginnings of an answer when he picked up a used couch he'd found in the newspaper classified ads (boy, that dates it, doesn't it?) and found that its former owner played blackjack "for a living". Jonas began consulting to the gaming industry in 1991, helping to open Treasure Island, Bellagio, and Wynn.

"Possibly half the casinos in the world use technology we created," he said at etech.

Gaming revenues are now less than half of total revenues, he said, and despite the apparent financial win they might represent problem gamblers are in fact bad for business. The goal is for people to have fun. And because of that, he said, a place like the Bellagio is "optimized for consumer experience over interference. They don't want to spend money on surveillance."

Jonas began with a slide listing some common ideas about how Las Vegas works, culled from movies like Ocean's 11 and the TV show Las Vegas. Does the Bellagio have a vault? (No.) Do casinos perform background checks on guests based on public records? (No.) Is there a gaming industry watch list you can put yourself on but not take yourself off? (Yes, for people who know they have a gambling addiction.) Do casinos deliberately hire ex-felons? (Yes, to rehabilitate them.) Do they really send private jets for high rollers? (Cue story.)

There was, he said, a casino high roller who had won some $18 million. A win like that is going to show up in a casino's quarterly earnings. So, yes, they sent a private jet to his town and parked a limo in front of his house for the weekend. If you've got the bug, we're here for you, that kind of thing. He took the bait, and lost $22 million.

Do they help you create cover stories? (Yes.) "What happens in Vegas stays in Vegas" is an important part of ensuring that people can have fun that does not come back to bite them when they go home. The casinos' problem is with identity, not disguises, because they are required by anti-money laundering rules to report it any time someone crosses the $10,000 threshold for cash transactions. So if you play at several different tables, then go upstairs and change disguises, and come back and play some more, they have to be able to track you through all that. ID, therefore, is extremely important. Disguises are welcome; fake ID is not.

Do they use facial recognition to monitor the doors to spot cheaters on arrival? (Well...)

Of course technology-that-is-indistinguishable-from-magic-because-it-actually-is-magic appears on every crime-solving TV show these days. You know, the stuff where Our Heroes start with a fuzzy CCTV image and they punch in on a tiny piece of it and blow it up. And then someone says, "Can you enhance that?" and someone else says, "Oh, yes, we have new software," and a second later a line goes down the picture filling in detail. And a second after that you can read the brand on the face of a wrist watch (Numb3rs or the manufacturer's coding on a couple of pills (Las Vegas. Or they have a perfect matching system that can take a partial fingerprint lifted off a strand of hair or something and bang! the database can find not only the person's identity but their current home address and phone number (Bones). And who can ever forget the first episode of 24, when Jack Bauer, alarmed at the disappearance of his daughter, tosses his phone number to an underling and barks, "Find me all the Internet passwords associated with this phone number."

And yet...a surprising number of what ought to be the technically best-educated audience on the planet thought facial recognition was in operation to catch cheaters. Folks, it doesn't work in airports, either.

Which is the most interesting thing Jonas said: he now works for IBM (which bought his company) on privacy and civil liberties issues, including work on software to help the US government spot terrorists without invading privacy. It's an interesting concept, partly because security at airports and other locations is now so invasive. But also because if Las Vegas can find a way to deploy surveillance such that only the egregious problems are caught and everyone else just has a good time...why can't governments?

Wendy M. Grossman's Web site has an extensive archive of her books, articles, and music, and an archive of all the earlier columns in this series. Readers are welcome to post here, at net.wars home, at her personal blog, or by email to netwars@skeptic.demon.co.uk (but please turn off HTML).

March 14, 2008

Uninformed consent

Apparently the US Congress is now being scripted by Jon Stewart of the Daily Show. In a twist of perfect irony, the House of Representatives has decided to hold its first closed session in 25 years to debate - surveillance.

But it's obvious why they want closed doors: they want to talk about the AT&T case. To recap: AT&T is being sued for its complicity in the Bush administration's warrantless surveillance of US citizens after its technician Mark Klein blew the whistle by taking documents to the Electronic Frontier Foundation (which a couple of weeks ago gave him a Pioneer Award for his trouble).

Bush has, of course, resisted any effort to peer into the innards of his surveillance program by claiming it's all a state secret, and that's part of the point of this Congressional move: the Democrats have fielded a bill that would give the whole program some more oversight and, significantly, reject the idea of giving telecommunications companies - that is, AT&T - immunity from prosecution for breaking the law by participating in warrantless wiretapping. 'Snot fair that they should deprive us of the fun of watching the horse-trading. It can't, surely, be that they think we'll be upset by watching them slag each other off. In an election year?

But it's been a week for irony, as Wikipedia founder Jimmy Wales has had his sex life exposed when he dumped his girlfriendand been accused of - let's call it sloppiness - in his expense accounts. Worse, he stands accused of trading favorable page edits for cash. There's always been a strong element of Schadenpedia around, but the edit-for-cash thing really goes to the heart of what Wikipedia is supposed to be about.

I suspect that nonetheless Wikipedia will survive it: if the foundation has the sense it seems to have, it will display zero tolerance. But the incident has raised valid questions about how Wikipedia can possibly sustain itself financially going forward. The site is big and has enviable masses of traffic; but it sells no advertising, choosing instead to live on hand-outs and the work of volunteers. The idea, I suppose, is that accepting advertising might taint the site's neutral viewpoint, but donations can do the same thing if they're not properly walled off: just ask the US Congress. It seems to me that an automated advertising system they did not control would be, if anything, safer. And then maybe they could pay some of those volunteers, even though it would be a pity to lose some of the site's best entertainment.

With respect to advertising, it's worth noting that Phorm, which we is under increasing pressure. Earlier this week, we had an opportunity to talk to Kent Ertegrul, CEO of Phorm, who continues to maintain that Phorm's system, because it does not store data, is more protective of privacy than today's cookie-driven Web. This may in fact be true.

Less certain is Ertegrul's belief that the system does not contravene the Regulation of Investigatory Powers Act, which lays down rules about interception. Ertegrul has some support from a informal letter from the Home Office whose reasoning seems to be that if users have consented and have been told how they can opt out, it's legal. Well, we'll see; there's a lot of debate going on about this claim and it will be interesting to hear the Information Commissioner's view. If the Home Office's interpretation is correct, it could open a lot of scope for abusive behavior that could be imposed upon users simply by adding it to the terms of service to which they theoretically consent when they sign up, and a UK equivalent of AT&T wanting to assist the government with wholesale warrantless wiretapping would have only to add it to the terms of service.

The real problem is that no one really knows how Phorm's system works. Phorm doesn't retain your IP address, but the ad servers surely have to know it when they're sending you ads. If you opt out but can still opt back in (as Ertegrul said you can), doesn't that mean you still have a cookie on your system and that your data is still passed to Phorm's system, which discards it instead of sending you ads? If that's the case, doesn't that mean you can not opt out of having your data shared? If that isn't how it works, then how does it work? I thought I understood it after talking to Ertegrul, I really did - and then someone asked me to explain how Phorm's cookie's usefulness persisted between sessions, and I wasn't sure any more. I think the Open Rights Group: Phorm should publish details of how its system works for experts to scrutinize. Until Phorm does that the misinformation Ertegrul is so upset about will continue. (More disclosure: I am on ORG's Advisory Council.

But maybe the Home Office is on to something. Bush could solve his whole problem by getting everyone to give consent to being surveilled at the moment they take US citizenship. Surely a newborn baby's footprint is sufficient agreement?

Wendy M. Grossman's Web site has an extensive archive of her books, articles, and music, and an archive of all the earlier columns in this series. Readers are welcome to post here, at net.wars home, at her personal blog, or by email to netwars@skeptic.demon.co.uk (but please turn off HTML).

February 22, 2008

Strikeout

There is a certain kind of mentality that is actually proud of not understanding computers, as if there were something honorable about saying grandly, "Oh, I leave all that to my children."

Outside of computing, only television gets so many people boasting of their ignorance. Do we boast how few books we read? Do we trumpet our ignorance of other practical skills, like balancing a cheque book, cooking, or choosing wine? When someone suggests we get dressed in the morning do we say proudly, "I don't know how"?

There is so much insanity coming out of the British government on the Internet/computing front at the moment that the only possible conclusion is that the government is made up entirely of people who are engaged in a sort of reverse pissing contest with each other: I can compute less than you can, and see? here's a really dumb proposal to prove it.

How else can we explain yesterday's news that the government is determined to proceed with Contactpoint even though the report it commissioned and paid for from Deloitte warns that the risk of storing the personal details of every British child under 16 can only be managed, not eliminated? Lately, it seems that there's news of a major data breach every week. But the present government is like a batch of 20-year-olds who think that mortality can't happen to them.

Or today's news that the Department of Culture, Media, and Sport has launched its proposals for "Creative Britain", and among them is a very clear diktat to ISPs: deal with file-sharing voluntarily or we'll make you do it. By April 2009. This bit of extortion nestles in the middle of a bunch of other stuff about educating schoolchildren about the value of intellectual property. Dare we say: if there were one thing you could possibly do to ensure that kids sneer at IP, it would be to teach them about it in school.

The proposals are vague in the extreme about what kind of regulation the DCMS would accept as sufficient. Despite the leaks of last week, culture secretary Andy Burnham has told the Financial Times that the "three strikes" idea was never in the paper. As outlined by Open Rights Group executive director Becky Hogge in New Statesman, "three strikes" would mean that all Internet users would be tracked by IP address and warned by letter if they are caught uploading copyrighted content. After three letters, they would be disconnected. As Hogge says (disclosure: I am on the ORG advisory board), the punishment will fall equally on innocent bystanders who happen to share the same house. Worse, it turns ISPs into a squad of private police for a historically rapacious industry.

Charles Arthur, writing in yesterday's Guardian, presented the British Phonographic Institute's case about why the three strikes idea isn't necessarily completely awful: it's better than being sued. (These are our choices?) ISPs, of course, hate the idea: this is an industry with nanoscale margins. Who bears the liability if someone is disconnected and starts to complain? What if they sue?

We'll say it again: if the entertainment industries really want to stop file-sharing, they need to negotiate changed business models and create a legitimate market. Many people would be willing to pay a reasonable price to download TV shows and music if they could get in return reliable, fast, advertising-free, DRM-free downloads at or soon after the time of the initial release. The longer the present situation continues the more entrenched the habit of unauthorized file-sharing will become and the harder it will be to divert people to the legitimate market that eventually must be established.

But the key damning bit in Arthur's article (disclosure: he is my editor at the paper) is the BPI's admission that they cannot actually say that ending file-sharing would make sales grow. The best the BPI spokesman could come up with is, "It would send out the message that copyright is to be respected, that creative industries are to be respected and paid for."

Actually, what would really do that is a more balanced copyright law. Right now, the law is so far from what most people expect it to be - or rationally think it should be - that it is breeding contempt for itself. And it is about to get worse: term extension is back on the agenda. The 2006 Gowers Review recommended against it, but on February 14, Irish EU Commissioner Charlie McCreevy (previously: champion of software patents) has announced his intention to propose extending performers' copyright in sound recordings from the current 50-year term to 95 years. The plan seems to go something like this: whisk it past the Commission in the next two months. Then the French presidency starts and whee! new law! The UK can then say its hands are tied.

That change makes no difference to British ISPs, however, who are now under the gun to come up with some scheme to keep the government from clomping all over them. Or to the kids who are going to be tracked from cradle to alcopop by unique identity number. Maybe the first target of the government computing literacy programs should be...the government.


Wendy M. Grossman's Web site has an extensive archive of her books, articles, and music, and an archive of all the earlier columns in this series. Readers are welcome to post here, at net.wars home, at her personal blog, or by email to netwars@skeptic.demon.co.uk (but please turn off HTML).

December 7, 2007

Data hogs

If a data point falls in the forest and there's no database to pick it up, is it still private?

There is a general view that people do not care about privacy, particularly younger people. They blog the names of all their favorite bands and best friends, post their drunken photographs on Facebook, and tell all of MySpace who they slept with last night. No one, the argument goes – actually 22 percent – reads the privacy policies Web sites pay their lawyers to draw up so unreadably.

And yet the perception is wrong. People do, clearly, care about privacy – when the issues are made visible to them. Unfortunately, the privacy-invasiveness of a service, policy, or Web site usually only becomes visible after the horse has escaped and is comfortably grazing in the field of three-leaf clover.

A lot of this is, as Charles Arthur blogged recently in relation to the loss of the HMRC discs holding the Child Benefit database, an education issue: if we taught kids important principles of computer science, like security, privacy, and the value of data, instead of boring things like how to format an Excel spreadsheet, some of the most casual data violations wouldn't happen.

A lot of recent privacy failures seem to have happened in just this same unconscious way. Google's various privacy invasions, for example, seem to be a peculiarly geeky failure to connect with the general public's view of things. You can just imagine the techies at the Googleplex saying, "Oh, cool! Look, you can see right into the windows of those houses!" and utterly failing at simple empathy.

The continuing Facebook privacy meltdown seems to include the worst aspects of both the HMRC incident and Google's blind spot. If you haven't been following it, the story in brief is that Facebook created a new advertising program it calls Beacon, which collects tracking data from a variety of partner sites such as Blockbuster.com. Beacon then uses the data to display your latest purchases so your friends can see them.

The blind spot is, of course, the utter surprise with which the company greeted the discovery that people have all sorts of reasons why they don't want their purchase history displayed to their friends. They might be gifts for said friends. The friends, as so often on Facebook and the other social networks, may not be really friends but acquaintances chosen to make you look well-connected, or relatives you assiduously avoid in real life. And even your closest real friends may prefer not to know too much about the porn DVDs you rent. American librarians are militant about protecting the reading lists of library patrons; but Facebook would gleefully expose the books you buy. Are you kidding me? Facebook CEO Mark Zuckerberg can apologize all he wants, but his apparent surprise at the size of the fuss suggests that he's as inexperienced at shopping as those women in front of you in the grocery checkout who seem not to know they'll need to pay until after everything's been bagged up.

What Facebook shares with HMRC, though, is the underlying principle that it's cheaper to send the full set of data and let the recipients delete what they don't want than to be selective. And so, as the story has developed, it turns out that all sorts of data is being sent to Facebook, some of it even relating to non-users. They just delete what they don't want, so they say.

Facebook was briefly defensive, then allowed users to opt out, and then finally allowed users to delete the thing entirely. But the whole thing highlights one of the very real problems with social network sites that net.wars first wrote about in connection with (the now more responsibly designed) Plaxo: they grow by getting people to invade their own and their friends' privacy. The Australian computer scientist and privacy advocate Roger Clarke, whose paper Very Black "Little Black Boooks" is the seminal work in this area, predicted in 2003 that the social networks' business models would force them to become extremely invasive. And so it has proved.

How do we make privacy a choice? We know people care about privacy when they can see its loss: the reactions to the Facebook and HMRC incidents have made this plain. We know theyRecent research by Lorrie Cranor at Carnegie-Mellon (PDF) suggests, for example, that people's purchasing habits will change if you give them an easy-to-understand graphical representation of how well an ecommerce site's practices match their privacy preferences.

But visibility to users, helpful though it would be, is not the root of the problem. What privacy advocates need going forward is a way to persuade companies and governments to make privacy choices easy and visible when their mindset is to collect and keep all data, all the time? These organisations do not perceive giving users control over their privacy as being in their own best interests. Maybe plummeting stock prices and forced resignations, however brief, will get through to them. But to keep their attention focused on building better systems that put the user in control, we need to make the consequences of getting it wrong constantly visible and easily interpretable to the data hogs themselves.


Wendy M. Grossman’s Web site has an extensive archive of her books, articles, and music, and an archive of all the earlier columns in this series. Readers are welcome to post here, at net.wars home, at her personal blog, or by email to netwars@skeptic.demon.co.uk (but please turn off HTML).

November 23, 2007

Road block

There are many ways for a computer system to fail. This week's disclosure that Her Majesty's Revenue and Customs has played lost-in-the-post with two CDs holding the nation's Child Benefit data is one of the stranger ones. The Child Benefit database includes names, addresses, identifying numbers, and often bank details, on all the UK's 25 million families with a child under 16. The National Audit Office requested a subset for its routine audit; the HMRC sent the entire database off by TNT post.

There are so many things wrong with this picture that it would take a village of late-night talk show hosts to make fun of them all. But the bottom line is this: when the system was developed no one included privacy or security in the specification or thought about the fundamental change in the nature of information when paper-based records are transmogrified into electronic data. The access limitations inherent in physical storage media must be painstakingly recreated in computer systems or they do not exist. The problem with security is it tends to be inconvenient.

With paper records, the more data you provide the more expensive and time-consuming it is. With computer records, the more data you provide the cheaper and quicker it is. The NAO's file of email relating to the incident (PDF) makes this clear. What the NAO wanted (so it could check that the right people got the right benefit payments): national insurance numbers, names, and benefit numbers. What it got: everything. If the discs hadn't gotten lost, we would never have known.

Ironically enough, this week in London also saw at least three conferences on various aspects of managing digital identity: Digital Identity Forum, A Fine Balance, and Identity Matters. All these events featured the kinds of experts the UK government has been ignoring in its mad rush to create and collect more and more data. The workshop on road pricing and transport systems at the second of them, however, was particularly instructive. Led by science advisor Brian Collins, the most notable thing about this workshop is that the 15 or 20 participants couldn't agree on a single aspect of such a system.

Would it run on GPS or GSM/GPRS? Who or what is charged, the car or the driver? Do all roads cost the same or do we use differential pricing to push traffic onto less crowded routes? Most important, is the goal to raise revenue, reduce congestion, protect the environment, or rebalance the cost of motoring so the people who drive the most pay the most? The more purposes the system is intended to serve, the more complicated and expensive it will become, and the less likely it is to answer any of those goals successfully. This point has of course also been made about the National ID card by the same sort of people who have warned about the security issues inherent in large databases such as the Child Benefit database. But it's clearer when you start talking about something as limited as road charging.

For example: if you want to tag the car you would probably choose a dashboard-top box that uses GPS data to track the car's location. It will have to store and communicate location data to some kind of central server, which will use it to create a bill. The data will have to be stored for at least a few billing cycles in case of disputes. Security services and insurers alike would love to have copies. On the other hand, if you want to tag the driver it might be simpler just to tie the whole thing to a mobile phone. The phone networks are already set up to do hand-off between nodes, and tracking the driver might also let you charge passengers, or might let you give full cars a discount.

The problem is that the discussion is coming from the wrong angle. We should not be saying, "Here is a clever technological idea. Oh, look, it makes data! What shall we do with it?" We should be defining the problem and considering alternative solutions. The people who drive most already pay most via the fuel pump. If we want people to drive less, maybe we should improve public transport instead. If we're trying to reduce congestion, getting employers to be more flexible about working hours and telecommuting would be cheaper, provide greater returns, and, crucially for this discussion, not create a large database system that can be used to track the population's movements.

(Besides, said one of the workshop's participants: "We live with the congestion and are hugely productive. So why tamper with it?")

It is characteristic of our age that the favored solution is the one that creates the most data and the biggest privacy risk. No one in the cluster of organisations opposing the ID card - No2ID, Privacy International, Foundation for Information Policy Research, or Open Rights Group - wanted an incident like this week's to happen. But it is exactly what they have been warning about: large data stores carry large risks that are poorly understood, and it is not enough for politicians to wave their hands and say we can trust them. Information may want to be free, but data want to leak.

Wendy M. Grossman’s Web site has an extensive archive of her books, articles, and music, and an archive of all the earlier columns in this series. Readers are welcome to post here, at net.wars home, at her personal blog, or by email to netwars@skeptic.demon.co.uk (but please turn off HTML).

November 9, 2007

Watching you watching me

A few months ago, a neighbour phoned me and asked if I'd be willing to position a camera on my windowsill. I live at the end of a small dead-end street (or cul-de-sac), that ends in a wall about shoulder height. The railway runs along the far side of the wall, and parallel to it and further away is a long street with a row of houses facing the railway. The owners of those houses get upset because graffiti keeps appearing alongside the railway where they can see it and covers flat surfaces such as the side wall of my house. The theory is that kids jump over the wall at the end of my street, just below my office window, either to access the railway and spray paint or to escape after having done so. Therefore, the camera: point it at the wall and watch to see what happens.

The often-quoted number of times the average Londoner is caught on camera per day is scary: 200. (And that was a few years ago; it's probably gone up.) My street is actually one of those few that doesn't have cameras on it. I don't really care about the graffiti; I do, however, prefer to be on good terms with neighbours, even if they're all the way across the tracks. I also do see that it makes sense at least to try to establish whether the wall downstairs is being used as a hurdle in the getaway process. What is the right, privacy-conscious response to make?

I was reminded of this a few days ago when I was handed a copy of Privacy in Camera Networks: A Technical Perspective, a paper published at the end of July. (We at net.wars are nothing if not up-to-date.)

Given the amount of money being spent on CCTV systems, it's absurd how little research there is covering their efficacy, their social impact, or the privacy issues they raise. In this paper, the quartet of authors – Marci Lenore Meingast (UC Berkeley), Sameer Pai (Cornell), Stephen Wicker (Cornell), and Shankar Sastry (UC Berkeley) – are primarily concerned with privacy. They ask a question every democratic government deploying these things should have asked in the first place: how can the camera networks be designed to preserve privacy? For the purposes of preventing crime or terrorism, you don't need to know the identity of the person in the picture. All you want to know is whether that person is pulling out a gun or planting a bomb. For solving crimes after the fact, of course, you want to be able to identify people – but most people would vastly prefer that crimes were prevented, not solved.

The paper cites model legislation (PDF) drawn up by the Constitution Project. Reading it is depressing: so many of the principles in it are such logical, even obvious, derivatives of the principles that democratic governments are supposed to espouse. And yet I can't remember any public discussion of the idea that, for example, all CCTV systems should be accompanied by identification of and contact information for the owner. "These premises are protected by CCTV" signs are everywhere; but they are all anonymous.

Even more depressing is the suggestion that the proposals for all public video surveillance systems should specify what legitimate law enforcement purpose they are intended to achieve and provide a privacy impact assessment. I can't ever remember seeing any of those either. In my own local area, installing CCTV is something politicians boast about when they're seeking (re)election. Look! More cameras! The assumption is that more cameras equals more safety, but evidence to support this presumption is never provided and no one, neither opposing politicians nor local journalists, ever mounts a challenge. I guess we're supposed to think that they care about us because they're spending the money.
The main intention of Meingast, Pai, et al, however, is to look at the technical ways such networks can be built to preserve privacy. They suggest, for example, collecting public input via the Internet (using codes to identify the respondents on whom the cameras will have the greatest impact). They propose an auditing system whereby these systems and their usage is reviewed. As the video streams become digital, they suggest using layers of abstraction of the resulting data to limit what can be identified in a given image. "Information not pertinent to the task in hand," they write hopefully, "can be abstracted out leaving only the necessary information in the image." They go on into more detail about this, along with a lengthy discussion of facial recognition.

The most depressing thing of all: none of this will ever happen, and for two reasons. First, no government seems to have the slightest qualm of conscience about installing surveillance systems. Second, the mass populace don't seem to care enough to demand these sorts of protections. If these protections are to be put in place at all, it must be done by technologists. They must design these systems so that it's easier to use them in privacy-protecting ways than to use them in privacy-invasive ways. What are the odds?

As for the camera on my windowsill, I told my neighbour after some thought that they could have it there for a maximum of a couple of weeks to establish whether the end of my street was actually being used as an escape route. She said something about getting back to me when something or other happened. Never heard any more about it. As far as I am aware, my street is still unsurveilled.

Wendy M. Grossman’s Web site has an extensive archive of her books, articles, and music, and an archive of all the earlier columns in this series. Readers are welcome to post here, at net.wars home, at her personal blog, or by email to netwars@skeptic.demon.co.uk (but please turn off HTML).

October 12, 2007

The permission-based society

It was Edward Hasbrouck who drew my attention to a bit of rulemaking being proposed by the Transportation Security Agency. Under current rules, if you want to travel on a plane out of, around, into, or over the US you buy a ticket and show up at the airport, where the airline compares your name and other corroborative details to the no-fly list the TSA maintains. Assuming you're allowed onto the flight, unbeknownst to you, all this information has to be sent to the TSA within 15 minutes of takeoff (before, if it's a US flight, after if it's an international flight heading for the US).

Under the new rules, the information will have to arrive at the TSA 72 hours before the flight takes off – after all, most people have finalised their travel plans by that time, and only 7 to 10 percent of itineraries change after that – and the TSA has to send back an OK to the airline before you can be issued a boarding pass.

There's a whole lot more detail in the Notice of Proposed Rulemaking, but that's the gist. (They'll be accepting comments until October 22, if you would like to say anything about these proposals before they're finalised.)

There are lots of negative things to say about these proposals – the logistical difficulties for the travel industry, the inadequacy of the mathematical model behind this (which at the public hearing the ACLU's Barry Steinhardt compared to trying to find a needle in a haystack by pouring more hay on the stack), and the privacy invasiveness inherent in having the airlines collect the many pieces of data the government wants and, not unnaturally, retaining copies while forwarding it on to the TSA. But let's concentrate on one: the profound alteration such a scheme will make to American society at large. The default answer to the question of whether you had the right to travel anywhere, certainly within the confines of the US, has always been "Yes". These rules will change it to "No".

(The right to travel overseas has, at times, been more fraught. The folk scene, for example, can cite several examples of musicians who were denied passports by the US State Department in the 1950s and early 1960s because of their left-wing political beliefs. It's not really clear to me why the US wanted to keep people whose views it disapproved of within its borders but some rather hasty marriages took place in order to solve some of these immigration problems, though everyone's friends again now and it's fresh passports all round.)

Hasbrouck, Steinhardt, and EFF founder John Gilmore, who sued the government over the right to travel anonymously within the US, have all argued that the key issue here is the right to assemble guaranteed in the First Amendment. If you can't travel, you can't assemble. And if you have to ask permission to travel, your right of assembly is subject to disruption at any time. The secrecy with which the TSA surrounds its decision-making doesn't help.

Nor does the amount of personal data the TSA is collecting from airline passenger name records. The Identity Project's recent report on the subject highlights that these records may include considerable detail: what books the passenger is carrying, what answer you give when asked where you've been or are going, names and phone numbers given as emergency contacts, and so on. Despite the data protection laws, it isn't always easy to find out what information is being stored; when I made such a request of US Airways last year, the company refused to show me my PNR from a recent flight and gave as the reason: "Security." Civilisation as we know it is at risk if I find out what they think they know about me? We really are in trouble.

In Britain, the chief objections to the ID card and, more important, the underlying database, have of course been legion, but they have generally focused on the logistical problems of implementing it (huge cost, complex IT project, bound to fail) and its general privacy-invasiveness. But another thing the ID card – especially the high-tech, biometric, all-singing, all-dancing kind – will do is create a framework that could support a permission-based society in which the ID card's interaction with systems is what determines what you're allowed to do, where you're allowed to go, and what purchases you're allowed to make. There was a novel that depicted a society like this: Ira Levin's This Perfect Day, in which these functions were all controlled by scanner bracelets and scanners everywhere that lit up green to allow or red to deny permission. The inhabitants of that society were kept drugged, so they wouldn't protest the ubiquitous controls. We seem to be accepting the beginnings of this kind of life stone, cold sober.

American children play a schoolyard game called "Mother, May I?" It's one of those games suitable for any number of kids, and it involves a ritual of asking permission before executing a command. It's a fine game, but surely it isn't how we want to live.


Wendy M. Grossman’s Web site has an extensive archive of her books, articles, and music, and an archive of all the earlier columns in this series. Readers are welcome to post here, at net.wars home, at her personal blog, or by email to netwars@skeptic.demon.co.uk (but please turn off HTML).

September 14, 2007

Nothing to hide, no one to trust

The actor David Hyde Pierce is widely reported to have once responded to a TV interviewer inquiring as to whether he was gay, "My life is an open book, but that doesn't mean I'm going to read it to you." (Or something very like that.)

This seems to me a both witty and intelligent response to the seemingly ever-present mantra these days, "If you have nothing to hide, you have nothing to fear," invoked every time someone wants to institute some new, egregious privacy-invasive surveillance practice. And there are a lot of these.

Last week, a British judge came up with a brilliant scheme for eliminating the racial bias of the 3 million-entry DNA database: collect samples from everyone, even visitors. I may have nothing to fear from this – after all, DNA testing has, in the US, been used to exonerate the innocent on Death Row - but it invokes in me what British politicos sometimes call the "yuck factor". Normally, this is reserved for such science-related ethical dilemmas as human cloning, but for me at least it applies much more strongly here. I loved the movie Gattaca, but I don't want to live there.

In fact, there are considerable risks in DNA-printing the entire population (aside from killing tourism). For one thing, we do not know how we will be able to use or interpret DNA in the future as sequencing plummets in price (as it's expected to do). Say, the UK had considered compiling a nationwide fingerprints database back in 1970 (there would have been riots, but leaving that aside). No one would have foreseen then the widespread availability of cheap fingerprint scanners and online communications that could turn that database into a central authority.

We can surmise that the DNA database will contain sufficient information to allow anyone who can gain access to it to impersonate anyone at any time. Conversely, as we get better and better at understanding what individual genes mean and sequencing drops precipitously in price, the DNA database will grant those who have access to it unprecedented amounts of information about each person's biological or medical prospects and those of their immediate relatives. While there are many diseases that do not have markers in our genes, there are plenty more that do. Does anyone really want the government to be the first to know that they carry the gene for cystic fibrosis or breast cancer?

I don't believe for a second that it was a serious proposal. This is the kind of thing someone says and then everyone holds their breath to gauge the reaction. Has the country been softened up enough to accept such a thing yet? But the fact that someone could say it at all shows how far we have moved away from the presumption of innocence on which both the UK and the US governments were founded.

Witty answers on talk shows aren't, however, quite enough to make a case to a government that what it wants to do is a bad idea. Daniel Solove, author of The Digital Person and a law professor at

In it, he compares privacy to environmental damage: not the single horror story implied by "nothing to hide, nothing to fear", but the result of the accumulating damage caused by a series of "small acts by different actors". The broader structural damage that happens in breaches of confidentiality (such as companies violating their own privacy policies by selling data to third parties) is a loss of trust.

I am not a supporter of open gun ownership, but the US Second Amendment has some merit in principle: the basic idea is to balance the power of the individual against the State. The EU's data protection laws do – or would, if the EU doesn't ignore them as it has in the case of passenger data – a reasonable job of balancing the power of the individual against commercial companies. But the data protection laws can be upended, it seems, whenever a national government wants to do so. All it has to do is pass a law making it legal or mandatory to supply the data it wants to collect, transfer, share, or sell. But the fact that such policies are possible doesn't make them a good idea, even with the best intentions of improving security or personal safety.

The San Francisco computer security expert Russell Brand once asked me, in the casual way he poses philosophical questions, "If you knew they would never be used against you, would you st