October 31, 2014

Disclosure

Back in April, headlines screamed that Her Majesty's Revenue and Customs was going to sell the nation's collected tax data to third parties for reuse. This was shortly after the care.data fiasco, so public freakage happened on a grand scale. HMRC said no, no, sharing and maybe recovering costs, not selling and opened a consultation.

This week, the Open Rights Group sent me (an advisory council member), to a roundtable discussion of these issues. I had two main points of interest. First, data-sharing should not become extortion, given that that we are legally obliged to give HMRC our financial data. Second, the open policy effort, presaged by a 2013 tea camp meeting focused on exactly this. Other groups represented at the meeting included the Open Data User Group, the British Bankers' Association, 38 Degrees, the Federation of Small Businesses, HMRC itself, the Chartered Institute of Taxation, which hosted the meeting, and the devil in this particular set of details, Experian.

The questions on the table: whether, how, with whom, under what circumstances, how much, in what form, and at whose expense data should be shared. Two types of data were up for discussion: the VAT registration database and taxpayer information. Both are contentious. The latter, because even the people who post other intimate details online do not post their financial statements. The former, because Britain's 3.7 million sole traders and 434,000 partnerships often register using their principals' home addresses. CIOT president Anne Fairpo suggested instead an online checker for the ownership and validity of VAT numbers.

There was little discussion of *whether*: sharing data for public benefit sounds benign. While most documents talk of eliminating fraud, Edward Troup, the second permanent secretary and tax assurance commissioner for HMRC, noted that current legal restrictions forced HMRC to turn down a proposed NHS study of factors underlying excess winter mortality. This changes the discussion, at least for me: the benefits are no longer solely economic. Trust, Troup said is key to allowing HMRC to pursue its mandated function.

The case for sharing VAT registration data seems to come down to the goal of improving access to credit. Paul Malyon, for Experian, suggested it would give more businesses "score access". How significant the benefit would be, no one was sure: it's just one among many factors in determining risk. Is this a real enough problem to need a 100 percent solution? As Fairpo asked, if the benefit of releasing this data only accrues to 25 percent of businesses, why should 100 percent of those registered have their data released? How much this policy is up for discussion is unclear, given that that the Small Business Enterprise and Employment Bill has already been introduced into parliament with provision for such disclosure included (section 6).

One question is whether to restrict sharing VAT registration data with a selected band of "most favored nations" (such as Experian) or to open it up more broadly to see what innovation and economic benefits might accrue. Having been a (voluntarily) registered sole trader myself, I grasp the invasiveness of simply publishing the dataset openly. But there are also issues around closing this public asset to all but a small group of already entrenched companies with data-driven business models in a discriminatory fashion. Is this the power structure we want for the future? It's certainly against the spirit of what open data is supposed to be about.

Taxpayer data is much trickier. You could see the HMRC folks wince at the memory) 2008 lost HMRC data CDs, but that experience has left HMRC with an acute awareness of what can go wrong. They have procedures, there are criminal sanctions for disclosing data, and so on. Under those conditions, HMRC already shares such data with chosen researchers, who are allowed to access it only in a controlled room on standalone machines. In the three years this Datalab has been running, said Cindy Bell, HMRC's head of information policy and disclosure, there have been no accidental disclosures. The HMRC folks also want to avoid reenacting care.data's missteps. So far, so good.

But that still leaves many concerns. Taxpayer data was not created with sharing in mind. Aggregated data may still reveal details about individuals if it's sliced too thinly (some industries are small and local). Anonymized data...isn't, a hard message to fully get across. Finding the right balance between privacy and the public interest in this area will be particularly fraught. And there are three additional risks that need to be considered. One: as several speakers said, individuals whose data is compromised need remedies and options for redress. Two: we need to plan how to recover from failures if and when. Three; as Jim Killock, ORG's executive director, has said, the fact that several different departments are considering these issues in different ways is not a joined-up approach (even if the goal is joined-up government). What's needed is a broader effort to develop first principles around data sharing. Otherwise, the risk is that incompatible data sharing regimes will become yet another set of barriers.

Finally, I believe individuals want to see and control what is being shared about them in a fine-grained ongoing way. A move to personal data stores could up-end this whole discussion.


Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Stories about the border wars between cyberspace and real life are posted occasionally during the week at the net.wars Pinboard - or follow on Twitter.