" /> net.wars: October 2012 Archives

« September 2012 | Main | November 2012 »

October 26, 2012

Lie to me

I thought her head was going to explode.

The discussion that kicked off this week's Parliament and Internet conference revolved around cybersecurity and trust online, harmlessly at first. Then Helen Goodman (Labour - Bishop Auckland), the shadow minister for Culture, Media, and Sport, raised a question: what was Nominet doing to get rid of anonymity online? Simon McCalla, Nominet's CTO, had some answers: primarily, they're constantly trying to improve the accuracy and reliability of the Whois database, but it's only a very small criminal element that engage in false domain name registration. Like that.

A few minutes later, Andy Smith, PSTSA Security Manager, Cabinet Office, in answer to a question about why the government was joining the Open Identity Exchange (as part of the Identity Assurance Programme) advised those assembled to protect themselves online by lying. Don't give your real name, date of birth, and other information that can be used to perpetrate identity theft.

Like I say, bang! Goodman was horrified. I was sitting near enough to feel the splat.

It's the way of now that the comment was immediately tweeted, picked up by the BBC reporter in the room, published as a story, retweeted, Slashdotted, tweeted some more, and finally boomeranged back to be recontextualized from the podium. Given a reporter with a cellphone and multiple daily newspaper editions, George Osborne's contretemps in first class would still have reached the public eye the same day 15 years ago. This bit of flashback couldn't have happened even five years ago.

For the record, I think it's clear that Smith gave good security advice, and that the headline - the greater source of concern - ought to be that Goodman, an MP apparently frequently contacted by constituents complaining about anonymous cyberbullying, doesn't quite grasp that this is a nuanced issue with multiple trade-offs. (Or, possibly, how often the cyberbully is actually someone you know.) Dates of birth, mother's maiden names, the names of first pets...these are all things that real-life friends and old schoolmates may well know, and lying about the answers is a perfectly sensible precaution given that there is no often choice about giving the real answers for more sensitive purposes, like interacting with government, medical, and financial services. It is not illegal to fake or refuse to disclose these things, and while Facebook has a real names policy it's enforced with so little rigor that it has a roster of fake accounts the size of Egypt.

Although: the Earl of Erroll might be a bit busy today changing the fake birth date - April 1, 1900 - he cheerfully told us and Radio 4 he uses throughout; one can only hope that he doesn't use his real mother's maiden name, since that, as Tom Scott pointed out later, is in Erroll's Wikipedia entry. Since my real birth date is also in *my* Wikipedia entry and who knows what I've said where, I routinely give false answers to standardized security questions. What's the alternative? Giving potentially thousands of people the answers that will unlock your bank account? On social networking sites it's not enough for you to be taciturn; your birth date may be easily outed by well-meaning friends writing on your wall. None of this is - or should be - illegal.

It turns out that it's still pretty difficult to explain to some people how the Internet works or why . Nominet can work as hard as it likes on verifying its own Whois database, but it is powerless over the many UK citizens and businesses that choose to register under .com, .net, and other gTLDs and country codes. Making a law to enjoin British residents and companies from registering domains outside of .uk...well, how on earth would you enforce that? And then there's the whole problem of trying to check, say, registrations in Chinese characters. Computers can't read Chinese? Well, no, not really, no matter what Google Translate might lead you to believe.

Anonymity on the Net has been under fire for a long, long time. Twenty years ago, the main source of complaints was AOL, whose million-CD marketing program made it easy for anyone to get a throwaway email address for 24 hours or so until the system locked you out for providing an invalid credit card number. Then came Hotmail, and you didn't even need that. Then, as now, there are good and bad reasons for being anonymous. For every nasty troll who uses the cloak to hide there are many whistleblowers and people in private pain who need its protection.

Smith's advice only sounds outrageous if, like Goodman, you think there's a valid comparison between Nominet's registration activity and the function of the Driver and Vehicle Licensing Agency (and if you think the domain name system is the answer to ensuring a traceable online identity). And therein lies the theme of the day: the 200-odd Parliamentarians, consultants, analysts, government, and company representatives assembled repeatedly wanted incompatible things in conflicting ways. The morning speakers wanted better security, stronger online identities, and the resources to fight cybercrime; the afternoon folks were all into education and getting kids to hack and explore so they learn to build things and understand things and maybe have jobs someday, to their own benefit and that of the rest of the country. Paul Bernal has a good summary.

Wendy M. Grossman's Web site has an extensive archive of her books, articles, and music, and an archive of all the earlier columns in this series.

October 19, 2012

Finding the gorilla

"A really smart machine will think like an animal," predicted Temple Grandin at last weekend's Singularity Summit. To an animal, she argued, a human on a horse often looks like a very different category of object than a human walking. That seems true; and yet animals also live in a sensory-driven world entirely unlike that of machines.

A day later, Melanie Mitchell, a professor of computer science at Portland State University, argued that analogies are key, she said, to human intelligence, producing landmark insights like comparing a brain to a computer (von Neumann) or evolutionary competition to economic competition (Darwin). This is true, although that initial analogy is often insufficient and may even be entirely wrong. A really significant change in our understanding of the human brain came with research by psychologists like Elizabeth Loftus showing that where computers retain data exactly as it was (barring mechanical corruption), humans improve, embellish, forget, modify, and partially lose stored memories; our memories are malleable and unreliable in the extreme. (For a worked example, see The Good Wife, season 1, episode 6.)

Yet Mitchell is obviously right when she says that much of our humor is based on analogies. It's a staple of modern comedy, for example, for a character to respond on a subject *as if* it were another subject (chocolate as if it were sex, a pencil dropping on Earth as if it were sex, and so on). Especially incongruous analogies: when Watson asks - in the video clip she showed - for the category "Chicks dig me" it's funny because we know that as a machine a) Watson doesn't really understand what it's saying, and b) Watson is pretty much the polar opposite of the kind of thing that "chicks" are generally imagined to "dig".

"You are going to need my kind of mind on some of these Singularity projects," said Grandin, meaning visual thinkers, rather than the mathematical and verbal thinkers who "have taken over". She went on to contend that visual thinkers are better able to see details and relate them to each other. Her example: the emergency generators at Fukushima located below the level of a plaque 30 feet up on the seawall warning that flood water could rise that high. When she talks - passionately - about installing mechanical overrides in the artificial general intelligences Singularitarians hope will be built one day soonish, she seems to be channelling Peter G. Neumann, who talks often about the computer industry's penchant for repeating the security mistakes of decades past.

An interesting sideline about the date of the Singularity: Oxford's Stuart Armstrong has studied these date predictions and concluded pretty much that, in the famed words of William Goldman, no one knows anything. Based on his study of 257 predictions collected by the Singularity Institute and published on its Web site, he concluded that most theories about these predictions are wrong. The dates chosen typically do not correlate with the age or expertise of the predicter or the date of the prediction. I find this fascinating: there's something like an 80 percent consensus that the Singularity will happen in five to 100 years.

Grandin's discussion of visual thinkers made me wonder whether they would be better or worse at spotting the famed invisible gorilla than most people. Spoiler alert: if you're not familiar with this psychologist test, go now and watch the clip before proceeding. You want to say better - after all, spotting visual detail is what visual thinkers excel at - but what if the demands of counting passes is more all-consuming for them than for other types of thinkers? The psychologist Daniel Kahneman, participating by video link, talked about other kinds of bias but not this one. Would visual thinkers be more or less likely to engage in the common human pastime of believing we know something based on too little data and then ignoring new data?

This is, of course, the opposite of today's Bayesian systems, which make a guess and then refine it as more data arrives: almost the exact opposite of the humans Kahneman describes. So many of the developments we're seeing now rely on crunching masses of data (often characterized as "big" but often not *really* all that big) to find subtle patterns that humans never spot. Linda Avey, founder of the personal genome profiling service 23andMe and John Wilbanks are both trying to provide services that will allow individuals to take control of and understand their personal medical data. Avey in particular seems poised to link in somehow to the data generated by seekers in the several-year-old self-quantified movement.

This approach is so far yielding some impressive results. Peter Norvig, the director of research at Google, recounted both the company's work on recognizing cats and its work on building Google Translate. The latter's patchy quality seems more understandable when you learn that it was built by matching documents issued in multiple languages against each other and building up statistical probabilities. The former seems more like magic, although Slate points out that the computers did not necessarily pick out the same patterns humans would.

Well, why should they? Do I pick out the patterns they're interested in? The story continues...

Wendy M. Grossman's Web site has an extensive archive of her books, articles, and music, and an archive of all the earlier columns in this series.

October 12, 2012

My identity, my self

Last week, the media were full of the story that the UK government was going to start accepting Facebook logons for authentication. This week, in several presentations at the RSA Conference, representatives of the Government Digital Service begged to differ: the list of companies that have applied to become identity providers (IDPs) will be published at the end of this month and until then they are not confirming the presence or absence of any particular company. According to several of the spokesfolks manning the stall and giving presentations, the press just assumed that when they saw social media companies among the categories of organization that might potentially want to offer identity authentication, that meant Facebook. We won't actually know for another few weeks who has actually applied.

So I can mercifully skip the rant that hooking a Facebook account to the authentication system you use for government services is a horrible idea in both directions. What they're actually saying is, what if you could choose among identification services offered by the Post Office, your bank, your mobile network operator (especially for the younger generation), your ISP, and personal data store services like Mydex or small, local businesses whose owners are known to you personally? All of these sounded possible based on this week's presentations.

The key, of course, is what standards the government chooses to create for IDPs and which organizations decide they can meet those criteria and offer a service. Those are the details the devil is in: during the 1990s battles about deploying strong cryptography, the government's wanted copies of everyone's cryptography keys to be held in escrow by a Trusted Third Party. At the time, the frontrunners were banks: the government certainly trusted those, and imagined that we did, too. The strength of the disquiet over that proposal took them by surprise. Then came 2008. Those discussions are still relevant, however; someone with a long memory raised the specter of Part I of the Electronic Communications Act 2000, modified in 2005, as relevant here.

It was this historical memory that made some of us so dubious in 2010, when the US came out with proposals rather similar to the UK's present ones, the National Strategy for Trusted Identities in Cyberspace (NSTIC). Ross Anderson saw it as a sort of horror-movie sequel. On Wednesday, however, Jeremy Grant, the senior executive advisor for identity management at the US National Institute for Standards and Technology (NIST), the agency charged with overseeing the development of NSTIC, sounded a lot more reassuring.

Between then and now came both US and UK attempts to establish some form of national ID card. In the US, "Real ID", focused on the state authorities that issue driver's licenses. In the UK, it was the national ID card and accompanying database. In both countries the proposals got howled down. In the UK especially, the combination of an escalating budget, a poor record with large government IT projects, a change of government, and a desperate need to save money killed it in 2006.

Hence the new approach in both countries. From what the GDS representatives - David Rennie (head of proposition at the Cabinet Office), Steven Dunn (lead architect of the Identity Assurance Programme; Twitter: @cuica), Mike Pegman (security architect at the Department of Welfare and Pensions, expected to be the first user service; Twitter: @mikepegman), and others manning the GDS stall - said, the plan is much more like the structure that privacy advocates and cryptographers have been pushing for 20 years: systems that give users choice about who they trust to authenticate them for a given role and that share no more data than necessary. The notion that this might actually happen is shocking - but welcome.

None of which means we shouldn't be asking questions. We need to understand clearly the various envisioned levels of authentication. In practice, will those asking for identity assurance ask for the minimum they need or always go for the maximum they could get? For example, a bar only needs relatively low-level assurance that you are old enough to drink; but will bars prefer to ask for full identification? What will be the costs; who pays them and under what circumstances?

Especially, we need to know what the detail of the standards organizations must meet to be accepted as IDPs, in particular, what kinds of organization they exclude. The GDS as presently constituted - composed, as William Heath commented last year, of all the smart, digitally experienced people you *would* hire to reinvent government services for the digital world if you had the choice - seems to have its heart in the right place. Their proposals as outlined - conforming, as Pegman explained happily, to Kim Cameron's seven laws of identity - pay considerable homage to the idea that no one party should have all the details of any given transaction. But the surveillance-happy type of government that legislates for data retention and CCDP might also at some point think, hey, shouldn't we be requiring IDPs to retain all data (requests for authentication, and so on) so we can inspect it should we deem it necessary? We certainly want to be very careful not to build a system that could support such intimate secret surveillance - the fundamental objection all along to key escrow.

Wendy M. Grossman's Web site has an extensive archive of her books, articles, and music, and an archive of the earlier columns in this series.

October 5, 2012

The doors of probability

Mike Lynch has long been the most interesting UK technology entrepreneur. In 2000, he became Britain's first software billionaire. In 2011 he sold his company, Autonomy, to Hewlett-Packard for $10 billion. A few months ago, Hewlett-Packard let him escape back into the wild of Cambridge. We've been waiting ever since for hints of what he'll do next; on Monday, he showed up at NESTA to talk about his adventures with Wired UK editor David Rowan.

Lynch made his name and his company by understanding that the rule formulated in 1750 by the English vicar and mathematician Thomas Bayes could be applied to getting machines to understand unstructured data. These days, Bayes is an accepted part of the field of statistics, but for a couple of centuries anyone who embraced his ideas would have been unwise to admit it. That began to change in the 1980s, when people began to realize the value of his ideas.

"The work [Bayes] did offered a bridge between two worlds," Lynch said on Monday: the post-Renaissance world of science, and the subjective reality of our daily lives. "It leads to some very strange ideas about the world and what meaning is."

As Sharon Bertsch McGrayne explains in The Theory That Would Not Die, Bayes was offering a solution to the inverse probability problem. You have a pile of encrypted code, or a crashed airplane, or a search query: all of these are effects; your problem is to find the most likely cause. (Yes, I know: to us the search query is the cause and the page of search results if the effect; but consider it from the computer's point of view.) Bayes' idea was to start with a 50/50 random guess and refine it as more data changes the probabilities in one direction or another. When you type "turkey" into a search engine it can't distinguish between the country and the bird; when you add "recipe" you increase the probability that the right answer is instructions on how to cook one.

Note, however, that search engines work on structured data: tags, text content, keywords, and metadata all going into building an index they can run over to find the hits. What Lynch is talking about is the stuff that humans can understand - raw emails, instant messages, video, audio - that until now has stymied the smartest computers.

Most of us don't really like to think in probabilities. We assume every night that the sun will rise in the morning; we call a mug a mug and not "a round display of light and shadow with a hole in it" in case it's really a doughnut. We also don't go into much detail in making most decisions, no matter how much we justify them afterwards with reasoned explanations. Even decisions that are in fact probabilistic - such as those of the electronic line-calling device Hawk-Eye used in tennis and cricket - we prefer to display as though they were infallible. We could, as Cardiff professor Harry Collins argued, take the opportunity to educate people about probability: the on-screen virtual reality animation could include an estimate of the margin for error, or the probability that the system is right (much the way IBM did in displaying Watson's winning Jeopardy answers). But apparently it's more entertaining - and sparks fewer arguments from the players - to pretend there is no fuzz in the answer.

Lynch believes we are just at the beginning of the next phase of computing, in which extracting meaning from all this unstructured data will bring about profound change.

"We're into understanding analog," he said. "Fitting computers to use instead of us to them." In addition, like a lot of the papers and books on algorithms I've been reading recently, he believes we're moving away from the scientific tradition of understanding a process to get an outcome and into taking huge amounts of data about outcomes and from it extracting valid answers. In medicine, for example, that would mean changing from the doctor who examines a patient, asks questions, and tries to understand the cause of what's wrong with them in the interests of suggesting a cure. Instead, why not a black box that says, "Do these things" if the outcome means a cured patient? "Many people think it's heresy, but if the treatment makes the patient better..."

At the beginning, Lynch said, the Autonomy founders thought the company could be worth £2 to £3 million. "That was our idea of massive back then."

Now, with his old Autonomy team, he is looking to invest in new technology companies. The goal, he said, is to find new companies built on fundamental technology whose founders are hungry and strongly believe that they are right - but are still able to listen and learn. The business must scale, requiring little or no human effort to service increased sales. With that recipe he hopes to find the germs of truly large companies - not the put in £10 million sell out at £80 million strategy he sees as most common, but multi-billion pound companies. The key is finding that fundamental technology, something where it's possible to pick a winner.

Wendy M. Grossman's Web site has an extensive archive of her books, articles, and music, and an archive of all the earlier columns in this series.