June 16, 2017

The ghost in the machine

rotated-patrickball-2017.jpgHumans are a problem in decision-making. We have prejudices based on limited experience, received wisdom, weird personal irrationality, and cognitive biases psychologists have documented. Unrecognized emotional mechanisms shield us from seeing our mistakes.

Cue machine learning as the solution du jour. Many have claimed that crunching enough data will deliver unbiased judgements. These days, this notion is being debunked: the data the machines train on and analyze arrives pre-infected, as we created it in the first place, a problem Cathy O'Neil does a fine job of explaining in Weapons of Math Destruction. See also Data & Society and Fairness, Accountability, and Transparency in Machine Learning.

Patrick Ball, founding director of the Human Rights Database Analysis Group, argues, however, that there are underlying worse problems. HRDAG "applies rigorous science to the analysis of human rights violations around the world". It uses machine learning - currently, to locate mass graves in Mexico - but a key element of its work is "multiple systems estimation" to identify overlaps and gaps.

"Every kind of classification system - human or machine - has several kinds of errors it might make," he says. "To frame that in a machine learning context, what kind of error do we want the machine to make?" HRDAG's work on predictive policing shows that "predictive policing" finds patterns in police records, not patterns in occurrence of crime.

Media reports love to rate machine learning's "accuracy", typically implying the percentage of decisions where the machine's "yes" represents a true positive and its "no" means a true negative. Ball argues this is meaningless. In his example, a search engine that scans billions of web pages for "Wendy Grossman" can be accurate to .99999 because the vast supply of pages that don't mention me (true negatives) will swamp the results. The same is true of any machine system trying to find something rare in a giant pile of data - and it gets worse as the pile of data gets bigger, a problem net.wars has often called searching for a needle in a haystack by building bigger haystacks in relation to data retention.

For any automated decision system, you can draw a 2x2 confusion matrix, like this:
"There are lots of ways to understand that confusion matrix, but the least meaningful of those ways is to look at true positives plus true negatives divided by the total number of cases and say that's accuracy," Ball says, "because in most classification problems there's an asymmetry of yes/no answers" - as above. A "94% accurate" model "isn't accurate at all, and you haven't found any true positives because these classifications are so asymmetric." This fact does make life easy for marketers, though: you can improve your "accuracy" just by throwing more irrelevant data at the model. "To lay people, accuracy sounds good, but it actually isn't the measure we need to know."

Unfortunately, there isn't a single measure: "We need to know at least two, and probably four. What we have to ask is, what kind of mistakes are we willing to tolerate?"

In web searches, we can tolerate a few seconds to scan 100 results and ignore the false positives. False negatives - pages missing that we wanted to see - are less acceptable. Machine learning uses "recall" for the fraction of true positives in the set of results, and "precision" for that of true positives in the entire set being searched. The various ways the classifier can be set can be drawn as a curve. Human beings understand a single number better than tradeoffs; reporting accuracy then means picking a spot on the curve as the point to set the classifier. "But it's always going to be ridiculously optimistic because it will include an ocean of true negatives." This is true whether you're looking for 2,000 fraudulent financial transactions in a sea of billions daily, or finding a handful of terrorists in the general population. Recent attackers, from 9/11 to London Bridge 2017, have already been objects of suspicion, but forces rarely have the capacity to examine every such person, and before an attack there may be nothing to find. Retaining all that irrelevant data may, however, help forensic investigation.

Where there are genuine distinguishing variables, the model will find the matches even given extreme asymmetry in the data. "If we're going to report in any serious way, we will come up with lay language around, 'we were trying to identify 100 people in a population of 20,00 and we found 90 of them." Even then, care is needed to be sure you're finding what you think. The classic example here is the the US Army's trial using neural networks to find camouflaged tanks. The classifier fell victim to the coincidence that all the pictures with tanks in them had been taken on sunny days and all the pictures of empty forest on cloudy days. "That's the way bias works," Ball says.

Cathy_O'Neil_at_Google_Cambridge.jpgThe crucial problem is that we can't see the bias. In her book, O'Neil favors creating feedback loops to expose these problems. But these can be expensive and often can't be created - that's why the model was needed.

"A feedback loop may help, but biased predictions are not always wrong - but they're wrong any time you wander into the space of the bias," Ball says. In his example: say you're predicting people's weight given their height. You use one half of a data set to train a model, then plot heights and weights, draw a line, and use its slope and intercept to predict the other half. It works. "And Wired would write the story." Investigating when the model makes errors on new data shows the training data all came from Hong Kong schoolchildren who opted in, a bias we don't spot because getting better data is expensive, and the right answer is unknown.

"So it's dangerous when the system is trained on biased data. It's really, really hard to know when you're wrong." The upshot, Ball says, is that "You can create fair algorithms that nonetheless reproduce unfair social systems because the algorithm is fair only with respect to the training data. It's not fair with respect to the world."

Illustrations: Patrick Ball; confusion matrix (Jackverr); Cathy O'Neil (GRuban).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Stories about the border wars between cyberspace and real life are posted occasionally during the week at the net.wars Pinboard - or follow on Twitter.

October 5, 2012

The doors of probability

Mike Lynch has long been the most interesting UK technology entrepreneur. In 2000, he became Britain's first software billionaire. In 2011 he sold his company, Autonomy, to Hewlett-Packard for $10 billion. A few months ago, Hewlett-Packard let him escape back into the wild of Cambridge. We've been waiting ever since for hints of what he'll do next; on Monday, he showed up at NESTA to talk about his adventures with Wired UK editor David Rowan.

Lynch made his name and his company by understanding that the rule formulated in 1750 by the English vicar and mathematician Thomas Bayes could be applied to getting machines to understand unstructured data. These days, Bayes is an accepted part of the field of statistics, but for a couple of centuries anyone who embraced his ideas would have been unwise to admit it. That began to change in the 1980s, when people began to realize the value of his ideas.

"The work [Bayes] did offered a bridge between two worlds," Lynch said on Monday: the post-Renaissance world of science, and the subjective reality of our daily lives. "It leads to some very strange ideas about the world and what meaning is."

As Sharon Bertsch McGrayne explains in The Theory That Would Not Die, Bayes was offering a solution to the inverse probability problem. You have a pile of encrypted code, or a crashed airplane, or a search query: all of these are effects; your problem is to find the most likely cause. (Yes, I know: to us the search query is the cause and the page of search results if the effect; but consider it from the computer's point of view.) Bayes' idea was to start with a 50/50 random guess and refine it as more data changes the probabilities in one direction or another. When you type "turkey" into a search engine it can't distinguish between the country and the bird; when you add "recipe" you increase the probability that the right answer is instructions on how to cook one.

Note, however, that search engines work on structured data: tags, text content, keywords, and metadata all going into building an index they can run over to find the hits. What Lynch is talking about is the stuff that humans can understand - raw emails, instant messages, video, audio - that until now has stymied the smartest computers.

Most of us don't really like to think in probabilities. We assume every night that the sun will rise in the morning; we call a mug a mug and not "a round display of light and shadow with a hole in it" in case it's really a doughnut. We also don't go into much detail in making most decisions, no matter how much we justify them afterwards with reasoned explanations. Even decisions that are in fact probabilistic - such as those of the electronic line-calling device Hawk-Eye used in tennis and cricket - we prefer to display as though they were infallible. We could, as Cardiff professor Harry Collins argued, take the opportunity to educate people about probability: the on-screen virtual reality animation could include an estimate of the margin for error, or the probability that the system is right (much the way IBM did in displaying Watson's winning Jeopardy answers). But apparently it's more entertaining - and sparks fewer arguments from the players - to pretend there is no fuzz in the answer.

Lynch believes we are just at the beginning of the next phase of computing, in which extracting meaning from all this unstructured data will bring about profound change.

"We're into understanding analog," he said. "Fitting computers to use instead of us to them." In addition, like a lot of the papers and books on algorithms I've been reading recently, he believes we're moving away from the scientific tradition of understanding a process to get an outcome and into taking huge amounts of data about outcomes and from it extracting valid answers. In medicine, for example, that would mean changing from the doctor who examines a patient, asks questions, and tries to understand the cause of what's wrong with them in the interests of suggesting a cure. Instead, why not a black box that says, "Do these things" if the outcome means a cured patient? "Many people think it's heresy, but if the treatment makes the patient better..."

At the beginning, Lynch said, the Autonomy founders thought the company could be worth £2 to £3 million. "That was our idea of massive back then."

Now, with his old Autonomy team, he is looking to invest in new technology companies. The goal, he said, is to find new companies built on fundamental technology whose founders are hungry and strongly believe that they are right - but are still able to listen and learn. The business must scale, requiring little or no human effort to service increased sales. With that recipe he hopes to find the germs of truly large companies - not the put in £10 million sell out at £80 million strategy he sees as most common, but multi-billion pound companies. The key is finding that fundamental technology, something where it's possible to pick a winner.

Wendy M. Grossman's Web site has an extensive archive of her books, articles, and music, and an archive of all the earlier columns in this series.

February 18, 2011

What is hyperbole?

This seems to have been a week for over-excitement. IBM gets an onslaught of wonderful publicity because it built a very large computer that won at the archetypal American TV game, Jeopardy. And Eben Moglen proposes the Freedom box, a more-or-less pocket ("wall wart") computer you can plug in and that will come up, configure itself, and be your Web server/blog host/social network/whatever and will put you and your data beyond the reach of, well, everyone. "You get no spying for free!" he said in his talk outlining the idea for the New York Internet Society.

Now I don't mean to suggest that these are not both exciting ideas and that making them work is/would be an impressive and fine achievement. But seriously? Is "Jeopardy champion" what you thought artificial intelligence would look like? Is a small "wall wart" box what you thought freedom would look like?

To begin with Watson and its artificial buzzer thumb. The reactions display everything that makes us human. The New York Times seems to think AI is solved, although its editors focus, on our ability to anthropomorphize an electronic screen with a smooth, synthesized voice and a swirling logo. (Like HAL, R2D2, and Eliza Doolittle, its status is defined by the reactions of the surrounding humans.)

The Atlantic and Forbes come across as defensive. The LA Times asks: how scared should we be? The San Francisco Chronicle congratulates IBM for suddenly becoming a cool place for the kids to work.

If, that is, they're not busy hacking up Freedom boxes. You could, if you wanted, see the past twenty years of net.wars as a recurring struggle between centralization and distribution. The Long Tail finds value in selling obscure products to meet the eccentric needs of previously ignored niche markets; eBay's value is in aggregating all those buyers and sellers so they can find each other. The Web's usefulness depends on the diversity of its sources and content; search engines aggregate it and us so we can be matched to the stuff we actually want. Web boards distributed us according to niche topics; social networks aggregated us. And so on. As Moglen correctly says, we pay for those aggregators - and for the convenience of closed, mobile gadgets - by allowing them to spy on us.

An early, largely forgotten net.skirmish came around 1991 over the asymmetric broadband design that today is everywhere: a paved highway going to people's homes and a dirt track coming back out. The objection that this design assumed that consumers would not also be creators and producers was largely overcome by the advent of Web hosting farms. But imagine instead that symmetric connections were the norm and everyone hosted their sites and email on their own machines with complete control over who saw what.

This is Moglen's proposal: to recreate the Internet as a decentralized peer-to-peer system. And I thought immediately how much it sounded like...Usenet.

For those who missed the 1990s: invented and implemented in 1979 by three students, Tom Truscott, Jim Ellis, and Steve Bellovin, the whole point of Usenet was that it was a low-cost, decentralized way of distributing news. Once the Internet was established, it became the medium of transmission, but in the beginning computers phoned each other and transferred news files. In the early 1990s, it was the biggest game in town: it was where the Linus Torvalds and Tim Berners-Lee announced their inventions of Linux and the World Wide Web.

It always seemed to me that if "they" - whoever they were going to be - seized control of the Internet we could always start over by rebuilding Usenet as a town square. And this is to some extent what Moglen is proposing: to rebuild the Net as a decentralized network of equal peers. Not really Usenet; instead a decentralized Web like the one we gave up when we all (or almost all) put our Web sites on hosting farms whose owners could be DMCA'd into taking our sites down or subpoena'd into turning over their logs. Freedom boxes are Moglen's response to "free spying with everything".

I don't think there's much doubt that the box he has in mind can be built. The Pogoplug, which offers a personal cloud and a sort of hardware social network, is most of the way there already. And Moglen's argument has merit: that if you control your Web server and the nexus of your social network law enforcement can't just make a secret phone call, they'll need a search warrant to search your home if they want to inspect your data. (On the other hand, seizing your data is as simple as impounding or smashing your wall wart.)

I can see Freedom boxes being a good solution for some situations, but like many things before it they won't scale well to the mass market because they will (like Usenet) attract abuse. In cleaning out old papers this week, I found a 1994 copy of Esther Dyson's Release 1.0 in which she demands a return to the "paradise" of the "accountable Net"; 'twill be ever thus. The problem Watson is up against is similar: it will function well, even engagingly, within the domain it was designed for. Getting it to scale will be a whole 'nother, much more complex problem.

Wendy M. Grossman's Web site has an extensive archive of her books, articles, and music, and an archive of all the earlier columns in this series.

September 1, 2006

The elephant in the dark

Yesterday, August 31, was the actual 50th anniversary of the first artificial intelligence conference, held at Dartmouth in 1956 and recently celebrated with a kind of rerun. John McCarthy, who convened the original conference, spent yesterday giving a talk to a crowd of students at Imperial College, London, on challenges for machine learning, specifically recounting a bit of recent progress working with Stephen Muggleton and Ramon Otero on a puzzle he proposed in 1999.
Here is the puzzle, which expresses the problem of determining an underlying reality from an outward appearance. Most machine learning research, he noted, has concerned the classification of appearance. But this isn't enough for a robot – or a human – to function in the real world. "Robots will have to infer relations between reality and appearance."

One of his examples was John Dalton's work discovering atoms. "Computers need to be able to propose theories," he said – and later modify them according to new information. (Though I note that there are plenty of humans who are unable to do this and who will, despite all evidence and common sense to the opposite, cling desperately to their theory.)

Human common sense reasons in terms of the realities. Some research suggests, for example, that babies are born with some understanding of the permanence of objects – that is, that when an object is hidden by a screen and reappears it is the same object.

Take, as McCarthy did, the simple (for a human) problem of identifying objects without being able to see them; his example was reaching into your pocket and correctly identifying and pulling out your Swiss Army knife (assuming you live in a country where it's legal to carry one). Or identifying the coin you want from a collection of similar coins. You have some idea of what the knife looks and feels like, and you choose the item by its texture and what you can feel of the shape. McCarthy also cited an informal experiment in which people were asked to draw a statuette hidden in a paper bag – they could reach into the paper bag to feel the statue. People can actually do this with little difference than if they can see the object.

But, he said, "You never form an image of the contents of the pocket as a whole. You might form a list." He has, he said, been trying to get Stanford to make a robotic pickpocket.

You can, of course, have a long argument about whether there is such a thing as any kind of objective reality. I've been reading a lot of Philip K. Dick lately, and he had robots that were indistinguishable from humans, even to themselves; yet in Dick's work reality is a fluid, subjective concept that can be disrupted and turned back on itself at any time. You can't trust reality.

But even if you – or philosophers in general – reject the notion of "reality" as a fundamental concept, "You may still accept the notion of relative reality for the design and debugging of robots." Seems a practical approach.
But the more important aspect may be the amount of pre-existing knowledge. "The common view," he said, "is that a computer should solve everything from scratch." His own view is that it's best to provide computers with "suitably formalized" common sense concepts – and that formalizing context is a necessary step.

For example: when you reach into your pocket you have some idea of the contents are likely to be. Partly, of course, because you put them there. But you could make a reasonable guess even about other people's pockets because you have some idea of the usual size of pockets and the kinds of things people are likely to put in them. We often call that "common sense", but a lot of common sense is experience. Other concepts have been built into human and most animal infants through evolution.

Although McCarthy never mentioned it, that puzzle and these other examples all remind me of the story of the elephant and the blind men, which I first came across in the writings of Idries Shah, who attributed it to the Persian poet Rumi. Depending which piece of the elephant a blind man got hold of, he diagnosed the object as a fan (ear), pillar (leg), hose (trunk), or throne (back). It seems to me a useful analogy to explain why, 50 years on, human-level artificial intelligence still seems so far off. Computers don't have our physical advantages in interacting with the world.

An amusing sidelight that seemed to reinforce that point. After the talk, there was some discussion of building the three-dimensional reality behind McCarthy's puzzle. The longer it went on, the more confused I got about what the others thought they were building; they insisted there was no difficulty in getting around the construction problem I had, which was how to make the underlying arcs turn one and only one stop in each direction. How do you make it stop? I asked. Turns out: they were building it mentally with Meccano. I was using cardboard circles with a hole and a fastener in the middle, and marking pens. When I was a kid, girls didn't have Meccano. Though, I tell you, I'm going to get some *now*.

Wendy M. Grossman’s Web site has an extensive archive of her books, articles, and music, and an archive of all the earlier columns in this series. Readers are welcome to post here, at net.wars home, at her , or by email to (but please turn off HTML).