« Competing dangerously | Main | JAQing off »

Think horses, not zebras

IBM-watson-jeopardy.pngThese two articles made a good pairing: Oscar Schwartz's critique of AI hype in the Guardian, and Jennings Brown's takedown of IBM's Watson in real-world contexts. Brown's tl;dr: "This product is a piece of shit," a Florida doctor reportedly told IBM in the leaked memos on which Gizmodo's story is based. "We can't use it for most cases."

Watson has had a rough ride lately: in August 2017 Brown catalogued mounting criticisms of the company and its technology; that June, MIT Technology Review did, too. All three agree: IBM's marketing has outstripped Watson's technical capability.

That's what Schwartz is complaining about: even when scientists make modest claims; media and marketing hype it to the hilt. As a result, instead of focusing on design and control issues such as how to encode social fairness into algorithms, we're reading Nick Bostrom's suggestion that an uncontrolled superintelligent AI would kill humanity in the interests of making paper clips or the EU's deliberation about whether robots should have rights. These are not urgent issues, and focusing on them benefits only vendors who hope we don't look too closely at what they're actually doing.

Schwartz's own first example is the Facebook chat bots that were intended to simulate negotiation-like conversations. Just a couple of days ago someone referred to this as bots making up their own language and cited it as an example of how close AI is to the Singularity. In fact, because they lacked the right constraints, they just made strange sentences out of normal English words. The same pattern is visible with respect to self-driving cars.

You can see why: wild speculation drives clicks - excuse me, monetized eyeballs - but understanding what's wrong with how most of us think about accuracy in machine learning is *mathy*. Yet understanding the technology's very real limits is crucial to making good decisions about it.

With medicine, we're all particularly vulnerable to wishful thinking, since sooner or later we all rely on it for our own survival (something machines will never understand). The UK in particular is hoping AI will supply significant improvements because of the vast amount of patient, that is, training, data the NHS has to throw at these systems. To date, however, medicine has struggled to use information technology effectively.

Attendees at We Robot have often discussed what happens when the accuracy of AI diagnostics outstrips that of human doctors. At what point does defying the AI's decision become malpractice? At this year's conference, Michael Froomkin presented a paper studying the unwanted safety consequences of this approach (PDF).

The presumption is that the AI system's ability to call on the world's medical literature on top of generations of patient data will make it more accurate. But there's an underlying problem that's rarely mentioned: the reliability of the medical literature these systems are built on. The true extent of this issue began to emerge in 2005, when John Ioannidis published a series of papers estimating that 90% of medical research is flawed. In 2016, Ioannidis told Retraction Watch that systematic reviews and meta-analyses are also being gamed because of the rewards and incentives involved.

The upshot is that it's more likely to be unclear, when doctors and AI disagree, where to point the skepticism. Is the AI genuinely seeing patterns and spotting things the doctor can't? (In some cases, such as radiology, apparently yes. But clinical trials and peer review are needed.) Does common humanity mean the doctor finds clues in the patient's behavior and presentation that an AI can't? (Almost certainly.) Is the AI neutral in ways that biased doctors may not be? Stories of doctors not listening to patients, particularly women, are legion. Yet the most likely scenario is that the doctor will be the person entering data - which means the machine will rely on the doctor's interpretation of what the patient says. In all these conflicts, what balance do we tell the AI to set?

Much sooner than Watson will cure cancer we will have to grapple with which AIs have access to which research. In 2015, the team responsible for drafting Liberia's ebola recovery plan in 2014 wrote a justifiably angry op-ed in the New York Times. They had discovered that thousands of Liberians could have been spared ebola had a 1982 paper for Annals of Virology been affordable for them to read; it warned that Liberia needed to be included in the ebola virus endemic zone. Discussions of medical AI to date appear to handwave this sort of issue, yet cost structures, business models, and use of medical research are crucial. Is the future open access, licensing and royalties, all-you-can-eat subscriptions?

The best selling point for AI is that its internal corpus of medical research can be updated a lot faster than doctors' brains can be. In 2017, David Epstein wrote at ProPublica, many procedures and practices become entrenched, and doctors are difficult to dissuade from prescribing them even when they've been found useless. In the US, he added, the 21st Century Cures Act, passed in December 2016, threatens to make all this worse by lowering standards of evidence.

All of these are pressing problems no medical AI can solve. The problem, as usual, is us.

Illustrations: Watson wins at Jeopardy (via Wikimedia)

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Stories about the border wars between cyberspace and real life are posted occasionally during the week at the net.wars Pinboard - or follow on Twitter.


TrackBack URL for this entry:

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)