" /> net.wars: June 2021 Archives

« May 2021 | Main | July 2021 »

June 25, 2021

Them

Thumbnail image for IBM-watson-jeopardy.png"It." "It." "It."

In the first two minutes of a recent episode of the BBC program Panorama, "Are You Scared Yet, Human?", the word that kept popping out was "it". The program is largely about the AI race between the US and China, an obviously important topic - see Amy Webb's recent book, The Big Nine. But what I wanted to scream at the show's producers was: "AI is not *it*. AI is *they*." The program itself proved this point by seguing from commercial products to public surveillance systems to military dreams of accurate targeting and ensuring an edge over the other country.

The original rantish complaint I thought I was going to write was about gendering AI-powered voice assistants and, especially, robots. Even though Siri has a female voice it's not a "she". Even if Alexa has a male voice it's not a "he". Yes, there's a long tradition of dubbing ships, countries, and even fiddles "she", but that bothers me less than applying the term to a compliant machine. Yolande Strengers and Jenny Kennedy made this point quite well in their book The Smart Wife, in which they trace much of today's thinking about domestic robots to the role model of Rosie, in the 1960s outer space animated TV sitcom The Jetsons. Strengers and Kennedy want to "queer" domestic robots so they no longer perpetuate heteronormative gender stereotypes.

The it-it-it of Panorama raised a new annoyance. Calling AI "it" - especially when the speaker is, as here, Jeff Bezos or Elon Musk - makes it sound like a monolithic force of technology that can't be stopped or altered, rather than what it is: am umbrella term for a bunch of technologies, many of them experimental and unfinished, and all of which are being developed and/or exploited by large companies and military agencies for their own purposes, not ours. "It" hides the unrepresentative workforce defining AI's present manifestation, machine learning. *This* AI is "systems", not a *thing*, and their impact varies depending on the application.

Last week, Pew Research released the results of a survey it conducted in 2020, in which two-thirds of the experts they consulted predicted that ethics would not be embedded in AI by 2030. Many pointed out that societies and contexts differ; that who gets to define "ethics" is crucial, and that there will always be bad actors who ignore whatever values the rest of us agree on. The report quotes me saying it's not AI that needs ethics, it's the *owners*.

I made a stab at trying to categorize the AI systems we encounter every day. The first that spring to mind are scoring applications whose impact on most people's lives appears to be in refusing access to things we need - asylum, probation in the criminal justice system, welfare in the benefits system, credit in the financial system - and assistance systems that answer questions and offer help, such as recommendation algorithms, search engines, voice assistants, and so on. I forgot about systems playing games, and since then a fourth type has accelerated into public use, in the form of identification systems, almost all of them deeply flawed but being deployed anyway: automated facial recognition, emotion recognition, smile detection, and fancy lie detectors.

I also forgot about medical applications, but despite many genuine breakthroughs - such as today's story that machine learning has helped develop a blood test to detect 50 types of early-stage cancer - many highly touted efforts have been failures.

"It"ifying AI makes many machine learning systems sound more successful than they are. Today's facial recognition is biased and inaccurate . Even in the pandemic, Benedict Dellot told a recent Westminster Health Forum seminar on AI in health care, the big wins in the pandemic have come from conventional data analysis underpinned by new data sharing arrangements. As examples, he cited sharing lists of shielding patients with local authorities to ensure they got the support they needed, linking databases to help local authorities identify vulnerable people, and repurposing existing technologies. But shove "AI" in the name and it sounds more exciting; see also "nano" before this and "e-" before that.

Maybe - *maybe* - one day we will say "AI" and mean a conscious, superhuman brain as originally imagined by science fiction writers and Alan Turing. Machine learning is certainly not that. as Kate Crawford writes in her recent Atlas of AI. Instead, we're talking about a bunch of computers calculating statistics from historical data, forever facing backward. And, as authors such as Sarah T. Roberts and Mary L. Gray and Siddharth Suri have documented, very often today's AI is humans all the way down. Direct your attention to the poorly-paid worker behind the curtain.

Crawford's book reminded me of Arthur C. Clarke's famous line, "Any sufficiently advanced technology is indistinguishable from magic." After reading her structural analysis of machine-learning-AI, it morphed into: "Any technology that looks like magic is hiding something." For Crawford, what AI is hiding is its essential nature as an extractive industry. Let's not grant these systems any more power than we have to. Breaking "it" apart into "them" allows us to pick and choose the applications we want.

Illustrations: IBM's Watson winning at Jeopardy; its later adventures in health care were less successful.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Stories about the border wars between cyberspace and real life are posted occasionally during the week at the net.wars Pinboard - or follow on Twitter.

June 18, 2021

Libera me

irc-networks-netsplit.de-top10_2021u.pngA man walks into his bar and finds...no one there.

OK, so the "man" was me, and the "bar" was a Reddit-descended IRC channel devoted to tennis...but the shock of emptiness was the same. Because tennis is a global sport, this channel hosts people from Syracuse NY, Britain, Indonesia, the Netherlands. There is always someone commenting, checking the weather wherever tennis is playing, checking scores, or shooting (or befriending) the channel's frequent flying ducks.

Not now: blank, empty void, like John Oliver's background for the last, no-audience year. Those eight listed users are there in nickname only.

A year ago at this time, this channel's users were comparing pandemic restrictions. In our lockdowns, I liked knowing there was always someone in another time zone to type to in real time. So: slight panic. Where *are* they?

IRC dates to the old cooperative Internet. It's a protocol, not a service, so anyone can run an IRC server, and many people do, even though the mainstream, especially the younger mainstream, long since moved on through instant messaging and on to Twitter, WhatsApp groups, Telegram channels, Slack, and Discord. All of these undoubtedly look prettier and are easier to use, but the base functionality hasn't changed all that much.

IRC's enduring appeal is that it's all plain text and therefore bandwidth-light, it can host any size of conversation from a two-person secret channel to a public channel of thousands, multiple clients are available on every platform, and it's free. Genuinely free, not pay-with-data free - no ads! Accordingly, it's still widely used in the open source community. Individual channels largely set their own standards and community norms...and their own games. Circa 2003, I played silly trivia quizzes on a TV-related channel. On this one...ducks. A sample:

゜゜・。 ​ 。・゜゜\_o​< FLAP​ FLAP!

However, the fact that anyone *can* run their own server doesn't mean that everyone *does*, and like other Internet services (see also: open web, email), IRC gravitated towards larger networks that enable discovery. If you host your own server, strangers can only find it if you let them; on a large network users can search for channels, find topics they're interested in, and connect to the nearest server. While many IRC networks still survive, in recent years by far the biggest, according to Netsplit, is Freenode, largely because of its importance in providing connections and support for the open source community. Freenode is also where the missing tennis channel was hosted until about Tuesday, three days before I noticed it was silent. As you'll see in the Netsplit image above, that was when Freenode traffic plummeted, countered by a near-vertical rise in traffic on Libera Chat. That is where my channel turned out to be restored to its usual bustling self.

What happened is both complicated and pretty simple: ownership changed hands without anyone's quite realizing what it was going to mean. To say that IRC is free to use does not mean there are no costs: besides computers and bandwidth, the owners of IRC servers must defend their networks against attacks. Freenode, Wikipedia explains, began as a Linux support channel on another network run by four people, who went on to set up their own network, which eventually became the largest support network for the open source community. A series of ownership changes led from a California charity through a couple of steps to today's owner, the UK-based private company Freenode Ltd, which is owned by Andrew Lee, a technology entrepreneur and founder of the Private Internet Access VPN. No one appears to have thought much about this until last month, when 20 to 30 of the volunteers who run Freenode ("staff") resigned accusing Lee of executing a hostile takeover. Some of them promptly set up Libera as an alternative.

What makes this story about a somewhat arcane piece of the old Internet interesting - aside from the book that demands to be written about IRC's rich history, culture, and significance - is that this is the second time in the last 18 months that a significant piece of the non-profit infrastructure has been targeted for private ownership. The other was the .org top-level domain. These underpinnings need better protection.

On the day traffic plummeted, Lee made deciding to move really easy: as part of changing the network's underlying software, he decided to remove the entire database of registered names and channels - committing suicide, some called it. Because, really: if you're going to have to reregister and reconstruct everything anyway, the barrier to moving to that identical new network over there with all the familiar staff and none of the new owner mishegoss is gone. Hence the mass exodus.

This is why IRC never spawned a technology giant: no lock-in. Normally when you move a conversation it dies. In this case, the entire channel, with its scripts and games and familiar interface, could be recreated at speed and resume as if nothing had happened. All they had to do was tell people. Five minutes after I posted a plaintive query on Reddit, someone came to retrieve me.

So, now: a woman logs into an IRC channel and finds all the old regulars. A duck flaps past. I have forgotten the ".bang" command. I type ".bef" instead. The duck is saved.

Illustrations: Netsplit's graph of IRC network traffic from June 2021.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Stories about the border wars between cyberspace and real life are posted occasionally during the week at the net.wars Pinboard - or follow on Twitter.

June 11, 2021

The fragility of strangers

Colonial_Pipeline_System.pngThis week, someone you've never met changed the configuration settings on their individual account with a company you've never heard of and knocked out 85% of that company's network. Dumb stuff like this probably happens all the time without attracting attention, but in this case the company, Fastly. is a cloud provider that also runs an intermediary content delivery network intended to speed up Internet connections. Result: people all over the world were unable to reach myriad major Internet sites such as Amazon, Twitter, Reddit, and the Guardian for about an hour.

The proximate cause of these outages, Fastly has now told the world, was a bug that was introduced (note lack of agency) into its software code in mid-May, which laid dormant until someone did something completely normal to trigger it.

In the early days, we all assumed that as more companies came onstream and admins built experience and expertise, this sort of thing would happen less and less. But as the mad complexity of our computer systems and networks continues to increase - Internet of Things! AI! - now it's more likely that stuff like this will also increase, will be harder to debug, and will cause far more ancillary damage - and that damage will not be limited to the virtual world. A single random human, accidentally or intentionally, is now capable of creating physical-world damage at scale.

Ransomware attacks earlier this month illustrate this. Attackers' use of a single leaked password linked to a disused VPN account in the systems that run the Colonial Pipeline compromised gasoline supplies down a large swathe of the US east coast. Near-simultaneously, a ransomware attack on the world's largest meatpacker, JBS, briefly halted production, threatening food security in North America and Australia. In December, an attack on network management software supplied by the previously little-known SolarWinds compromised more than 18,000 companies and government agencies. In all these cases, random strangers reached out across the world and affected millions of personal lives by leveraging a vulnerability inside a company that is not widely known but that provides crucial services to companies we do know and use every day.

An ordinary person just trying to live their life has no defense except to have backups of everything - not just data, but service providers and suppliers. Most people either can't afford that or don't have access to alternatives, which means that precarious lives are made even more so by hidden vulnerabilities they can't assess.

An earlier example: in 2012, journalist Matt Honan's data was entirely wiped out through an attack that leveraged quirks of two unrelated services - Apple and Amazon - against each other to seize control of his email address and delete all his data. Moral: data "in the cloud" is not a backup, even if the hosting company says they keep backups. Second moral: if there is a vulnerability, someone will find it, sometimes for motives you would never guess.

If memory serves, Akamai, founded in 1998, was the first CDN. The idea was that even though the Internet means the death of distance, physics matters. Michael Lewis captured this principle in detail in his book Flash Boys, in which a handful of Wall Street types pay extraordinary amounts to shave a few split-seconds off the time it takes to make a trade by using a ruler and map to send fiber topic cables along the shortest possible route between exchanges. Just so, CDNs cache frequently accessed content on mirror servers around the world. When you call up one of those pages, it, or frequently-used parts of it in the case of dynamically assembled pages, is served up from the nearest of those servers, rather than from the distant originator. By now, there are dozens of these networks and what they do has vastly increased in sophistication, just as the web itself has. A really major outlet like Amazon will have contracts with more than one, but apparently switching from one to the other isn't always easy, and because so many outages are very short it's often easier to wait it out. Not in this case!

At The Conversation, criminology professor David Wall also sees this outage as a sign of the future for the same reason I do: centralization and consolidation have shrunk, and continue to shrink, the number of single points of widespread failure. Yes, the Internet was built to withstand a bomb outage is true - but as we have been writing for 20 years now, this Internet is not that Internet. The path to today's Internet has led from the decentralized era of Usenet, IRC, and own-your-own mail server to web hosting farms to the walled gardens of Facebook, Google, and Apple, and the AI-dominating Big Nine. In 2013, Edward Snowden's revelations made plain how well that suits surveillance-hungry governments, and it's only gotten worse since, as companies seek to insert themselves into every aspect of our lives - intermediaries that bring us a raft of new insecurities that we have no time or ability to audit.

Increasing complexity, hidden intermediation, increasing numbers of interferers, and increasing scale all add up to a brittle and fragile Internet, onto which we continue to pile all our most critical services and activities. What could possibly go wrong?


Illustrations: Map of the Colonial Pipeline.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Stories about the border wars between cyberspace and real life are posted occasionally during the week at the net.wars Pinboard - or follow on Twitter.

June 4, 2021

Data serfs

Asklepios_-_Epidauros.jpgIt is shameful that the UK government has apparently refused to learn anything over decades of these discussions, and is now ordering GPs in England to send their patient data to NHSx beginning on July 1 and continuing daily thereafter. GPs are unhappy about this. Patients - that is, the English population - have until June 23 to opt out. Government information has been so absent that if it were not for medConfidential we might not even know it was happening. The opt-out process is a dark pattern; here's how.

The pandemic has taught us a lot about both upsides and downsides of sharing information. The downside is the spread of covid conspiracy theories, refusal to accept public health measures, and death threats to public health experts.

But there's so much more upside. The unprecedented speed with which we got safe and effective vaccinations was enormously boosted by the Internet. The original ("ancestral") virus was genome-sequenced and shared across the world within days, enabling everyone to get cracking. While the heavy reliance on preprint servers meant some errors have propagated, rapid publication and direct access to experts has done far more good than harm overall.

Crowdsourcing is also proving its worth: by collecting voluntary symptom and test/vaccination status reports from 4.6 million people around the UK, the Covid Symptom Study, to which I've contributed daily for more than a year, has identified additional symptoms, offered early warning of developing outbreaks, and assessed the likelihood of post-vaccination breakthrough covid infections. The project is based on an app built by the startup Joinzoe in collaboration with 15 charities and academic research organizations. From the beginning it has seemed an obviously valuable effort worth the daily five seconds it takes to report - and worth giving up a modest amount of data privacy for - because the society-wide benefit is so obvious. The key points: the data they collect is specific, they show their work and how my contribution fits in, I can review what I've sent them, and I can stop at any time. In the blog, the project publishes ongoing findings, many of which have generated journal papers for peer review.

The government plans meet none of these criteria. The data grab is comprehensive, no feedback loop is proposed, and the subject access rights enshrined in data protection law are not available. How could it be more wrong?

Established in 2019, NHSx is the "digital arm" of the National Health Service. It's the branch that commissioned last year's failed data-collecting contact tracing app ("failed", as in many people correctly warned that their centralized design was risky and wouldn't work,). NHSx is all data and contracts. It has no direct relationship with patients, and many people don't know it exists. This is the organization that is demanding the patient records of 56 million people, a policy Ross Anderson dates to 1992.

If Britain has a national religion it's the NHS. Yes, it's not perfect, and yes, there are complaints - but it's a lot like democracy: the alternatives are worse. The US, the only developed country that has refused a national health system, is near-universally pitied by those outside it. For those reasons, no politician is ever going to admit to privatizing the NHS, and most citizens are suspicious, particularly of conservatives, that this is what they secretly want to do.

Brexit has heightened these fears, especially among those of us who remember 2014, when NHS England announced care.data, a plan to collect and potentially sell NHS patient data to private companies. Reconstructing the UK's economy post-EU membership has always been seen as involving a trade deal with the US, which is likely to demand free data flows and, most people believe, access to the NHS for its private medical companies. Already, more than 50 GPs' practices (1%) are managed by Operose, a subsidiary of US health insurer Centene. The care.data plan was rapidly canceled with a promise to retreat and rethink.

Seven years later, the new plan is the old plan, dusted off, renamed, and expanded. The story here is the same: it's not that people aren't willing to share data; it's that we're not willing to hand over full control. The Joinzoe app has worked because every day each contributor remakes the decision to participate and because the researchers provide a direct feedback loop that shows how the data is being used and the results. NHSx isn't offering any of that. It is assuming the right to put our most sensitive personal data into a black box it owns and controls and keep doing so without granting us any feedback or recourse. This is worse than advertisers pretending that we make free choices to accept tracking. No one in this country has asked for their relationship with their doctor to be intermediated by a bunch of unknown data managers, however well-meaning. If their case for the medical and economic benefits is so strong (and really, it is, *when done right*), why not be transparent and open about it?

The pandemic has made the case for the value of pooling medical data. But it has also been a perfect demonstration of what happens when trust seeps out of a health system - as it does when governments feudally treat citizens as data serfs. *Both* lessons should be learned.


Illustrations: Asklepios, Greek god of medicine.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Stories about the border wars between cyberspace and real life are posted occasionally during the week at the net.wars Pinboard - or follow on Twitter.