net.wars: August 2022 Archives

« July 2022 | Main | September 2022 »

August 26, 2022

Good-enough

A couple of months on from Amazon's synthesized personal voices, it was intriguing to read this week, in the Financial Times ($) (thanks to Charles Arthur's The Overspill), that several AI startups are threatening voice actors's employment prospects. Actors Equity is campaigning to extend legal protection to the material computers synthesize from actors' voices and likeness that, as Equity puts it, "reproduces performances without generating a 'recording' or a 'copy'." The union's survey found that 65 of performance artists and 93% of audio artists thought AI voices pose a threat to their livelihood.

Voices gives a breakdown of their assignmenets. Fortuitously, most jobs seek "real person" acting - exactly where voice synthesizers fail. For many situations, though - railway announcements, customer service, marketing campaigns - "real person" is overkill. Plus, AI voices, the FT notes, "can be made to say anything at the push of a button". No moral qualms need apply.

We have seen this movie before. This is a more personalized version of appropriating our data in order to develop automated systems - think Google's language translation, developed from billions of human-translated web pages, or the cooption of images posted on Flickr to build facial recognition systems later used to identify deportees. More immediately pertinent are the stories of Susan Bennett, the actress whose voice was Siri 2011, and Jen Taylor, the voice of Microsoft's Cortana. Bennett reportedly had no idea that the phrases and sentences she'd spent so many hours recording were in use until a friend emailed. Shouldn't she have the right to object- or to royalties?

Freelance writers have been here: the 1990s saw an industry-wide shift from first-rights contracts under which we controlled our work and licensed one-time use to all-rights contracts that awarded ownership in perpetuity to a shrinking number of conglomerating publishers. Photographers have been here, watching as the ecosystem of small, dedicated agencies that cared about them got merged into Corbis and Getty while their work opportunities shrank under the confluence of digital cameras, smartphones, and social media. Translators, especially, have been here: while the most complex jobs require humans, for many uses machine translation is good enough. It's actors' "good-enough" ground that is threatened.

Like so many technologies, personalized voice synthesis started with noble intentions - to help people who'd lost their own voices to injury or illness. The new crop of companies the FT identifies are profit-focused; as so often, it's not the technology itself, but the rapidly decreasing cost that's making trouble.

First historical anecdote: Steve Williams, animation director for the 1991 film Terminator 2, warned the London Film Festival that it would soon be impossible to distinguish virtual reality from physical reality. Dead presidents would appear live on the news and Cary Grant would make new movies, Obvious result: just as musicians compete against the entire back catalogue of recorded music, might actors now be up against long-dead stars when auditioning for a role?

Second historical anecdote: in 1993, Silicon Graphics, then leading the field of computer graphics, in collaboration with sensor specialist SimGraphics, presented VActor, a system that captured measurements of body movements from live actors and turned them into computer simulations. Creating a few minutes of the liquid metal man (Robert Patrick) in Terminator 2, although a similar process, took 50 animators a year. VActor was faster and much cheaper at producting a reusable library of "good-enough" expressions and body movements. At the time, the company envisioned the system's use for presentations at exhibitions and trade shows and even talk shows. Prior art: Max Headroom 1987-1988, In 2022, SimGraphics is still offering "real-time interactive characters" - these days, for the metaverse. Its website says VActor, now "AI-VActor", is successfully animating Mario.

Third historical anecdote: in 1997, Fred Astaire, despite being dead at the time, appeared in ads performing some of his most memorable dance moves with a Dirt Devil vacuum cleaner. The ad used CGI to replace two of his dance partners - a mop, a hat rack. If old Cary Grant did have career prospects, they were now lost: the public *hated* the ad. Among the objectors was Astaire's daughter, who returned one of the company's vacuum cleaners with a letter that siad, in part, "Yes, he did dance with a mop but he wasn't selling that mop and it was his own idea " The public at large agreed: Astaire's extraordinary artistry deserved better than an afterlife as a shill.

Today, voice actors really could find themselves competing for work against synthesized versions of themselves. Equity's approach seems to be to push to extend copyright so that performers will get royalties for future reuse. Actors might, however, be better served by the personality rights as granted in some jurisdictions (not the UK). This right helped Cheers actors George Wendt and John Ratzenberger win when they sued and won against a company that created robots that looked like them, and the one Bette Midler used when the singer in an ad fooled people into thinking she herself was singing.

The bottom line: a tough profession looks like getting even tougher. As Michael (Dustin Hoffman) says in Tootsie (written by Murray Schisgal and Larry Gelbart), "I don't believe in Hell. I believe in unemployment, but I don't believe in Hell."

Illustrations:: The Big Bang Theory's Rajesh (Kumal Nayyar) tries to date Siri (Becky O'Donahue).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Stories about the border wars between cyberspace and real life are posted occasionally during the week at the net.wars Pinboard - or follow on Twitter.

Posted by Wendy M. Grossman at 12:55 PM | Permalink | Comments (0) | TrackBacks (0)

Zero day

Years ago, an alarmist book about cybersecurity threats concluded with the suggestion that attackers' expertise at planting backdoors could result in a "zero day" when, at an attacker-specified time, all the world's computers could be shut down simultaneously.

That never seemed likely.

But if you *do* want to take down all of the computers in an area the easiest way is to cut off the electricity supply. Which, if the worst predictions for this year's winter in Britain come true, is what could happen, no attacker required. All you need is a government that insists, despite expert warnings, that there will be plenty of very expensive energy to go round for those who can afford it - even while the BBC reports that in some areas of West London the power grid is so stretched by data centers' insatiable power demands that new homes can't be built

Lack of electrical power is something even those rich enough not to have to choose between eating and heating can't ignore - particularly because they're also most likely to be dependent on broadband for remote working. But besides that: no power means no Internet: no way for kids to do their schoolwork or adults to access government sites to apply for whatever grants become available. Exponentially increasing energy prices already threatens small businesses, charities, care homes, child care centers, schools, food banks, hospitals, and libraries, as well as households. It won't be much consolation if we all wind up "saving" money because there's no power available to pay for.
***
In an earlier, analog, era, parents taking innocent nude photos of their kids were sometimes prosecuted when they tried to have them developed at the local photo shop. In the 2021 equivalent, Kashmir Mill reports at the New York Times, Google flagged pictures two fathers took of their young sons' genitalia in order to help doctors diagnose an infection, labeled them child sexual abuse material, ordered them deleted, suspended the fathers' accounts, and reported them to the police.

It's not surprising that Google has automated content moderation systems dedicated to identifying abuse images, which are illegal almost everywhere. What *has* taken people aback, however, was these fathers' complete inability to obtain redress, even after the police exonerated them. Most of us would expect Google to have a "human in the loop" review process to whom someone who's been wrongfully accused can appeal.

In reality, though, the result is more likely to be like what happened in the so-called Twitter joke trial. In that case, a frustrated would-be airline passenger trying to visit his girlfriend posted on Twitter that he might blow up the airport if he still couldn't get a flight. Everyone who saw the tweet, from the airport's security staff to police, agreed he was harmless - and yet no one was willing to be the person who took the risk of signing off on it, just in case. With suspected child abuse, the same applies: no one wants to risk being the person who wrongly signs off on dropping the accusations. Far easier to trust the machine, and if it sets of a cascade of referrals that cost an innocent parent their child (as well as all their back GMail, contacts list, and personal data), well...it's not your fault. This goes double for a company like Google, whose bottom line depends on providing as little customer services as possible.
***
Even though all around us are stories about the risks of trusting computers not to fail, last week saw a Twitter request for the loan of a child. For the purpose of: having it run in front of a Tesla operating on Full Self-Drive to prove the car would stop. At the Guardian, Arwa Mahdawi writes that said poster did find a volunteer, albeit with this caveat: "They just have to convince their wife." Apparently several wives were duly persuaded, and the children got to experience life as crash test dummies - er, beta testers. Fortunately, none were harmed .

Reportedly, Google/YouTube is acting promptly to get the resulting videos taken down, though is not reporting the parents, who, as a friend quipped, are apparently unaware that the Darwin Award isn't meant to be aspirational.
***
The last five years of building pattern recognition systems - facial recognition, social scoring, and so on - have seen a lot of evidence-based pushback against claims that these systems are fairer because they eliminate human bias. In fact they codify it because they are trained on data with the historical effects of those biases already baked in.

This week saw a disturbing watershed: bias has become a selling point. An SFGate story by Joshua Bote (spotted at BoingBoing) highlights Sanos, a Bay Area startup that offers software intended to "whiten" call center workers' voices by altering their accents into "standard American English". Having them adopt obviously fake English pseudonyms apparently wasn't enough.

Such as system, as Bote points out, will reinforce existing biases. If it works, it's perfectly designed to expand prejudice and entitlement along the lines of "Why should I have to deal with anyone whose voice or demeanor I don't like?" It's worse than virtual reality, which is at least openly a fictional simulation; it puts a layer of fake over the real world and makes us all less tolerant. This idea needs to fail.

Illustrations: One of the Tesla crashes investigated in New York Times Presents, discussed here in June.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Stories about the border wars between cyberspace and real life are posted occasionally during the week at the net.wars Pinboard - or follow on Twitter.

Posted by Wendy M. Grossman at 12:55 PM | Permalink | Comments (0) | TrackBacks (0)

August 19, 2022

Open connections

It's easy to say the future is hybrid. Much harder to say how to make it - and for "it" read "conferences and events" - hybrid and yet still a good experience for all involved.

I should preface this by stating the obvious: writers don't go to events the same way as other people. For one thing, our work is writing about what we find. So where a "native" (say, a lawyer at a conference on robots, policy, and law will be looking to connect their work to the work others at the conference are doing, the writer is always thinking, "Is that worth writing about?" or "Why is everyone excited about this paper?" You're also always looking around: who would be interesting to schmooze over the next lunch break?

For writers, then - or at least, *this* writer - attending remotely can be unsatisfying, more like reviewing a TV show. After one remote event last year, I approached a fellow attendee on Twitter and suggested a Zoom lunch to hash over what we'd just witnessed. She thought it was weird. In person, wandering up to join her lunchtime conversation would have been unremarkable. The need to ask makes people self-conscious.

And yet, there is a big advantage in being able to access many more events than you could ever afford to fly to. So I want hybrid events to *work*.

In a recent editorial note, a group of academic researchers set out guidelines and considerations for hybrid conferences, the result of discussions held in July 2021 at the Dagstuhl seminar on Climate Friendly Internet Research. They divide hybrid conferences into four types: passive (in which the in-person conference is broadcast to the remote audience, who cannot present or comment); semi-passive (in which remote participants can ask questions but not present or act as panelists); true (in which both local and remote participants have full access and capabilities); and distributed (in which local groups form clusters or nodes, which link together to form the main event).

I have encountered the first three of these (although I think the fourth holds a lot of promise). My general rule: the more I can participate as a remote attendee the better I like it and the more I feel like the conference must be joined in real time. A particular bugaboo is organizers who disable the chat window. At one in-person-only event this year, several panels were composed solely of remote speakers, who needed a technician's help to get audience feedback.

As the Dagstuhl authors write, hybrid events are not new. One of the organizations I'm involved with has enabled remote participation in council meetings for more than 15 years. At pre-pandemic meetings a telephone hookup and conference speaker provided dial-in access. Alongside, two of us typed into a live chat channel updates that both became the meeting's minutes and helped clarify what was being said and who was speaking. Those two also monitored the chat for remote participants who needed help being heard.

Folk music gatherings have developed practices that might be more broadly useful. For one thing, they set up many more breakout "rooms" than seems needed at first glance. One becomes the "parking lot" - a room where participants can leave their computer logged in, mic and camera off, so they can resume the session at any time without having to log in again. There's usually a "kitchen" or some such where people can chat with new and old friends. Every music session has both a music host and a technical assistant who keeps things running smoothly. And there is always an empty period following each session, so people can linger and the next session has ample set-up time. A lobby is continuously staffed by a host who helps incomers find the sessions they want and provides a point of contact if something is going wrong.

As both these examples suggest, enabling remote attendees to be full participants requires a lot of on-site support. In a discussion about this on Twitter, Jon Crowcroft, one of the note's authors, said, for example, that each in-person participant should also have a Zoom (or whatever) login so they could interact fully with remote participants, including accessing the chat window. I would second this. At a multi-track workshop earlier this year, some of the event's tracks were inaccessible because the room's only camera and microphone were poorly placed, making it impossible to see or understand commenters. At the end of each session the conference split in two; those of us on Zoom chatted to each other, while the in-person attendees wandered off to the room where the refreshments were. Crowcroft's recommendation would have helped a lot.

It's a lot of effort, but there is a big reason to do it, which the Dagstuhl authors also discuss: embracing diversity. The last two years have enabled all of us to gain contact with people who could never muster the funding or logistics to travel to distant events. Treating remote participants as an add-on sends the message that we're back to exclusionary business as previous normal. In locking us down, the pandemic also opened up much more of the world to participation. It would be wrong to close it back down again.

Illustrations: The second shot of the final episode of Better Call Saul (because I couldn't think of anything).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Stories about the border wars between cyberspace and real life are posted occasionally during the week at the net.wars Pinboard - or follow on Twitter.

Posted by Wendy M. Grossman at 1:37 PM | Permalink | Comments (0) | TrackBacks (0)

August 12, 2022

Nebraska story

This week saw the arrest of a Nebraska teenager and her mother, who are charged with multiple felonies for terminating the 17-year-old's pregnancy at 28 weeks and burying (and, apparently, trying to burn) the fetus. Allegedly, this was a home-based medication abortion...and the reason the authorities found out is that following a tip-off the police got a search warrant for the pair's Facebook accounts. There, the investigators found messages suggesting the mother had bought the pills and instructed her daughter how to use them.

Cue kneejerk reactions. "Abortion" is a hot button. Facebook privacy is a hot button. Result: in reporting these gruesome events most media have chosen to blame this horror story on Facebook for turning over the data.

As much as I love a good reason to bash Facebook, this isn't the right take.

Meta - Facebook's parent - has responded to the stories with a "correction" that says the company turned over the women's data in response to valid legal warrants issued by the Nebraska court *before* the Supreme Court ruling. The company adds, "The warrants did not mention abortion at all."

What the PR folks have elided is that both the Supreme Court's Dobbs decision, which overturned Roe v. Wade, and the wording of the warrants are entirely irrelevant. It doesn't *matter* that this case was about an abortion. Meta/Facebook will *always* turn over user data in compliance with a valid legal warrant issued by a court, especially in the US, its home country. So will every other major technology company.

You may dispute the justice of Nebraska's 2019 Pain-Capable Unborn Child Act, under which abortion is illegal after 20 weeks from fertilization (22 weeks in normal medical parlance). But that's not Meta's concern. What Meta cares about is legal compliance and the technical validity of the warrant. Meta is a business, not a social justice organization, and while many want Mark Zuckerberg to use his personal judgment and clout to refuse to do business with oppressive regimes (by which they usually mean China, or Myanmar), do you really want him and his company to obey only laws they agree with?

There will be many much worse cases to come, because states will enact and enforce the vastly more restrictive abortion laws that Dobbs enables, and there will be many valid legal warrants that force them to hand data to police bent on prosecuting people in excruciating pregnancy-related situations - and in many more countries. Even in the UK, where (except for Northern Ireland) abortion has been mostly non-contentious for decades, lurking behind the 1967 law which legalized abortion until 24 weeks is an 1861 statute under which abortion is criminal. That law, as Shanti Das recently wrote at the Guardian, has been used to prosecute dozens of women and a few men in the last decade. (See also Skeptical Inquirer.)

So if you're going to be mad at Facebook, be mad that the platform hadn't turned on end-to-end encryption for its messaging. That, as security engineer Alec Muffett has been pointing out on Twitter, would have protected the messages against access by both the system itself and by law enforcement. At the Guardian, Johana Bhuiyan reports the company is now testing turning on end-to-end encryption by default. Doubtless, soon to be followed by law enforcement and governments demanding special access.

Others advocate switching to other encrypted messaging platforms that, like Signal, provide a setting that allows you to ensure that messages automatically vape themselves after a specified number of days. Such systems retain no data that can be turned over.

It's good advice, up to a point. For one thing, it ignores most people's preference for using the familiar services their friends use. Adopting a second service just for, say, medical contacts adds complications; getting everyone you know to switch is almost impossible.

Second, it's also important to remember the power of metadata - data about data, which includes everything from email headers to search histories. "We kill people based on metadata," former NSA head Michael Hayden said in 2014 in a debate on the constitutionality of the NSA. (But not, he hastened to add, metadata collected from *Americans*.)

Logs of who has connected to whom and how frequently is often more revealing than the content of the messages sent back and forth. For example: the message content may be essentially meaningless to an outsider ("I can make it on Monday at two") until the system logs tell you that the sender is a woman of childbearing age and the recipient is an abortion clinic. This is why so many governments have favored retaining Internet connection data. Governments cite the usual use cases - organized crime, drug dealers, child abusers, and terrorists - when pushing for data retention, and they are helped by the fact that most people instinctively quail at the thought of others reading the *content* of their messages but overlook metadata's significance.intuitively grasp the importance of metadata - data about data, as in system logs, connection records - has helped enable mass Internet surveillance.

The net result of all this is to make surveillance capitalism-driven technology services dangerous for the 65.5 million women of childbearing age in the US (2020). That's a fair chunk of their most profitable users, a direct economic casualty of Dobbs.

Illustrations: Facebook.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Stories about the border wars between cyberspace and real life are posted occasionally during the week at the net.wars Pinboard - or follow on Twitter.

Posted by Wendy M. Grossman at 12:36 PM | Permalink | Comments (0) | TrackBacks (0)

August 5, 2022

Painting by numbers

My camera can see better than I can. I don't mean that it can take better pictures than I can because of its automated settings, although this is also true. I mean it can capture things I can't *see*. The heron above, captured on a grey day along the Thames towpath, was pretty much invisible to me. I was walking with a friend. Friend pointed and said, "Look. A heron" I pointed the camera more or less where she indicated, pushed zoom to maximum, hit the button, and when I got home there it was.

If the picture were a world-famous original, there might be a squabble about who owned the copyright. I pointed the camera and pushed the button, so in our world the copyright belongs to me. But my friend could stake a reasonable claim: without her, I wouldn't have known where or when to point the camera. The camera company (Sony) could argue, quite reasonably, that the camera and its embedded software, which took years to design and build, did all the work, while my entire contribution took but a second.

I imagine, however, that at the beginning of photography artists who made their living painting landscapes and portraits might have seen reason to be pretty scathing about the notion that photography deserved copyright at all. Instead of working for months to capture the right light and nuances...you just push a button? Where's the creative contribution in that?

This thought was inspired by a recent conversation on Twitter between two copyright experts - Lilian Edwards and Andres Guadamuz - who have been thinking for years about the allocation of intellectual property rights when an AI system creates or helps to create a new work. The proximate cause was Guadamuz's stunning experiments generating images usingMidjourney.

If you try out Midjourney's image-maker via the bot on its Discord server, you quickly find that each detail you add to your prompt adds detail and complexity to the resulting image; an expert at "prompt-craft" can come extraordinarily close to painting with the generation system. Writing prompts to control these generation systems and shape their output is becoming an art in itself, an expertise that will become highly valuable in itself. Guadamuz calls it "AI whispering".

Guadamuz touches on this in a June 2022 blog posting, in which he asks about the societal impact of being able to produce sophisticated essays, artworks, melodies, or software code based on a few prompts. The best human creators will still be the crucial element - I don't care how good you are at writing prompts, unless you're the human known as Vince Gilligan you+generator are not going to produce Breaking Bad or Better Call Saul. However, generation systems *might*, as Guadamuz says, produce material that's good-enough for many contexts, given that it''s free (ish).

More recently, Guadamuz considers the subject he and Edwards were mulling on Twitter: the ownership of copyright in generated images. Guadamuz had been reading the generators' terms and conditions. OpenAI, owner of DALL-E, specifies that users assign the copyright in all "Generations" its system produces, which it then places in the public domain whilegranting users a permanent license to do whatever they want with the Generations their prompts inspire. Midjourney takes the opposite approach: the user owns the generated image, and licenses it back to Midjourney.

What Guardamuz found notable was the trend toward assuming that generated images are subject to copyright, even though lawyers have argued that they can't be and fall into the public domain. Earlier this year, the US Copyright Office has rejected a request to allow an AI copyright a work. The UK is an outlier, awarding copyright in computer-generated works to the "person by whom the arrangements necessary for the creation of the work are undertaken". This is ambiguous: is that person the user who wrote the prompt or the programmers who trained the model and wrote the code?

Much of the discussion evolved around how that copyright might be divided up. Should it be shared between the user and the company that owns the generating tool? We don't assign copyright in the words we write to our pens or word processors; but as Edwards suggested, the generator tool is more like an artist for hire than a pen. Of course, if you hire a human artist to create an image for you, contract terms specify who owns the copyright. If it's a work made for hire, the artist retains no further interest.

So whatever copyright lawyers say, the companies who produce and own these systems are setting the norms as part of choosing their business model. The business of selling today's most sophisticated cameras derives from an industry that grew up selling physical objects. In a more recent age, they might have grown up selling software add-on tools on physical media. Today, they may sell subscriptions and tiers of functionality. Nonetheless, if a company's leaders come to believe there is potential for a low-cost revenue stream of royalties for reusing generated images, it will go for it. Corbis and Getty have already pioneered automated copyright enforcement.

For now, these terms and conditions aren't about developing legal theory; the companies just don't want to get sued. These are cover-your-ass exercises, like privacy policies.

Illustrations: Grey heron hanging out by the Thames in spring 2021.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Stories about the border wars between cyberspace and real life are posted occasionally during the week at the net.wars Pinboard - or follow on Twitter.

Posted by Wendy M. Grossman at 10:05 AM | Permalink | Comments (0) | TrackBacks (0)