" /> net.wars: March 2018 Archives

« February 2018 | Main | April 2018 »

March 30, 2018

Conventional wisdom

Hague-Grote_Markt_DH3.JPGOne of the problems the internet was always likely to face as a global medium was the conflict over who gets to make the rules and whose rules get to matter. So far, it's been possible to kick the can down the road for Future Government to figure out while each country makes its own rules. It's clear, though, that this is not a workable long-term strategy, if only because the longer we go on without equitable ways of solving conflicts, the more entrenched the ad hoc workarounds and because-we-can approaches will become. We've been fighting the same battles for nearly 30 years now.

I didn't realize how much I longed for a change of battleground until last week's Internet Law Works-in-Progress paper workshop, when for the first time I heard an approach that sounded like it might move the conversation beyond the crypto wars, the censorship battles, and the what-did-Facebook-do-to-our-democracy anguish. The paper was presented by Asaf Lubin, a Yale JSD candidate whose background includes a fellowship at Privacy International. In it, he suggested that while each of the many cases of international legal clash has been considered separately by the courts, the reality is that together they all form a pattern.

asaf-lubin-yale.pngThe cases Lubin is talking about include the obvious ones, such as United States v. Microsoft, currently under consideration in the US Supreme Court and Apple v. FBI. But they also include the prehistoric cases that created the legal environment we've lived with for the last 25 years: 1996's US v. Thomas, the first jurisdictional dispute, which pitted the community standards of California against those of Tennessee (making it a toss-up whether the US would export the First Amendment Puritanism); 1995's Stratton Oakmont v. Prodigy, which established that online services could be held liable for the content their users posted; and 1991's Cubby v. CompuServe, which ruled that CompuServe was a distributor, not a publisher and could not be held liable for user-posted content. The difference in those last two cases: Prodigy exercised some editorial control over postings; CompuServe did not. In the UK, notice-and-takedown rules were codified after the Godfrey v. Demon Internet extended defamation law to the internet..

Both access to data - whether encrypted or not - and online content were always likely to repeatedly hit jurisdictional boundaries, and so it's proved. Google is arguing with France over whether right-to-be-forgotten requests should be deindexed worldwide or just in France or the EU. The UK is still planning to require age verification for pornography sites serving UK residents later this year, and is pondering what sort of regulation should be applied to internet platforms in the wake of the last two weeks of Facebook/Cambridge Analytica scandals.

The biggest jurisdictional case, United States v. Microsoft, may have been rendered moot in the last couple of weeks by the highly divisive Clarifying Lawful Overseas Use of Data (CLOUD) Act. Divisive because: the technology companies seem to like it, EFF and CDT argue that it's an erosion of privacy laws because it lowers the standard of review for issuing warrants, and Peter Swire and Jennifer Daskal think it will improve privacy by setting up a mechanism by which the US can review what foreign governments do with the data they're given; they also believe it will serve us all better than if the Supreme Court rules in favor of the Department of Justice (which they consider likely).

Looking at this landscape, "They're being argued in a siloed approach," Lubin said, going on to imagine the thought process of the litigants involved. "I'm only interested in speech...or I'm a Mutual Legal Assistance person and only interested in law enforcement getting data. There are no conversations across fields and no recognition that the problems are the same." In conversation at conferences, he's catalogued reasons for this. Most cases are brought against companies too small to engage in too-complex litigation and who fear antagonizing the judge. Larger companies are strategic about which cases they argue and in front of whom; they seek to avoid having "sticky precedents" issued by judges who don't understand the conflicts or the unanticipated consequences. Courts, he said, may not even be the right forums for debating these issues.

The result, he went on to say, is that these debates conflate first-order rules, such as the right balance on privacy and freedom of expression, with second-order rules, such as the right procedures to follow when there's a conflict of laws. To solve the first-order rules, we'd need something like a Geneva Convention, which Lubin thought unlikely to happen.

To reach agreement on the second-order rules, however, he proposes a Hague Convention, which he described as "private international law treaties" that could address the problem of agreeing the rules to follow when laws conflict. As neither a lawyer nor a policy wonk, the idea sounded plausible and interesting: these are not debates that should be solved by either "Our lawyers are bigger and more expensive than your lawyers" or "We have bigger bombs." (Cue Tom Lehrer: "But might makes right...") I have no idea if such an idea can work or be made to happen. But it's the first constructive new suggestion I've heard anyone make for changing the conversation in a long, long time.


Illustrations: The Hague's Grote Markt (via Wikimedia; Asaf Lubin.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Stories about the border wars between cyberspace and real life are posted occasionally during the week at the net.wars Pinboard - or follow on Twitter.

March 23, 2018

Aspirational intelligence

2001-hal.png"All commandments are ideals," he said. He - Steven Croft, the Bishop of Oxford - had just finished reading out to the attendees of Westminster Forum's seminar (PDF) his proposed ten commandments for artificial intelligence. He's been thinking about this on our behalf: Croft malware writers not to adopt AI enhancements. Hence the reply.

The first problem is: what counts as AI? Anders Sandberg has quipped that it's only called AI until it starts working, and then it's called automation. Right now, though, to many people "AI" seems to mean "any technology I don't understand".

Croft's commandment number nine seems particularly ironic: this week saw the first pedestrian killed by a self-driving car. Early guesses are that the likely weakest links were the underemployed human backup driver and the vehicle's faulty LIDAR interpretation of a person walking a bicycle. Whatever the jaywalking laws are in Arizona, most of us instinctively believe that in a cage match between a two-ton automobile and an unprotected pedestrian the car is always the one at fault.

Thinking locally, self-driving cars ought to be the most ethics-dominated use of AI, if only because people don't like being killed by machines. Globally, however, you could argue that AI might be better turned to finding the best ways to phase out cars entirely.

We may have better luck at persuading criminal justice systems to either require transparency, fairness, and accountability in machine learning systems that predict recidivism and who can be helped or drop them entirely.

The less-tractable issues with AI are on display in the still-developing Facebook and Cambridge Analytica scandals. You may argue that Facebook is not AI, but the platform certainly uses AI in fraud detection and to determine what we see and decide which of our data parts to use on behalf of advertisers. All on its own, Facebook is a perfect exemplar of all the problems Australian privacy advocate foresaw in 2004 after examining the first social networks. In 2012, Clark wrote, "From its beginnings and onward throughout its life, Facebook and its founder have demonstrated privacy-insensitivity and downright privacy-hostility." The same could be said of other actors throughout the tech industry.

Yonatan Zunger is undoubtedly right when he argues in the Boston Globe that computer science has an ethics crisis. However, just fixing computer scientists isn't enough if we don't fix the business and regulatory environment built on "ask forgiveness, not permission". Matthew Stoll writes in the Atlantic about the decline since the 1970s of American political interest in supporting small, independent players and limiting monopoly power. The tech giants have widely exported this approach; now, the only other government big enough to counter it is the EU.

The meetings I've attended of academic researchers considering ethics issues with respect to big data have demonstrated all the careful thoughtfulness you could wish for. The November 2017 meeting of the Research Institute in Science of Cyber Security provided numerous worked examples in talks from Kat Hadjimatheou at the University of Warwick, C Marc Taylor from the the UK Research Integrity Office, and Paul Iganski the Centre for Research and Evidence on Security Threats (CREST). Their explanations of the decisions they've had to make about the practical applications and cases that have come their way are particularly valuable.

On the industry side, the problem is not just that Facebook has piles of data on all of us but that the feedback loop from us to the company is indirect. Since the Cambridge Analytica scandal broke, some commenters have indicated that being able to do without Facebook is a luxury many can't afford and that in some countries Facebook *is* the internet. That in itself is a global problem.

Croft's is one of at least a dozen efforts to come up with an ethics code for AI. The Open Data Institute has its Data Ethics Canvas framework to help people working with open data identify ethical issues. The IEEE has published some proposed standards (PDF) that focus on various aspects of inclusion - language, cultures, non-Western principles. Before all that, in 2011, Danah Boyd and Kate Crawford penned Six Provocations for Big Data, which included a discussion of the need for transparency, accountability, and consent. The World Economic Forum published its top ten ethical issues in AI in 2016. Also in 2016, a Stanford University Group published a report trying to fend off regulation by saying it was impossible.

If the industry proves to be right and regulation really is impossible, it won't be because of the technology itself but because of the ecosystem that nourishes amoral owners. "Ethics of AI", as badly as we need it, will be meaningless if the necessary large piles of data to train it are all owned by just a few very large organizations and well-financed criminals; it's equivalent to talking about "ethics of agriculture" when all the seeds and land are owned by a child's handful of global players. The pre-emptive antitrust movement of 2018 would find a way to separate ownership of data from ownership of the AI, algorithms, and machine learning systems that work on them.


Illustrations: HAL.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Stories about the border wars between cyberspace and real life are posted occasionally during the week at the net.wars Pinboard - or follow on Twitter.

March 16, 2018

Homeland insecurity

United_Kingdom_foreign_born_population_by_country_of_birth.pngTo the young people," a security practitioner said at a recent meeting, speaking of a group he'd been working with, "it's life lived on their phone."

He was referring to the tendency for adults to talk to kids about fake news, or sexting, or sexual abuse and recruitment, and so on as "online" dangers the adults want to protect them from. But, as this practitioner was trying to explain (and we have said here before), "online" isn't separate to them. Instead, all these issues are part of the context of pressures, relationships, economics, and competition that makes up their lives. This will become increasingly true as widely deployed sensors and hybrid cyber-physical systems and tracking become the norm.

This is a real generation gap. Older adults have taken on board each of these phenomena as we've added it into our existing understanding of the world. Watching each arrive singly over time allows the luxury of consideration and the mental space in which to plot a strategy. If you're 12, all of these things are arriving at once as pieces that are coalescing into your picture of the world. Even if you only just finally got your parents to let you have your own phone you've been watching videos on YouTube, FaceTiming your friends, and playing online games all your life.

An important part of "life lived on the phone" is in the UK's data protection bill implementation of the General Data Protection Regulation, now going through Parliament. The bill carves out some very broad exemptions. Most notably, opposed by the Open Rights Group and the3million, the bill would remove a person's rights as a data subject in the interests of "effective immigration control". In other words, under this exemption the Home Office could make decisions about where and whether you were allowed to live but never have to tell you the basis for its decisions. Having just had *another* long argument with a different company about whether or not I've ever lived in Iowa, I understand the problem of being unable to authenticate yourself because of poor-quality data.

It's easy for people to overlook laws that "only" affect immigrants, but as Gracie Mae Bradley, an advocacy and policy officer, made clear at this week's The State of Data 2018 event, hosted by Jen Persson, one of the consequences is to move the border from Britain's ports into its hospitals, schools, and banks, which are now supposed to check once a quarter that their 70 million account holders are legitimate. NHS Digital is turning over confidential patient information to help the Home Office locate and deport undocumented individuals. Britain's schools are being pushed to collect nationality. And, as Persson noted, remarkably few parents even know the National Pupil Database exists, and yet it catalogues highly detailed records of every schoolchild.

"It's obviously not limited to immigrants," Bradley said of the GDPR exemption. "There is no limit on the processes that might apply this exemption". It used to be clear when you were approaching a national border; under these circumstances the border is effectively gummed to your shoe.

The data protection bill also has the usual broad exemptions for law enforcement and national security.

Both this discussion (implicitly) and the security conversation we began with (explicitly) converged on security as a felt, emotional state. Even a British citizen living in their native country in conditions of relative safety - a rich country with good health care, stable governance, relatively little violence, mostly reasonable weather - may feel insecure if they're constantly being required to prove the legitimacy of their existence. Conversely, people may live in objectively more dangerous conditions and yet feel more secure because they know the local government is not eying them suspiciously with a view to telling them to repatriate post-haste.

Put all these things together with other trends, and you have the potential for a very high level of social insecurity that extends far outwards from the enemy class du jour, "illegal immigrants". This in itself is a damaging outcome.

And the potential for social control is enormous. Transport for London is progressively eliminating both cash and its Oyster payment cards in favor of direct payment via credit or debit card. What happens to people who one quarter fail the bank's inspection. How do they pay the bus or tube fare to get to work?

Like gender, immigration status is not the straightforward state many people think. My mother, brought to the US when she was four, often talked about the horror of discovering in her 20s that she was stateless: marrying my American father hadn't, as she imagined, automatically made her an American, and Switzerland had revoked her citizenship because she had married a foreigner. In the 1930s, she was naturalized without question. Now...?

Trying to balance conflicting securities is not new. The data protection bill-in-progress offers the opportunity to redress a serious imbalance, which Persson called, rightly, a "disconnect between policy, legislation, technological change, and people". It is, as she and others said, crucial that the balance of power that data protection represents not be determined by a relatively small, relatively homogeneous group.


Illustrations: 2008 map of nationalities of UK residents (via Wikipedia

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Stories about the border wars between cyberspace and real life are posted occasionally during the week at the net.wars Pinboard - or follow on Twitter.

March 9, 2018

Signaling intelligence

smithsam-ASIdemo-slides.pngLast month, the British Home Office announced that it had a tool that can automatically detect 94% of Daesh propaganda with 99.995% accuracy. Sophos summarizes the press release to say that only 50 out of 1 million videos would require human review.

"It works by spotting subtle patterns in the extremist videso that distinguish them from normal content..." Mark Werner, CEO of London-based ASI Data Science, the company that developed the classifier, told Buzzfeed.

Yesterday, ASI, which numbers Skype co-founder Jaan Tallinn among its investors, presented its latest demo day in front of a packed house. Most of the lightning presentations focused on various projects its Fellows have led using its tools in collaboration with outside organizations such as Rolls Royce and the Financial Conduct Authority. Warner gave a short presentation of the Home Office extremism project that included little more detail than the press reports a month ago, to which my first reaction was: it sounds impossible.

That reaction is partly due to the many problems with AI, machine learning, and big data that have surfaced over the last couple of years. Either there are hidden biases, or the media reports are badly flawed, or the system appears to be telling us only things we already know.

Plus, it's so easy - and so much fun! - to mock the flawed technology. This week, for example, neural network trainer Janelle Shane showed off the results of some of her pranks. After confusing image classifiers with sheep that don't exist, goats in trees (birds! or giraffes!) and sheep painted orange (flowers!), she concludes, "...even top-notch algorithms are relying on probability and luck." Even more than humans, it appears that automated classifiers decide what they see based on what they expect to see and apply probability. If a human is holding it, it's probably a cat or dog; if it's in a tree it's not going to be a goat. And so on. The experience leads Shane to surmise that surrealism might be the way to sneak something past a neural net.

Some of this approach appears to be what ASI's classifier probably also does (we were shown no details). As Sophos suggests, a lot of the signals ASI's algorithm is likely to use have nothing to do with the computer "seeing" or "interpreting" the images. Instead, it likely looks for known elements such as logos and facial images matched against known terrorism photos or videos. In addition it can assess the cluster of friends surrounding the account that's posted the video and look for profile information that shows the source is one that has been known to post such material in the past. And some will be based on analyzing the language used in the video. From what ASI was saying, it appears that the claim the company is making is fairly specific: the algorithm is supposed to be able to detect (specifically) Daesh videos, with a false positive rate of 0.005%, and 94% of true positives.

These numbers - assuming they're not artifacts of computerish misunderstanding about what it's looking for - of course represent tradeoffs, as Patrick Ball explained to us last year. Do we want the algorithm to block all possible Daesh videos? Or are we willing to allow some through in the interests of honoring the value of freedom of expression and not blocking masses of perfectly legal and innocent material? That policy decision is not ASI's job.

What was more confusing in the original reports is that the training dataset was said to have been "over 1,000 videos". That seems an incredibly small sample for testing a classifier that's going to be turned loose on a dataset of millions. At the demonstration, Warner's one new piece of information is that because that training set was indeed small, the project developed "synthetic data" to enlarge the training set to sufficient size. As gaming-the-system as that sounds, creating synthetic data to augment training data is a known technique. Without knowing more about the techniques ASI used to create its synthetic data it's hard to assess that work.

We would feel a lot more certain of all of these claims if the classifier had been through an independent peer review. The sensitivity of the material involved makes this tricky; and if there has been an outside review we haven't been told about it.

But beyond that, the project to remove this material rests on certain assumptions. As speakers noted at the first conference run by VOX-Pol, an academic research network studying violent online political extremism, the "lone wolf" theory posits that individuals can be radicalized at home by viewing material on the internet. The assumption that this is true underpins the UK's censorship efforts. Yet this theory is contested: humans are highly social animals. Radicalization seems unlikely to take place in a vacuum. What - if any - is the pathway from viewing Daesh videos to becoming a terrorist attacker?

All these questions are beyond ASI's purview to answer. They'd probably be the first to say: they're only a hill of technology beans being asked to solve a mountain of social problems.

Illustrations: Slides from the demonstration (Sam Smith).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Stories about the border wars between cyberspace and real life are posted occasionally during the week at the net.wars Pinboard - or follow on Twitter.

March 2, 2018

In sync

Discarding images-King David music.jpgUntil Wednesday, I was not familiar with the use of "sync" to stand for a music synchronization license - that is, a license to use a piece of music in a visual setting such as a movie, video game, or commercial. The negotiations involved can be Byzantine and very, very slow, in part because the music's metadata is so often wrong or missing. In one such case, described at Music 4.5's seminar on developing new deals and business models for sync (Flash), it took ten years to get the wrong answer from a label to the apparently simple question: who owns the rights to this track on this compilation album?

The surprise: this portion of the music business is just as frustrated as activists with the state of online copyright enforcement. They don't love the Digital Millennium Copyright Act (2000) any more than we do. We worry about unfair takedowns of non-infringing material and bans on circumvention tools; they hate that the Act's Safe Harbor grants YouTube and Facebook protection from liability as long as they remove content when told it's infringing. Google's automated infringement detection software, ContentID, I heard Wednesday, enables the "value gap", which the music industry has been fretting about for several years now because the sites have no motivation to create licensing systems. There is some logic there.

However, where activists want to loosen copyright, enable fair use, and restore the public domain, they want to dump Safe Harbor, either by developing a technological bypass; or change the law; or by getting FaceTube to devise a fairer, more transparent revenue split. "Instagram," said one, "has never paid the music industry but is infringing copyright every day."

To most of us, "online music" means subscription-based streaming services like Spotify or download services like Amazon and iTunes. For many younger people, especially Americans though, YouTube is their jukebox. Pex estimates that 84% of YouTube videos contain at least ten seconds of music. Google says ContentID matches 99.5% of those, and then they are either removed or monetized. But, Pex argues, 65% of those videos remain unclaimed and therefore provide no revenue. Worse, as streaming grows, downloads are crashing. There's a detectable attitude that if they can fix licensing on YouTube they will have cracked it for all sites hosting "creator-generated content".

It's a fair complaint that ContentID was built to protect YouTube from liability, not to enable revenues to flow to rights holders. We can also all agree that the present system means millions of small-time creators are locked out of using most commercial music. The dancing baby case took eight years to decide that the background existence of a Prince song in a 29-second home video of a toddler dancing was fair use. But sync, too, was designed for businesses negotiating with businesses. Most creators might indeed be willing to pay to legally use commercial music if licensing were quick, simple, and cheap.

There is also a question of whether today's ad revenues are sustainable; a graphic I can't find showed that the payout per view is shrinking. Bloomberg finds that increasingly winning YouTubers are taking all with little left for the very long tail.

The twist in the tale is this. MP3 players unbundled albums into songs as separate marketable items. Many artists were frustrated by the loss of control inherent in enabling mix tapes at scale. Wednesday's discussion heralded the next step: unbundling the music itself, breaking it apart into individual beats, phrases and bars, each licensable.

One speaker suggested scenarios. The "content" you want to enjoy is 42 minutes long but your commute is only 38 minutes. You might trim some "unnecessary dialogue" and rearrange the rest so now it fits! My reaction: try saying "unnecessary dialogue" to Aaron Sorkin and let's see how that goes.

I have other doubts. I bet "rearranging" will take longer than watching the four minutes. Speeding up the player slightly achieves the same result, and you can do that *now* for free (try really blown it. More useful was the suggestion that hearing-impaired people could benefit from being able to tweak the mix to fade the background noise and music in a pub scene to make the actors easier to understand. But there, too, we actually already have closed captions. It's clear, however, that the scenarios may be wrong, but the unbundling probably isn't.

In this world, we won't be talking about music, but "music objects". Many will be very low-value...but the value of the total catalogue might rise. The BBC has an experiment up already: The Mermaid's Tears, an "object-based radio drama" in which you can choose to follow any one of the three characters to experience the story.

Smash these things together, and you see a very odd world coming at us. It's hard to see how fair use survives a system that aims to license "music objects" rather than "music". In 1990, Pamela Samuelson warned about copyright maximlism. That agenda does not appear to have gone away.


Illustrations: King David dancing before the Ark of the Covenant, 'Maciejowski Bible', Paris ca. 1240 (via Discarding Images.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Stories about the border wars between cyberspace and real life are posted occasionally during the week at the net.wars Pinboard - or follow on Twitter.