net.wars: June 2017 Archives

Who's the boss?

Even for Google, €2.4 billion is a lot of money. It makes the £130 million the company agreed to pay in back taxes to the UK in 2016 into a micropayment. The former is, of course, the fine that the European Commission has applied to Google over its shopping ad placements.

A fine like that is guaranteed to produce widespread press coverage (and remind a company that governments still have power), and so it has. At Distilled, Will Critchlow finds the ruling wrong but notes that Google could nonetheless face more huge payouts in other vertical areas it operates in, a point also made at Bloomberg. At Reuters, Foo Yun Chee and Eric Auchard think the fine will trouble Google less than the regulatory burden that attends its new definition as a monopoly. Search Engine Land thinks the changes regulators are forcing upon Google will damage the consumer experience. This is Google's own argument: consumers don't want to be directed to some other site to search again; they want results they can click on *right now*.

Google is not wrong about this. The mass of useless comparison sites littering Google's results was one reason I stopped using it back in 2010. I find it more productive to use eBay or Amazon to find niche retailers. So the sites complaining here are mostly ones I regard as spam, but include a few that even I find useful, like Yelp and Trip Advisor, which combine price comparisons with community recommendations and reviews. These two, plus the much more obscure (and currently suspended) UK site Foundem, all filed complaints alongside many other organizations listed by the lobbying group Fairsearch, and being pushed down the results list is a definite disadvantage. This ought to be fixable with a reorganized results page.

There's no doubt that a search engine with the dominance that Google has in the European search market is well-placed to make life extremely difficult for any company that competes with one or more aspects of its business. Here, Windows provides a pretty good analogy: just as the operating system is your gateway to the rest of the computer's functions, search engines most people's gateway to the rest of the internet. When Microsoft began subsuming utilities like compression, internet browsing, and media play into its operating system, in the process it undermined the business model of competitors that had sprung up to fill those gaps. Google's moves into flight, hotel, and restaurant search threaten independents in exactly the same way. It is a classic principle of antitrust that a company should not be able to leverage its dominance in one market to damage competition in another. Ironically, the one area in which regulators have been proactive in ensuring the health of Google's competitors is file-sharing: would The Pirate Bay be as long-lived as it is if Google promoted its own torrent search?

It's just that the whole thing seems so terribly dated. Outside of a few sectors, like financial services and utilities, comparison sites seem to belong to the mid-1990s, when everyone wanted to have a portal or a shopping mall. Basically, everyone wanted to be Yahoo (Yahoo!) until Google came along and wiped the front page clean. This backdated sensation is fairly typical of antitrust actions in the internet sphere, if only because they take so long to mount. The EU went after Microsoft for its media player in 2004 and Internet Explorer in 2009; each of those actions was arguably at least five years late.

Google has reached the size and power it has is its long list of acquisitions, which includes Zagat (2011), Waze (2013), ITA Software (2011), and, over the contemporaneous opposition of privacy advocates, DoubleClick (2007). The acquisition of the web-wide advertising business DoubleClick was opposed by many privacy advocates at the time; the Department of Justice made it a condition of the ITA Software buyout that Google had to license it to other websites for five years. This is the airfare search and pricing engine that powers not only Google's airline search but that of Kayak, Bing Travel, Orbitz, and a number of airlines. Most people probably have no idea it exists, but you see the point: why bother going to Kayak if you're already on Google and the results are right there?

In 2013, the US Federal Trade Commission declined to prosecute Google for biasing its search results in its own favour, a decision Ed Felten, seconded to the FTC at the time, alludes to at Freedom-to-Tinker in his discussion of the EU decision. At the time, net.wars argued that the regulators examining the case should actually have been thinking about Google's abrupt aggregation of everything it knew about you and its ability to force Android users to provide their personal information. You can just about use an Android phone without signing up for a Gmail account or using the Play Store, but doing so severely limits the apps you can add. Both these threats have done nothing but grow since then. Today's action ought to be about data.

Illustrations: European flags; Google.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Stories about the border wars between cyberspace and real life are posted occasionally during the week at the net.wars Pinboard - or follow on Twitter.

Posted by Wendy M. Grossman at 6:51 PM | Permalink | Comments (0) | TrackBacks (0)

Dead weight

"How much does it cost?" I asked a US friend about her supply of a medication I'm familiar with.

"I don't know," she said. After reading former New York Times reporter Elisabeth Rosenthal's new book An American Sickness, I understand better: no one connected with the American medical system knows how much anything costs. Even the uninsured, who are paying out of pocket, don't know the cost of anything. They only know what they're charged, a very different thing. Spoiler: they pay the most because they have no purchasing power.

In American health care economics, everything works backwards to the way you think it works. "Prices," Rosenthal writes in the tenth of her 11 laws of US health care economics, "rise to what the market will bear." Among the others: "More treatment is always better" and "a lifetime of treatment is preferable to a cure". In addition: treatments and medications can get more expensive as they age; economies of scale give providers the market power to charge more; more competition can send prices up as well as down. Rosenthal proceeds to document how all these rules work out in practice in eye-popping detail. We all know Americans pay more for health care, but Satan lurks in Rosenthal's thousands of details.

Rosenthal hopes to start a loud conversation to foster change. Such proposals are up against more than just the post Citizens United influx of corporate money into politics. For one thing, we tend to believe that more expensive must mean "better". Telling Americans their health care is the most expensive in the world fails as a strategy because many then think that means it must be the best, even if not everyone can access it. And there's confirmation bias: every infant mortality or life expectancy figure can be countered with a cancer survival rate or waiting list. We can't make them love the NHS.

Tying health insurance to employment as an aspirational benefit - a stroke of evil genius - kept health separate from human rights. Viewing health insurance as something you earn out of merit helps create and extend the social divisions we see every day. In another part of our brains, we know that sick people are rarely in a position to act as responsible consumers. You can't shop around for a cheaper surgeon in the middle of a heart attack or ask detailed security questions about your pacemaker's design before installation. Instead, you wrangle with your insurance company after those decisions have been made. The bit of good news there is that the insurance company can't actually repossess your cancer treatment. The bad news is that more than 50% of American bankruptcies are among people who *have* health insurance, and it's distressingly easy for some people to reach the maximum lifetime payout. In Girl in Glass, Deanna Fei writes about the medical meaning of "catastrophic": unexpected and unpredictable, rather than devastating, as it applied to her terrifyingly premature daughter, one of the two "distressed babies" the AOL CEO said cost $1 million each. Even for insured Americans, the burdens of health crises are often financially "catastrophic" in both that sense and the more usual "disaster" sense as well.

Even Rosenthal, however, does not fully recognize the cost of health care to Americans. Employers' involvement in providing health insurance for their employees helps them justify drug testing, and has consequences for medical privacy. People like Fei's husband find their most sensitive health information open to their employers, especially when these self-insure, hiring a third-party insurer just to handle administration. In recent years, Health Privacy Summit speakers have deplored employers' access to staff health data through "Wellness" programs, which aren't protected by the 1996 Health Insurance Portability and Accountability Act (HIPAA). Patients lose privacy under other regimes, too; but the *consequences* of health privacy breaches are less severe when employers aren't financially motivated to fire (or not hire) their most vulnerable employees.

There are also broader costs. In 1987, when Peter Skrabanek first warned of the risks that screening mammograms would increase over-treatment and patient distress, the general response was that the saved lives would be worth the cost. Today, doctors are concluding that Skrabanek was right.

In his recent book, Move Fast and Break Things, Jonathan Taplin recounts the story of The Band member Levon Helm, who was forced to keep touring while seriously ill with esophageal cancer because he needed the money for medical bills. A European might blame the lack of nationalized health care. Taplin focuses on internet companies, which he blames for the music industry's post-Napster dropping revenues. Later in the book, Taplin argues that a future in which robots and AI have taken most jobs will require a universal basic income and free health care - but still doesn't connect it to Helms's plight. For Taplin, the solution is to restore Helm's royalty stream.

The global dominance of American technology companies means we all feel the consequences of the demographics of US start-up culture - which is predominantly a youth culture in part because hardly anyone old enough to contemplate having children can afford to risk their medical insurance. Finally, the expanding cost is a drag on the overall economy. Meanwhile, the cost is measured in human lives.

Illustrations: Hospital pictogram; Elisabeth Rosenthal; An American Sickness.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Stories about the border wars between cyberspace and real life are posted occasionally during the week at the net.wars Pinboard - or follow on Twitter.

Posted by Wendy M. Grossman at 6:14 PM | Permalink | Comments (0) | TrackBacks (0)

The ghost in the machine

Humans are a problem in decision-making. We have prejudices based on limited experience, received wisdom, weird personal irrationality, and cognitive biases psychologists have documented. Unrecognized emotional mechanisms shield us from seeing our mistakes.

Cue machine learning as the solution du jour. Many have claimed that crunching enough data will deliver unbiased judgements. These days, this notion is being debunked: the data the machines train on and analyze arrives pre-infected, as we created it in the first place, a problem Cathy O'Neil does a fine job of explaining in Weapons of Math Destruction. See also Data & Society and Fairness, Accountability, and Transparency in Machine Learning.

Patrick Ball, founding director of the Human Rights Database Analysis Group, argues, however, that there are underlying worse problems. HRDAG "applies rigorous science to the analysis of human rights violations around the world". It uses machine learning - currently, to locate mass graves in Mexico - but a key element of its work is "multiple systems estimation" to identify overlaps and gaps.

"Every kind of classification system - human or machine - has several kinds of errors it might make," he says. "To frame that in a machine learning context, what kind of error do we want the machine to make?" HRDAG's work on predictive policing shows that "predictive policing" finds patterns in police records, not patterns in occurrence of crime.

Media reports love to rate machine learning's "accuracy", typically implying the percentage of decisions where the machine's "yes" represents a true positive and its "no" means a true negative. Ball argues this is meaningless. In his example, a search engine that scans billions of web pages for "Wendy Grossman" can be accurate to .99999 because the vast supply of pages that don't mention me (true negatives) will swamp the results. The same is true of any machine system trying to find something rare in a giant pile of data - and it gets worse as the pile of data gets bigger, a problem net.wars has often called searching for a needle in a haystack by building bigger haystacks in relation to data retention.

For any automated decision system, you can draw a 2x2 confusion matrix, like this:

"There are lots of ways to understand that confusion matrix, but the least meaningful of those ways is to look at true positives plus true negatives divided by the total number of cases and say that's accuracy," Ball says, "because in most classification problems there's an asymmetry of yes/no answers" - as above. A "94% accurate" model "isn't accurate at all, and you haven't found any true positives because these classifications are so asymmetric." This fact does make life easy for marketers, though: you can improve your "accuracy" just by throwing more irrelevant data at the model. "To lay people, accuracy sounds good, but it actually isn't the measure we need to know."

Unfortunately, there isn't a single measure: "We need to know at least two, and probably four. What we have to ask is, what kind of mistakes are we willing to tolerate?"

In web searches, we can tolerate a few seconds to scan 100 results and ignore the false positives. False negatives - pages missing that we wanted to see - are less acceptable. Machine learning uses "recall" for the fraction of true positives in the set of results, and "precision" for that of true positives in the entire set being searched. The various ways the classifier can be set can be drawn as a curve. Human beings understand a single number better than tradeoffs; reporting accuracy then means picking a spot on the curve as the point to set the classifier. "But it's always going to be ridiculously optimistic because it will include an ocean of true negatives." This is true whether you're looking for 2,000 fraudulent financial transactions in a sea of billions daily, or finding a handful of terrorists in the general population. Recent attackers, from 9/11 to London Bridge 2017, have already been objects of suspicion, but forces rarely have the capacity to examine every such person, and before an attack there may be nothing to find. Retaining all that irrelevant data may, however, help forensic investigation.

Where there are genuine distinguishing variables, the model will find the matches even given extreme asymmetry in the data. "If we're going to report in any serious way, we will come up with lay language around, 'we were trying to identify 100 people in a population of 20,00 and we found 90 of them." Even then, care is needed to be sure you're finding what you think. The classic example here is the the US Army's trial using neural networks to find camouflaged tanks. The classifier fell victim to the coincidence that all the pictures with tanks in them had been taken on sunny days and all the pictures of empty forest on cloudy days. "That's the way bias works," Ball says.

The crucial problem is that we can't see the bias. In her book, O'Neil favors creating feedback loops to expose these problems. But these can be expensive and often can't be created - that's why the model was needed.

"A feedback loop may help, but biased predictions are not always wrong - but they're wrong any time you wander into the space of the bias," Ball says. In his example: say you're predicting people's weight given their height. You use one half of a data set to train a model, then plot heights and weights, draw a line, and use its slope and intercept to predict the other half. It works. "And Wired would write the story." Investigating when the model makes errors on new data shows the training data all came from Hong Kong schoolchildren who opted in, a bias we don't spot because getting better data is expensive, and the right answer is unknown.

"So it's dangerous when the system is trained on biased data. It's really, really hard to know when you're wrong." The upshot, Ball says, is that "You can create fair algorithms that nonetheless reproduce unfair social systems because the algorithm is fair only with respect to the training data. It's not fair with respect to the world."

Illustrations: Patrick Ball; confusion matrix (Jackverr); Cathy O'Neil (GRuban).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Stories about the border wars between cyberspace and real life are posted occasionally during the week at the net.wars Pinboard - or follow on Twitter.

Posted by Wendy M. Grossman at 5:00 PM | Permalink | Comments (0) | TrackBacks (0)

Foobar

Anyone who flies with any regularity watched with apprehension the catastrophic failure of British Airways' computer systems over the end-of-May holiday weekend, which saw an estimated 75,000 travelers stranded at Heathrow and many others stuck, probably less miserably, in various other places in the world. The most likely outcome of the eventual investigation into what went wrong is that there were many factors, mostly not flattering to the airline despite its chair's defense. For stranded travelers, the more immediate problem was the apparent lack of any plan for dealing with such a total shutdown. You have to ask: if the airport and BA staff have no plan for dealing with a day's worth of backed-up passengers - for getting them information, water, food, and an orderly exit - how do they intend to cope with more urgent and dangerous situations like a fire, or a tornado?

Britain is famous for "muddling through" disasters of various scales, and for analog-era catastrophes that worked well enough because attacks tended to be localized and scale slowly. It's poorly suited for the digital era, when systems don't fail gracefully a bit at a time but crash catastrophically at high speed at global scale. Passengers are the cogs in such systems and, with the exception of a small percentage, are fungible, easily replaced because so many people either want or need to fly. Wherever distance and lack of alternatives means the airline industry as a whole - though not necessarily any one particular airline - lacks competition, increasingly crappy treatment has been the norm for a long time. A hidden factor: Bloomberg pointed out a couple of months ago that airlines make more money by selling the miles customers accrue by using their affiliated credit cards than they do selling seats.

Most of the time, things are different for the small percentage. - that is, the minority of people for whom flying is a routine. They know which comforts to arrange for themselves, they know the layouts of their most-used airports, and they generally have help from airlines who recognize that while it's easy to replace the once-a-year vacation flyer it's a lot more expensive to replace the 100,000 mile-a-year business class executive. Particularly if, as in many parts of Europe, that executive can shrug and take the train instead, though to do this you have at least to be able to exit the airport.

The point here, however, is much more about resilience. In the early days of automating airlines and airports, when the legacy systems were still in place and, crucially, staff still remembered how to use them, system failures could be managed by returning to older methods. Today, that option is gone, especially, one presumes, for Terminal 5, the newest of Heathrow's terminals, which is all electronic gates and automated transport (surrounding the inevitable giant shopping mall). We can expect many more messes like this as the Internet of Things takes shape, partly because eternally optimistic technology companies never like to admit something might go wrong with their products and help you think how to recover, and partly because all those "smart" things will add unfathomably multi-dimensional new dependencies that will be hard to understand. Think for the want of a nail writ in weird little gizmos that provide the smallest possible increment of convenience.

***

The big hope with respect to the UK's general election results is that there might be at least a brief respite from the rhetoric demanding yet more surveillance powers (small wonder Guardian columnist John Crace writes about the Maybot). A rational response to the Westminster and London Bridge attacks might have looked something like: The ink is barely dry on the Investigatory Powers Act; we must study what went wrong; learn how best to deploy the new powers; and give them time to bed in. Instead, we got yet more demands for increased intrusions: direct access to cloud data; new demands to break encryption; and more regulation of the internet, though no one suggests banning white vans or closing London's bridges to motorized traffic. More to the point, both attacks, like others before them, were followed by the news that the attackers were known to the security services, which failed to act on the intelligence they had. In testimony he gave Parliament when it was considering the Investigatory Powers Act, NSA whistleblower William Binney warned that inundating analysts with data was counterproductive and would cost lives.

I know it's hard for politicians, particularly mid-campaign, not to immediately reach for a "something" we "must do". But why can't the something be, just once, fixing the actual things that went wrong rather than continuing to demand the same things over and over expecting different results?

Illustrations: Heathrow Terminal 5 (Fingalo Christian Bickel) in happier times; Theresa May.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Stories about the border wars between cyberspace and real life are posted occasionally during the week at the net.wars Pinboard - or follow on Twitter.

Posted by Wendy M. Grossman at 5:36 PM | Permalink | Comments (0) | TrackBacks (0)

The power of privacy

"It's about power."

Because humans like to count in handfuls, this year's Privacy Law Scholars, the tenth, kicked off with a panel considering a simple question: how have our ideas about privacy changed in the last decade.

"Power has entered the foreground," University of Maryland professor Danielle Citron and University of Washington Ryan Calo summed it up. Ten years ago at this event, they said, privacy was about managing personal data. Now, we're talking managing society through information.

Founded by Daniel Solove and Chris Hoofnagle, PLSC's goal is to improve the quality of privacy law scholarship. Lawyers - academic, government, activist, and corporate - assemble here to workshop each other's papers and wrangle over how to adapt existing legal theories and case law to the new technological reality. I can't judge the ten-year comparison; I only remember 2015 and 2016.

Nearly twenty years ago, the science fiction writer David Brin set privacy activists to stun by arguing, in The Transparent Society, that we should embrace openness, rather than privacy. "Privacy," I remember him saying at that year's Computers, Freedom, and Privacy, "only protects the rich and powerful". What Brin proposed instead was reciprocal transparency: the cameras would be trained on government and police as well as the citizenry. In 2014, Dave Eggers imagined a world in which that seemed to be coming true: but in The Circle, the result is "privacy is theft" and those who opt out wind up dead.

Citron and Calo are certainly right about this year's crop of 68 papers. A number consider the impact of algorithmic decision-making - on civil rights, on policing, on extreme vetting. But if you've spent the last 25 years surrounded by privacy activists, the power balance and trust relationship implicit in "the right to be let alone" was clear long ago. The burgeoning presence of CCTV, the battle over the British national ID card and its attendant database (too many net.wars to link, from 2004 to 2010), the public outrage over care.data, the First Crypto Wars of the 1990s...the opposition to all of these displayed consciousness of power and social control. Lawyers, particularly American lawyers, may have been focused on consumer protection, data protection, and codes of practice, but things looked different on the ground.

Among the most prescient was the privacy pioneer Simon Davies, who founded Privacy International, who has long seen privacy as the bedrock of democracy. In leading the opposition to the national ID card, No2ID and others often warned that ID cards and, especially, their attendant database would change the relationship between the citizen and the state. What is clearer now is that this relationship is changing anyway, as governments seek to impound the data so many of us have trustingly uploaded to servers run by companies like Facebook and Google. Why build a national database when you can subpoena one or demand logins at the border?

Like dramatists, lawyers go where the conflicts are. Ten years ago, the landscape did not include today's data giants. Facebook was a two-year-old upstart, Twitter was barely founded, Apple's first iPhone was a year old, and Google had been a public company for two years. In 2006, it was hard to imagine a time when Google's helpfulness was unloved. Four years later, I wanted a divorce.

Today's biggest repositories of personal data are no longer necessarily governments. GAFA - Google, Amazon, Facebook, Apple - do not have the power to imprison or detain, as governments do, and they build their datasets by seduction rather than compulsion. They and their commercial partners benefit from the greater intimacy of daily contact. Small wonder that Donald Trump saw Facebook as a hidden vector for his campaign.

The early internet's libertarians imagined that governments could not move quickly enough to command cyberspace. Had the internet remained decentralized that might have been true. Now, however, the population of monthly Facebook users is larger than that of China. Small wonder that we're seeing so many governments - and media - complain about multinational corporate tax evasion. Or that we're seeing a string of conflicts over jurisdiction: Microsoft v Ireland, Google v France, and Google v Spain. Apple's stand-off with the FBI ended in a stalemate but is doubtless only the first of many such stare-downs. In these cases, individuals are wheat grains caught between giant, implacable grinding millstones.

Three recent stories illustrate where we're headed in these debates. In Notes from an Emergency, a talk given at re:publica, Maciej Ceglowski discussed the state of the "feudal internet" and asked why Europe had ceded the internet to five giant US companies (Google, Apple, Microsoft, Facebook) and suggested how this direction of travel might be reversed. Almost immediately, the news broke that Google intends to "work with publishers" in preparation for adding ad blocking to its Chrome web browser next year, a move that could give Google great control over the content we access and who gets paid for providing it. Finally, Cory Doctorow writes in the Guardian that technology is making the world more unequal - but that technology can fix that by enabling loosely-knit groups to force a redistribution of power at scale. This is what the next few years of privacy debates will look like.

In one sense this may make privacy advocates' jobs easier. It was always hard to explain why "privacy" mattered. Power, people understand.

Illustrations: Ryan Calo and Danielle Citron; Daniel Solove and Chris Hoofnagle; Simon Davies.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Stories about the border wars between cyberspace and real life are posted occasionally during the week at the net.wars Pinboard - or follow on Twitter.

Posted by Wendy M. Grossman at 7:42 PM | Permalink | Comments (0) | TrackBacks (0)

net.wars

June 30, 2017

Who's the boss?

June 23, 2017

Dead weight

June 16, 2017

The ghost in the machine

June 9, 2017

Foobar

June 2, 2017

The power of privacy