Book Reviews

"The Human Face of Big Data," by Rick Smolan and Jennifer Erwitt, eds.

By Book Review Editor
Wednesday, March 27, 2013, 10:58 PM

Published by Against All Odds Productions (2012) 

Reviewed by Susan Hennessey

Big Data finally has its own coffee table book. From Day in the Life series creators Rick Smolan and Jennifer Erwitt, The Human Face of Big Data is bursting with stories of Big Data modern miracles, promising even those will soon seem quaint. It’s a visually stunning effort on behalf of Big Data public relations. As awareness of increasingly sophisticated - and potentially invasive - Big Data tools grows, some branding has to fill the alarmist media void. But is this the PR campaign Big Data needs?

In the introduction, Smolan describes himself as a “convert” to the power of Big Data and writes that he intends to start a conversation. This particular conversation happens to be sponsored by EMC, Cisco, and FedEx among others. The book begins and ends with glossy advertisements for EMC. Although this transparency is laudable, it is hardly surprising that a leading provider of data storage and cloud computing has paid for a book that thinks Big Data is pretty great.

This is not to say The Human Face of Big Data isn’t a worthwhile contribution. The stories are fascinating and it makes a convincing case for Big Data as an immutable facet of the future. It is also refreshing to find an examination of Big Data in both commercial and government contexts, rather than the uneven coverage so common in media stories. But if Big Data teaches us anything, it is that context matters, and here it includes financial backers with big stakes in the subject matter. And a few ominous passages and thoughtful essays aside, this is an unabashed celebration of Big Data and its power to save the world.

Much of the book is dedicated to imparting the sheer scope of Big Data:

'Big Data' is a term that describes the accumulation and analysis of information. Lots of information. Oceans of information. Every time someone clicks on something on Amazon, it’s recorded and another drop is added to the ocean. Every time a scanner beeps at the supermarket checkout. Every time a home electricity meter reports a reading. Every time a parcel passes a FedEx checkpoint. Every time a customs officer checks a passport, every time someone posts to Facebook, every time someone does a Google search—the ocean swells.

A true coffee table book, readers can open to any given page and find a compelling vignette, illustrated by beautiful photographs and slick Silicon Valley graphics. It’s unusual to confront Big Data in tangible form, and the large, hardbound format makes the scope all the more immediate. Readable blurbs in eye-catching typography beg to be read aloud: "Today one in three children have an online presence (usually in the form of a sonogram) before they are born." Wow. If the point here is to start the conversation, then mission accomplished. But The Human Face of Big Data is perhaps more notable for what it misses than what it contains.

Readers are treated to story after story of Big Data promise. A woman stands proudly tagged by her genetic risk profile. A previously infertile couple happily watches their two-year-old blow out birthday candles. A woman with no episodic memory relives her days via a wearable sensor. Health care costs fall, diseases are cured, soldiers saved on the battlefield, baseball players improve, the next big thing in music discovered, and on and on. Sincerely, Big Data’s capabilities are amazing. Yet no features in the book examine actual or potential abuses or the costs to privacy and civil liberties.

This being said, buried among the eye-catching photo layouts are several thoughtful essays. Jonathan Harris concludes that “Big Data is powerful, but it is ethically neutral; we have to choose how to use it.” And there are worthwhile suggestions. Kate Greene’s piece on self-tracking---the impulse to catalogue our own calories, run times, sleep cycles, etc.---explores the scores of new technology facilitating such collection. She notes:

These tools will probably never catch on with the wider public unless people are confident that their data are safe, though. ‘The key is giving individuals more control over their data, yet the flexibility to share it when they need to,’ says [Alex Pentland, of the Human Dynamic Laboratory at MIT]. To do this, he suggests, data should be protected by a ‘trust network’ that is not a company or government agency. People might then establish their own personal data vaults for which they define the rules of sharing.

These essays address concepts of data privacy based on control and contextual understanding, distinct from individual data ownership. But even the rare meditations on privacy and restraint fail to articulate what abuse might look like, and how it might change our relationship to our governments and to one another.

One blurb touts Progressive Insurance’s “Snapshot” tracking device, which reports on car location, acceleration, and distance traveled, as saving users 10-15% on auto insurance. While the section mentions that privacy activist “fear” abuses, it never gives voice to the anxiety over how this technology enables a domestic abuser, or what it means for a teen who parks at Planned Parenthood.

Relatively little of the book is dedicated to law enforcement; what is there is interesting, if remarkably uncritical. Marc Goodman introduces the section, entitled “Dark Data,” with an essay that ignores civil liberties concerns, focusing instead on the fear of technology proliferation making its way to criminals and terrorists. (Goodman does get bonus points for discussing crime and Big Data without referencing Minority Report.)

Smolan does not acknowledge the impact his intended conversation itself might have on government use of data. It’s not clear he realizes it, and perhaps that’s not a surprise; he’s a photographer and editor, not a lawyer. Still, much of the relationship between the public and law enforcement is mediated---via the 4th amendment---through a reasonable expectation of privacy. Wider social understanding about the power of Big Data meaningfully alters the calculus. Dan Gardner writes about the capacity to “detect patterns of behavior we are not aware of, and those patterns could reveal unconscious thought processes that drive the behavior. In a very real sense, Big Data could know us better than we know ourselves.” Theoretically, collective awareness of this type of capability could leave even unexpressed thoughts constitutionally unprotected and accessible to law enforcement.

The “Dark Data” section holds many of the stories of greatest interest to Lawfare readers. It catalogues well-known Big Data triumphs like Palantir’s Operation Fallen Hero. It’s take on the controversial NYPD’s “Domain Awareness System” recounts instant access to the thousands of cameras monitoring Manhattan, paired with automated license plate readers, radiation detectors, relevant 911 calls, arrest records and vast files of individual’s physical characteristics. The piece on tracking cargo containers using sensors to determine when they are rerouted or opened, misses the opportunity to illustrate how Big Data might be simultaneously enhancing both privacy and security. The sensors allow law enforcement officials to monitor the containers without ever looking inside.

This section also highlights the ability to pinpoint the location of a gunshot through acoustic sensors. A cool spread on unmanned aerial vehicles includes stunning pictures of the Air Force’s MQ-9 Reaper and a shot of drone pilots wearing flight suits in what look to be simulated cockpits. India’s digital identification project is featured as well. According to the Indian government, national ID cards logging digital photographs, iris scans, and fingerprints of the nation’s 1.2 billion inhabitants will curb identity theft, corruption, voter fraud, and even poverty. Again, no mention is made of the potential for abuse.

One interesting piece is dedicated to Wikileaks as a data set to generate insight into the war in Afghanistan. Data scientists have used the classified* information to show an expanding conflict zone, increasing civilian deaths, and higher insurgent reliance on IEDs. In an exploration of Big Data, this seems like an obvious place to address ethical use of data sets instead of a tacit endorsement that more data is always better, regardless of the source. Instead, the section is judgment neutral at best, and ends by praising the data dump’s potential to “change the direction of future conflicts.”

Turn the page, and two facing stories juxtapose the Wikileaks data bunker in Sweden with the FBI filing room circa 1943. There is passing acknowledgment that Hoover used these file “to gather secrets about---and power over---U.S. presidents and other public figures” but no actual discussion about one of the most stunning abuses of personal data in U.S. history. The contrast of Assange and Hoover is not meant to illustrate the danger of Big Data in the hands of bad actors, rather to demonstrate the “irony” that the primitive nature of the FBI filing reinforced security; it would be nearly impossible to remove all the paper information and even then would require years to analyze.

Interspersed with technologies that verify malaria drug integrity in Ghana and aim to cure polio in Nigeria, are some interesting tidbits for the national security junkie. There is a story on China’s effort to develop a new satellite-based navigation system, called Compass or Beidou-2. Scheduled for completion in 2020, Compass could be the world’s most accurate navigation system, eclipsing U.S. G.P.S. technology. According to Brian Weeden of the Secure World Foundation, “[p]recision positioning and navigation is crucial to military power. China didn’t want to be reliant on either the Americans or Russians for those critical services.” However, the system is designed to be interoperable with U.S., Russian, and European systems, and purports to be a free, highly accurate civilian service with vast economic, and data, potential.

Among all the praise for the Big Data revolution, there is essentially no examination of cyber-security, of what U.S. persons’ data looks like in the hands of foreign companies, or that increased dependence means increased vulnerabilities. Crowdsourcing ancient text translation is amazing, using Twitter to dispatch Marines to emergency events following the Haitian earthquake awe-inspiring. But the conversation Smolan seeks has to be about more than how impressive it all is and The Human Face of Big Data doesn’t ask hard questions. True, it gives provocative information: did you know your credit card company can predict divorce based in part on identifying infidelity? But that particular tidbit is paragraphs below information about wedding dress advertising, next to a picture of an elated bride.

Underlying the sentiment of healing the sick and empowering mankind, The Human Face of Big Data begs the question, for all its promise what is the driving force behind Big Data? It takes a careful read, but the answer is still there: money. The technologies to eliminate malaria are byproducts of those designed to generating advertising revenue. Sean Gourley of Quid questions our ability to be skeptical of software that isn’t acting in the users interest:

What would happen if Google Maps knew you were looking for a new car. Maybe when you looked up directions to a party, it would suggest a route that passed right by a dealership. ‘[Has] it got your interest at heart, or has it got making money from ads at heart?”

Jonathan Harris, who authors the best and final essay, notes that through Big Data technology a few dozen people, predominantly young males, living in San Francisco and New York, disproportionately impact our species. Harris criticizes the model of user as commodity. Whatever the underlying service, software aims to draw the user back to the screen again and again, so their attention can be then be sold to advertisers. Facebook expat, Jeff Hammerbacher, puts it more bluntly. “The best minds of my generation are thinking about how to make people click ads. That sucks.”

Update 10:45 am: An alert reader notes that while the Wikileaks material is certainly publicly available, it is, in fact, still classified.

(Susan Hennessey is a Lawfare staff contributor and third-year student at Harvard Law School, where she is an editor of the Harvard National Security Journal and a law clerk at a laboratory that develops advanced defense technologies.)