Former Google data scientist Seth Stephens-Davidowitz offers shocking insights gleaned from big data.
Data Revelations
Former Google data scientist Seth Stephens-Davidowitz – now a visiting lecturer at the Wharton School and a New York Times op-ed contributor – uses big data to reveal underlying currents in human behaviors and attitudes.
He delves into darker aspects of US society and reveals how the new age of big data generates intriguing insights about America today. His accessible writing makes a complex subject approachable, including his explanations of how data analysis works, how analysts use data sets and why current advances in analytics are relevant. Building on his groundbreaking political research, Stephens-Davidowitz discusses sociology, psychology, economics, medicine and crime. His startling overview is informative reading for anyone seeking to understand how big data shapes contemporary life and attitudes.
The book’s foreword is by Steven Pinker, the best-selling author of The Better Angels of Our Nature. He calls Stephens-Davidowitz’s work, “An unprecedented peek into people’s psyches.” The Economist found it to be, “A whirlwind tour of the modern human psyche using search data as its guide.”
Google Searches
Stephens-Davidowitz chronicles how analysts plumb Google searches to illuminate the truths revealed by user searches. The author’s theme is that analyzing Google searches – looking at what people seek when they think no one knows what they’re searching for – unveils what traditional survey and poll methodologies could never discover. This, Stephens-Davidowitz stresses, means you should not trust conventional polling results.
The everyday act of typing a word or phrase into a compact, rectangular white box leaves a small trace of truth that, when multiplied by millions, eventually reveals profound realities.Seth Stephens-Davidowitz
For instance, he reports that on President Barack Obama’s first election night, one in every 100 online searches for “Obama” included a racial slur for African Americans or the term “KKK.” That night, searches and new membership inquiries for the white supremacist group Stormfront were 10 times higher than average. Stephens-Davidowitz points out that this profoundly contradicted conventional polling claiming America had become a post-racial society.
Big Data
With people generating 2.5 million trillion bytes of data daily, Stephens-Davidowitz says the problem confronting analysts isn’t finding sufficient data, but applying the correct data to the most relevant questions. He regards data science as often intuitive, but explains that the sheer amount of data and analysis available may reveal counterintuitive results.
Big data’s revelatory power, Stephens-Davidowitz clarifies, derives from showing what people search for and deducing how that indicates who they are. Data scientists can enlarge a tiny slice of information or of a small segment of the population and carry out quick, controlled tests.
A danger of the data revolution is that, as more of our life is quantified…better prediction can lead to subtler and more nefarious discrimination.Seth Stephens-Davidowitz
The author notes that analysts can leverage data in areas researchers haven’t always explored extensively, such as racism, child abuse and sexual behavior. But, he cautions strongly, for ethical, legal and data science reasons, researchers should never use analyses to predict individual actions.
Unofficial Information
Stephens-Davidowitz asserts that examples abound in which official information doesn’t align with incontrovertible analyses of Google searches, especially regarding sexuality, pornography, child abuse, gender discrimination and racism. Such analysis wasn’t previously accessible, and the author provides unvarnished access into people’s unsettling private searches.
Big data does not eliminate the need for all the other ways humans have developed over the millennia to understand the world.Seth Stephens-Davidowitz
For example, he discloses that during the 2007 recession, official child abuse and neglect cases were down in the states hardest hit economically, but Google searches about abuse and neglect – and searches for “my mom beat me” – increased dramatically as did the number of child deaths from neglect and/or abuse. Stephens-Davidowitz offers a parallel, counterintuitive example: Analysis shows that Donald Trump’s election didn’t fuel a surge in white nationalism; rather, it unearthed biases that previously existed but were concealed.
Fate and Geography
Stephens-Davidowitz explains that economists and social scientists examine small subsets within larger subsets. For instance, he cites experts who analyzed all US tax records since 1966 – some 1.2 billion data points – to study the chances that an American born to parents in the bottom 20% of income could reach the top 20%. They found that people from San Jose, CA, have a 12.9% chance of that rise, compared with a 4.4% chance for someone from Charlotte, NC.
Easy Complexity
As a data scientist, Stephens-Davidowitz easily could have fallen into the experts’ trap of trying to explain unnecessary details of his arcane field. Remarkably, he provides exactly the right amount of inside, technical knowledge to enable readers to understand his revelations about both larger-view and granular analysis methods and information. He is adamant about the need for moral and practical limits on how analysts and their clients should utilize search results.
Stephens-Davidowitz offers ample evidence that almost everything the broader media assumes and proclaims about darker aspects of American society is, in fact, wrong. His book is a compelling, revelatory, necessary read.
Parallel books that address the current uses of big data include Weapons of Math Destruction by Cathy O’Neil; Dataclysm by Christian Rudder; and Noise by Daniel Kahneman, Olivier Sibony and Cass R. Sunstein.
Warning
Stephens-Davidowitz discusses research into racism, child abuse, sexuality and pornography in stark terms. The book cites specific racial slurs utilized in research.