Alexander Nix, Big Data, Cambridge Analytica, Cambridge University Psychometrics Centre, Carole Cadwalladr, Financial Times, Fundamental Attribution Error, Gerd Gigerenzer, Gillian Tett, Herbert Simon, Jerome Kagan, Lewin's equation, ludwig Wittgenstein, Malcolm Gladwell, Michal Kosinski, Neils Bohr, Newsweek, Observer, Panopticon, personality, PNAS, Proceedings of the National Academy of Sciences, Psychographics, The Guardian, The New York Times, The Tipping Point, Werner Heisenberg
Why the “psychographic” techniques which underpin the Big Data ideas of Michal Kosinski and Alexander Nix have no scientific foundation.
Dr Hugh Morrison (The Queen’s University of Belfast [retired])
“There is a general disease of thinking which always looks for (and finds) what would be called a mental state from which all our acts spring as from a reservoir.” Ludwig Wittgenstein
The data mining techniques of the software company Cambridge Analytica have come in for particular scrutiny in the aftermath of the 2016 American presidential election. In recent days, two names in particular have become associated with using the Facebook “likes” of individuals to gain access to their personality: Michal Kosinski, a past Operations Director at the Cambridge University Psychometrics Centre, and Alexander Nix of the company Cambridge Analytica. While there is clearly much that divides these two men, they appear to have at least one belief in common, namely, that “Big Data” techniques make it possible to access the personality profile of an individual from his or her Facebook likes. I will make the case that this claim has no scientific merit. “Psychographics,” as it has come to be known, has no basis whatever in science.
It is important to investigate the claims made by those who use data mining to infer personality. Aside from the eye-watering sums of money involved, it is important that the general public be aware of the profound limitations of this use of Big Data. The June 16, 2017 issue of Newsweek captures the degree to which we should all be concerned: “Big Data, artificial intelligence and algorithms designed and manipulated by strategists like the folks at Cambridge [Analytica] have turned our world into a Panopticon, the 19th century circular prison designed so that guards, without moving, could observe every inmate every minute of every day.”
The central claim advanced by those who advocate the use of Big Data to uncover information about personality is that information derived from an individual’s Facebook likes can be used to infer something about that individual’s personality. This claim has no scientific basis for a very simple reason; personality is not a property of the individual which can be represented numerically. Personality is a joint property of the individual and the context in which it was manifest. Personality isn’t a trait which the individual somehow carries within themselves from context to context. Rather, personality varies with context: a child may be extrovert at home, but quiet and reserved in the classroom. She may be extrovert in the company of the children who live next door, but introvert when interacting with strangers in unfamiliar settings.
The research documented in Malcolm Gladwell’s best-seller The Tipping Point leaves one in very little doubt that – pace Kosinski and Nix – personality cannot be a trait, ascribable to the individual, and amenable to quantification. On page 186 of his book Surprise, Uncertainty, and Mental Structures, the distinguished Harvard psychologist Jerome Kagan writes: “Some men are loyal to their wives and affectionate with their children but disloyal and hostile in their relations with colleagues at work.” On page 188 he cautions: “[C]onclusions about personality that are based only on questionnaires or interviews have a meaning that is as limited as Ptolemy’s conclusions about the cosmos based on the reports of observers staring at the sky without telescopes.” Every undergraduate physicist learns (from the teaching of Niels Bohr) that the mind is not a carrier of definite states which determine behaviour, but a carrier of potentiality which cannot be represented by real numbers.
In short, in order to communicate unambiguously (the hallmark of science) one must describe the context in which a particular facet of personality is manifest. Data, no matter how “big”, is powerless to capture the complex interactions etc. that comprise human situations; we must rely on language in any attempt to represent context. On page 135 of Werner Heisenberg’s book Physics and Beyond, the following advice appears: “For if we want to say anything at all about nature – and what else does science try to do? – we must somehow pass from mathematical to everyday language.” The lesson for psychologists who use questionnaires to measure personality is that it is meaningful to speak about someone’s personality only if one details the questionnaire; personality is a relational attribute rather than an attribute intrinsic to the person.
In an article published in Proceedings of the National Academy of Sciences, Van Bavel, Mende-Siedlecki, Brady and Reinero demonstrate that the centrality of context (or environment) is something well-known to all undergraduate psychologists. “Indeed, the insight that behaviour is a function of both the person and the environment – elegantly captured by Lewin’s equation: B = f(P, E) – has shaped the direction of social psychological research for more than half a century. During that time, psychologists and other social scientists have paid considerable attention to the influence of context on the individual and have found extensive evidence that contextual factors alter human behaviour.” If personality is a joint property of the person and the context in which it is manifest, then unambiguous communication demands that a description of the context must be integral to any attempt to represent personality.
Finally, one only appreciates the incoherence of psychographics when one notes the intellectual standing of those whose writings oppose the thinking of Kosinski and Nix. Three thinkers who stand out among those who argue that psychological attributes cannot be separated from the context in which they are manifest are the Nobel laureate Herbert A. Simon and two of the 20th century’s greatest intellectuals: the father of quantum theory, Niels Bohr, and the philosopher Ludwig Wittgenstein.
All three reject the notion that one can ignore context and treat behaviour as wholly analysable in terms of traits and inner processes (and therefore quantifiable). Indeed, psychology itself has a name for the error at the heart of psychographics, a name familiar to all undergraduate psychologists. Gerd Gigerenzer of the Max Planck Institute writes: “The tendency to explain behaviour internally without analysing the environment is known as the ‘fundamental attribution error.’”
First, Herbert Simon uses a scissors metaphor to indicate the degree to which a psychological attribute cannot be disentangled from the context in which it is manifest. Herbert writes: “Human rational behaviour is shaped by a scissors whose blades are the structure of the environment and the computational capabilities of the actor.”
Secondly, Niels Bohr – in his Discussion with Einstein on Epistemological Problems in Atomic Physics – uses quantum complementarity to argue that first-person ascriptions [the contribution of the individual] and third-person ascriptions [the contribution of the environment] of psychological attributes form an “indivisible whole.”
Finally, on page 143 of his Blue and Brown Books, Wittgenstein highlights the error at the heart of the psychographics project: “There is a general disease of thinking which always looks for (and finds) what would be called a mental state from which all our acts spring as from a reservoir.”