E.D. Hirsch Jr. is powerless to challenge direct instruction


, , , , , , , , , , , ,


Why E.D. Hirsch’s particular brand of “science” is powerless to challenge direct instruction

Dr Hugh Morrison (The Queen’s University of Belfast [retired])



The cover page of the Times Education Supplement (TES) quotes the distinguished American educationalist E D Hirsch’s claim that “there is no scientific basis for direct instruction.”  Given the high regard in which Hirsch is held by educational traditionalists, there will be widespread dismay that one of their own is invoking science to attack a traditional pedagogical technique that can see off any progressivist model when it comes to raising the educational standards of poor children.  (Readers who google the words “direct instruction” will see why this classroom approach is so important to traditionalists.)  Hirsch clearly appreciates that the reasoning set out in his new book may not be received with universal acclaim: “To offend everybody is one of the few prerogatives left to old age.”  The good news for proponents of direct instruction everywhere is that the “science” Hirsch appeals to makes no sense.




The basis of Hirsch’s TES “scientific” attack is the field of cognitive science.  To convince his readers that his book represents “consensus science” he has invited two prominent cognitive scientists, Steven Pinker and Daniel Willingham, to “blurb” the book.  Now cognitive scientists hold that psychological attributes like thought, understanding, memory, meaning, and so on, are internal processes associated with the human brain/mind.  According to cognitive scientists the mind/brain is a self-contained realm where computations are performed on mental “representations.”  However, one of the towering figures of 20th century thought, Ludwig Wittgenstein, regarded this type of thinking as deeply misconceived.  He wrote that: “The confusion and barrenness of psychology is not to be explained by calling it a “young science”: its state is not comparable with that of physics, for instance, in its beginnings. … For in psychology there are experimental methods and conceptual confusion.”  In their 2003 book Philosophical Foundations of Neuroscience Max Bennet and Peter Hacker use Wittgensteinian reasoning to attack cognitive neuroscience’s central claims.  In the remainder of this essay the “consensus science” which informs Hirsch’s claims is undermined using the writings of Bennett and Hacker.  The reader needs neither a background in cognitive psychology nor a grounding in philosophy to appreciate immediately the validity of Wittgenstein’s “conceptual confusion” claim; a healthy dose of common sense will reveal immediately the error at the heart of cognitive science.


While it is clear that thinking would be impossible without a properly functioning brain, the claim that brains can think or that thinking takes place in the brain ought to be supported with scientific evidence.  No such evidence exists.  To mistakenly attribute properties to the brain which are, in fact, properties of the human being is to fall prey to what Bennett and Hacker refer to as the “mereological fallacy.” (Mereology is concerned with part/whole relations and the fallacy goes all the way back to Aristotle.)


“Psychological predicates are predicates that apply essentially to the whole living animal, not to its parts.  It is not the eye (let alone the brain) that sees, but we see with our eyes (and we do not see with our brains, although without a brain functioning normally in respect of the visual system, we would not see).  So, too, it is not the ear that hears, but the animal whose ear it is.  The organs of an animal are part of the animal, and psychological predicates are ascribable to the whole animal, not its constituent parts” (pp. 72-73).


Cognitive scientists often refer to brains “thinking,” “knowing,” “believing,” “deciding,” “seeing an image of a cube,” “reasoning,” “learning” and so on.


“We know what it is for human beings to experience things, to see things, to know or believe things, to make decisions … But do we know what it is for a brain to see … for a brain to have experiences, to know or believe something?  Do we have any conception of what it would be like for a brain to make a decision? … These are all attributes of human beings.  Is it a new discovery that brains also engage in such human activities?” (p. 70)


In the words of Wittgenstein (1953, §281): “Only of a human being and what resembles (behaves like) a living human being can one say: it has sensations; it sees, is blind; hears, is deaf; is conscious or unconscious”.  If the human brain can learn, “This would be astonishing, and we should want to hear more.  We should want to know what the evidence for this remarkable discovery was” (Bennett & Hacker, 2003, p. 71).  It is important to appreciate the depth of the error committed here.  When the claim that the brain can think is called into question, this doesn’t render valid the assertion that brains, in fact, cannot think.


“It is our contention that this application of psychological predicates to the brain makes no sense.  It is not that as a matter of fact brains do not think, hypothesise and decide, see and hear, ask and answer questions; rather, it makes no sense to ascribe such predicates or their negations to the brain.  The brain neither sees, nor is it blind – just as sticks and stones are not awake, but they are not asleep either” (p. 72).



One gets the clear impression from the cognitive science literature that understanding or remembering are inner processes.  Wittgenstein, while accepting that without a properly functioning brain one couldn’t learn, nevertheless teaches that understanding is something attributed to the whole person, and not the brain.  When a teacher asks a pupil what she thinks, the pupil expresses her thoughts in language.  Were it not for the pupil’s language skills, the teacher couldn’t ascribe thoughts to her.  Since brains aren’t language-using creatures, how can it make sense to ascribe thoughts to a brain?



While cognitive scientists may protest that the brain’s ability to make connections while it (the brain) is learning, is visible from the PET or fMRI images of the brain, scientific writing should always show restraint:


“But this does not show that the brain is thinking, reflecting or ruminating; it shows that such-and-such parts of a person’s cortex are active when the person is thinking, reflecting or ruminating.  (What one sees on the scan is not the brain thinking – there is no such thing as a brain thinking – nor the person thinking – one can see that whenever one looks at someone sunk in thought, but not by looking at a PET scan – but the computer-generated image of the excitement of cells in his brain that occurs when he is thinking.)” (pp. 83-84).



In Neuroscience & Philosophy: Brain, Mind and Language (2007, p. 143), Bennett and Hacker write:


“But if one wants to see thinking going on, one should look at the Le Penseur (or the surgeon operating or the chess player playing or the debater debating), not his brain.  All his brain can show is which parts of the brain are metabolizing more oxygen than others when the patient in the scanner is thinking.”




In order to see off Hirsch’s ill-founded claims, advocates of direct instruction can appeal to no less a thinker than Aristotle.  Around 350BC he wrote:

“to say that the soul (psyche) is angry is as if one were to say that the soul weaves or builds.  For it is surely better not to say that the soul pities, learns or thinks, but that a man does these with his soul.”











The flaw in Dweck & Boaler’s Mindset research


, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

The flaw at the heart of Dweck and Boaler’s research, and the real source of psychology’s “reproducibility” problem

 Dr Hugh Morrison (The Queen’s University of Belfast [retired])

Email: drhmorrison@gmail.com

The Times Higher Education (28.04.2016-04.05.2016) reported that Professor Carol Dweck’s “ideas on education have swept through schools.  … In a TED talk that has so far garnered more than four million views online, she shares inspiring tales of pupils in tough, inner-city areas who have zoomed ahead after being trained to believe that their talents are not fixed.”  The key word in this quotation is the word “believe”; Professor Dweck is particularly concerned with an individual’s beliefs or “mindsets.”  In the fixed mindset, “Success is about proving you’re smart or talented.  Validating yourself.”  In the growth mindset, on the other hand, success is about “stretching yourself to learn something new.  Developing yourself.”

It is vital to Dweck’s theory – a theory currently attracting millions of pounds of education funding in the UK – that beliefs are entities in the mind/brain of the individual.  Dweck urges teachers to spur their students to success by concentrating on their “mindsets.”  On page 16 of her Mindset book she writes: “You have a choice.  Mindsets are just beliefs.  They’re powerful beliefs, but they’re just something in your mind, and you can change your mind.”  Beliefs/mindsets are properties of the individual.  Her writing is replete with references to children “putting themselves” into one mindset or the other, or being “placed into” the growth mindset by a teacher or by psychologists working on Dweck’s “Brainology” programme.

At the outset, there are two issues which engender disquiet about Dweck’s theory.  First, her writings ignore the extensive literature demonstrating that beliefs are neither mental states, brain states, nor dispositions (see, for example, Peter Hacker’s 2013 book, The intellectual powers: a study of human nature).  Second, and much more importantly, Dweck treats “intelligence” and “ability” as things-in-themselves, to borrow a Kantian conceit.  All of Dweck’s writing assumes that the individual’s intelligence exists independent of the efforts of others to learn about it.  To borrow a phrase from Crispin Wright (2001), intelligence is an “investigation-independent” entity for Dweck.  At the very core of Dweck’s research is the child’s belief that his or her intelligence is either something fixed, or something capable of improvement.  Jo Boaler’s research – which applies Dweck’s thinking to the mathematics classroom – similarly views mathematical ability as a thing-in-itself.

To prepare the reader for what follows, consider the following simple thought experiment which suggests that it is meaningless to refer to mathematical ability as a thing-in-itself, existing independent of the teacher’s efforts to measure it.  In the UK children who find mathematics a struggle take the “Foundation GCSE” mathematics examination at age 16.  The test items on the Foundation mathematics paper aren’t considered mathematically demanding.  Now suppose that Richard, a pupil who has completed the Foundation mathematics curriculum, produces a perfect score in the examination.  It seems sensible to conjecture that if Einstein were alive, he too would produce a perfect score on this mathematics paper.  Since we think of the examination paper as measuring mathematical ability, are we therefore justified in saying that Einstein and Richard have the same mathematical ability?

Paradox results when one erroneously treats mathematical ability as something investigation-independent, entirely divorced from the circumstances of its ascription.  It is clear that Richard’s mathematical achievements are dwarfed by Einstein’s.  To avoid the paradox one would have to mention the measurement circumstances and say: “Richard and Einstein have the same mathematical ability relative to the Foundation mathematics paper.”  When one factors in the measuring instrument, the paradox dissolves away.  This thought experiment illustrates how the reasoning which underpins the research of Dweck and Boaler can be undermined since neither researcher makes mention of the measurement context.

Mathematical ability is not a property of the person.  Rather it is a joint property of the person and the relevant measuring tool.  In short, mathematical ability should be thought of as a property of an interaction.  It is a relational attribute rather than an intrinsic property of the person being measured.  Definitive statements about mathematical ability – such as “my mathematical ability is fixed” – are difficult to justify.  It is only by specifying the measuring instrument in the Foundation mathematics thought experiment that one “communicates unambiguously,” to borrow the quantum physicist Niels Bohr’s words.

It would be unfair to judge Dweck’s ideas by focusing exclusively on her book Mindset, which is clearly intended for the “popular psychology” bookshelves.  Dweck’s 2000 book on “self-theories” is published by Psychology Press.  It focuses on beliefs about intelligence and is directed at the psychological community.  Dweck (2000, p. xi) sets out her programme as follows: “In this book I spell out how people’s beliefs about themselves (their self-theories) can create different psychological worlds, leading them to think, feel and act differently in identical situations.”  “Entity theorists” are individuals who believe that their intelligence is fixed, while “incremental theorists” believe that their intelligence is capable of improvement.  The former are focused on grades while learning preoccupies the latter.  (The parallels with the fixed and growth mindsets are obvious.)

The items which make up the questionnaires Dweck uses to measure children’s self-theories give the clear impression that she takes two things for granted: (i) that intelligence is an intrinsic property of the child, and (ii) that intelligence exists in “amounts.”  For example, the first two items in the theory of intelligence scale require the child to consider the statements: “You have a certain amount of intelligence, and you really can’t do much to change it” and “Your intelligence is something about you that you can’t change very much” (p. 177).

The remainder of this article will centre on statements like “I believe my intelligence/ability is fixed” and “I believe my intelligence/ability can be changed.”  Such statements sit at the core of Dweck’s theory, but I will argue that they are, in fact, meaningless because neither intelligence nor mathematical ability are things-in-themselves.

The serious conceptual errors in Dweck and Boaler’s research have their origins in the psychologist’s insistence on thinking of the concepts they study as “strongly objective” in the same sense as the concepts of Newtonian physics are strongly objective (Gould, 1996).  The attraction of psychology’s Newtonian worldview is obvious in that classical physics holds out the promise of certainty and objectivity.  Psychology’s Newtonian “physics envy” is puzzling given that physicists themselves turned their backs on a Newtonian picture of reality, adopting a quantum theoretical one approximately a century ago.

A clear indication that psychology – in common with quantum theory – is “weakly objective” can be seen in psychology’s century-long quest to define what intelligence is and what memory is.  The explanation for the futility of this quest?  The question “what is intelligence?” and the question “what is memory?” are not meaningful in a weakly objective framework.  Outside the strongly objective framework, intelligence and memory cannot be regarded as things-in-themselves about which one can speak meaningfully while disregarding the measurement context.

A recent initiative to reproduce the findings of 100 important papers in the psychological literature succeeded in only 39% of cases.  Unless psychology responds definitively, the so-called “reproducibility” project will continue to threaten cherished psychological principle after cherished psychological principle, undermining the very discipline itself.  The source of psychology’s current reproducibility ills is its claim to strong objectivity, at a time when science’s most powerful theory – tested to destruction in the laboratory – embraces weak objectivity.

Werner Heisenberg (1958), summarising “the revolution in modern science,” writes: “Since the measuring device has been constructed by the observer … we have to remember that what we observe is not nature in itself but nature exposed to our method of questioning.”  His great mentor, Niels Bohr, drew parallels between quantum theory and psychology for all of his professional life, prompting Harvard physicist Abner Shimony to conjecture that: “quantum concepts can be applied to psychology, but not with as much geometrical structure as in quantum physics.”  Richardson’s (1999, p. 40) writings (in respect of psychological measurement) echo Heisenberg’s: “[W]e find that the IQ testing movement is not merely describing properties of people: rather, the IQ test has largely created them.”

No scientist of note has ever supported psychology in its claim to strong objectivity.  As far back as 1955, the great American physicist Robert Oppenheimer pleaded with psychologists to forsake their adherence to strong objectivity: “It seems to me that the worst of all possible misunderstandings would be that psychology be influenced to model itself after a physics which is not there anymore, which has been quite outdated.”  Gigerenzer (1987, p. 11) explains the context of Oppenheimer’s plea: “quantum theory was indeed discussed in the 1940s and ‘50s within psychology.  However, it was unequivocally rejected as a new ideal of science.”  Psychology shut its ears and (incredibly) continued to teach that the measurement of living beings can be modelled on Newtonian measurement principles developed to model inanimate matter! As physicist Henry Stapp (1993, p. 219) put it: “while psychology has been moving towards the mechanical concepts of nineteenth-century physics, physics itself has moved in just the opposite direction.”

If all this seems fanciful, consider two of psychology’s most studied mental predicates: memory and intelligence.  I choose memory and intelligence because both have been studied intensively for over a century and I do not wish to be accused of making the case for weak objectivity in psychology by selecting a poorly researched area of the discipline.  The psychologists quoted in what follows are all leading researchers in memory and intelligence.

First, let’s consider memory.  Jenkins (1979, p. 431) hinted that psychology should eschew its quest to understand memory as a thing-in-itself: “The memory phenomena that we see depend on what kinds of subject we study, what kinds of acquisition conditions we provide, what kinds of material we choose to work with, and what kinds of criteria measures we obtain.”

Roediger (2008, p. 247) expresses disappointment that after decades of study, the search for an answer to the question “what is memory?” must be abandoned.  In his 2008 review article Relativity of remembering: why the laws of memory vanished, he writes: “The most fundamental principle of learning and memory, perhaps its only sort of general law, is that making any generalisation about memory one must add that ‘it depends.’”  Roediger (2008, p. 247) – anticipating the Reproducibity Project – suggests that whatever claim a psychologist makes about memory, a sceptic can always say; “Very nice work, but your finding depends on many other conditions.  Change those and your effect will go away.”  Measurement of memory seems unavoidably dependent upon the measurement context.  One can detect the same concerns in Tulving’s (2007) paper entitled: Are there 256 different kinds of memory?

Psychology’s reproducibility problem is cast in a very different light when memory is treated as a joint property of the individual and the measurement context; change the measurement context and the meaning of what is measured must change.  Battig (1978) suggests that this dependence on measurement context generalises to all psychological attributes.  Battig’s reasoning goes beyond the conservative claims of the Reproducibility Project and would suggest that psychology faces years of damaging criticism if it continues to treat the attributes it studies as strongly objective.  What physics has long since come to accept as a fundamental truth, Roediger III (2008, p. 228) bemoans as a matter of regret: “the great truth of the first 120 years of empirical study of human memory is captured in the phrase ‘it depends.’”

Quantum theory has taught physicists to accept that the question “what is an electron?” makes no sense in a weakly objective framework.  Because the physicist participates in what he or she “sees,” it must be acknowledged that physics is not concerned with standing back from nature and objectively reporting what one “sees.”  Rather, physics is limited to studying humankind’s interaction with nature.  Roediger’s “it depends” response is embraced in quantum theory: in experimental arrangement A, the electron manifests as a particle; in experimental arrangement B it manifests as something very different – a wave.  How are the statements “the electron is a particle” and “the electron is a wave” to be reconciled?  The obvious paradox is avoided by demanding (as Niels Bohr did) that physicists “communicate unambiguously” by always making reference to the measuring instrument.

The statements “the electron is a particle relative to experimental arrangement A” and “the electron is a wave relative to arrangement B” banish the paradox entirely. “Particle” and “wave” are not intrinsic properties of the electron (leading to an obvious contradiction); instead, they are properties of the electron’s interaction with the measuring tool.  According to Bohr one is communicating ambiguously – for Bohr, unambiguous communication is the hallmark of science – if one speaks of the electron as a thing-in-itself, independent of how it is observed.  All of this has relevance for Dweck’s self-theory notion because all reference to beliefs that “my intelligence is fixed/malleable” are ruled out as meaningless in a weakly objective framework, because intelligence is being wrongly interpreted as a thing-in-itself.

One can get the distinct impression from the psychological literature that the typical psychologist sees herself as a passive, objective observer of what she studies.  There is little evidence that psychologists see themselves as participants in what they “see.”  This mistaken adherence to strong objectivity also afflicts the thinking of the Reproducibility Project’s researchers who see themselves as mere observers of a re-run of psychology’s signature experiments.  The very notion of “reproducibility” sits awkwardly with the teachings of Bohr: “In the study of atomic phenomena, however, we are presented with a situation where the repetition of an experiment with the same arrangements may lead to different recordings” (Bohr, 1958-1962, p. 18). Jammer (1999, p. 234) writes that “Bohr did not regard the world as an objective reality with a given structure … conceptually separable from us as observers. … Thus, there must be limits to the depth of understanding that we can hope to gain of the world, because of our joint role as spectators and actors in the drama of existence.”  Misner, Thorne & Wheeler (1973, p. 12) counsel: “’Participator’ is the incontrovertible new concept given by quantum mechanics; it strikes down the term ‘observer’ of classical theory, the man who stands safely behind the thick glass wall and watches what goes on without taking part.  It can’t be done, quantum mechanics says.  Even with the lowly electron one must participate before one can give any meaning whatsoever to its positon and velocity.”

When attention turns to the construct “intelligence,” the case against psychology’s claim to strong objectivity deepens.  Jensen (1998, p. 46) acknowledges that after a century of attempts, a widely endorsed answer to the question “what is intelligence?” has eluded psychologists: “No other term in psychology has proved harder to define than ‘intelligence.’  Not that psychologists haven’t tried.  Though they have been attempting to define ‘intelligence’ for at least a century, even experts in this field still cannot agree on a definition.  In fact, there are as many different definitions of ‘intelligence’ as there are experts.”  One high profile search for a definition of intelligence was documented in 1986 by Sternberg and Detterman.  In the concluding paragraph of the book, Detterman sums up the experts’ judgements: “For those who expected to read this volume – entitled “What is intelligence?” – and obtain the definitive definition, I apologise.”

It was argued above that a commitment to communicate unambiguously demands that the measurement context be clearly specified if statements such as “Richard and Einstein have the same mathematical ability” are to be justified.  A statement of the mathematics tested in the Foundation examination clarifies that the words “mathematical ability” refer to a very restricted subset of the entire domain of mathematics.  In respect of intelligence Gladwell (2007, p. 95) draws conclusions from the Flynn-effect that strike at the heart of Dweck’s research: “For instance, Flynn shows what happens when we recognize that I.Q. is not a freestanding number but a value attached to a specific time and a specific test. … The notion that anyone “has” an I.Q. of a certain number, then, is meaningless unless you know which WISC he took, and when he took it, since there’s a substantial difference between getting a 130 on the WISC(IV) and getting a 130 on the much easier WISC.”

I.Q. is a relational attribute; it is the property of an interaction (between test-taker and test) and not an intrinsic property of the test-taker.  This makes it impossible to define what intelligence is as a thing-in-itself.  For example, I suspect that Dweck is using the word “intelligence” in the restricted sense of intelligence tests.  But Howard Gardner (1983), a highly respected Harvard psychologist, extended intelligence beyond the language and logical-mathematical realms to spatial intelligence, musical intelligence, the use of the body to solve problems or make things, and interpersonal/intrapersonal intelligences.  Indeed, why stop at Gardner’s seven so-called “multiple intelligences”?  Statements such as “I believe my intelligence is fixed” are devoid of any clear meaning without a precise specification of the measurement context.

The distinguished American physicist David Mermin developed Niels Bohr’s counsel that psychology can learn from quantum theory that there are questions which cannot be answered in a weakly objective framework.  It is instructive to quote Mermin’s words in full: “What does it mean for a property to be real?  When you study an object how can you be sure you are learning something about the object itself, and not merely discovering some irrelevant feature of the instruments you used in your study?  This is a question that has plagued generations of psychologists.  When you measure IQ are you learning something about an inherent quality of a person called “intelligence,” or are you merely acquiring information about how the person responds to something you have fancifully called an IQ Test?  Until the advent of the quantum theory in 1925 physicists were above such concerns.  But since then, with the discovery that experiments at the atomic level necessarily disturb the object of investigation, precisely such reservations have been built into the foundations of physics.”

From the physicist’s perspective: “[I]f we set out to measure the momentum, say, of an electron, what we are actually measuring is the ability of an electron to answer questions about momentum.  The electron may, indeed, not have any such property as momentum, in the way we think of it in the everyday world… .  We get experimental results – ‘answers’ – which we interpret as measures of momentum.  But they are only telling us about the ability of electrons to respond to momentum tests, not their real momentum, just as the results of IQ measurements only tell us about the ability of people to respond to IQ tests, not their real intelligence” (Gribbin, 1995, p. 148).

Weak objectivity views measurement as context-dependent.  When the position of an electron is measured, the physicist is not merely checking up on an investigation-independent property inherent in the election.  What one is really measuring is the interaction between the electron and the measuring instrument.  The measurement outcome is a joint property of the electron and the measurement instrument.  Niels Bohr considered questions which treated the electron as a thing-in-itself (such as “what is an electron?”) as meaningless in a weakly objective framework.  As anyone who has watched popular science programmes will recognise, in quantum theory an electron manifests as a wave in one measurement context, and as a particle in another.  It is therefore meaningless to ask what an electron is as a thing-in-itself, without reference to the measurement context.

An electron is a wave relative to one measurement context, and a particle relative to another.  According to his close colleague, Aage Petersen, Bohr summarised the switch from strong to weak objectivity as follows: “It is wrong to think that the task of physics is to find out how nature is.  Physics concerns what we can say about nature.”  He uses the word “say” because in a weakly objective framework, in order to communicate ambiguously one must provide a description of the measurement apparatus.  Changes in the measurement context therefore have consequences for the very meaning of what is measured.

Bohr labelled this tendency of quantum entities to manifest as wave or particle, depending on the measurement context, as “complementarity.”  He regarded complementarity as the central concept in quantum theory.  Physicists treat these two characteristic manifestations as opposites given that a particle is confined to a tiny region of space, while a wave spreads throughout space.  A strikingly similar concept arising in the study of mind appears in the writings of one of the greats of modern philosophy, Ludwig Wittgenstein.  In the secondary literature derived from Wittgenstein’s later philosophy, it has become known as “first-person/third-person asymmetry.”  This asymmetry applies to intentional predicates in general and to intelligence and mathematical ability in particular.  Incidentally, this analogue of complementarity in respect of mind avoids all of the difficulties associated with both Cartesian dualism and behaviourism.

I will quote in full from Colin McGinn’s book The Character of Mind in order to illustrate that an analogue of complementarity informs fundamental thinking about mind.  McGinn (1996, pp. 6-7) writes: “Mental concepts are unique in that they are ascribed in two, seemingly very different, sorts of circumstances: we apply them to ourselves on the strength of ‘inner’ awareness of our mental states, as when a person judges of himself that he has a headache; and we apply them to others on the strength of their ‘outer’ manifestations in behaviour and/or speech.  These two [opposite] ways of ascribing mental concepts are referred to as first-person and third-person ascriptions. … It would be fine if we could somehow, as theorists, prescind from both perspectives and just contemplate how mental phenomena are, so to say, in themselves; but this is precisely what seems conceptually unfeasible … we seem to need the idea of a single mental reality somehow neutral between the first- and third-person perspectives; the problem is that there does not appear to be any such idea.”

Physics has taught us that a statement such as “the electron’s velocity is constant” is utterly meaningless in a weakly objective framework such as the quantum framework.  Velocity is not an intrinsic property of the electron. Quantum theory rules out all reference to the velocity of an electron without a clear description of the particular measurement context.  One gets nonsense when one omits the context of ascription.  Similarly, investigation-independent statements such as “I believe that my intelligence is fixed” and “I believe my mathematical ability can grow” make little or no sense.  Both of these statements wrongly present mind as a carrier of definite states.  Mind is a carrier of potentiality (just like the microentity) and not a carrier of definite states.  Intelligence and mathematical ability are not inner states which are somehow the source of behaviour.  In respect of the first statement above, Mermin’s reasoning (see above) rejects the notion that intelligence can ever be an intrinsic property of the individual.

But what of mathematical ability as Boaler construes it?  Just as in the case of intelligence, mathematical ability (as a thing-in-itself) is a potentiality rather than a state.  A pupil who has grasped the concept “even number” has the potential to non-collusively modify his or her behaviour so that it is in accord with accepted mathematical practice in respect of the even numbers. Mathematical ability is therefore a joint property of the individual and the fiduciary (to borrow Polanyi’s term) framework within which he or she has been educated.  As with intelligence, mathematical ability is not an intrinsic property of the person.

In conclusion, it is instructive to demonstrate the gulf between Dweck and Boaler’s research and the writings of one of the greatest physicists of all time.  On page 96 of the first volume of his essays Bohr (1934) considers the connection between “the conscious analysis of any concept” and “its immediate application.”  For example, how is the ability to apply the concept “even number” in accord with established mathematical practice connected to the possession of an introspectable mental “object,” namely, the formula which generates the even numbers: Un = 2n?  How does having the formula in mind (a concept capable of “conscious analysis”, in Bohr’s terms) connect with one’s ability to say or write out the even numbers in accord with established mathematical practice (“immediate application”, in Bohr’s terms)?

The following connection immediately suggests itself: the mental image of the formula is the source of the individual’s ability to write out the even numbers.  Two entirely separate realms are suggested here: on the one hand, the individual’s understanding of the even numbers (to have the formula in mind is to understand the even numbers), and, on the other hand, the application of that understanding in the writing out of the even numbers in accord with established practice.  In this picture the inner world of understanding is divorced entirely from the public realm of application.  From this viewpoint it is tempting to think of the formula-in-mind as representing mathematical understanding as a thing-in-itself.  For Bohr, this strongly objective picture of the connection between “inner” and outer must be invalid.

Bohr considered this Cartesian picture (with its Newtonian, self-standing mental “objects”) as entirely wrong.  To understand Bohr’s reasoning one must turn to the later philosophy of Wittgenstein and to his writings on rule-following in particular.  The error in the strongly objective Cartesian picture, in which mathematical ability in respect of the even numbers is entirely divorced from subsequent application, is that the inner formula cannot be the source of correct application because it has no guidance properties whatever.  The formula as a thing-in-itself simply cannot determine subsequent application.  One can only derive guidance from a mathematical formula by being trained in the practice of using that formula.  However, mathematical practice is a feature of the entirely separate public realm where the formula is to be applied.  Because the formula-in-mind doesn’t have its applications written into it, it cannot guide.

It is the experience of mathematics teachers throughout the world that the appearance of the quadratic formula in the formula sheet made available to pupils taking public examinations is no guarantee that all pupils taking the examination will be able to successfully apply that formula.  Wittgenstein demonstrates that when understanding as a thing-in-itself is separated from application, any attempt to explain how these two realms are connected leads to a destructive “regress of interpretations.”  When one defines “understanding of the even numbers” as having a mental object (the formula Un = 2n) in mind, divorced entirely from the mathematical practice which gives the formula its life, one descends into confusion (see Oakeshott, 1975).  Even experienced mathematicians often fail to recognise the role played by their long apprenticeship in the discipline.  Malcolm (1965, p. 102) questions the experienced mathematician’s mistaken intuition that formulae in themselves determine their applications: “You would like to think that your understanding of the formula determines in advance the steps to be taken, that when you understand or meant the formula in a certain way ‘your mind as it were flew ahead and took all the steps before you physically arrived at this one or that one’ (Wittgenstein, 1953, §188).”

Wittgenstein and Bohr resolve the regress of interpretations problem by treating the two realms as conceptually inseparable.  This allows the difficulties of both Cartesian dualism and behaviourism to be avoided.  Kenny (2004, p. 49) explains the pivotal role played by Wittgenstein’s notion of criteria: “According to him the connection between mental states and physical [application] is neither one of logical reduction (as in behaviourist theory) nor one of causal connection (as in Cartesian theory).  According to him the physical expression of the mental process is a criterion for that process; that is to say, it is part of the concept of a mental process of a particular kind that it should have a characteristic manifestation.  The criteria by which we attribute states of mind and mental acts, Wittgenstein showed, are bodily states and activities.”

Mind is expressed in behaviour; the teacher cannot help but “see” the child’s understanding of the concept “even number” in the ease with which she applies the rule for generating the even numbers.  This is how the connection is made.  In an appropriate school context, the ability to unhesitatingly write the next 100 even numbers, starting with 2088, for example, serves as a criterion which justifies the teacher in saying that the child understands the concept “even number.”  The Cartesian causal connection between mind, on the one hand, and behaviour, on the other, is replaced by a picture in which mind and behaviour are treated as an indivisible whole.

Malcolm (1965, pp. 101-102) summarises Wittgenstein’s (and Bohr’s) rejection of the notion that mathematical ability is a thing-in-itself, entirely divorced from application: “But the question of whether one understands the rule cannot be divorced from the question of whether one will go on in that one particular way that we call ‘right.’  The correct use is a criterion of understanding. … You would like to think that your understanding of the formula determines in advance the steps to be taken, that when you understood or meant the formula in a certain way “your mind as it were flew ahead and took all the steps before you physically arrived at this one or that one” (§188).  But how you meant it is not independent of how in fact you use it. … How he meant the formula determines his subsequent use of it, only in the sense that the latter is a criterion of how he meant it.”


Battig, W.F. (1978).  Parsimony or psychology?  Presidential address, Rocky Mountain Psychological Association, Denvir, CO.

Bohr, N. (1934).  The philosophical writings of Niels Bohr.  Woodbridge: Ox Bow Press.

Bohr, N. (1958-1962).  Essays 1958-1962 o atomic physics and human knowledge.  Woodbridge: Ox Bow Press.

Dweck, C. S. (2000).  Self-theories: their role in motivation, personality, and development.  Philadelphia, PA: Psychology Press.

Dweck, C. S. (2006).  Mindset: The new psychology of success.  New York: Random House.

Gardner, H. (1983).  Frames of mind.  New York: Basic Books.

Gigerenzer, G. (1987).  Probabilistic thinking and the fight against subjectivity.  In L. Kruger, G. Gigerenzer, & M.S. Morgan (Eds.), The probabilistic revolution – Volume 2: Ideas in the sciences (pp. 11-33).  Cambridge, MA: The Massachusetts Institute of Technology Press.

Gladwell, M. (Dec. 17, 2007).  None of the above.  New Yorker magazine.

Gould, S.J. (1996).  The mismeasure of man.  London: Penguin Books.

Gribbin, J. (1995).  Schrödinger’s kittens and the search for reality.  London: Weidenfeld & Nicolson.

Hacker P.M.S. (1997).  Insight and illusion: Themes in the philosophy of Wittgenstein.  Bristol: Thoemmes Press.

Hacker, P.M.S. (2013).  The intellectual powers: A study of human nature.  Oxford: Wiley Blackwell.

Heisenberg, W. (1958).  Physics and philosophy: the revolution in modern science.  New York: Prometheus Books.

Jammer, M. (1999).  Einstein and religion.  Princeton, NJ: Princeton University Press.

Jenkins, J.J. (1979).  Four points to remember: a tetrahedral model of memory experiments.  In L.S. Cremak, & F.I.M. Craik (Eds.), Levels of processing in human memory (pp. 429-446).  Hillsdale, NJ: Lawrence Erlbaum.

Jensen, A.R. (1998).  The g factor: the science of mental ability.  Westport, CT: Praeger.

Kenny, A. (2004).  The unknown God.  London: Continuum.

Malcolm, N. (1963).  Knowledge and certainty.  Englewood Cliffs, NJ: Prentice-Hall.

McGinn, C. (1996).  The character of mind.  Oxford: Oxford University Press.

Mermin, N.D. (1993).  Lecture given at the British Association Annual Science Festival.  London: British Association for the Advancement of Science.

Misner, C.W., Thorne, K.S., & Wheeler, J.A. (1973).  Gravitation.  San Francisco: Freeman.

Oakeshott, M. (1975).  On human conduct.  Oxford: Clarendon Press.

Oppenheimer, R. (1955, September 4).  Analogy in science.  Paper presented at the 63rd Annual Meeting of the American Psychology Association, San Francisco, CA.

Richardson, K. (1999).  The making of intelligence.  London: Weidenfeld & Nicolson.

Roediger III, H.L. (2008).  Relativity of remembering: Why the laws of memory vanished.  Annual Review of Psychology, 59, 225-254.

Stapp, H.P. (1993).  Mind, matter, and quantum mechanics.  Berlin: Springer-Verlag.

Sternberg, R.J., & Detterman, D.G. (Eds.). (1986).  What is intelligence?  Contemporary viewpoints on its nature and definitions.  Norwood, NJ: Ablex Publishing Corporation.

Tulving, E. (2007).  Are there 256 different kinds of memory?  In J.S. Nairne (Ed.), The foundations of remembering: Essays in honour of Henry L. Roediger III (pp. 39-52).  New York: Psychological Press.

Wittgenstein, L. (1953).  Philosophical investigations.  Oxford: Blackwell.

Wright, C. (2001).  Rails to infinity: Essays on themes from Wittgenstein’s Philosophical Investigations.  Cambridge, MA: Harvard University Press.

Why OECD Pisa cannot be rescued


, , , , , , , , , , , , , , , , , , , , , , , , ,

PISA cannot be rescued by switching IRT model because all IRT modelling is flawed.

Dr Hugh Morrison (The Queen’s University of Belfast [retired])drhmorrison@gmail.com

On page 33 of the Times Educational Supplement of Friday 25th November 2016, Andreas Schleicher, who oversees PISA, appears to accept my analysis of the shortcomings of the Rasch model which plays a central role in PISA’s league table.  The Rasch model is a “one parameter” Item Response Theory (IRT) model, and Schleicher argues that PISA’s conceptual difficulties can be resolved by abandoning the Rasch model for a two or three parameter model.  However, my criticisms apply to all IRT models, irrespective of the number of parameters.  In this essay I will set out the reasoning behind this claim.


One can find the source of IRT’s difficulty in Niels Bohr’s 1949 paper entitled Discussion with Einstein on Epistemological Problems in Atomic Physics.  Few scientists have made a greater contribution to the study of measurement than the Nobel Laureate and founding father of quantum theory, Niels Bohr.  Given Bohr’s preoccupation what the scientist can say about aspects of reality that are not visible (electrons, photons, and so on), one can understand his constant references to measurement in psychology.  “Ability” cannot be seen directly; rather, like the microentities that manifest as tracks in particle accelerators, ability manifests in the examinee’s responses to test items.  IRT is concerned with “measuring” something which the measurer cannot experience directly, namely, the ability of the examinee.


IRT relies on a simple inner/outer picture for its models to function.  In IRT the inner (a realm of timeless, unobserved latent variables, or abilities) is treated as independent of the outer (here examinees write or speak responses at moments in time).  This is often referred to as a “reservoir” model in which timeless abilities are treated as the source of the responses given at specific moments in time.


As early as 1929 Bohr rejected this simplistic thinking in strikingly general terms: “Strictly speaking, the conscious analysis of any concept stands in a relation of exclusion to its immediate application.  The necessity of taking recourse to a complementary … mode of description is perhaps most familiar to us from psychological problems.”  Now what did Bohr mean by these words?  Consider, for example, the concept “quadratic.”  It is tempting to adopt a reservoir approach and trace a pupil’s ability to apply that concept in accord with established mathematical practice to his or her having the formula in mind.  The guidance offered by the formula in mind (Bohr’s reference to “conscious analysis”) accounts for the successful “application,” for example, to the solution of specific items on an algebra test.


However, this temptingly simplistic model in which the formula is in the unobserved mental realm and written or spoken applications of the concept “quadratic” take place in the observed realm, contains a fundamental flaw; the two realms cannot be meaningfully connect.  The “inner” formula (in one realm) gets its guidance properties from human practices (in the other realm).  A formula as a thing-in-itself cannot guide; one has to be trained in the established practice of using the formula before it has guidance properties.  In school mathematics examinations around the world, pupils are routinely issued with a page of formulae relevant to the examination.  Alas, it is the experience of mathematics teachers everywhere that simply having access to the formula as a thing-in-itself offers little or no guidance to the inadequately trained pupil.  The formula located in one realm cannot connect with the applications in the other.


Wittgenstein teaches that no formula, rule, principle, etc. in itself can ever determine a course of action.  The timeless mathematical formula in isolation cannot generate all the complexities of a practice (something which evolves in time); rather, as Michael Oakeshott puts it, a formula is a mere “abridgement” of the practice – the practice is primary, with the formula, rule, precept etc. deriving its “life” from the practice.


Returning to Bohr’s writing, it is instructive to explain his use of the word “complementarity” in respect of psychology and to explain the meaning of the words: “stands in a relation of exclusion.”  Complementarity was the most important concept Bohr bequeathed to physics.  It involves a combination of two mutually exclusive facets.  In order to see its relevance to the validity of IRT modelling, let’s return to the two distinct realms.


We think of the answers to a quadratic equation as being right or wrong (a typical school-level quadratic equation has two distinct answers).  In the realm of application this is indeed the case.  When the examinee is measured, his or her response is pronounced right or wrong dependent upon its relation to established mathematical practice.  However, in the unobserved realm, populated by rules, formulae and precepts (as things-in-themselves), any answer to a quadratic equation is simultaneously right and wrong!


A formula as a thing-in-itself cannot separate what accords with it from what conflicts with it, because there will always exist an interpretation of the formula for which a particular answer is correct, and another interpretation for which the same answer can be shown to conflict with the formula.  Divorced from human practices, the distinction between right and wrong collapses.  (This is a direct consequence of Wittgenstein celebrated “private language” argument.)  This explains Bohr’s reference to a “relation of exclusion.”  In simplistic terms, the unobserved realm, in which answers are compared with the formula for solving quadratics, responses are right-and-wrong, while in the observed realm, where answers are compared with the established practice, responses are right-or-wrong.


On this reading, ability has two mutually exclusive facets which cannot meaningfully be separated.  The distinguished Wittgenstein scholar, Peter Hacker, captures this situation as follows: “grasping an explanation of meaning and knowing how to use the word explained are not two independent abilities but two facets of one and the same ability.”  Ability, construed according to Bohr’s complementarity, is indefinite when unobserved and definite when observed.  Moreover, this definite measure is not an intrinsic property of the examinee, but a property of the examinee’s interaction with the measuring tool.


Measurement of ability is not a matter of passively checking up on what already exists – a central tenet of IRT.  Bohr teaches that the measurer effects a radical change from indefinite to definite.  Pace IRT, measurers, in effect, participate in what is measured.  No item response model can accommodate the “jump” from indefinite to definite occasioned by the measurement process.  All IRT models mistakenly treat unmeasured ability as identical to measured ability.  What scientific evidence could possibly be adduced in support of that claim?  No IRT model can represent ability’s two facets because all IRT models report ability as a single real number, construed as an intrinsic property of the measured individual.




The problem of Social Mobility explained


, , , , , , , , , , , , , , , , , , ,


how-not-to-be-a-hypocriteWhile the mainstream media offer endless analysis and political party talking heads pontificate on the issue of grammar schools and social mobility, the explanation for any reduction of social mobility is made clear by actions of these Members of Parliament

It is not the grammar schools  which are responsible for restricting social mobility but those influential people who had the benefit of receiving private, independent schooling or attended a grammar school and then denying to others something that improved their own social mobility.

All those illustrated in this post would benefit from reading a copy of Adam Swift’s book, How not to be a hypocrite.how-not-to-be-a-hypocrite







Stroud MP Neil Carmichael Conservative chairman of the Education Select Committee told Radio Four’s Westminster Hour:

We have serious issues about social mobility, in particular white working-class young people, and I don’t think that having more grammar schools is going to help them


Neil Carmichael boarded at St Peter’s, an independent school in York that dates back to AD627 and includes among its alumni Guy Fawkes, cricketer Jonny Bairstow and actor Greg Wise. Today to send your son to board would cost £27,375 a year.

Neil Carmichael’s Wikipedia page makes reference to him being a hypocrite.




John Pugh, Liberal Democrat education spokesperson condemns grammars. Pugh attended Prescot and Maidstone grammars, and taught in the independent sector at Merchant Taylors’ Boys’ School





Jeremy Corbyn MP, leader of the Labour Party was educated at Castle House Independent Preparatory School and Adams Grammar School.

In 1999, the MP split from his first wife over a conflict over their son’s education. His wife, Claudia Bracchita, explained:

We had to make the right decision in the interests of our child. We would have been less than human if we had done anything else.



Sir Michael Wilshaw attended Clapham College Grammar School.

The Ofsted Chief recently made a plea to Theresa May, the Prime Minister,  to stop grammar schools He told Nick Ferrari on LBC, Leading Britain’s Conversation,

We need more than the top 10 or 20% of youngsters to do well in our economy and in our society.

Sir Michael Wilshaw has conveniently ignored the Northern Ireland education system which is entirely selective.  Northern Ireland leads the UK in performance in GCSE and A-level examinations and has only one private post-primary school.

Update 13th September via Guido Fawkes


Polly Toynbee today attacks Theresa May’s grammar school plans, arguing that segregation by social class is “irrational” and claiming grammars add to “splits and divisions” in society. She has some front. Polly herself failed her 11-plus and attended the independent Badminton School. Earning £110,000-a-year at theGuardian meant she was able to send two of her children to private school as well. Today Toynbee writes that “inequality is monstrously unfair… it means birth is almost always social destiny”. Some children are evidently more equal than others.Is there a bigger hypocrite in the grammar schools debate?

The DUP have failed to protect UK parity on academic standards


, , , , , , , , , , , , , ,


Peter Weir 1Jim Hacker

Despite the power to do so the DUP Education Minister, Peter Weir, has failed to address his predecessor’s break with United Kingdom parity in respect of academic standards.

Had he acted immediately, instead of buying time for the Northern Ireland Executive, Mr Weir could have adopted the United Kingdom model both in respect of the grading scale for examinations and longstanding concerns regarding coursework or so-called “controlled assessment”

It appears that the DUPs Yes Minister equivalent of Jim Hacker has been an easy victim of the green Blob’s civil servants in Rathgael House. The green Blob is Northern Ireland’s devolved version of the UK education establishment.


The text of the Newsletter Lead Letter

Are GCSE and GCE exam results between GB and N. Ireland comparable?

The answer regrettably, for the moment, is that it is too early to tell.

The general public must be careful not to assume that Peter Weir, by overturning the effective monopoly John O’Dowd granted CCEA over GCSE and GCE assessment in Northern Ireland, has done anything other than tinker at the edges of the problem bequeathed him by Sinn Fein. He seems to be a ‘Yes Minister’ captured by the green Blob, the entrenched education establishment

John O’Dowd’s break with UK parity in respect of academic standards goes beyond his expulsion of two of the largest UK awarding bodies, and presents huge technical difficulties in respect of standards.  These could have been solved at a stroke had Peter Weir responded positively by cutting this Gordian knot and adopted the UK model both in respect of its grading scale and its concerns regarding coursework or so-called “controlled assessment.”

This action would have allowed Peter Weir to significantly scale down CCEA’s GCSE/GCE functions. Northern Ireland could simply “borrow” papers from larger awarding bodies and make the substantial savings available to hard-pressed schools.

Given the achievements of our schools in the recent GCSE and AS/A2 results, it is bizarre they now enter another time of uncertainty while CCEA – who act as their own qualifications regulator– fail to reconcile these two sets of standards.  The technical difficulties are considerable; CCEA’s assessments differ in the role given to controlled assessment.

The public have a right to know precisely what CCEA’s Qualification Regulator, Roger McCune, means when he promises: “We will start work immediately on the technical implementation of the new grading and continue to ensure that our qualifications remain comparable to other similar qualifications elsewhere in the United Kingdom.”

The CCEA Regulator is confusing squares and circles.

Peter Weir stands in danger of being compared to Jim Hacker for his failure to master his opponents within the green Blob and refusal to act decisively during the first 100 days of a new administration.

News Letter 30-08-16 Weir_20160830_0001



Why the UK Department for Education is wrong on promoting OECD Pisa


, , , , , , , , , , , , ,


Why PISA ranks are founded on a methodological thought disorder


Dr Hugh Morrison

(The Queen’s University of Belfast [retired])



When psychometricians claimed to be able to measure, they used the term ‘measurement’ not just for political reasons but also for commercial ones. … Those who support scientific research economically, socially and politically have a manifest interest in knowing that the scientists they support work to advance science, not subvert it.  And those whose lives are affected by the application of what are claimed to be ‘scientific findings’ also have an interest in knowing that these ‘findings’ have been seriously investigated and are supported by evidence. (Michell, 2000, p. 660)



This essay is a response to the claim by the Department of Education that: “The OECD is at the forefront of the academic debate regarding item response theory [and] the OECD is using what is acknowledged as the best available methodology [for international comparison studies].”


Item Response Theory plays a pivotal role in the methodology of the PISA international league table.  This essay refutes the claim that item response theory is a settled, well-reasoned approach to educational measurement.  It may well be settled amongst quantitative psychologists, but I doubt if there is a natural scientist on the planet who would accept that one can measure mental attributes in a manner which is independent of the measuring instrument (a central claim of item response theory).  It will be argued below that psychology’s approach to the twin notions of “quantity” and “measurement” has been controversial (and entirely erroneous) since its earliest days.  It will be claimed that the item response methodolology, in effect, misuses the two fundamental concepts of quantity and measurement by re-defining them for its own purposes.  In fact, the case will be made that PISA ranks are founded on a “methodological thought disorder” (Michell, 1997).


Given the concerns of such a distinguished statistician as Professor David Spiegelhalter, the Department of Education’s continued endorsement of PISA is difficult to understand.  This essay extends the critique of PISA and item response theory beyond the concerns of Spiegelhalter to the very data from which the statistics are generated.  Frederick Lord (1980, p. 227-228), the father of modern psychological measurement, warned psychologists that when applied to the individual test-taker, item response theory produces “absurd” and “paradoxical” results.  Given that Lord is one of the architects of item response theory, it is surprising that this admission provoked little or no debate among quantitative psychologists.  Are politicians and the general public aware that item response theory breaks down when applied to the individual?


In order to protect the item response model from damaging criticism, Lord proposed what physicists call a “hidden variables” ensemble model when interpreting the role probability plays in item response theory.  As a consequence item response models are deterministic and draw on Newtonian measurement principles. “Ability” is construed as a measurement-independent “state” of the individual which is the source of the responses made to test items (Borsboom, Mellenbergh, & van Heerden, 2003).  Furthermore, item response theory is incapable of taking account of the fact that the psychologist participates in what he or she observe.  Richardson (1999) writes: “[W]e find that the IQ-testing movement is not merely describing properties of people: rather, the IQ test has largely created them” (p. 40).  The participative nature of psychological enquiry renders the objective Newtonian model inappropriate for psychological measurement.  This prompted Robert Oppenheimer, in his address to the American Psychological Association, to caution: [I]t seems to me that the worst of all possible misunderstandings would be that psychology be influenced to model itself after a physics which is not there anymore, which has been quite outdated.”


Unlike psychology, Newtonian measurement has very precise definitions of “quantity” and “measurement” which item response theorists simply ignore.  This can have only one interpretation, namely, that the numerals PISA attaches to the education systems of countries aren’t quantities, and that PISA doesn’t therefore “measure” anything, in the everyday sense of that word. I have argued elsewhere that item response theory can escape these criticisms by adopting a quantum theoretical model (in which the notions of “quantity” and “measurement” lose much of their classical transparency).  However, that would involve rejecting one of the central tenets of item response theory, namely, the independence of what is measured from the measuring instrument.  Item response theory has no route out of its conceptual difficulties.


This represents a conundrum for the Department of Education.  In endorsing PISA, the Department is, in effect, supporting a methodology designed to identify shortcomings in the mathematical attainment of pupils, when that methodology itself has serious mathematical shortcomings.


Modern item response theory is founded on a definition of measurement promulgated by Stanley Stevens and addressed in detail below.  By this means, Stevens (1958, p. 384) simply pronounced psychology a quantitative science which supported measurement, ignoring established practice elsewhere in the natural sciences.  Psychology refused to confront Kant’s view that psychology couldn’t be a science because mental predicates couldn’t be quantified.  Wittgenstein’s (1953, p. 232) scathing critique had no impact on quantitative psychology: “The confusion and barrenness of psychology is not to be explained by calling it a “young science”; its state is not comparable with that of physics, for instance, in its beginnings. … For in psychology there are experimental methods and conceptual confusion. … The existence of the experimental method makes us think we have the means of solving the problems which trouble us; though problem and method pass one another by.”


Howard Gardner (2005, p. 86), the prominent Harvard psychologist looks back in despair to the father of psychology itself, William James:


On his better days William James was a determined optimist, but he harboured his doubts about psychology.  He once declared, “There is no such thing as a science of psychology,” and added “the whole present generation (of psychologists) is predestined to become unreadable old medieval lumber, as soon as the first genuine insights are made.”  I have indicated my belief that, a century later, James’s less optimistic vision has materialised and that it may be time to bury scientific psychology, at least as a single coherent undertaking.


I will demonstrate in a follow-up paper to this essay, an alternative approach which solves the measurement problem as Stevens presents it, but in a manner which is perfectly in accord with contemporary thinking in the natural sciences.  None of the seemingly intractable problems which attend item response theory trouble my account of measurement in psychology.

However, my solution renders item response theory conceptually incoherent.


In passing it should be noted that some have sought to conflate my analysis with that of Svend Kreiner, suggesting that my concerns would be assuaged if only PISA could design items which measured equally from country to country.  Nothing could be further from the truth; no adjustment in item properties can repair PISA or item response theory.  No modification of the item response model would address its conceptual difficulties.


The essay draws heavily on the research of Joel Michell (1990, 1997, 1999, 2000, 2008) who has catalogued, with great care, the troubled history of the twin notions of quantity and measurement in psychology.  The following extracts from his writings, in which he accuses quantitative psychologists of subverting science, counter the assertion that item response theory is an appropriate methodology for international comparisons of school systems.


From the early 1900s psychologists have attempted to establish their discipline as a quantitative science.  In proposing quantitative theories they adopted their own special definition of measurement and treated the measurement of attributes such as cognitive abilities, personality traits and sensory intensities as though they were quantities of the type encountered in the natural sciences.  Alas, Michell (1997) presents a carefully reasoned argument that psychological attributes lack additivity and therefore cannot be quantities in the same way as the attributes of Newtonian physics.  Consequently he concludes: “These observations confirm that psychology, as a discipline, has its own definition of measurement, a definition quite unlike the traditional concept used in the physical sciences” (p. 360).


Boring (1929) points out that the pioneers of psychology quickly came to realise that if psychology was not a quantitative discipline which facilitated measurement, psychologists could not adopt the epithet “scientist” for “there would … have been little of the breath of science in the experimental body, for we hardly recognise a subject as scientific if measurement is not one of its tools” (Michell, 1990, p. 7).


The general definition of measurement accepted by most quantitative psychologists is that formulated by Stevens (1946) which states: “Measurement is the assignment of numerals to objects or events according to rules” (Michell, 1997, p. 360).  It seems that psychologists assign numbers to attributes according to some pre-determined rule and do not consider the necessity of justifying the measurement procedures used so long as the rule is followed.  This rather vague definition distances measurement in psychology from measurement in the natural sciences.  Its near universal acceptance within psychology and the reluctance of psychologists to confirm (via. empirical study) the quantitative character of their attributes casts a shadow over all quantitative work in psychology.  Michell (1997, p. 361) sees far-reaching implications for psychology:


If a quantitative scientist (i) believes that measurement consists entirely in making numerical assignments to things according to some rule and (ii) ignores the fact that the measurability of an attribute presumes the contingent … hypothesis that the relevant attribute possesses an additive structure, then that scientist would be predisposed to believe that the invention of appropriate numerical assignment procedures alone produces scientific measurement.


Historically, Fechner (1860) – who coined the word “psychophysics” – is recognised as the father of quantitative psychology.  He considered that the only creditworthy contribution psychology could make to science was through quantitative approaches and he believed that reality was “fundamentally quantitative.”  His work focused on the instrumental procedures of measurement and dismissed any requirement to clarify the quantitative nature of the attribute under consideration.


His understanding of the logic of measurement was fundamentally flawed in that he merely presumed (under some Pythagorean imperative) that his psychological attributes were quantities.  Michell (1997) contends that although occasional criticisms were levied against quantitative measurement in psychology, in general the approach was not questioned and became part of the methodology of the discipline.  Psychologists simply assumed that when the study of an attribute generated numbers, that attribute was being measured.


The first official detailed investigation of the validity of psychological measurement from beyond its professional ranks was conducted – under the auspices of the British Association for the Advancement of Science – by the Ferguson Committee in 1932.  The non-psychologists on the committee concluded that there was no evidence to suggest that psychological methods measured anything, as the additivity of psychological attributes had not been demonstrated.  Psychology moved to protect its place in the academy at all costs.  Rather than admitting the error identified by the committee and going back to the drawing board, psychologists sought to defend their modus operandi by attempting a redefinition of psychological measurement.  Stevens’ (1958, p. 384) definition that measurement involved “attaching numbers to things” legitimised the measurement practices of psychologists who subsequently were freed from the need to test the quantitative structure of psychological predicates.


Michell (1997, p. 356) declares that presently many psychological researchers are “ignorant with respect to the methods they use.”  This ignorance permeates the logic of their methodological practices in terms of their understanding of the rationale behind the measurement techniques used.  The immutable outcome of this new approach to measurement within psychology is that the natural sciences and psychology have quite different definitions of measurement.


Michell (1997, p. 374) believes that psychology’s failure to face facts constitutes a “methodological thought disorder” which he defines as “the sustained failure to see things as they are under conditions where the relevant facts are evident.”  He points to the influence of an ideological support structure within the discipline which serves to maintain this idiosyncratic approach to measurement.  He asserts that in the light of commonly available evidence, interested empirical psychologists recognise that “Stevens’ definition of measurement is nonsense and the neglect of quantitative structure a serious omission” (Michell, 1997, p. 376).


Despite the writings of Ross (1964) and Rozeboom (1966), for example, Stevens’ definition has been generally accepted as it facilitates psychological measurement by an easily attainable route.  Michell (1997, p. 395) describes psychology’s approach to measurement as “at best speculation and, at worst, a pretence at science.”


[W]e are dealing with a case of thought disorder, rather than one of simple ignorance or error and, in this instance, these states are sustained systemically by the almost universal adherence to Stevens’ definition and the almost total neglect of any other in the relevant methodology textbooks and courses offered to students.  The conclusion that follows from this history, especially that of the last five decades, is that systemic structures within psychology prevent the vast majority of quantitative psychologists from seeing the true nature of scientific measurement, in particular the empirical conditions necessary for measurement.  As a consequence, number-generating procedures are consistently thought of as measurement procedures in the absence of any evidence that the relevant psychological attributes are quantitative.  Hence, within modern psychology a situation exists which is accurately described as systemically sustained methodological thought disorder. (Michell, 1997, p. 376)


To make my case, let me first make two fundamental points which should shock those who believe that the OECD is using what is acknowledged as the best available methodology for international comparisons.  Both of these points should concern the general public and those who support the OECD’s work.  First, the numerals that PISA publishes are not quantities, and second, PISA tables do not measure anything.


To illustrate the degree of freedom afforded to psychological “measurement” by Stevens it is instructive to focus on the numerals in the PISA table.  Could any reasonable person believe in a methodology which claims to summarise the educational system of the United States or China in a single number?  Where is the empirical evidence for this claim?  Three numbers are required to specify even the position of a single dot produced by a pencil on one line of one page of one of the notebooks in the schoolbag of one of the thousands of American children tested by PISA.  The Nobel Laureate, Sir Peter Medawar refers to such claims as “unnatural science.”  Medawar (1982, p. 10) questions such representations using Philip’s (1974) work on the physics of a particle of soil:


The physical properties and field behaviour of soil depends on particle size and shape, porosity, hydrogen iron concentration, material flora, and water content and hygroscopy.  No single figure can embody itself in a constellation of values of all these variables in any single real instance … psychologists would nevertheless like us to believe that such considerations as these do not apply to them.


Quantitative psychology, since its inception, has modelled itself on the certainty and objectivity of Newtonian mechanics.  The numerals of the PISA tables appear to the man or woman in the street to have all the precision of measurements of length or weight in classical physics.  But, by Newtonian standards, psychological measurement in general, and item response theory in particular, simply have no quantities, and do not “measure,” as that word is normally understood.


How can this audacious claim to “measure” the quality of a continent’s education provision and report it in a single number be justified?  The answer, as has already been pointed out, is to be found in the fact that quantitative psychology has its own unique definition of measurement, which is that “measurement is the business of pinning numbers on things” (Stevens, 1958, p. 384).  With such an all-encompassing definition of measurement, PISA can justify just about any rank order of countries.  But this isn’t measurement as that word is normally understood.


This laissez faire attitude wasn’t always the case in psychology.  It is clear that, as far back as 1905, psychologists like Titchener recognised that his discipline would have to embrace the established definition of measurement in the natural sciences: “When we measure in any department of natural science, we compare a given measurement with some conventional unit of the same kind, and determine how many times the unit is contained in the magnitude” (Titchener, 1905, p. xix).  Michell (1999) makes a compelling case that psychology adopted Stevens’ ultimately meaningless definition of measurement – “according to Stevens’ definition, every psychological attribute is measurable” (Michell, 1999, p. 19) – because they feared that their discipline would be dismissed by the “hard” sciences without the twin notions of quantity and measurement.


The historical record shows that the profession of psychology derived economic and other social advantages from employing the rhetoric of measurement in promoting its services and that the science of psychology, likewise, benefited from supporting the profession in this by endorsing the measurability thesis and Stevens’ definition.  These endorsements happened despite the fact that the issue of the measurability of psychological attributes was rarely investigated scientifically and never resolved. (Mitchell, 1999, p. 192)


The mathematical symbolism in the next paragraph makes clear the contrast between the complete absence of rigorous measurement criteria in psychology and the onerous demands placed on the classical physicist.



To merit the label “quantity” in Newtonian physics, Hölder’s seven axioms must all be satisfied.  Hölder’s axioms are as follows:


  1. magnitude pairs, a and b, of Q, one and only one of the following is true:

(i).        a = b and b = a

(ii).       a > b and b < a

(iii).      b > a and a < b


  1. magnitudes a of Q, $ some b in Q such that b < a.


  1. magnitude pairs, a and b, in Q, $ c in Q such that a + b = c.


  1. magnitude pairs, a and b, in Q, a + b > a and a + b > b.


  1. magnitude pairs, a and b, in Q, if a < b, $ magnitudes, c and d, in Q, such that a + c = b and d + a = b.


  1. magnitude triplets, a, b and c, in Q, (a + b) + c = a + (b + c).


  1. pairs of classes, f and y, of magnitudes of Q, such that

(i)         each magnitude of Q belongs to one and only one of f and y

(ii)        neither f nor y are empty, and

(iii)       every magnitude in f is less than each magnitude in y,

$ a magnitude x in Q such that for every other magnitude, x’, in Q, if x’ < x, then x’ Î f and if x’ > x, then x’ Î y (depending on the particular case, x may belong to either class).


An essential step in establishing the validity of the concepts “quantity” and “measurement” in item response theory is an empirical analysis centred on Hölder’s conditions.  The reader will search in vain for evidence that quantitative psychologists in general, and item response theorists in particular, subject the predicate “ability” to Hölder’s conditions.

This is because the definition of measurement in psychology is so vague that it frees psychologists of any need to address Hölder’s conditions and permits them, without further ado, to simply accept that the predicates they purport to measure are quantifiable.


Quantitative psychology presumed that the psychological attributes which they aspired to measure were quantitative. … Quantitative attributes are attributes having a quite specific structure.  The issue of whether psychological attributes have that sort of structure is an empirical issue … Despite this, mainstream quantitative psychologists … not only neglected to investigate this issue, they presumed that psychological attributes are quantitative, as if no empirical issue were at stake.  This way of doing quantitative psychology, begun by its founder, Gustav Theodor Fechner, was followed almost universally throughout the discipline and still dominates it. … [I]t involved a defective definition of a fundamental methodological concept, that of measurement. … Its understanding of the concept of measurement is clearly mistaken because it ignores the fact that only quantitative attributes are measurable.  Because this … has persisted within psychology now for more than half a century, this tissue of errors is of special interest. (Michell, 1999, pp. xi – xii)


This essay has sought to challenge the Department of Education’s claim that in founding its methodology on item response theory, PISA is using the best available methodology to rank order countries according to their education provision.  As Sir Peter Medawar makes clear, any methodology which claims to capture the quality of a country’s entire education system in a single number is bound to be suspect.  If my analysis is correct PISA is engaged in rank-ordering countries according to the mathematical achievements of their young people, using a methodology which itself has little or no mathematical merit.


Item response theorists have identified two broad interpretations of probability in their models: the “stochastic subject” and “repeated sampling” interpretations.  Lord has demonstrated that the former leads to absurd and paradoxical results without ever investigating why this should be the case.  Had such an investigation been initiated, quantitative psychologists would have been confronted with the profound question of the very role probability plays in psychological measurement.  Following a pattern of behaviour all too familiar from Michell’s writings, psychologists simply buried their heads in the sand and, at Lord’s urging, set the stochastic subject interpretation aside and emphasised the repeated sampling approach.


In this way the constitutive nature of irreducible uncertainty in psychology was eschewed for the objectivity of Newtonian physics.  This is reflected in item response theory’s “local hidden variables” ensemble model in which ability is an intrinsic measurement-independent property of the individual and measurement is construed as a process of merely checking up on what pre-exists measurement.  For this to be justified, Hölder’s seven axioms must apply.


In order to justify the labels “quantity” and “measurement” PISA must produce the relevant empirical evidence against the Hölder axioms.  Absent such evidence, it seems very difficult to justify the Department of Education’s claims that (i) “the OECD is at the forefront of the academic debate regarding item response theory,” and (ii) “the OECD is using what is acknowledged as the best available methodology [for international comparison studies].”







Boring, E.G. (1929).  A history of experimental psychology.  New York: Century.

Borsboom, D., Mellenbergh, G.J., & van Heerden, J. (2003).  The theoretical status of latent variables.  Psychological Review, 110(2), 203-219.

Fechner, G.T. (1860).  Elemente der psychophysik.  Leipzig: Breitkopf & Hartel.  (English translation by H.E. Adler, Elements of Psychophysics, vol. 1, D.H. Howes & E.G. Boring (Eds.).  New York: Holt, Rinehart & Winston.)

Gardner, H. (2005).  Scientific psychology: Should we bury it or praise it?  In R.J. Sternberg (Ed.), Unity in psychology (pp. 77-90).  Washington DC: American Psychological Association.

Lord, F.M. (1980).  Applications of item response theory to practical testing problems.  Hilldale, NJ.: Lawrence Erlbaum Associates, Publishers.

Medawar, P.B. (1982).  Pluto’s republic.  Oxford University Press.

Michell, J. (1990).  An introduction to the logic of psychological measurement.  Hillsdale, NJ: Lawrence Erlbaum Associates, Publishers.

Michell, J. (1997).  Quantitative science and the definition of measurement in psychology.  British Journal of Psychology, 88, 353-385.

Michell, J. (1999).  Measurement in psychology: A critical history of a methodological concept.  Cambridge: Cambridge University Press.

Michell, J. (2000).  Normal science, pathological science and psychometrics.  Theory and Psychology, 10, 639-667.

Michell, J. (2008).  Is psychometrics pathological science? Measurement: Interdisciplinary Research and Perspectives, 6, 7-24.

Oppenheimer, R. (1956).  Analogy in science.  The American Psychologist, 11, 127-135.

Philip, J.R. (1974).  Fifty years progress in soil physics.  Geoderma, 12, 265-280.

Richardson, K. (1999).  The making of intelligence.  London: Weidenfeld & Nicolson.

Ross, S. (1964).  Logical foundations of psychological measurement.  Copenhagen: Munksgaard.

Rozeboom, W.W. (1966).  Scaling theory and the nature of measurement.  Synthese, 16, 170-223.

Stevens, S.S. (1946).  On the theory of scales of measurement.  Science, 103, 667-680.

Stevens, S.S. (1958).  Measurement and man.  Science, 127, 383-389.

Titchener, E.B. (1905).  Experimental psychology: A manual of laboratory practice, vol. 2.  London: Macmillan.

Wittgenstein, L. (1953).  Philosophical Investigations.  Oxford: Blackwell.


The Growth Mindset : Telling Penguins to Flap Harder ?

 We should be rather cautious about adopting the “Growth Mindset” approach as some sort of universal principle. 

Disappointed Idealist

I’m not sure whether this particular blog might lose me friends. It’s not intended to, but I’m going to stumble into an area where I know some people have very strong views. It was prompted by a post-parents’ evening trawl through some blogs, and I came across this blog by Dylan Wiliam :

I’m generally a fan of Dylan Wiliam, although I once tried to joke with him on Twitter, and I’m not sure my humour survived the transition to 140 characters. If I made any impression, it was almost certainly a bad one. Oh well. In any case, it’s not actually his blog on feedback which is at issue here – it’s a good piece, and I agree with the central message about marking/feedback. The bit I want to write about is this :

“Students must understand that they are not born with talent (or lack of it) and…

View original post 4,773 more words

Warning to parents prior to N. Ireland Assembly Election


, , , , , , , , , , , , , ,

FosterThe DUP’s Educational Incoherence


At the same time as the DUP has committed itself to a “No Child Left Behind” policy, Peter Weir (Chair of the Education Committee) suggested that the Party might return the Transfer Test to CCEA control.  Has he forgotten that the current AQE test was written to address shortcomings – such as unacceptable high pupil misclassification rate – in the  old CCEA test?


More worrying for the coherence of DUP education policy is the remarkably high proportion of children on free school meals (FSM) qualifying for grammar school places under the current AQE tests.  ALMOST HALF of AQE entrants eligible for FSM are meeting minimal standards for grammar school entry.  Handing the test back to CCEA would see a dramatic reduction in this number.  In short, returning to a CCEA test would be entirely at odds with a policy of leaving no child behind.

Advice to Parents on AQE & GL Transfer Tests 2015/16


, , , , , , , , , , , , , , , , , , ,

Scan_20160126 (2)

Now that the majority of pupils and parents have the results of the test(s) in hand it is right that there is time taken to acknowledge the effort, celebrate and relax.  If only the media would allow it. Instead the annual circus turns up right on cue. Never let facts get in the way of a good story.

Transfer Test Papers)

T he BBCNI Education correspondent, Robbie Meredith, has prepared a package for today’s local  news on the transfer test results.  He talks about the Education Minister calling for an end to academic selection – that is not news. Sinn Fein Education Ministers have been trying to end the existence of grammar schools for sixteen years   Dr Meredith suggests that non- Catholic grammar schools are mostly controlled – that statement is totally inaccurate and finally he fleetingly mentions the “dualling” schools, ignoring entirely the fact that it is only those schools which require pupils to take multiple tests. Dr Meredith has been informed of the potential misclassification of pupils using the ‘equating’ schemes cited by the “dualling schools” but will not investigate or report on the problem.


A question from the AQE transfer test in 2015

The schools accepting GL Assessment and or AQE test results without accepting responsibility for the pressure their unnecessary demands cause are: Lagan College, Belfast (not a grammar school), Glenlola Collegiate, Bangor; Campbell College, Belfast;  Antrim Grammar, Antrim; Victoria College, Belfast; St Patrick’s Grammar, Downpatrick; Wellington College, Belfast; Hunterhouse College, Belfast.

Source: Belfast Telegraph Transfer Test Guide published January 25, 2016 Page 19

Most politicians would like to see the end of academic selection but will not admit it to you lest they lose your vote, a problem they are evidently incapable of reconciling. Former DUP First Minister Peter Robinson made much of his determination to deliver a single test. He left office defeated by the resolve of parents and a dedicated group of principled individuals who will not allow political expediency to destroy parental choice.

Enjoy the weekend.