Time for Ofqual to take back control

Tags

, , , , , , , , , , , , , , , , ,

The GCSE Grade 5 controversy: why it’s time for Ofqual to “take back control”

 Numbers3

Dr Hugh Morrison, The Queen’s University of Belfast (retired)  drhmorrison@gmail.com

 The GCSE Grade 5 Controversy

A highly unusual feature of the new numbered GCSE grade scale is the claim that the new grade 5 will somehow reflect the standards of educational jurisdictions ranked near the top of international league tables.  Given the controversy surrounding such tables it will be possible, for the first time, to raise profound technical concerns about a particular grade on the GCSE grade scale.  Moreover, it will be difficult to make the case that grade 5 has any technical merit if Pisa ranks have any role in its determination.  Pisa is the acronym for the Paris-based “Programme for International Student Assessment” and a glance at the Times Educational Supplement of 26.07.2013 will reveal that Pisa league tables are fraught with technical difficulties.

 Concerns about Item Response Theory

The methodology which underpins Pisa ranks is called “Item Response Theory” (IRT).  IRT software claims to estimate the ability of individuals based on their responses to test items.  However, while the claim that ability is some inner state or “trait” of the individual from which his or her tests responses flow – the so-called reservoir model – is central to IRT, the claim is rarely supported by evidence.  There is very good reason to reject the inner state approach.  Consider, for example, the child solving simple problems in arithmetic.  To explain this everyday behaviour it transpires that one must invoke inner states with talismanic properties in that the state must be timeless, infinite and future-anticipating!

Rasch

Great caution is needed when using the word “ability” – while test evidence can justify us in saying that an individual has ability, that same evidence can never be used to justify the claim that the word “ability” refers to an inner (quantifiable) state of that individual.  Ability is not an intrinsic property of an individual; rather, it is a property of the interaction between individual and test items.  The individual’s responses to the test items are an inseparable part of that ability.  Indeed, divorced from a measurement context, ability is indefinite.  Individuals have definite ability only relative to a measurement context; even here it is incorrect to suggest that individuals have a quantifiable entity called “ability.”  Abandoning IRT’s appealingly simple picture of ability as an inner (quantifiable) state that individuals carry about with them, renders IRT untenable.  Ability is a two-faceted entity governed by first-person/third-person asymmetry: while we ascribe ability to ourselves without criteria, criteria are an essential prerequisite when ascribing ability to others.

The picture central to all IRT modelling – that ability is something intrinsic to the individual which is definite (and quantifiable) at all times – is rejected by the Nobel laureate Herbert Simon and by two giants of 20th century thought, physicist Niels Bohr and philosopher Ludwig Wittgenstein.  Indeed, Wittgenstein described the rationale which underpins IRT modelling – that test responses can be explained by appealing to inner processes – as a “general disease of thinking.”  Psychologists have a name for this error; Gerd Gigerenzer of the Max Plank Institute writes: “The tendency to explain behaviour internally without analysing the environment is known as the ‘Fundamental Attribution Error’.”

Niels Bohr

The criticisms levelled by Bohr and Wittgenstein are particularly damaging because IRT modellers construe ability as something inner which can be measured.  Few philosophers can match Wittgenstein’s contribution to our understanding of what can be said about the “inner”; and few scientists can match Bohr’s contribution to our understanding of measurement, particularly when the object of that measurement lies beyond direct experience.  (Bohr is listed among the top ten physicists of all time in recognition of his research on the quantum measurement problem.)  Both Bohr and Wittgenstein are concerned with the same fundamental question: how can one communicate unambiguously about aspects of reality which are beyond the direct experience of the measurer?  Just as Bohr rejected entirely the existence of definite states within the atom, Wittgenstein also rejected any claim to inner mental states; potentiality replaces actuality for both men.

For the duration of his professional life, Bohr maintained that quantum attributes have a “deep going” relation to psychological attributes in that neither can be represented as quantifiable states hidden in some inner realm.  We will always be limited to talking about ability; we will never be able to answer the question “what is ability?” let alone quantify someone’s ability.  Bohr believed that “Our task is not to penetrate into the essence of things, the meaning of which we don’t know anyway, but rather to develop concepts which allow us to talk in a productive way about phenomena in nature. …The task of physics is not to find out how nature is, but to find out what we can say about nature. … For if we want to say anything at all about nature – and what else does science try to do? – we must somehow pass from mathematical to everyday language” [italics added].

Given that IRT software is designed to measure ability, it may surprise readers that the claim that ability can be construed as a quantifiable inner state is rarely defended in IRT textbooks and journal articles.  In their article “Five decades of item response modelling,” Goldstein and Wood trace the beginnings of IRT to a paper written in 1943 by Derrick Lawley.  They note: “Lawley, a statistician, was not concerned with unpacking what ‘ability’ might mean.”  Little has changed in the interim.

Why Ofqual must protect GCSE pupils from the OECD’s “sophisticated processes”

These profound conceptual difficulties with the model which underpins Pisa rankings must surely undermine the OECD’s claim that one can rank order countries for the quality of their education systems.  In a detailed analysis of the 2006 Pisa rankings, the eminent statistician Svend Kreiner revealed that “Most people don’t know that half of the students taking part in Pisa [2006] do not respond to any reading item at all.  Despite that, Pisa assigns reading scores to these children.”  Given such revelations, why are governments, the media and the general public not more sceptical about Pisa rankings?  Kreiner offers the following explanation: “One of the problems that everybody has with Pisa is that they don’t want to discuss things with people criticising or asking questions concerning the results.  They didn’t want to talk with me at all.  I am sure it is because they can’t defend themselves.”

Given the depth of the conceptual problems which afflict IRT and, as a consequence, Pisa rankings, it seems to me foolhardy in the extreme to predicate the new GCSE grade 5 on Pisa rankings.  Ofqual have announced that grade 5 will be “broadly in line with what the best available evidence tells us is the average PISA performance in countries such as Finland, Canada, the Netherlands and Switzerland.”  In addition to Ofqual, the Department for Education and Tim Oates, director of research at Cambridge Assessment, appear to endorse a role for Pisa in UK public examinations.  The Department for Education have produced a report – PISA 2009 Study: How big is the gap? – which creates the impression that “gaps” between England and high performing Pisa countries can be represented on a GCSE grade scale designed for reporting achievement rather than ability.

Finally, the director of Cambridge Assessment asserts: “I am more optimistic … than most other analysts, I don’t see too many problems in these kinds of international comparisons.”  Indeed, Mr Oates believes that UK assessment has much to learn from involving Pisa staff directly in solving the grade 5 problem: “If we want to do it formally then we ought to have discussions with OECD. … OECD have some pretty sophisticated processes of equating tests which contain different items in different national settings.”  There is an immediate problem with this claim.  Since the psychometric definition of equity begins with the words: “for every group of examinees of identical ability …,” equity itself is founded on the erroneous assumption that ability can be quantified.

For the first time in the history of public examinations in the UK the technical fidelity of a GCSE grade will be linked to Pisa methodology.  Given the concerns surrounding IRT, is it not time for Ofqual to distance itself from the claim that the grade 5 standard is somehow invested with properties which allow it to track international standards in the upper reaches of the Pisa league tables?  The recent introduction of the rather vague term “strong pass” smacks of desperation; couldn’t a grade 6 also be deemed a strong pass?  Why not stop digging, sever the link with Pisa, and simply interpret grade 5 as nothing more than the grade representing a standard somewhere between grade 4 and grade 6?

The new GCSE grade 5: what Ofqual refuse to tell the public

Tags

, , , , , , , , , , , , , , , , , , ,

The new GCSE grade 5 and the Fundamental Attribution Error

Dr Hugh Morrison, The Queen’s University of Belfast (retired)  drhmorrison@gmail.com

Hilda Ogden

Holding the new GCSE grade 5 up to ridicule

A highly unusual feature of the new numbered GCSE grade scale is the claim that the new grade 5 will somehow reflect the standards of educational jurisdictions ranked near the top of international league tables.  Given the controversy surrounding such tables it will be possible, for the first time, to raise profound technical concerns about a particular grade on the GCSE grade scale.  Moreover, it will be impossible to make the case that grade 5 has any technical merit if (as seems likely) Pisa ranks have any role in its determination.  Pisa is the acronym for the OECD’s “Programme for International Student Assessment” and a glance at the Times Educational Supplement of 26.07.2013 will reveal that Pisa league tables are fraught with technical difficulties.

The distinguished statistician Svend Kreiner, of the University of Copenhagen, who has carried out a detailed investigation of the Pisa model, concluded: “the best we can say about Pisa rankings is that they are useless.”  The British mathematician Tony Gardner, of Birmingham University, has referred to Pisa claims as “snake oil.”  In the Times Educational Supplement piece, I argued that the model used by Pisa is flawed because, in order to explain a child’s ability to do simple arithmetic, for example, one must posit exotic inner states which are infinite, timeless and which somehow anticipate every arithmetical problem the child will subsequently encounter in a lifetime.  These impossible inner states arise because Pisa models treat “ability” as a state rather than a capacity.  How has Pisa managed to survive all these years given such damaging and unequivocal criticism?  Its secret is that it appears to enjoy a relationship with Government and the media which, in effect, insulates it from its critics.  Kreiner writes: “One of the problems that everybody has with Pisa is that they don’t want to discuss things with people criticising or asking questions concerning the results.  They didn’t want to talk to me at all.  I am sure it is because they can’t defend themselves.”

For the first time in the history of British examinations, a simple argument that anyone can understand can be deployed to undermine the technical fidelity of a particular examination grade.  Mixing the measurement of achievement with the measurement of ability exposes the new grade 5 to ridicule.  If grade 5 is to be predicated on Pisa rankings then profound validity shortcomings in respect of the rankings will have implications for grade 5.  Consider the arrangement of balls on a snooker table before a game begins.  The configuration of balls requires 44 numbers (two per ball, with the front and side rails serving as coordinate axes).  While the arrangement of balls on a snooker table cannot be summarised in less than 44 numbers, Pisa claims to represent the state of mathematics education in the USA – with its almost 100,000 schools – in a single number.  It would seem that what cannot be achieved for the location of simple little resin balls is nevertheless possible when the entity being “measured” is the mathematical attainment of millions of complex, intentional beings.

The Nobel laureate Sir Peter Medawar labelled such claims “unnatural science.”  Citing the research of John R. Philip, he notes that the properties of a simple particle of soil cannot be captured in a single number: “the physical properties and field behaviour of soil depend on particle size and shape, porosity, hydrogen ion concentration, material flora and water content and hygroscopy.  No single figure can embody itself in a constellation of values of all these variables.”  Once again, what is impossible for a tiny particle of soil taken from the shoe of one of the many millions of pupils who attend school in America, is nevertheless possible when the entity being “measured” is the combined mathematical attainment of a continent’s schoolchildren.

The problem with the new GCSE grade 5: a detailed critique

The OECD has now taken the bold step of analysing measures of “happiness,” “well-being” and “anxiety” for individual countries (see, for example, ‘New Pisa happiness table,’ Times Educational Supplement 19.04.2017).  In these tables “life satisfaction,” for example, is measured to two-decimal place accuracy.  This begs the question, “Can complex constructs such as happiness or anxiety really be represented by a number such as 7.26?”  For two giants of 20th century thought – the philosopher Ludwig Wittgenstein and the father of quantum physics, Niels Bohr – the answer to this question is an unequivocal “no.”  The fundamental flaws in Pisa’s approach to measuring happiness will serve to illustrate the folly of linking a particular GCSE grade to Pisa methodology.

Once again, surely common sense itself dictates that constructs such as happiness, anxiety and well-being cannot be captured in a single number?  In his book Three Seductive Ideas, the Harvard psychologist Jerome Kagan draws on the writings of Bohr and Wittgenstein to argue that measures of constructs such as happiness cannot be attributed to individuals and cannot be represented as numbers because such measures are context-dependent.  He writes: “The first premise is that the unit of analysis … must be a person in a context, rather than an isolated characteristic of that person.”  Wittgenstein and Bohr (independently) arrived at the conclusion that what is measured cannot be separated from the measurement context.  It follows that when an individual’s happiness is being measured, a complete description of the measuring tool must appear in the measurement statement because the measuring tool helps define what the measurer means by the word happiness.

Kagan rejects the practice of reporting the measurement of complex psychological constructs using numbers: “The contrasting view, held by Whitehead [co-author of the Principia Mathematica] and Wittgenstein, insists that every description should refer to … the circumstances of the observation.”  The reason for including a description of the measuring instrument isn’t difficult to see.  Kagan points out that “Most investigators who study “anxiety” or “fear” use answers on a standard questionnaire or responses to an interview to decide which of their subjects are anxious or fearful.  A smaller number of scientists ask close friends or relatives of each subject to evaluate how anxious the person is.  A still smaller group measures the heart rate, blood pressure, galvanic skin response, or salivary level of subjects.”  Alas, all these methodologies yield very different “measures” of the anxiety or fear of the subject.

Kagan therefore argues that a change in the measuring tool means a change in the reported measurement; one must include a description of the measuring instrument in order to “communicate unambiguously,” as Bohr expressed it.  One can never simply write “happiness = 4.29” (as in Pisa tables) because there is no such thing as a context-independent measure of happiness.  We have no idea what happiness is as a thing-in-itself.  Kagan notes the implications for psychologists of the measurement principles set out by Niels Bohr: “Modern physicists appreciate that light can behave as a wave or a particle depending on the method of measurement.  But some contemporary psychologists write as if that maxim did not apply to consciousness, intelligence, or fear.”

According to Bohr, when one reports psychological measurements, the requirement to describe the measurement situation means that ordinary language must replace numbers.  This invalidates the entire Pisa project.  Werner Heisenberg summarised his mentor’s teachings: “If we want to say anything at all about nature – and what else does science try to do – we must pass from mathematical to everyday language.”  The consequences of accepting this counsel are clear; one cannot rank order descriptions.

(To simplify matters somewhat, while numbers function perfectly well when observing the motion of a tennis ball or a star, the psychologist cannot observe directly the pupil’s happiness.  Bohr argued that there is “a deep-going analogy” between measurement in quantum physics and measurement in psychology because both are concerned with measuring constructs which transcend the limits of ordinary experience.  According to Bohr, because the physicist, like the psychologist (in respect of attempts to measure happiness), cannot observe electrons and photons directly, “physics concerns what we can say about nature,” and numbers, therefore, must give way to ordinary language.)

The arguments advanced above apply, without modification, to Pisa’s core activity of measuring pupil ability.  A simple thought experiment (first reported in the Times Educational Supplement of 26.07.2013) makes this clear.  Suppose that a pupil is awarded a perfect score in a GCSE mathematics examination.  It seems sensible to conjecture that if Einstein were alive, he too would secure a perfect score on this mathematics paper.  Given the title on the front page of the examination paper, one has the clear sense that the examination measures ability in mathematics.  Is one therefore justified in saying that Einstein and the pupil have the same mathematical ability?

This paradoxical outcome results from the erroneous treatment of mathematical ability as something entirely divorced from the questions which make up the examination paper (the measurement context).  It is clear that the pupil’s mathematical achievements are dwarfed by Einstein’s; to ascribe equal ability to Einstein and the pupil is to communicate ambiguously.  To avoid the paradox one simply has to detail the measurement circumstances in any report of attainment and say: “Einstein and the pupil have the same mathematical ability relative to this particular GCSE mathematics paper.”  By including a description of the measuring instrument one is, in effect, making clear the restrictive meaning which attaches to the word “mathematics” as it is being used here; school mathematics omits whole areas of the discipline familiar to Einstein such as non-Euclidean geometry, tensor analysis, vector field theory, Newtonian mechanics, and so on.  As with the measurement of happiness, when one factors in a description of the measuring instrument, the paradox dissolves away.

Pace Pisa, ability is not an intrinsic property of the person.  Rather, it is a joint property of the person and the measuring tool.  Ability is the property of an interaction.  Alas for Pisa, the move from numbers to language also dissolves away that organisation’s much-lauded rank orders.  Little wonder that Wittgenstein described the reasoning which underpins the statistical model (Item Response Theory) at the heart of the Pisa rankings as “a disease of thought.”  For the first time, the many profound conceptual difficulties of the Pisa league table now become difficulties for a grade on the GCSE grade scale.  Why would anyone agree to predicate a perfectly respectable grade scale on a ranking system with such profound shortcomings?

An article published in 2016 in the USA’s Proceedings of the National Academy of Sciences by Van Bavel, Mende-Siedlecki, Brady and Reinero, serves to emphasise the degree to which Pisa thinking is isolated even in psychology: “Indeed, the insight that behaviour is a function of both the person and the environment – elegantly captured by Lewin’s equation: B = f(P, E) –  has shaped the direction of social psychological research for more than half a century.  During that time, psychologists and other social scientists have paid considerable attention to the influence of context on the individual and have found extensive evidence that contextual factors alter human behaviour.”

If ability is a joint property of the person and the context in which that ability is manifest, then unambiguous communication demands that a description of the context must be integral to any attempt to represent an individual’s ability.  Mainstream psychology rejects the notion that one can ignore context and treat behaviour as wholly analysable in terms of traits and inner processes.  Indeed, psychology itself has a name for the error which afflicts the Pisa ranking model.  Gerd Gigerenzer of the Max Planck Institute writes: “The tendency to explain behaviour internally without analysing the environment is known as the ‘fundamental attribution error.’”

Three thinkers who stand out among those who argue that ability measures cannot be separated from the context in which they are manifest are the Nobel laureate Herbert A. Simon and two of the 20th century’s greatest intellectuals: the father of quantum theory, Niels Bohr, and the philosopher Ludwig Wittgenstein.  First, Herbert Simon uses a scissors metaphor to indicate the degree to which an attribute like ability cannot be disentangled from the context in which it is manifest.  (Pursuing questions such as “which blade of the scissors cuts the cloth?” will do little to advance an explanation of how scissors cut; there seems to be little value in seeking to understand the whole (the cutting action) in terms of its parts (the unique contribution of each blade).)  Herbert writes: “Human rational behaviour is shaped by a scissors whose blades are the structure of the environment and the computational capabilities of the actor.”

Secondly, Niels Bohr – in his Discussion with Einstein on Epistemological Problems in Atomic Physics – uses quantum “complementarity” to argue that first-person ascriptions [the contribution of the individual] and third-person ascriptions [the contribution of the environment] of psychological attributes form an “indivisible whole.”  Finally, on page 143 of his Blue and Brown Books, Wittgenstein highlights the error at the heart of the Pisa project: “There is a general disease of thinking which always looks for (and finds) what would be called a mental state from which all our acts spring as from a reservoir.”

Conclusions

The arguments set out above have serious implications for the technical fidelity of the new GCSE grade 5.  The more the general public find out about the modelling which underpins Pisa, the more their faith in the new GCSE grade scale will be undermined.  (For example, Kreiner reveals in the Times Educational Supplement piece that, “Most people don’t know that half of the students taking part in Pisa [2006] do not respond to any reading item at all.  Despite that, Pisa assigns reading scores to these children.”) 

The fact that a switch from numbers to language invalidates entirely the practice of ordering countries according to the efficacy of their education systems has profound implications for the validity of inferences made in respect of the new GCSE grade 5.  Given the assertion that grade 5 is designed to reflect the academic standards of high performing educational jurisdictions, as identified by their Pisa ranks, what possible justification can be offered for assigning a privileged role to the GCSE grade 5 in school performance tables?

To date, the profound conceptual difficulties which attend Pisa ranks have not impacted directly on the life chances of particular children in this country.  This would change if individual pupils failing to reach the grade 5 standard were construed as having fallen short of international standards (whatever that means).  If one accepts the reasoning of Simon, Wittgenstein and Bohr, grade 5 can represent nothing more than a standard somewhere between grade 4 and grade 6.  Any attempt to accord it special status, thereby giving it a central role in the EBacc and/or performance tables, for example, risks exposing the new GCSE grading scale to ridicule.

Why the Belfast Telegraph and the Irish News must correct their claim that St Dominic’s High School is Northern Ireland’s top grammar school.

Tags

, , , , , , , ,

 

 

In May 2016 the Belfast Telegraph published a league table of Northern Ireland grammar schools, based on the Advanced Level grades achieved by grammar school pupils in the school year 2014/2015.  The Belfast Telegraph’s editor failed to reply to correspondence asserting that fundamental errors in the design of her newspaper’s league table could result in unfair reputational damage to schools (see,  “Why the Belfast Telegraph and Irish News must set the record straight on grammar school league tables”).  In the recent past, the Irish News has published a similarly designed league table based on the 2015/2016 examinations.  This table has precisely the same design fault as the Belfast Telegraph table.  Once again, there is potential for reputational damage to a significant number of grammar schools.

 

It isn’t difficult to spot the error in the tables.  In assigning ranks to the various grammar schools, grades C, B, A and A* are treated as equal in value.  This is clearly wrong; a grade B represents a higher standard than a grade C, a grade A represents a higher standard than a grade B, and a grade A* represents a higher standard than an A.  Any instrument which treats a grade C the same as a grade A* simply cannot claim to measure academic excellence. The following scenario illustrates just how unfit-for-purpose these league tables are as tools for identifying high performing grammar schools: if every middle-sixth pupil in every school discipline across the entire grammar school estate were to simultaneously improve from C standard to A* standard, the tables published in these two newspapers would be completely powerless to detect any change whatsoever in standards.

 

Furthermore, both newspapers are, in effect, using highly questionable analysis to call into question the quality of teaching and learning in all “non-Catholic” schools.  Here are a few quotations from Rebecca Black’s Belfast Telegraph piece: “It is impossible not to be impressed at the consistently high performances of our top Catholic schools;” “By contrast, some of the best known non-Catholic grammars have slipped below the Northern Ireland average;” “If someone could bottle the ethos for success in these (Catholic) schools, they could run the world;” “Sean Rafferty, principal of St Louis Grammar, makes a very salient point in today’s Belfast Telegraph in calling for the Department of Education to examine what makes the top Catholic schools so successful, to learn the lessons and spread that magic across the school estate.”

 

Is the Belfast Telegraph not guilty of promulgating what  has been labelled “fake news”?  Based on highly dubious reasoning, Rebecca Black is distorting the debate on the relative efficacy of Catholic and non-Catholic education in Northern Ireland.  For Ms Black, the superiority of St Dominic’s High School (the school ranked first in both the Belfast Telegraph and the Irish News league tables) over, for example, Friends School Lisburn (ranked 12) is explained by the Catholic ethos of the former.  However, when a Grade Point Average approach is used to compute ranks, the order is reversed and Friends (a “non-Catholic” school) is the superior school!  The simplistic hypothesis that Catholic education is superior to that offered in non-Catholic schools is goes up in smoke when proper account is taken of the ordinal nature of examination grades: C, B, A and A*.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Meaning for Victims

Tags

, , , , , , , , , , , , , , , , , ,

Was political failure to compensate deserving victims of the troubles inevitable given the incoherent thinking in the 2006 Victims and Survivors Order?

 There is a clear consensus abroad in Northern Ireland that the 2006 Victims and Survivors Order No. 2953 (N.I.17) permits the terrorist who takes the life of an innocent member of the public to be deemed as much a “victim” as the unwitting individual he or she kills.   The conflation of the innocent and those who terrorised them is achieved through the definition of the term “victim” in section three of the Order.  The Order’s approach to the meaning of “victim” is such that it can apply equally to the innocent citizen who finds herself in the vicinity of a bomb, and to the terrorist who planted it.

 

This brief essay argues that the meaning of the term “victim” cannot consist in a definition. Moreover, the Order’s definition is entirely at odds with the meaning of “victim.” The case is made that the project to include those who murdered innocent men, women and children in the category “victim,” stands reasoning its head.  In order for terrorists to be construed as victims, the Order must effect a reappraisal of the very meaning of the word “victim.”

Since the Order’s definition of “victim” is entirely at odds with the meaning of that term (see below), can it be any wonder that the basic issue of addressing the needs of victims remains unresolved years after the Good Friday Agreement?

The central argument of this essay draws on the ideas of one of the greatest 20th century philosophers, Ludwig Wittgenstein.  Wittgenstein demonstrates that the meaning of a word cannot consist in any verbal or written formulation. Rather, to investigate the meaning of any word, one must look at how it is used in everyday practice.  For Wittgenstein, “the meaning of a word is its use in language.”  The difficulty with section three of the 2006 Order is that it gives the appearance of establishing (through definition) the meaning of “victim” in the Northern Ireland context.  But the meaning of the word “victim” cannot be established via re-definition because meaning resides in human practices and not in definitions.

 

To illustrate Wittgenstein’s reasoning, consider the meaning of the concept “force” in physics.  Force is defined as the time derivative of momentum.  But the community of scientists do not treat someone who merely has access to the definition of force, as having grasped its meaning.  Rather, only those who can use the concept of force to solve a wide range of problems, of increasing complexity, over an extended period of time, are deemed to have grasped the concept.  Michael Oakeshott argues, with Wittgenstein, that the practice is the essential thing, the definition being little more than an “abridgement” of that all-important practice.  Meaning can never consist in a definition.  Meanings reside in practices and the physics undergraduate, for example, gets access to meaning through being enculturated into the practice of physics.  (This move from definition to use is vital to the analysis presented here, for people rarely use the term “victim” in respect of terrorists.)

 

In The Structure of Scientific Revolutions, Kuhn writes: “If, for example, the student of Newtonian dynamics ever discovers the meaning of terms like ‘force,’ ‘mass,’ ‘space,’ and ‘time,’ he does so less from the incomplete though sometimes helpful definitions in his text than by observing and participating in the application of these concepts to problem-solution. … That process of learning … by doing continues throughout the process of professional initiation.  As the student proceeds from his freshman course to and through his doctoral dissertation, the problems assigned to him become more complex and less completely precedented.”  We would never countenance reducing Einstein’s grasp of general relativity to his ability to complete the sentence: “General Relativity is __________________ .“

 

This all-important relation between meaning and use goes beyond physics; it is completely generalizable.  For example, the definition of the tort of battery in Black’s Law Dictionary (9th edition) is: “an intentional and offensive touching of another without lawful justification.”  The novice law student will quickly realise that legal reasoning does not involve the straightforward application of such textbook definitions or rules to particular cases.  Once again, to learn the meaning of the tort of battery one must learn to use “battery” as it is used by experienced legal experts; the practice of law is the repository of meaning.

The great American jurist Oliver Wendell Holmes argued that the life of the law is not to be found in definitions; only “experience” gives access to legal meaning.  Definitions get their life through the role they play in the practice of the law; divorced from the practice, a definition can determine no course of action.  Through experience the novice learns to use the legal term “battery” as more experienced colleagues do.  Only through experience can a student learn that legal rules have an intrinsic vagueness or “open texture” (Hart’s The Concept of Law). The meaning of a legal term does not consist in a precisely formulated definition; rather, meaning depends on the way terms are used over time.

 

In his Introduction to Legal Reasoning, Edward H. Levi writes: “It is important that the mechanism of legal reasoning should not be concealed by its pretence.  The pretence is that the law is a system of known rules applied by a judge; the pretence has long been under attack.  In an important sense, legal rules are never clear, and, if a rule had to be clear before it could be imposed, society would be impossible.”  If meaning resided in definitions and rules rather than in the practice of the law (where the emphasis is on use), the pivotal relationship between precedent and all legal deliberation would dissolve away.

 

In summary then, the central issue in this essay is Wittgenstein’s teaching that “the meaning of a word is its use in the language.”   The definition published in section three of the Victims and Survivors Order, which seeks to blur the distinction between terrorist and innocent victim, is at odds with the meaning of “victim” because that word is only used in everyday conversation to refer to the innocent.  The Order gives the impression that it is forced to define victim in such all-embracing language in order to take account of Northern Ireland’s troubled past.  However, Wittgenstein writes as follows on page 183 of his Lectures on the Foundations of Mathematics:

“To know [the meaning of a word] is to use it in the same way as other people do.  ‘In the right way’ means nothing.”

Because the Order does not use the term “victim” as other people do, the definition of victim in the Order, in effect, contradicts its meaning.

 

 

 

 

 

Why there is little cause to be happy with the new GCSE grade 5

Tags

, , , , , , , , , , , ,

The OECD’s Programme for International Student Assessment (Pisa) has now taken the bold step of analysing measures of “happiness,” “well-being” and “anxiety” for individual countries (see New Pisa happiness table, TES 19.04.2017 https://www.tes.com/news/school-news/breaking-news/new-pisa-happiness-table-see-where-uk-pupils-rank).

The claim is made that “life satisfaction,” for example, can be measured to two-decimal place accuracy.  This begs the question, “Can complex constructs such as happiness or anxiety really be represented as a number like 7.26?”  For two giants of 20th century thought – the philosopher Ludwig Wittgenstein and the father of quantum physics, Niels Bohr – the answer to this question is an unequivocal “no.”

 

Surely common sense itself dictates that constructs such as happiness, anxiety and well-being cannot be captured in a single number?  In his book Three Seductive Ideas, the Harvard psychologist Jerome Kagan draws on the writings of Bohr and Wittgenstein to argue that measures of constructs such as happiness cannot be represented as numbers.  He writes: “The first premise is that the unit of analysis … must be a person in a context, rather than an isolated characteristic of that person.”  Wittgenstein and Bohr (independently) arrived at the conclusion that what is measured cannot be separated from the measurement context.  It follows that when an individual’s happiness is being measured, a description of the questions on the Pisa questionnaire must appear in the measurement statement because these questions help define what the measurer means by the word happiness.

Kagan rejects the practice of reporting the measurement of complex psychological constructs using numbers: “The contrasting view, held by Whitehead [co-author of the Principia Mathematica] and Wittgenstein, insists that every description should refer to … the circumstances of the observation.”  The reason for including a description of the measuring instrument isn’t difficult to see.  Kagan points out that “Most investigators who study “anxiety” or “fear” use answers on a standard questionnaire or responses to an interview to decide which of their subjects are anxious or fearful.  A smaller number of scientists ask close friends or relatives of each subject to evaluate how anxious the person is.  A still smaller group measures the heart rate, blood pressure, galvanic skin response, or salivary level of subjects.  Unfortunately, these three sources of information rarely agree.”

 

Given that a change in the measuring tool means a change in the reported measurement, one must include a description of the measuring instrument in order to “communicate unambiguously,” as Bohr expressed it.  One can never simply write “happiness = 4.29” (as in Pisa tables) because there is no such thing as an instrument-independent measure of happiness.  We have no idea what happiness is as a thing-in-itself.  Kagan notes the implications for psychologists of the measurement principles set out by Niels Bohr: “Modern physicists appreciate that light can behave as a wave or a particle depending on the method of measurement.  But some contemporary psychologists write as if that maxim did not apply to consciousness, intelligence, or fear.”  According to Bohr, when one reports psychological measurements, the requirement to describe the measurement situation means that ordinary language must replace numbers.  Werner Heisenberg summarised his mentor’s teachings: “If we want to say anything at all about nature – and what else does science try to do – we must pass from mathematical to everyday language.”

 

(To simplify matters somewhat, while numbers function perfectly well when observing the motion of a tennis ball or a star, the psychologist cannot observe directly the pupil’s happiness.  Bohr argued that there was “a deep-going analogy” between measurement in quantum physics and measurement in psychology because both were concerned with measuring constructs which transcend the limits of ordinary experience.  According to Bohr, because the physicist, like the psychologist (in respect of attempts to measure happiness), cannot directly experience electrons and photons, “physics concerns what we can say about nature,” and numbers must therefore give way to ordinary language.)

 

The arguments advanced above apply, without modification, to Pisa’s core activity of measuring pupil ability.  A simple thought experiment (first reported in the TES of 26.07.2013) makes this clear.  Suppose that a pupil is awarded a perfect score in a GCSE mathematics examination.  It seems sensible to conjecture that if Einstein were alive, he too would secure a perfect score on this mathematics paper.  Given the title on the front page of the examination paper, one has the clear sense that the examination measures ability in mathematics.  Is one therefore justified in saying that Einstein and the pupil have the same mathematical ability?

 

This paradoxical outcome results from the erroneous treatment of mathematical ability as something entirely divorced from the questions which make up the examination paper.  It is clear that the pupil’s mathematical achievements are dwarfed by Einstein’s; to ascribe equal ability to Einstein and the pupil is to communicate ambiguously.  To avoid the paradox one simply has to detail the measurement circumstances in any report of attainment and say: “Einstein and the pupil have the same mathematical ability relative to this particular GCSE mathematics paper.”  By including a description of the measuring instrument one is, in effect, making clear the restrictive meaning which attaches to the word “mathematics” as it is being used here; school mathematics omits whole areas of the discipline familiar to Einstein such as non-Euclidean geometry, tensor analysis, vector field theory, Newtonian mechanics, and so on.

 

As with the measurement of happiness, when one factors in a description of the measuring instrument, the paradox dissolves away.  Alas for Pisa, the move from numbers to language also dissolves away that organisation’s much-lauded rank orders.  Little wonder that Wittgenstein described the reasoning which underpins the statistical model (Item Response Theory) at the heart of the Pisa rankings as “a disease of thought.”

 

This brings us to the very serious implications for the new GCSE grade 5, of the arguments set out above.  The fact that a switch from numbers to language invalidates entirely the practice of ordering countries according to the efficacy of their education systems has profound implications for the validity of claims made concerning the new GCSE grade 5.  Given the assertion that grade 5 reflects the academic standards of high performing international jurisdictions as identified by their Pisa ranks, what possible justification can be offered for assigning a privileged role to the GCSE grade 5 in school performance tables?

 

To date, Pisa rankings have not impacted directly on the life chances of particular children in this country.  This would change if individual pupils failing to reach the grade 5 standard were construed as having fallen short of international standards (whatever that means).  If one accepts the reasoning of Wittgenstein and Bohr, grade 5 can represent nothing more than a standard somewhere between grade 4 and grade 6.  Any attempt to accord it special status, thereby giving it a central role in the EBacc and/or performance tables, risks exposing the new GCSE grading scale to ridicule.

Dr Hugh Morrison, The Queen’s University of Belfast (retired)

 

 

 

The AQE CEA and GL Assessment Test Results: Advice to parents: 2017

Tags

, , , , , , , ,

All parents who have received a letter notifying them of the results of their applications to a post-primary school(s) would benefit from reading this post.

Why the Belfast Telegraph and Irish News must set the record straight on grammar school league table libel

Decisions made to apply to a particular grammar school based, even in part, on newspaper claims about exam performance are unsafe.

Irish News League Tables.jpg

It is important to understand that the Grammar School Exam Performance lists ( league table libel) presented in the form used by the Belfast Telegraph and the Irish News represent, at best, a marketing tool used by the newspapers to increase sales in a declining print media environment.

In the Irish News of May 22, 2017 Simon Doyle boldly claims:

The Irish News performance lists are anticipated annually and some schools advertise their positions on their websites

Here is an example from St Dominic’s High School/Grammar School in Belfast.

https://www.stdominics.org.uk/news-archive/2017/5/22/top-performing-school-in-northern-ireland

St Dominic's HighNote the circular self-referencing between the Irish News and St. Dominic’s High / Grammar School. The school website is ambivalent Without a hint of irony Mr Doyle avoids acknowledging that  the Irish News  advertises  a version of league table libel. Discerning parents will note that there is no presentation of information to explain the methodology used to compile the tables but it is not difficult to suggest that if a C grade is treated exactly the same as an A* grade in value and all examination subjects treated as if they will equally passport a pupil on to a university course the wheels come off these meaningless tables whether they are ‘anticipated’ or not.

Surprisingly, Mr Doyle fails to mention the complete absence from his list of his own grammar school, the Methodist College, Belfast.

Where is the equivalent breakdown for St. Dominic’s High School, Belfast?

MCB A level results

 

 

Why the Belfast Telegraph and Irish News must set the record straight on grammar school league table libel

Tags

, , , , , , , , ,

Why the Belfast Telegraph and Irish News must set the record straight on grammar school league tables.

Dr Hugh Morrison

Immediately below is a short letter  sent to the editor of the Belfast Telegraph approximately one year ago.  It concerned a conceptual error in the paper’s A-Level league tables.  Despite repeated requests to make the public aware of their  error in the rank order, the letter never appeared in print.  On May 22, 2017 the Irish News published a league table generated by precisely the same flawed algorithm.

 

To be assigned a low rank in the Belfast Telegraph’s recently published league tables is likely to do little for the reputation of a school.  Given the potential reputational damage, it is vital to ensure that the numerical rank assigned to each school is meaningful.  Examination grades are reported on what is known as an ordinal scale.  There is an ordering of standards associated with the various grades: An A* grade represents a higher standard than an A grade, which in turn represents a higher standard than a B grade, and so on.  In the Belfast Telegraph’s league tables, ranks are computed by adding grades.  Alas, this produces meaningless numbers because arithmetic in general, and addition in particular, is impermissible for ordinal scales.  The Belfast Telegraph must make this error clear to its readers.

 

The Belfast Telegraph clearly believes that its league tables measure academic excellence.   It refers to the likely impact of cuts on “excellence,” schools which are “top of the class,” and “top performing” schools.  Rebecca Black writes: “Whether you believe the highest priority for a school should be academic excellence or not, it is impossible not to be impressed at the consistently high performance of our top Catholic schools.”  But are such inferences justified?  Are these league tables capable of even identifying excellent schools?

 

The Belfast Telegraph eschews the orthodox method used throughout the world by newspapers when publishing school league tables, the so-called “Grade Point Average” procedure.  This attempts to respect the fact that the scale of standards implicit in the various grades is ordinal by assigning a weight to each grade.  For example, with regard to the A-level league table, 12 points might be assigned to an A* grade, 10 to an A grade, 8 to a B grade, and 6 to a C grade.  However, in the Belfast Telegraph’s methodology, all four grades are assigned exactly the same value.  Unfortunately, a league table which treats a CCC profile as indistinguishable from a profile of A*A*A* cannot lay claim to distinguish schools on the basis of their academic excellence.

 

 

 

Why there is little cause to be happy with the new GCSE grade 5

Tags

, , , , , , , , , , , ,

The OECD’s Programme for International Student Assessment (Pisa) has now taken the bold step of analysing measures of “happiness,” “well-being” and “anxiety” for individual countries (see New Pisa happiness table, TES 19.04.2017 https://www.tes.com/news/school-news/breaking-news/new-pisa-happiness-table-see-where-uk-pupils-rank).

The claim is made that “life satisfaction,” for example, can be measured to two-decimal place accuracy.  This begs the question, “Can complex constructs such as happiness or anxiety really be represented as a number like 7.26?”  For two giants of 20th century thought – the philosopher Ludwig Wittgenstein and the father of quantum physics, Niels Bohr – the answer to this question is an unequivocal “no.”

 

Surely common sense itself dictates that constructs such as happiness, anxiety and well-being cannot be captured in a single number?  In his book Three Seductive Ideas, the Harvard psychologist Jerome Kagan draws on the writings of Bohr and Wittgenstein to argue that measures of constructs such as happiness cannot be represented as numbers.  He writes: “The first premise is that the unit of analysis … must be a person in a context, rather than an isolated characteristic of that person.”  Wittgenstein and Bohr (independently) arrived at the conclusion that what is measured cannot be separated from the measurement context.  It follows that when an individual’s happiness is being measured, a description of the questions on the Pisa questionnaire must appear in the measurement statement because these questions help define what the measurer means by the word happiness.

Kagan rejects the practice of reporting the measurement of complex psychological constructs using numbers: “The contrasting view, held by Whitehead [co-author of the Principia Mathematica] and Wittgenstein, insists that every description should refer to … the circumstances of the observation.”  The reason for including a description of the measuring instrument isn’t difficult to see.  Kagan points out that “Most investigators who study “anxiety” or “fear” use answers on a standard questionnaire or responses to an interview to decide which of their subjects are anxious or fearful.  A smaller number of scientists ask close friends or relatives of each subject to evaluate how anxious the person is.  A still smaller group measures the heart rate, blood pressure, galvanic skin response, or salivary level of subjects.  Unfortunately, these three sources of information rarely agree.”

 

Given that a change in the measuring tool means a change in the reported measurement, one must include a description of the measuring instrument in order to “communicate unambiguously,” as Bohr expressed it.  One can never simply write “happiness = 4.29” (as in Pisa tables) because there is no such thing as an instrument-independent measure of happiness.  We have no idea what happiness is as a thing-in-itself.  Kagan notes the implications for psychologists of the measurement principles set out by Niels Bohr: “Modern physicists appreciate that light can behave as a wave or a particle depending on the method of measurement.  But some contemporary psychologists write as if that maxim did not apply to consciousness, intelligence, or fear.”  According to Bohr, when one reports psychological measurements, the requirement to describe the measurement situation means that ordinary language must replace numbers.  Werner Heisenberg summarised his mentor’s teachings: “If we want to say anything at all about nature – and what else does science try to do – we must pass from mathematical to everyday language.”

 

(To simplify matters somewhat, while numbers function perfectly well when observing the motion of a tennis ball or a star, the psychologist cannot observe directly the pupil’s happiness.  Bohr argued that there was “a deep-going analogy” between measurement in quantum physics and measurement in psychology because both were concerned with measuring constructs which transcend the limits of ordinary experience.  According to Bohr, because the physicist, like the psychologist (in respect of attempts to measure happiness), cannot directly experience electrons and photons, “physics concerns what we can say about nature,” and numbers must therefore give way to ordinary language.)

 

The arguments advanced above apply, without modification, to Pisa’s core activity of measuring pupil ability.  A simple thought experiment (first reported in the TES of 26.07.2013) makes this clear.  Suppose that a pupil is awarded a perfect score in a GCSE mathematics examination.  It seems sensible to conjecture that if Einstein were alive, he too would secure a perfect score on this mathematics paper.  Given the title on the front page of the examination paper, one has the clear sense that the examination measures ability in mathematics.  Is one therefore justified in saying that Einstein and the pupil have the same mathematical ability?

 

This paradoxical outcome results from the erroneous treatment of mathematical ability as something entirely divorced from the questions which make up the examination paper.  It is clear that the pupil’s mathematical achievements are dwarfed by Einstein’s; to ascribe equal ability to Einstein and the pupil is to communicate ambiguously.  To avoid the paradox one simply has to detail the measurement circumstances in any report of attainment and say: “Einstein and the pupil have the same mathematical ability relative to this particular GCSE mathematics paper.”  By including a description of the measuring instrument one is, in effect, making clear the restrictive meaning which attaches to the word “mathematics” as it is being used here; school mathematics omits whole areas of the discipline familiar to Einstein such as non-Euclidean geometry, tensor analysis, vector field theory, Newtonian mechanics, and so on.

 

As with the measurement of happiness, when one factors in a description of the measuring instrument, the paradox dissolves away.  Alas for Pisa, the move from numbers to language also dissolves away that organisation’s much-lauded rank orders.  Little wonder that Wittgenstein described the reasoning which underpins the statistical model (Item Response Theory) at the heart of the Pisa rankings as “a disease of thought.”

 

This brings us to the very serious implications for the new GCSE grade 5, of the arguments set out above.  The fact that a switch from numbers to language invalidates entirely the practice of ordering countries according to the efficacy of their education systems has profound implications for the validity of claims made concerning the new GCSE grade 5.  Given the assertion that grade 5 reflects the academic standards of high performing international jurisdictions as identified by their Pisa ranks, what possible justification can be offered for assigning a privileged role to the GCSE grade 5 in school performance tables?

 

To date, Pisa rankings have not impacted directly on the life chances of particular children in this country.  This would change if individual pupils failing to reach the grade 5 standard were construed as having fallen short of international standards (whatever that means).  If one accepts the reasoning of Wittgenstein and Bohr, grade 5 can represent nothing more than a standard somewhere between grade 4 and grade 6.  Any attempt to accord it special status, thereby giving it a central role in the EBacc and/or performance tables, risks exposing the new GCSE grading scale to ridicule.

Dr Hugh Morrison, The Queen’s University of Belfast (retired)

 

 

 

Newsletter suspected of squandering Transfer Test exclusive for political motive

Below is the text of an article submitted by The Parental Alliance for Choice in Education to the Belfast Newsletter on Monday 20th February. It is worthwhile noting that a senior Newsletter journalist spent two and a half hours with the author trying to come up with reasons not to publish a story that the following Monday would be dominating the BBC Northern Ireland newscasts and a two page spread in the Belfast Telegraph.

Peter Weir, DUP Education Minister was always intended to be the beneficiary of a claim that AQE & PPTC were making progress towards a single transfer test.Weir’s appointee Professor Peter Tymms had suggested in his report, made available to the BBC, but not the public, that three 11+ plus transfer tests be taken on one day.

Think about the stressful implications of this suggestion before you cast a vote in the Assembly Election today

THe text of the original unpublished letter.

Professor Peter Tymms of Durham University and his team were engaged by the DUP Education Minister, Peter Weir, to explore the possibilities for a single transfer test to replace the current AQE/GL hybrid.  The AQE test is used by “state” grammar schools, while the GL test is used – in the main – to determine admission to Catholic grammars.  The Tymms report has serious implications for parents intending to send their children to a state grammar.

If the Tymms report is implemented, the current AQE arrangement of three tests taken on different days (with the best two used to determine the published AQE score) will be replaced by three tests taken on a single Saturday.  Multiple choice grids will replace the AQE practice of the child simply writing his or her answer on the test paper.  It would also appear that children seeking a place at a state grammar school will no longer have access to past papers.

In 2006, the Times Higher Education reported that Professor Tymms’ research had attracted criticism from Prime Minister, Secretary of State for Education, and the head of the Office for Standards in Education.  The Times Higher Education highlighted David Blunkett’s counsel that no one “with the slightest common sense” could possibly take seriously research by Peter Tymms.

Professor Tymms had a leadership role in the design of CCEA’s ill-fated InCAS tests.  To quote the Belfast Telegraph of 05.01.12: “The Department of Education has confirmed that the InCAS contract, which expires on January 19, has not been renewed.  InCAS was administered at a total cost of £3 million by the University of Durham’s Centre for Evaluating and Monitoring (CEM).”

In his report for Peter Weir, Professor Tymms and colleagues seem to draw on modern psychometrics ( the field in psychology and education that is devoted to testing) to somehow justify the preferred model of three tests on the same day.  Joel Michell has devoted much of his professional life to an in-depth analysis, presented in books and peer-reviewed articles, of psychometrics.  He comes to the conclusion that psychometrics is “a pathological science.”

One simple illustration of the validity of Michell’s claim is that psychometrics treats the central concept of “ability” as a state.  Surely ability is a capacity rather than a state?  If someone at the Department of Education or CCEA had been alert to this curious interpretation of ability in the standard measurement model, maybe £3 million wouldn’t have been lost to the public purse.

Getting down to practicalities, Peter Weir must, without delay, tell voters whether he accepts or rejects the model proposed in the report compiled by Professor Tymms and his colleagues.  After all, these academics were paid from the public purse; the report should be public property.

The AQE test has functioned without error since 2009.  It has strikingly high approval ratings from parents, and, most importantly, 40 to 50% of pupils on Free School Meals who take the AQE test, secure scores which will admit them to grammar school.

Those thousands of parents who wish to send their child to a state grammar must know, without further delay, the DUP’s unequivocal position on this important matter.

 

Stephen Elliott