Policy Based Evidence Making

February 8, 2013

Apologies for being a week late (and, therefore, not terribly topical) with this one. There were technical difficulties.

There seems to have been a craze recently for people engaged in partisan arguments about education to claim that those they disagree with have ignored the evidence. It seems that the more ideological one’s own position is, the more likely it is that one will declare other peoiple’s views as being without evidence. So, for instance, SWP activist Michael Rosen, a man so divorced from the evidence on how children learn to read that he thinks phonics is “barking at print”, declared in one of his tedious rants directed at Michael Gove that

For you to be able to push through what is fast becoming an exam that will be a major impediment for most young people to develop as learners, you must… ignore all evidence on adolescents and learning.

Similarly, in another Guardian rant, this one so short of actual evidence or argument that Tom Bennett described it as “a joyless donkey ride across the greatest hits of armchair fantasy edu- football”, Suzanne Moore argued that:

Gove, charming as he is, is one of the most profoundly ideological of the lot. One would have thought that a man of his intelligence might push through policies based on evidence. Evidence-based policy-making is all the rage you know. Scientists even do it! But no: the entire education system is now one vast experiment without any aim except the reach of Gove’s ambition.

A third example can be found from this blogger. I was saddened to read:

Like many teachers at a senior level, I have an MA. Three years of hard work in my own time, travelling up to 80 miles on a round trip once a week or so, I wasn’t going to waste my time. The time I used was spent on gathering evidence, from the research of others and from my own work in the classroom. No evidence, no MA basically. There is little evidence in Gove’s ideology.

The one source he has quoted, Daniel Willingham, is a cognitive scientist, not a primary or secondary school teacher, in the USA, a country with perhaps more rigid curriculum rules than our own. He has researched brain mechanisms and memory, and has dismissed the usefulness of learning styles. He appears to be Gove’s guru, and the source of his obsession with rote learning and the rigour of exams.

Phonics: I am not opposed to phonics as such, it is a way of teaching reading, but not the only one. The evidence base was very narrow. A study in Clackmannanshire, the smallest authority in Scotland, is the basis for the introduction of synthetic phonics in England. Too narrow a base, in a part of our nation with a different educational system. The testing too is a political tool. The use of nonsense words, whilst enabling new language learners to show their phonic skills, actually penalised good readers who for example might read the nonsense word ‘dess’ as ‘dress’ because they want to start reading real words to make sense of the nonsense. This appears to have penalised more able readers in their scores, and impacted on schools in the ‘leafier’ suburbs.

So to sum up, the author claims:

  • A professor of psychology, who has published two books on education, knows nothing about how learning works.
  • The evidence on phonics (Hattie suggested in 2009 that there were 425 available studies of phonics instruction) is reduced to one study that apparently can be ignored due to the size of the local government boundaries.
  • The alternative to both the evidence-based discipline of cognitive psychology and the empirical evidence is: the opinion of people who have done MAs in education.
  • Probably lots of other things, I just couldn’t bring myself to read any further.

However, if these contributions were not enough to make me wonder if “evidence” is a synonym for “my opinion” and “lack of evidence” is another way of saying “your opinion” there was one blog, widely celebrated on Twitter, that really got my goat. Not because it could compete with the “evidence-based ranting” approach of the above, but because it seemed remarkably plausible until you actually analysed the sources and saw how they had been cherry-picked. This is “The research v the government” from Ian Gilbert. This draws on the Hattie style analysis of education research published by Education Endowment Foundation here.

I have issues with much of the EEF analysis for a few reasons.

1) It looks at effect sizes but seems to ignore Hattie’s claim that when you use this for analysis:

Almost everything works, Ninety per cent of all effect sizes in education are positive. Of the ten per cen that are negative about half are expected (e.g. effects of disruptive students); thus about 95 percent of all things we do have a positive influence on achievement. When teachers claim that they are having a positive effect on achievement or when a policy improves achievement this is almost a trivial claim: virtually everything works. One only needs a pulse and we can improve achievement.

Famously, Hattie’s answer is to compare effect sizes with the average effect size of 0.4. I am a little sceptical about such a cut-off point, but I would suggest that we have every reason to consider effects that are of the order of this “hingepoint” or less to be unproven even when statistically significant.

2) In the absence of decent empirical evidence, the next best thing is the evidence from experimental psychology. To ignore this on the basis of education research, which is usually of a much lower standard, strikes me as a mistake and undermines any claim to be “evidence-based”. There is an exploration of this argument here.

3) The EEF report includes both evidence from studies and opinions which are not clearly drawing upon, and sometimes contradicts, the studies.

Gilbert’s blog ignores these problems, but exploits the third so as to sometime quote from the research conclusions, and sometimes quote from the opinions accompanying them, according to whichever one contradicts the government. For these reasons much of his “evidence” soon seems to be less than convincing when under scrutiny and I will address them each in turn.

Claim 1: Ability grouping harms middle and low attainers.

This is a classic among educational ideologues. I have lost count of the number of times I’ve heard it claimed that this is what all the evidence shows by someone who promptly discovers that they cannot find the evidence in question. While I am yet to do a full review of the evidence myself I can point out that this particular claim is made by the EEF authors on the basis of 4 meta-analyses. 2 of them found a small positive effect for ability grouping. The one that found the largest negative effect (-0.12) for low attainers, according to their own description, actually found a positive effect for homogeneous grouping of +0.12. Given Hattie’s observations about how education research usually finds a positive effect (and a bigger one than this) inverting the result seems unfair. Also, all these meta-analyses are from the 80s and 90s meaning one of the most rigorous bits of education research on ability grouping isn’t included.

Claim 2: There is no evidence for the benefits of school uniform.

This may be an accurate description of the research, but given that elsewhere the opinions of the EEF authors are quoted as evidence it seems a little odd that it has been ignored that they go on to say:

When combined with the development of a school ethos and the improvement of behaviour and discipline, the introduction or enforcement of a school uniform can be successfully included as part of this process.

Claim 3) Performance Pay doesn’t work.

I’m not going to argue with that.

Claim 4) Evidence does not support longer school days.

The EEF authors actually concede there is evidence of effectiveness but doubt whether it is cost effective. It would have been equally possible to quote the following section:

Overall approaches to increasing the length of the school day or the school year add on average two months additional progress to pupils’ attainment over the course of a year. Additionally, research based on international comparisons, looking at average times for schooling in different countries is consistent with this conclusion.  However, it should also be noted that pupils from disadvantaged backgrounds benefit by, on average, an additional half a month’s progress relative to their peers suggesting that extending school time can be an effective means to improve learning for pupils who are most at risk of failure.

Here, we have a exact reversal of the way claim 2 was considered. For claim 2, the summary of the evidence was reported but not the opinion of the EEF authors. Here, the opinion of the EEF authors (that it is not cost-effective) is reported but the summary of the evidence (it works) was not reported. Nothing could show more clearly how selective Gilbert is being.

Claim 5) SEAL works.

This is one where the EEF authors are partly to blame. They appear to have quoted a wide variety of studies related to the social and emotional aspects of learning as supporting the effectiveness of SEAL. However, they (unlike Gilbert) admit that when SEAL itself was studied the evidence was not good: “A quasi-experimental evaluation of the impact of the secondary programme did not find a significant impact on attainment in the SEAL schools.”

Claim 6) Nick Gibb was wrong to recognise the success of phonics.

This is again an outrageous selection of opinion over evidence.

The evidence, as the EEF authors admits, indicates “Phonics approaches have been consistently found to be effective in supporting younger readers to master the basics of reading.The approach tends to be more effective than other approaches to early reading (such as whole language or alphabetic approaches)…” Unfortunately the rest of the passage is marred by the usual phonics denialist rhetoric used to obscure the clear message of the evidence with qualifications which can’t actually be deduced from the evidence. Gilbert has quoted only from this obfuscation and opinion.

Claim 7) Despite Gove’s support for sitting in rows, collaborative learning works really well.

This is really one with a lot of background and I intend to blog about it in more detail at a later date. However, it is worth mentioning that the actual research on sitting in rows is ignored here. It is also worth mentioning that the effect size the EEF authors find for collaborative learning is 0.42. Hattie found 0.41.  Both are not really distinguishable from Hattie’s “hingepoint” of 0.4 making “collaborative learning” less than clearly effective. This is a case where I would suggest we look at the evidence from psychology. We actually have a 100 years of psychology research supporting “The Ringelmann effect”: a general tendency for people to become less motivated when made to work in groups.

Claim 8) In contrast to the government’s policy of ending ringfencing for one-to-one tuition, such tuition does work.

This is another one where relevant opinions of the EEF authors are ignored. They state that one-to-one tuition is expensive and other alternatives should be considered. Ending ringfencing (as opposed to stopping all one-to-one tuition) actually seems to be in line with this opinion.

Claim 9) Early Years Intervention works, despite a government minister saying Sure Start isn’t a candidate for more money.

Like claim 8,  this seems to miss the difference between something being a good use of money and it having an effect. More importantly though, it ignores that a general level of evidence for this form of intervention isn’t necessarily evidence for Sure Start, as the EEF authors acknowledge “Evaluations of Sure Start in the UK do not show consistent positive effects and indicate that some caution is needed when generalising from exceptionally successful examples”

Claim 10) Peer tutoring works and this disproves Gove’’s point of view about collaborative learning.

The immediate problem with this is that Gove’s opposition to collaborative learning was actually an interpretation of a comment about sitting kids in rows (the research on which is again ignored here) so it is far from clear that the evidence on peer tutoring actually has any relevance to what he said. However, even if he does have a general dislike of groupwork, then this cannot be said to disproved by picking the one type of groupwork with a strong positive effect. Why not? I think the following is a really good explanation of why this is not a fair way to do research:

Claim 11: Something about meta-cognition

I don’t even get what is being claimed here.


  1. Great post. Gilbert’s post was a load of rubbish. In my opinion, what this shows more than anything is that teachers, and probably the general public, should be taught to evaluate evidence.

    On the 0.4 effect-size cut-off point: I think Hattie is saying that interventions with this effect size work, but they might not be worth our time. Why? Because we could spend that time on interventions with a higher effect size. So it’s not quite true to say that interventions whose effect size is below 0.4 are “unproven” – they’re just not as effective as others. Of course, interventions may not be mutually exclusive, so something with a low effect size but zero implementation cost is worth considering.

    • Actually Hattie’s point is that this is an average figure, so as good a place to start as any. However he also argues that other factors such as ease of a strategy, its cost etc might still make it worth while. the bottom line which i think is the real point about all the evidence is that there is no magic bullet to education. That the role of the teacher and student expectations are the key and tbh while some things may have advantageous outcomes for most strategies the critical factor is that it is being implemented effectively by people who understand and believe what they are doing.

      Collaborative working probably does work when done by a teacher who a) believes in it b) has the competency to make it work well c) as a result of a and b has created a certain high expectations amongst students to make a success of it. However this may be as likely for a classroom in which the students sit in rows. What the meta analysis often then demonstrate is that many of these strategies are not miles apart as the distribution of competence amongst those employing the strategies is balanced out.

      What the evidence is often good at is showing some things which blatantly don’t work or have limited impact. One of the biggest findings from hattie’s work is many of the structural things that politicians/educationalists obsess about are a waste of time.

      The problem with the phonics issue is that it is being treated as the panacea for early reading and writing, however it may be a good starting point but it is quite limited in my experience as too many words just do not succumb to the phonics, other than by creating ever more complicated and numerous digraphs/trigraphs blah blah. As a consequence the phonics screening happens in my (admittedly not professional opinion on this one) too late to be of much use. By the end of year 1 young people should be casting off the chrysalis of phonics to use more advance reading strategies, not being tested on the old scaffold!

      • “Actually Hattie’s point is that this is an average figure, so as good a place to start as any.”

        I’m not totally sure what you’re driving at but I don’t think this is quite right. The problem with interventions with a small effect size is opportunity cost.

        • Sorry i just meant that an effect size of 0.4 did not mean they were not worth bothering with, at least that is what i took hattie to mean. We could pick any value as a kind of cut off for different motivations but if something had an effect size of 0.4 but was really easy to do, then i think the advice would be to do it. The effect size becomes more useful when weighjng up several options with limited time/money or when a conflict between two paths exists

  2. Slightly OT re. the Clackmannanshire phonics research (and I have to admit that I haven’t read the study itself, only the reporting on it) but why does it not seem to have occurred to anyone that a key reason you can’t generalize from it (IMO) is that the “softer” (S/E) Scottish accents are much more amenable to phonetics than any other English-English accent. (Though maybe not quite as much as ‘Old’ South accents in the US.)

    (I am not a fan of phonics, but I can tell you exactly why that is – I was taught with them, and my childhood was miserable. Which seems to be the main basis for an awful lot of educational philosophy. “X happened to me. I had a miserable childhood. Therefore X is bad.” Or alternately: “X happened to me. I have subsequently become a government minister / Very Important Person. Therefore X is good.”)

    • My point was that it does not all hinge on that one study. However, I have to say I’m sceptical about this reason for dismissing it.

      • Assuming you are skeptical about the accent thing, rather than my silly quip about education philosophy. (Although I do think my silly quip DOES explain a lot about ‘amateur’ educational philosophies – i.e. those of people like… to pull a category out of thin air… government ministers!!)

        Anyways, I wasn’t suggesting that it was a reason to dismiss it totally – just that it seems that this project is a bit of an edge case in terms of its success, and I am just surprised that it doesn’t seem to have been taken into account as a POSSIBLE reason why this project was SO successful.

        • The phonology of Scottish English isn’t different enough for this to be a plausible explanation.

  3. It is a classic mistake to think phonics doesn’t work because of exceptions/being complicated to learn.
    Phonics approaches teach the child to pay attention to the pattern of letters in the word and look at them left to right. The other current approaches instead encourage guessing with initial sounds, word shape, context and pictures. This involves the eye jumping about the page looking for clues. The approach at school meant my son saw a four letter word beginning with ‘w’ and guessed it was ‘went’. A phonics approach at home meant he learnt to read through the word and recognised it was ‘what’. In other words phonics made it easier for him to read ‘irregular’ words.
    That is why it is nonsense when critics of systematic phonics teaching say mixed methods are better. You can’t have a default strategy of both looking at the letters on the page left to right while simultaneously having a default strategy of jumping around the page looking for clues. Its like saying a driver can drive on the left and right side of the road simultaneously.

    • If that response is to my comment then can you tell me where I said phonics doesn’t work? I just said it has limitations. As one would expect as a child develops their reading, at some point they have to move on to other methods, or we would all still be sat here for hours sounding out every word! Also suggesting the possibility of using more than one method does not imply the idea of mixing lots of contradictory methods, just that you may find two or three methods work well together. Alternatively in cases of intervention it might be that one child finds an alternative method works for them. I am not surprised the preference for one method to rule them all ,as I sometimes wonder whether the priority of education is to teach students to learn or to conform. (Note I do not assume that this is your own position more a question about why we try and find one size fits all solution to education).

  4. Hi Trudge, We don’t move onto other methods from phonics. Proficient readers just move left to right through words very quickly – we just get very fast (or if we have not been taught using phonics we finally ditch other habits and go left to right to become proficient.) My point is that the main methods ARE contradictory. There aren’t two or three methods that work well together. That is the very point I was trying to make in my last post. Research shows proficient readers all read the same way – left to right taking account of the patterns of letters. That is why dyslexia is tested by checking ‘phonological awareness’ – whether you are aware of the letter patterns in words.

    • So once you can read left to right, you just start to memorise words? I presume their must be some process by which you move beyond that. How does speed reading work for example? How do you read words which are not spelt phonetically? I will check out what is meant by reading through phonics, as you seem to be redefining it in terms of reading from left to right, and detaching it from the ‘sounding’ element which is what I presumed phonics was meant to be about? It seems to me that a bit more decoding is going in a ‘mature’ reader. the symbols on the page a being converted in ideas and meanings with their actual component sounds being invoked? I guess what I am suggesting is that the next ‘method’ would be to develop rapid recognition of words, and an expanded vocab. The method of synthetic phonics must surely be enhanced with other strategies? ie phonics is a good way to first learn a word, but surely you will want to teach in such a way that increases that speed of left right reading? Anyway thanks for your response, it may well be that the issue is what is determined by a ‘method’ and my ignorance as to what ‘phonics’ is referring to on a wider context. My son uses phonics if he is unsure of a word, but seems able to recognise a number of words from memory once he has learnt them. Should that be discouraged? Dies he need to ‘read’ through the whole word in case he has got the wrong word?

      • My understanding is that humans use two separate strategies to read. One of these is phonics – working out the word from the individual sounds. The other is word recognition. Word recognition is slower than phonics for uncommon words, and faster for really common words. For irregular words, you only have one strategy – recognition.

        Dyslexia isn’t one thing. It can affect either strategy. So some people with dyslexia will find word recognition extremely difficult. In other languages (eg Welsh or Spanish) they would suffer far less, as these languages have few or no irregularities.

        Other people with dyslexia find phonics extremely difficult (this type is rarer). They can only use word recognition. I came across such a person doing a psychology experiment years ago. She was a successful Cambridge University undergraduate, yet if you put a made-up word like ‘zate’ before her, she couldn’t pronounce it. Her father had ended up teaching her to read using loads of flashcards because phonics was never going to work for her.

        The morals? The vast majority of students will benefit from learning phonics. Phonics isn’t the only reading strategy. Very occasionally students won’t get phonics at all, and might need a completely different strategy.

        • I don’t think any of this can really be supported by evidence.

          • Not only that, but an extreme form of phonics, Toe-by-Toe, is regularly prescribed as a curative for dyslexia, and, in my admittedly limited experience, is much more effective than other methods. This is precisely because it uses nonsense words to make the child focus on the phonemes alone. Only when they’ve mastered them will the child be allowed to progress to words with meaning, but, boy, do they tear through them quickly after that.

            A further point about your blogger. If a child can’t tell ‘dess’ from ‘dress’, she will find it much harder to learn new words which differ very slightly from words she already knows (so at least she won’t be able to recognise that the blogger’s argument is ‘dross’).

            She will also have been trained to be a really lousy proofreader. This does at least explain why so many A level and university essays are full of basic spelling mistakes, and why the students look at you in complete bafflement when you point them out.

  5. Symptomatic of the gradual deprofessionalisation of teaching perhaps. It seems to matter little these days whether there is any evidence for a claim, and in fact often simply the fact that something is posted on the internet is enough to give most statements, no matter how foolish, some degree of validity in the eyes of many so called educators.

    Although we all agree that Inset/ CPD days can be a complete waste of time, the following “unconference” approach seems to some to be the panacea to all of our professional development woes, as if sharing your good ideas makes them intrinsically valuable in some way.


    Ignoring the evidence in favour of “i have a good idea” is possibly what has resulted in the US being 17th in the recent Pearson league table of global education systems. I am just amazed that the UK is lagging so far behind the US in their educational decline.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: