Last Week’s Verdict on the English GCSE Farrago

February 20, 2013

After a conversation with some English teachers, I pointed out in July 2012 that the English GCSE appeared to contain some ridiculous, dumbed-down content.

English Language GCSE – Narrowing the Horizons of the Next Generation

The week before the 2012 GCSE results came out I tweeted to say that schools seemed unaware that many of them could expect to see results fall.

On the day schools received the results (but before they were made public),  I saw lots of claims that results in English must be down 10% across the board and so I explained that some schools having a fall in scores was inevitable under “comparative outcomes”. In particular:

If too many schools target what they think is a “C”, then they won’t get it. It is no good looking at January mark schemes, or previous year’s mark schemes, and trying to replicate was a C grade then. Everyone else will be doing the same thing and they can’t all get Cs. Boundaries will shift upwards.

Furthermore the effects of these things I have described will be disproportionately felt by schools which have focused on improving their number of C grades. If you aimed for lots of low Cs then you are likely to be in trouble. If you relied on controlled assessments and coursework to get grade Cs (i.e. cheating), then it is almost certain the goalposts will have moved. The effects will also be felt more in subjects where marking is imprecise and arbitrary.

None of the fuss so far has indicated yet that there has been a real problem with English beyond the failure of schools to realise the above. The culture of continual “improvement” (that actually just meant gaming the system) is quite heavily ingrained. An end to grade inflation will be a shock to the system with a lot of consequences for schools.

From A Note About The GCSEs

As the results were published the following day it became clear that there was no general collapse in results. I observed that the exam’s bizarre structure had resulted in many schools attempting to manipulate their results, but this had been foiled by the “comparative outcomes” approach used by the examining boards.

Actually, It Was About Cheating

A day or two later I responded to some of the arguments being put forward to estblish unfairness in the exams. In particular I pointed out that there was no reason to maintain the January grade boundaries in June.

The Exam Hysteria Continues…

I then followed up on any remaining arguments a few days later and argued that what the regrading lobby were pushing for (a massive increase in C grades) was not acceptable.

More About Exams

The following month I looked at the case for comparative outcomes in more detail, and considered the arguments that had emerged since results came out.

A Note on Exams

The GCSE English Farrago

Finally, OFQUAL’s report came out in November and I observed that it’s key claims (that the exam was flawed and open to manipulation and that results could not have been allowed to shoot up) for which it had compiled a large amount of evidence, were exactly what I had claimed all along.

I Told You So

Now, this entire line of argument made me staggeringly unpopular with people who claimed only to care for the best interests of the students. These arguments were repeatedly dismissed as excuses and it was far more common to hear it claimed (without evidence) that Michael Gove had personally caused the results to be pushed down for political reasons, or that OFQUAL had made a mistake and by pointing out the flaws in the exams was “attacking teachers”.

For this reason I cannot resist pointing out that the claims went to the high court, and last week a judgement was released with the following conclusion (I have highlighted some keypoints):

149. The claimants brought this case because they considered that students had been treated unfairly. There are two principal grievances: first, the actual performance of these students had not been fairly reflected in their grade because the results had been unjustly moulded to reflect predicted performance. The statistics had dominated the assessment process in a wholly unacceptable way. I have rejected that submission, essentially on the ground that it was legitimate for Ofqual to pursue a policy of comparable outcomes, ensuring a consistent standard year on  year, and assessing marks against predicted outcomes was a rational way of achieving that objective. Moreover, the Awarding Committee in each of these AOs believed that the June grades fairly reflected the quality of the candidates.

150. The second grievance is a wholly understandable one, and relates to the inconsistent treatment meted out to the students taking assessments in January and June respectively. There is no doubt with hindsight that the former were treated more generously than the latter. Some teachers, again understandably, took the January grade boundaries as a strong guide to future assessment. They did not anticipate the boundary shifting as much as it did in certain units. The reason for the change was in part that some teachers had marked papers more leniently in June specifically in order to bring them just above the C grade; but that was far from the whole story.  More significantly, there was fuller information available in June than in January and it became clear with hindsight that the January cohort had been treated too leniently.

151. Ofqual was in a difficult position. It considered and rejected the possibility of reassessing the January grade assessments.  Nobody seriously suggests that it should have retrospectively reduced a candidate’s grade in that way when the result had been made public. Yet if it were to have applied the grade boundaries in June, it would have led to a significant dilution of standards, with an unrealistically high proportion of students obtaining a C grade. That would have created an injustice as between those qualifying in June 2012 when compared with students in earlier and subsequent years. Indeed, the problem is compounded when it is appreciated that some candidates for particular units in June 2012 were qualifying in June 2013. If they were to be assessed according to the January 2012 boundary marks, that would be unfair to candidates taking the same unit in January and June 2013. It would manifest precisely the same unfairness that the claimants now allege, but shifted to different victims.

152. The problem lies in the modular nature of the examination, coupled with the fact that grade boundaries were assessed and made  public at each stage of the process. Mr Sheldon [the QC acting on behalf of the regrading lobby] was highly critical of this structure. He rightly points out that a number of experts had predicted precisely the kind of difficulties which have, in fact, arisen. He says that the problem is of Ofqual’s own making (or at least, Ofqual’s predecessor). That may be so, but the judicial review challenge is not to the modular nature of the assessment process, or to the practice of assessments being made at different points in the two year qualification period. It is a challenge to the way in which Ofqual and the AOs sought to deal with the problems once they had materialised.

153. Initially it was assumed that since the same procedures were being adopted in January as in June, there should be no change in standards.  In fact, this was not so and the January cohort were assessed more leniently.  Once that became clear, Ofqual was engaged in an exercise of damage limitation. Whichever way it chose to resolve the problem, there was going to be an element  of unfairness. If it imposed the same standard in June as it had in January, this would be unjust to subsequent cohorts of students taking the units in subsequent years. If it did not, that would favour the January cohort over the June cohort in 2012. Unless standards were to be lowered into the future and the currency of GCSE English debased, at some stage a decision would have had to be taken to depart from the less rigorous January grade boundaries and at that point, whenever it was, there would be winners and losers.  

154. The claimants submit that even if the January cohort was treated unduly favourably, it was wrong to draw a distinction between groups of candidates qualifying in the same year. This was more important than equality as between years.

155. However, there is no obvious or right answer to the question where the balance of unfairness should lie. Ofqual’s solution was in my judgment plainly open to them. Their priority was to protect the comparable outcomes objective, although it meant that January candidates were treated more generously.  However, the adverse consequences were relatively contained by acting at that point since far fewer students took the relevant units in January than in June.  

156. For these reasons, which briefly recapitulate those spelt out in some detail in this judgment, I do not think it can be said that Ofqual or the AOs erred in law.

157. I therefore dismiss these applications. As I have said, however, this is a rolled up hearing, and although nothing turns on the point, I would grant permission for the applicants to bring these proceedings. This was a matter of widespread and genuine concern; there was on the face of it an unfairness which needed to be explained. There is no question, in my view, that the matter was properly brought to court.  Indeed, following the outcry when the results were published in August, Ofqual itself carried out an investigation into the concerns which were being expressed and produced two reports, an interim report and a final one  produced after consulting widely with interested parties. Ofqual was not persuaded that it should require the grade boundaries to be changed, but it appreciated that there were features of the process which had operated unfairly and it proposed numerous changes for the future which are designed to ensure that the problems which arose in this case will not be repeated. It also took the unusual step of allowing students to take resits in November instead of having to wait until the following January. We are not directly concerned with those reports which simply reflect Ofqual’s own views.  However, having now reviewed the Judgment Approved by the court for handing down evidence in detail, I am satisfied that it was indeed the structure of the qualification itself which is the source of such unfairness as has been demonstrated in this case, and not any unlawful action by either Ofqual or the AOs.

Does anybody who said I was wrong before care to reconsider their position?

  1. I stated before that the legal challenge to have all students graded by the lenient January boundaries was wrong and that the focus should be on ensuring that the June boundaries were not set at too harsh a level to try and compensate. That issue has been lost in the noise from the court case.

    I believe, based on discussions with senior OFQUAL staff and reading FOI requests that this is what happened:

    January 2012 written foundation paper was graded too leniently (and June ’11 and Jan ’11 by inference).

    These papers were, collectively, taken by just over 30% of the full cohort.

    This will have resulted in *some* students gaining an overall grade that was above their merit (others will not have crossed a grade boundary due to the leniency in this unit).

    No consideration was undertaken by OFQUAL into whether this would impact on the applicability of the comparable outcome approach because OFQUAL initially thought that the number of students taking this unit was very small (it was only later they found out it was 30%+).

    OFQUAL have advised that this argument was not considered by the High Court – they restricted consideration to the general principle of comparable outcomes.

    I believe that OFQUAL should have identified how many ‘false Cs’ were awarded overall because of the leniency and adjusted the comparable outcomes generated A*-C target to reflect this thus ensuring that June students were not disadvantaged against the ‘correct’ standard for a grade C.

    It may be that relatively few false C grades were awarded because of the errors in grading earlier units but, and this is a key for me, OFQUAL still don’t know how many such students there were!

    I find that worrying.

    My other area of concern is that AQA moved thr grade boundaries for the written controlled assessment unit by 3 marks due to ‘teacher overmarking’ yet they did not carry out any statistical analysis as to the scale of overmarking until the enquiry itself.

    The enquiry found an average overmark of just over 1 mark per student – this doesn’t really justify a 3 mark shift in boundaries and neither AQA nor OFQUAL have, to my knowledge, explained the discrepency.

    I do find it worrying that AQA were able to move the boundaries by 3 marks (from those previously used and explicitly given out at training events) without any formal analysis taking place to guide them as to the scale and size of the problem.

    • This is from memory so I could have got this wrong, but don’t have time to look it up.

      I seem to recall that the November OFQUAL report did consider the possibility that the January sittings may have made the June marks too harsh and dismissed it for reasons I didn’t fully understand. At any rate, if there was any evidence for this it is odd that it has never been produced by the regrading lobby.

      With regard to AQA CA boundaries, I think (and I haven’t looked this up) the court judgement considered this and concluded that if they hadn’t been moved then the other boundaries would have been moved more instead and overall it would have made little difference.

  2. re: 2nd point – it may not have made a difference overall but it almost certainly would on an individual student level – where is the fairness in that?

    • I think we have long established unfairness. Question is whether unfairness best dealt with by raising grade C boundaries in June.

  3. Well done OA!

    Well done court! (one with common sense -gosh)

    Well done me! (sorry- but the court made the same, bleeding obvious, point that I did, about shifting unfairness to different sets of ‘victims’)

    Steve- if some Jan kids were ‘inflated’, it would have a smallish chunk (as that 30% covers A* to G kids) and they would have been been ‘real’ high D’s.

    Therefore the corresponding kids that were ‘depressed’ in June, again a small sized small chunk, would have been ‘real’ low C’s.

    I don’t think its crime of the century because a low C student is, in my view, already deeply flattered by our assessment system.

    I would say a low C english UK student is one that many would regard as barely literate or even illiterate.

    I think its a sound judgement and hopefully people can now move on.

    • Exactly. What really bothered me about all this was the sob stories from pupils saying that they now couldn’t do the A levels they wanted because their low C had become a D.

      Well, I’m sorry, but if you’re unable to get your GCSE grade up to something reasonable without the intervention of the exam board or Ofqual, then by no stretch of the imagination would you be capable of doing A levels in the first place. Your teachers, likewise, shouldn’t be, to adapt Rob’s line, ‘deeply flattering’ you by making you believe that you should be doing them

      • …in the first place. (Pressed the Send button by mistake.)

