h1

The tragedy of grades based on predictions

August 16, 2020

When I wrote about an exam announcement last week it was out of date before I’d finished typing. This post too may now be out of date if the appeals system allows major changes, but I have seen so much false information that I thought I’d better get this out there.

Exams were not sat this year. The decision was made instead to predict what grades would have been given. This is probably the decision that should have been debated. Instead the debate has centred on how grades were predicted with much talk of an evil algorithm crushing children’s hopes. Some wished to predict grades deliberately inaccurately in order to allow grade inflation to hide the problems. Because opportunities such as university places and employment are finite, grade inflation actually doesn’t solve any problem. What it does is make sure that when people lose out on opportunities, it would not be clear that this year’s grades were the problem. I argued against the idea that grade inflation solves problems here and will not be going into it again now, but it is worth noting that most disagreement with any opinions I express in this post will be from advocates of using grade inflation to solve problems, rather than anything else. In particular, it needs to be acknowledged that the use of teacher assessment would have on average led to more grade inflation.

However, because people seemed to think inaccuracy in grades would justify grade inflation, and because people objected to specific grades when they arrived, there has now been huge debate about how grades were given. Much of this has been ill-informed. 

I intend to explain the following:

  1. How grades are predicted.
  2. Why predicted grades are inaccurate.
  3. What claims about the process are false or unproven.

Normally, I’d split this into 3 posts, but things are moving so fast I assumed people would want all this at once in one long post.

How grades are predicted.

Ofqual produced a statistical model that would predict the likeliest grades for each centre (usually a school or college). This used all the available data (past performance and past grades of the current cohort) to predict what this year’s performance would have been. This was done in accordance with what previous data showed would predict grades accurately. A lot of comment has assumed that if people are now unhappy with these predictions or individual results, then there must have been a mistake in this statistical model. However, this is not something where one can simply point at things one doesn’t like and say “fix it”. You can test statistical models using old data, e.g. predict 2019 grades from the years before 2019. If you have a model that predicts better than Ofqual’s then you win, you are right. If you don’t, and you don’t know why the Ofqual model predicts how it does, then you are probably wrong. In the end, proportions of grades were calculated from grades given in recent years, then adjusted in light of GCSE information about current students, then the number of expected A-levels in each subject at each grade was calculated for each centre. Centres were given information about what happened in this process in their case.

Although the model came up with the grades at centre level, which students got which grades was decided by the centres. Centres ranked their students in each subject and grades were given in rank order. Some commentary has overlooked this, talking as if the statistical model decided every student’s grade. It did not. It determined what grades were available to be given (with an exception to be discussed in the next paragraph), not which student should get which grade. As a result the majority of grades were not changed and where they were, it would often have been a result of the ranking as well as the statistical model.

Finally, there was an exception because of the problem of “small cohorts” taking exams i.e. where centres had very few students taking a particular exam (or very few had taken it in the past). This is because where there was less data, it would be harder to predict what grades were likely to be given. Centres had also been asked to predict grades (Centre Assessed Grades or CAGs) for each student and for the smallest cohorts these were accepted. Slightly larger cohorts were given a compromise between the CAGs and the statistical model, and for cohorts that were larger still, the statistical model alone was used.

It is important to understand this process if you think a particular grade is wrong. Without knowing whether the cohort was small; why the statistical model would have predicted what it did; how the distribution was calculated for a centre, and where a student was in the ranking, you do not know how a grade came to be given. For some reason, people have jumped to declare the evils of an “algorithm”. Didn’t get your result? It’s the result of an algorithm.

As a maths teacher, I quite like algorithms. Algorithms are the rules and processes used to solve a problem, perhaps best seen as the recipe for getting an answer. Every year algorithms are used after exams to decide grade boundaries and give grades. A mark scheme is also an algorithm. The alternative to algorithms deciding things is making arbitrary judgements that don’t follow rules. This year is different in that CAGs; a statistical model (also a type of algorithm), and centre rankings have replaced exams. The first thing that people need to do to discuss this sensibly is to stop talking about an algorithm that decided everything. If you mean the statistical model then say “the statistical model”. There are other algorithms involved in the process, but they are more like the algorithms used every year: rules that turn messy information into grades. Nobody should be arguing that the process of giving grades should not happen according to rules. Nobody in an exam board should be making it up as they go along.

Why predicted grades are inaccurate.

Predicted grades, whether from teachers or from a statistical model, are not likely to be accurate. That’s why exams are taken every year. The grades given will not have been the same as those that would have been given had exams been sat. Exam results are always influenced by what seem like random factors that nobody can predict (I will discuss this further in the next section). We can reasonably argue over what is the most accurate way to predict grades, but we cannot claim that there is a very accurate method. There are also situations where exam results are very hard to predict. Here is why I think this year’s results will be depressingly inaccurate.

Some students are exceptional. Some will get an A* in a school that’s never had an A*. Some will get a U in a school that’s never had a U. Predicting who these students are is incredibly difficult and remains difficult even where historic A-level results are adjusted to account for the GCSE data of current students. Students will have often unfairly missed out (or unfairly gained) wherever very high or low grades were on the table (i.e. if students were at the top and the bottom of rankings). This is the most heartbreaking aspect of what’s happened. The exceptional is unpredictable. The statistical model will not pick up on these students. If a school normally gets some Us (or it gets Es but this cohort is weaker than usual) the model will predict Us. If a school doesn’t normally get A*s (or it does but this years cohort is weaker than usual) the model will not predict A*s. This will be very inaccurate in practice. You might then think that CAGs should be used to identify these students. However, just as a statistical model won’t pick up an A* or U student where normally there are none, a teacher who has never taught an A* or U student will not be able to be sure they have taught one this time. In the case of U it might be more obvious, but why even enter a student for the exam if it was completely obvious they’d get U? The inaccuracy in the CAGs for extreme grades was remarkable. In 2019, 7.7% of grades were A*; in 2020, 13.9% of CAGs were A*. In 2019, 2.5% of grades were Us; in 2020, 0.3% of CAGs were Us. Both the CAGs and the statistical models were likely to be wrong. There’s no easy way to sort this out, it’s a choice between two bad options.

As well as exceptional students, there are exceptional schools. There are schools that do things differently now, and their results will be different. Like exceptional students, these are hard to predict. Ofqual found that looking at the recent trajectory of schools did not tell them which were going to improve and so the statistical model didn’t use that information. Some of us (myself included) are very convinced we work in schools that are on the right track and likely to do better. However, no school is going to claim otherwise and few schools will admit grades are going to get worse, so again, CAGs are not a solution. Because exceptional schools and exceptional students are by their very nature unpredictable, this is where we can expect to find the biggest injustices in predicted grades.

Perhaps the biggest source of poor predictions is the one that people seem to be reluctant to mention. The rankings rely on the ability of centres to compare students. There is little evidence that schools are good at this, and I can guarantee that some schools I’ve worked at would do a terrible job. However, if we removed this part of the process, grades given in line with the statistical model would be ignoring everything that happened during the course. Few people would argue that this should happen, so this hasn’t been debated anywhere near as much as other sources of error. But for individual students convinced their grades are wrong, this is likely to be incredibly important. Despite what I said about the problems with A*s and Us, a lot of students who missed out on their CAG of A* will have done so because they were not highly ranked, and a lot of students who have got Us will have done so because they were ranked bottom and any “error” could be attributable to their school rather than an algorithm. 

Finally, we have the small cohorts problem. There’s no real way round this, although obviously plenty of technical debate is possible about how it should be dealt with. If the cohort was so small that the statistical model would not work, something else needs to be done. The decision was to use CAGs fully or partially, despite the fact that these are likely to have been inflated. Inflated grades are probably better than random ones or ones based on GCSE results. But this is also a source of inaccuracy. It also favours centres with small cohorts in a subject and, therefore, it will allow systematic inaccuracy that will affect some institutions very differently to others. It is the likely reason that CAGs have not been adjusted downwards equally in all types of school. Popular subjects in large sixth forms are likely to have ended up with grades further below CAGs than obscure subjects in small sixth forms.

Which claims about the process are false or unproven

Much of what I have observed of the debate about how grades were given has consisted of calls for grade inflation disguised as complaints about inaccuracy, or emotive tales of students’ thwarted ambitions that assume that this was unfair or unusual without addressing the cause of the specific disappointment. As mentioned above, much debate has blamed everything on an “algorithm” rather than identifying what choices were made and why. Having accepted the problems with predicting grades and acknowledged the suffering caused by inaccuracies, it’s still worth trying to dispense with mistaken, misleading or inaccurate claims that I have seen on social media and heard on the news. Here are the biggest myths about what’s happened.

Myth 1: Exams grades are normally very accurate. A lot of attempts to emphasise the inaccuracies in the statistical model have assumed that there is more precision in exam grades than there actually are. In reality, the difference between a B grade student and a C grade student can be far less than the difference between two B grade students. Some types of exam marking (not maths, obviously) is quite subjective and there is a significant margin of error, making luck a huge factor in what grades are given. Add to that the amount of luck involved in revising the right topics, having a good day or a bad day in the exam, and it’s no wonder grades are hard to predict with accuracy. It’s not comforting to think that a student may miss out on a university offer because of bad luck, but that is not unique to this year; it is normal. The point of exam grades is not to distinguish between a B grade and a C grade, but between a B grade and a D grade or even an E grade. It’s not that every A* grade reflects the top 7.7% of ability, it’s more a way of ensuring that anyone in the top 1%, say, should get an A*. All grades are a matter of probability, not a definitive judgement. That does not make them useless or mean that there are better alternatives to exams, but it does mean everyone should interpret grades carefully every year. 

Myth 2: CAGs would have been more accurate.

As mentioned above, CAGs were higher than they should have been based on the reasonable assumption that a year group with an interrupted year 13 is unlikely to end up far more able than all previous year groups. There’s been a tendency for people to claim that aggregate errors don’t tell us anything about inaccuracies at the level of individual students. This is getting things backwards. It is possible to have inaccuracies for individual students that cancel each other out and aren’t visible at the aggregate level. So you could have half of grades being too high, and half too low, and on average the distribution of grades seems fair. You could even argue that this happens every year. But this does not work the other way. If, on average, grades were too high it does tell us something about individual grades. It tell us that they are more likely to be too high than too low. This is reason enough to adjust downwards if you want to make the most accurate predictions.

Myth 3: Individual students we don’t know getting unpredicted Us and not getting predicted A*s are examples of how the statistical model was inaccurate.

As argued above, the statistical model is likely to have been inaccurate with respect to the extremes. However, because we know CAGs are also inaccurate, and that bad rankings can also explain anomalies, we cannot blindly accept every story about this from kids we don’t know. I mention this because so much commentary and news coverage has been anecdotal in this way. If there were no disappointed school leavers that would merely tell us that the results this year were way out compared to what they should have been, because disappointed school leavers are normal when exam grades are given out. Obviously, the better you know a student, the more likely you are to know a grade is wrong, but even then you need to know their ranking and the justification for the grade distribution to know the statistical model is the problem.

Myth 4: The system was particularly unfair on poor bright children.

This myth seems to have come from two sources, so I’ll deal with each in turn.

Firstly, is has been assumed that as schools which normally get no A*s would not be predicted A*s (not quite true) this means poor bright kids in badly performing schools would have lost out. This misses out the fact that even with little history of getting A*s previously, they might still be predicted if the cohort has better GCSE results than usual, so the error is less likely if the poor bright kid had good GCSEs. It also assumes that it is normal for poor kids to go to do A-levels in institutions that get no A*s which is unlikely for big institutions. Additionally, schools are not uniform in their intake. The bright kid at a school full of poor kids who misses out is not necessarily poor, in fact because disadvantaged kids are likely to get worse results, they often won’t be. Finally, it’s not just low achieving schools whose A* students are hard to predict. While a school that usually gets no A*s in a subject, but who would have got one this year makes for a more dramatic story, the situation of that child is no different to the lowest ranked child in a school that normally gets 20 A*s in a subject and this year would have got 21. 

The second cause of this myth, is from statistics about downgrading from CAGs like these.

Although really this shows there’s not a huge difference between children with a different socioeconomic status (SES) it has been used to claim that poorer students were harder hit by downgrading and, therefore, it is poor bright kids that will have been hit worse than wealthier bright kids. (Other arguments have looked at type of school, but I’ll deal with that next). Whether this figure is a result of the problem of small cohorts, or from the fact that it is harder to overestimate higher achieving students, I don’t know. However, we do know the claim these figures reflect what happened to the highest achieving kids is incorrect. If we look at the top two grades, the proportion of kids who had a high CAG and had them downgraded is smaller for lower SESs (although because fewer students received those grades overall the chance of being downgraded given that you had a high CAG would show the opposite pattern).

 

Myth 5: The system was deliberately rigged to downgrade the CAGs of some types of students more than others

I suppose it’s probably worth saying that it’s impossible to prove beyond all doubt that this is a myth, but I can note the evidence is against it. The statistical model should not have discriminated at all. The problem of small cohorts and the fact it is easier to over-estimate low-achieving students and harder to over-estimate high achieving students seem to provide a plausible explanation of what we can observe about discrepancies in downgrading. Also, if we compare results over time, we would expect those types of institutions who on average had a fall in results last time to have a rise this year. Take those three factors into account and nobody should be surprised to see the following or to think it sinister (although it would be useful to know to what extent each type of school was affected by downgrading and by small cohort size).

If you see anyone using only one of the above two sets of data, ignoring the change from 2018 to 2019, or deciding to pick and choose which types of centre matter (like comparing independent schools with FE colleges) suspect they are being misleading. Also, recall that these are averages and individual subjects and centres will differ a lot. You cannot pick a single school like, say, Eton and claim it will have done well in avoiding downgrading in all subjects this year.

Now for some general myth-busting.

The evidence shows students were affected by rounding errors. False. Suggestions like this, often used to explain unexpected Us, seem entirely speculative and not necessary to explain why students have got Us.

Some students got higher results in further maths than maths. True. Still a tiny minority, but much higher than normal.

No students at Eton were downgraded. Almost certainly false. This claim that was all over Twitter is extremely unlikely; denied anecdotally and there is no evidence for it. We would expect large independent schools to have been downgraded in popular subjects.

Something went wrong on results day. False. Things seem to have gone according to plan. If what happened was wrong it was because it was the wrong plan. Nothing surprising happened at the system level.

Students were denied the grades they needed by what happened. True for some students, but on average there is no reason to think it would have been more common to miss out on an offer than if exams had taken place, and some institutions might become more generous, if they can, due to the reduced reliability of the grades.

Results were given according to a normal distribution. False.

Rankings were changed by the statistical model. False. Or at least if it did happen, it wasn’t supposed to and an error has been made.

The stressful events of this year where exams were cancelled show that we shouldn’t have exams. False. Your logic does not resemble our earth logic.

And one final point. So many of the problems above come down to small cohort size, that next week’s GCSE results should be far more accurate. Fingers crossed. And good luck.

34 comments

  1. Excellent blog and summary of the issues.
    Anecdotally it feels like there are more individual stories of “unfairness” this year (certainly at my school). Possibly this is because “unfairness” as part of taking exams is more readily accepted. However I do wonder how much is due to Universities being cautious with accepting students who missed their grades due to uncertainty over appeals and the strict caps they’ve had imposed this year. Certainly from my school we have a number of students who have just missed their grades who we would normally expect to still have their places accepted but this year have been rejected. We always have a few students who are left without offers, but it feels much greater than usual this year. As I say this is anecdotal but from talking to colleagues and University admissions it does seem to have some grounding in fact.
    Which brings me to my main criticism of the government and Ofqual which has been the sense of panic ever since Scotland awarded CAGs on Tuesday. Ever since then there seems to have been a desire to do something, anything, and all the uncertainty created by poor coms and decisions has made the matter 100% worse. I have an awful feeling that they will “fold” and simply award CAGs which in my view makes a bad situation worse: it places Universities in an even trickier position, penalises students at schools whose processes in the awarding of CAGs was the most robust, and leads to massive grade inflation which doesn’t help anything. It would also shift the current anger onto schools and teachers rather than make it go away, as a lot of the stories I am seeing are based on predicted grades (i.e. UCAS ones) rather than CAGs – in fact (going back to my point about coms) the conflation of CAGs with “predictions” has been thoroughly unhelpful.
    With the benefit of hindsight I would have changed a number of things in how the process worked, although I maintain that the broad ideas were the least bad option at the time. However hindsight is no use here and rather I believe we need practical solutions going forward. I would make the following adjustments to the model:
    – I wouldn’t award any Us unless the CAG was a U. Simply because I think Us are almost impossible to predict;
    – I would have built in a small margin of error for grade distributions, and award CAGs if they were within the margin of error. This would e.g. allow a school whose model suggests no A*s in a subject to award an A* to a particularly outstanding student.
    – I would look at subject quirkiness, for example I believe nobody should have received a lower grade in Maths than in Further Maths, as the typical number is vanishingly small and therefore impossible to predict
    – I would drop the ludicrous idea that students can appeal based on their Mock results.
    I believe you could make all of these adjustments now in retrospect for A-levels and before GCSE results on Thursday. I realise that this would lead to more grade inflation than we have had (and it’s important to note that grades are already up this year), but I believe (although I could be wrong) that it would be a small difference, and could help tackle some (possibly most) of the individual stories of unfairness which are real and genuine.


    • Forgot to add:
      – I would relax the cap on University places, if possible for this year, but if not (and I suspect it might be too late) for next year as otherwise I think the knock-on effect on next year’s cohort will definitely be felt.


    • With the Us, the issue is there are a lot more of them than CAGs predicted, so while they are hard to predict, CAGs would be way off.

      With regard to the maths and further maths things, this was a definite distortion, but it seems a relatively unimportant one affecting 3% of students on a course a lot of centres don’t offer.

      I’m not going near the issue of appeals.


      • Re Us: my issue is it is IMO a worse crime if one person gets a U who didn’t deserve one than 50 get an E when actually they should have gotten a U. Hence on that particular boundary I’m happy to more wrong (statistically) in that direction. If that makes sense. I would even be happy to say no Us at all for one year, regardless of CAGs. I don’t think the impact of that inflation would be that huge, compared to the impact to individuals getting unearned Us.

        You’re right that it is only 3%, but again these are real people who have been hard done by. There are anecdotes of people not getting their Maths place at Uni despite an A* in Further Maths because their single Maths result was “only” an A.

        I guess my point is that whilst the global picture of grades awarded is just about right, there are genuine individual stories of hardship. Where possible we should seek to redress these if doing so doesn’t make the global picture wrong. I think upgrading Us and the Fm/Maths issue above wouldn’t inflate grades much, but would redress some of the individual mistakes, hence why I would apply them.

        I would add a couple of ideas:
        – no grade should have been dropped more than 2 below the CAG
        – All downgrades of 2 grades should be systematically reviewed on a case-by-case basis (but I’m not sure how this would work in practise with the details, and it’s probably impossible retrospectively)


        • Most of your suggestions here are ways of increasing grade inflation. Maybe not big ways, but if that’s what you do to get out of every difficulty, where does it end?


          • We already have grade inflation because of the small cohorts thing. So we accept that some grade inflation is necessary at the expense of perceived “individual fairness” where modelling isn’t all that likely to be that accurate anyway. I am suggesting that there are a few other cases where it could/should be done as well, where I believe the effect on grade inflation would be small enough and the effect on individuals large enough to justify it.

            If preventing grade inflation was the sole consideration then the standardisation could have been applied even to small cohorts.

            I think this conversation will be moot fairly soon, as I believe it’s very likely we will simply see CAGs awarded (which I think would be very wrong) anyway.


            • The small cohorts thing could be described as grade inflation to avoid perceived unfairness, but it could equally be described as allowing it to avoid giving any grades at random (if you think the statistical model should have been followed in those cases) or ignoring all inputs from the last two years (if you think it should have been calculated from a non-centre specific model).


  2. Before exams were cancelled, a solution was required. They had months to come up with one, and this isn’t it. Some of your points are valid (most of the mythbusting at the end), some not so much. Rounding down (very much part of the model) made it impossible for some high achievers in historically struggling 6th forms to ever achieve their predictions, because “the algorithm” dictated that there could be no A* grades awarded – previous years are to blame, but this year’s cohort suffer. And the sensitivity of cohort ranking as a grading tool (when I understand that teachers were led to believe it was more of a backup to predicted grades) is not up to the task.

    But the main issue is exactly the one grades are meant to solve – how to assess one student against another in a different school. Unusually bright children in historically unremarkable schools have been punished for attending the wrong institution, and all their classmates have paid the penalty as well, as a knock-on effect, because the top spot in the algorithm’s curve is taken already. I struggle to see how anyone can defend such a situation.

    The irony is that this algorithmic fitting is apparently necessary because of a lack of confidence in teachers’ predicted grades, but relies entirely on those same teachers’ ability to accurately assess and rank every single member of a cohort in correct order. Unless, of course, you happen to be part of a small cohort, or attend an institution without a suitable historical record for modelling, when CAGs will be accepted without question.

    There are philosophical questions behind how we handle this, and what our priorities are. I would err on the side of assuming the best at the expense of grade consistency, but can see both sides of the argument. What I cannot understand is an attitude that extrapolates from expected trends to individual results, effectively punishing children for attending the wrong college and wrecking their future prospects not due to anything they did but because “computer says no”.


    • Before they were cancelled nobody had a reason to look for one.


    • Apparently, research indicates teachers are able with better accuracy to predict relative rank performance of students compared to absolute judgement.

      It will be interesting to see how businesses respond to a plethora of “top grade” students; how these students progress during their next stage of study/employment; whether Ofqual will ask every future years for CAG data; whether other statisticians will be invited to improve on Ofqual model development


  3. As always an excellent article. It would be good if everyone commenting on the results had at least a basic understanding of maths before they did so.

    You haven’t mentioned the one thing that has annoyed me most – that every teacher who has been interviewed has been in a school with its strongest ever cohort of students. Again, this is statistically unlikely and may be due to this cohort being the first where the majority of GCSE grades were on the 9-1 scale


  4. From what I have read it is far worse than you have said with regard to the prior attainment algorithm. The prior attainment of students on a course is based on the achievement at GCSE of everyone on the course. If someone on the course, had poor GCSE results it would restrict the number of A and A* grades that could be awarded to a more talented student on the course. Secondly, all GCSE’s have been looked at. So someone studying for physics A level with a diabolical grade in Spanish and Art leads to a lack of access of the higher grades.


    • As I understand it, the adjustments for prior attainment varied according to the ability range. So As and A* would only be reduced if the prior attainment showed fewer of those whose GCSEs suggested vary high achievement.


  5. Thank you for your interesting and informative article. As an admissions tutor I have spent the last week dealing with the fallout from the statistical model in clearing. I have spoken to many applicants, none from a private school. That may be a function of the institution that I am at and my subject area. Nevertheless, every student I have spoken too has had marks that are downgraded from both predicted (not surprising) and CAG marks. A few points;

    1) I think I can pretty much grade student work with 99% accuracy in marine science to its band (1st, 2.1,2.2 etc). I cannot grade to 1% accurately. I don’t assess A-levels but I have a feeling teachers should have been asked to grade to band.
    2) If the problems are at the extremes (A* and U), we should not have the extremes this year. There are other ways that colleges/schools can recognise high achievers outside the exam system that universities will value for admissions (we dont just look at grades). In the real world U’s are often a result of external influences – illness, bereavement etc.
    3) The trajectory of schools is a really important factor, I think in particular those that have been struggling, often in challenging locations. As the successive governments have been sio intent on telling us, leadership matters. I dont think that has been accounted for?
    4) To use exams to judge the worth or potential of a young person is pretty dreadful – at school or university. Quizzes should only ever be used to do a quick check of general understanding. We should get rid of exams and go back to a pre-Govian modular system that rewards sustained hard work.


    • 1) Private schools are a small sector and perhaps focused on some disciplines more than others. That said, the small sixth forms probably did gain, particularly in the minor subjects.
      2) How does one “not have the extremes”? And Us are more common than this suggests.
      3) It was left out of the statistical model because it was shown not to predict results.
      4) Very few teachers want to go back to the days of coursework and controlled assessment. They were abandoned for good reason. not least because they were widely manipulated.


      • You replied to this comment.


        • For 2: Judging leniently is still judging.
          For 4: While I accept that in other countries, alternatives to exams might have worked better, here they were a disaster over a long period of time. That needs to be explained before repeating the same mistakes.


  6. I am not convinced that the prior attainment has been looked at at individual level but as a cohort. This is very unfair to the individual who may have excellent GCSE results but his classmates do not


    • They have looked at the cohort, but remember, an exceptional individual only needs the top grade to change. I’m not aware of why 1 person being much stronger than usual at GCSE can’t change 1 grade at A-level.


      • At my school no A*’s have been awarded in physics despite having an exceptional individual and having received A*’s before and there is a big negative sign next to the prior attainment at A*. This negative sign has been generated because the prior attainment of the cohort is weak compared with previous years but individuals within the cohort have prior attainment which is as strong as ever. This also applies to some of our A students downgraded to B’s and C’s. Now in some large top private schools this won’t happen because they don’t allow D and E grade candidates onto the course to preserve their AA* statistics. We do allow D and E candidates onto our A level courses and why shouldn’t we.


        • It should be the relative, not absolute, strength of the cohort that determines the prior achievement adjustment, i.e. how it compares with the same subject cohort in previous years.


    • Yes, but the cohort is made up of individuals.


      • Well we’ve gone from 5% A*’s over the last 4 years to 0% A*’s and from 22% A-A*’s over the last 4 years to 17% despite it being a very strong top end in terms of prior attainment all straight 9’s in all maths and science subjects. I do not understand how this is fair or accurate. We do have a lot of weak candidates as well on the course, the subject is very, very popular.


        • What does the paperwork show?


          • Hi teachingbattleground. The incorrect results happened when a cohort had more bright students but also more weak students than before. In this case the weak students add E and U to Ofquals “class grades quota” and the bright students add A and A*, but not always enough.

            There are examples here http://thaines.com/content/ALevels/vis/ as well as more cases where the algorithm creates ridiculous results. Now that it has been withdrawn we will probably never see how bad it was.

            This has sometimes happened to the same student across many classes hence horror stories of receiving 8 grades below Centre Assessed Grade because the same issue has affected all their subjects.

            The rounding issue that caused U grades is also very real and FFT Datalab have described it. Even Eton got some totally spurious U grades.


            • He said the number of A* grades went down so that’s not the problem. I did read that blogpost, but couldn’t make much sense of the complaints, beyond the fact that it is complaining about rankings and wants more statistical methods and less input from schools, so the opposite of most criticism of what was done. I read the FFT Datalab blog, it described a situation where more than half a student would be rounded up to a whole student. This is normal. Other people claimed a tiny fraction of a student would be rounded up, this was incorrect.


  7. […] The tragedy of grades based on predictions […]


  8. […] There is an excellent and dispassionate post by a mathematics teacher, Andrew Old, which outlines ho… (very much as I’ve explained it but in a little more detail) and dispelling many of the myths surrounding it. […]


  9. […] ‘The tragedy of grades based on predictions‘ – Excellent piece by education blogger Andrew Old about the A level results story […]


  10. On the BBC news there was one student who was complaining that she didn’t get the grades needed to go to Oxford university. They had asked for 2 A* and an A. She got A*,A, B. In my mind the university didn’t want her and gave her a difficult target.
    With regards to grades, they’ve never been fixed in stone, there’s always been some latitude to try and reduce problems associated with marking. The problem is made worse if, for whatever reason, there’s bias in the teacher’s assessment leading to either inflating or deflating a particular student’s grades.
    Let those that are appealing take the proper exams next year, deferring their university place for 12 months. How on earth can an appeal work, what criteria could make a change? What happens if the appeal process downgrades a result?


  11. This is what I thought before the results came out and
    although going back to predicted grades isn’t a great solution either the original solution was not as good as you are making it out to be. You said they looked at individual GCSE results and adjusted based on their case but that isn’t true, it looks more like they just pressed ‘go’ on a computer algorithm and everyone’s result in classes that did badly on average were brought down. I have a friend who was given CCD, getting a D in the only subject she got a 9 in at GCSE, and another friend who got a ABB with a B also in the only subject she got a 9 in. The whole point of grade 9 was to single out people with exceptional talent in each subject. The reason for the D was because that subject is a weak point in our college not because she personally did worse in it at college.


    • GCSEs modified the distribution not the individual result so not much good if you were not top in the subject.


  12. […] program/routine that inhumanly decided people’s grades without any human input at all. See this blogpost for more details. There was a statistical model that was used to process the information sent from […]


  13. First sensible article I have seen on this, subject. Thank you.



Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: