Are OFSTED Judgements Reliable?

May 15, 2014

One of the properties required for any assessment to be credible, is that it is reliable. Roughly speaking, this property requires that the assessment would, if repeated or carried out by somebody else, give the same result. One of the difficulties with OFSTED is that, as inspections of a school are usually separated by enough of a period of time for a significant change to have happened, and no two schools are identical, it is hard to assess whether judgements differ because of genuine differences in performance or because of inconsistencies in the inspection procedure.

In order to determine reliability it would be necessary to carry out two OFSTED inspections of the same school at (roughly) the same time, and see if they came to the same conclusion. Although this sort of experiment would help ensure reliability it is not something that, to my knowledge, is part of OFSTED’s quality assurance procedures. However, it was brought to my attention that something similar to this had happened recently in two cases, not as part of quality assurance but as a result of some unusual school structures.

In the first of these cases, two separate OFSTED teams simultaneously inspected both Bishop Challoner Catholic Collegiate Boys School and Bishop Challoner Catholic Collegiate Girls School. Although these schools are technically separate institutions; in reality they share a building, a staff, and a governing body. In effect, two different OFSTED teams were inspecting the same set of teachers. Yet, somehow they managed to conclude that the teaching in the girls school was “Outstanding” and that in the boys school was only “Good”. These grades were also repeated in the overall judgement on the schools.

In the second case, OFSTED inspected Outwood Academy Valley on the 11th of March, and Outwood Academy Portland on the 13th of March. These two schools share a sixth form. Once again discrepancies with judgements occurred; somehow the sixth form of Outwood Academy Valley is described as “Good” and the sixth form of Outwood Academy Portland is described as “Outstanding”. Clearly, there must have been some awareness of this contradiction as the following bizarre passage was included in the second report:

The Principal and Executive Principal maintain that there is a distinct difference between Outwood Academy Portland students and others that share the sixth form centre. Inspectors found a strong ethos established by the end of Key Stage 4, by which Portland had instilled learning habits that contributed to sixth form students’ exceptional progress from having overcome often difficult circumstances in their earlier career. This sense of purpose has been established through the care, guidance and support from their teachers throughout their time in the academy.

Yes, that’s right, the inspectors claim to have subdivided the students in the sixth form according to where they were in KS4, and then used this to credit one school with the quality of the education received in the sixth form by those students who had previously been in KS4 in that school. A bizarre precedent has been set for judging sixth forms by their intake, rather than what they do now.

Now I should point out that I am not claiming that the judgements in these cases are wrong. The procedures may have been followed perfectly and the judgements may have some very solid foundations. What this calls into question is not the validity of the judgements, but the reliability of the procedures that have been used to reach them. If the same set of teachers can be judged “Good” by one team and “Outstanding” by another in the same day, or in the same week; then the only conclusion is that those judgements tell us little about those teachers. The difference between a “Good” sixth form and an “Outstanding” sixth form is not the quality of the teaching, or management of that sixth form. The difference between teachers who achieve “Outstanding” teaching and learning, and those who achieve “Good” teaching and learning is apparently not down to the abilities or dedication of those teachers. These reports demonstrate that OFSTED judgements are not reliable; they do not tell us about a relevant difference in quality, only differences in either circumstances or the personal inclinations of inspection teams.




  1. Reblogged this on The Echo Chamber.

  2. Reblogged this on The BB2 Collaborative.

  3. Think the same happened with Manchester Creative and Media Academy, which is split by gender, but shares the same governing body, Principal, SLT and site. The girls academy RI, whilst the boys inadequate.

  4. “What this calls into question is not the validity of the judgements, but the reliability of the procedures that have been used to reach them.”
    Actually, this lets Ofsted off the hook too lightly! For data to be valid, they must be reliable. A question over the reliability of Ofsted judgements automatically implies that the judgements reached are not valid.

    • I could have phrased that better, but wasn’t trying to let OFSTED off of the hook. A lack of reliability does indeed call the validity into question too.

  5. […] judgments are unreliable (see the indefatigible Andrew Old hereand Harry Fletcher […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: