Archive for September, 2019


Teacher autonomy is the most difficult issue in the education debate. Part 1

September 28, 2019

How much freedom should teachers be given to do their own thing?

Few things are more irritating to educators than knowing what needs to be done, and being stopped from doing it by those in charge of you. Whether that’s the actions necessary to keep order in the classroom, or keep students safe from each other, or the way of teaching that will make the most difference. At times, one is tempted to ask for complete autonomy. Freedom to make whatever decision one wants to. However, this principle becomes less appealing when one realises that what happens in other classrooms affects expectations of behaviour and effort in one’s own. It also becomes less appealing when one considers the consequences for children of the worst classroom practices, both educationally and with regards to their safety and well-being. A related issue is workload. An assumption that every teacher plans their own lessons, may undermine collaboration in the production of resources, leaving teachers to duplicate work their colleagues have already done. Conversely, decisions imposed on teachers by those less familiar with their classes or year group, may have adverse workload consequences for teachers who find they constantly have to make adjustments for a poorly sequenced curriculum or to prepare for badly designed assessments.

It’s easy to make rhetorical arguments in favour of teacher autonomy. Teachers should be trusted. Teachers are professionals. Teachers know their classes best. It’s easy to make rhetorical arguments against teacher autonomy. Teachers must be required to teach effectively. All teachers should have high expectations. All teaching should be based on how students actually learn. Students need consistency. Once you accept that all these arguments are sometimes true, the debate becomes about where you draw the line, and that’s tricky.

In this post I will summarise my past posts on the topic, which can be found here:

In those posts I concluded a number of things.

Managers should try to avoid giving any of the following:

  • Instructions contradicted by other instructions.
  • Completely idiotic instructions.
  • Instructions that no manager would ever subsequently admit to giving.
  • Instructions which, if followed, will be used against the teacher following them.

Some of this might seem obvious, but none of these things are uncommon. Dysfunctional management is by all reasonable accounts a problem in teaching and it is worth considering where managers should definitely leave well enough alone. However, attending to this only narrowly restricts the places where the line can be drawn.

Moving on from the day to day decisions of managers to the systems used to manage teachers, I suggested the following should be avoided in any system of holding teachers to account.

  1. Trying to achieve multiple aims simultaneously and without a clear indication of priority;
  2. Holding teachers accountable for methods and outcomes simultaneously;
  3. Enforcing, and creating paperwork for things that would happen anyway;
  4. Creating work that does not have to be done;
  5. Measuring and judging things that don’t matter;
  6. Measuring and judging things unreliably;
  7. Encouraging behaviour that is actually counter-productive;
  8. Wasting money, particularly on management salaries.

Again, this stuff might seem obvious, but it is all incredibly common. I believe almost every large school would gain from applying these principles to all of its rules and systems for holding teachers to account.

The point about the problem of trying to achieve multiple aims simultaneously is one that applies at many levels and across the public services. The philosopher Onora O’Neill, when talking about accountability described the following problem:

Traditionally, the public sector exercised control by process. We often call it bureaucratic process. The private sector allegedly exercised control by targets. When the target setting was imposed on the public sector, the process controls were not removed, hence the problem of having to be responsive to and responsible for two completely different sets of controls whose coincidence is not guaranteed.

Teachers should never be held accountable for outcomes if they were not given the freedom to affect them. In this era of workload concerns, I would add that if they can affect those outcomes, but only by taking on more work, that should also be considered unreasonable.

One helpful way of looking at restrictions on autonomy can be found in this blogpost by Doug Lemov which appeals to the concept of “positive and negative variance”.

… one of the strongest ways a school can make a difference in student achievement is to have a coherent approach to teaching, one that outlines a shared understanding of “how we do it”—things that comprise a schools core approach that everyone is expected to do. The school should name the things that are part of “how we do it” and then provide training  so predictable implementation errors are reduced. That’s a way of both aligning and implementing a philosophy but also of reducing negative variance.

But it’s super-important to balance that reduction of negative variance with an understanding of the benefits of “positive variance”… the idea that people who have achieved proficiency with a skill should have the freedom to personalize and adapt.

The example he gives is centralised lesson planning. Preparing lessons centrally will reduce negative variation in that it will make it harder for teachers to be under prepared. However, in order to encourage positive variation, teachers will need to be allowed to adapt the lessons and be progressively given decision rights that can include dropping the centrally prepared lesson entirely.

I suggested the following principles might help with ensuring there is less negative variance and more positive variance.

  1. Outcomes must be considered before processes.
  2. Schools should be upfront about what they want.
  3. If you can’t write down clearly, concisely and objectively what you want, you have no right to ask for it.
  4. The best justification for restricting autonomy is where a teacher’s behaviour will undermine colleagues. e.g. differing expectations for behaviour across the school.
  5.  Don’t take the piss. i.e. don’t have systems that can be harmful to teachers in themselves by adding to stress or encouraging bullying.

That’s about 1000 words summarising what I’ve already said on this topic. I am fully aware that everything I’ve said only suggests some constraints on where to draw the line, and doesn’t give any easy answers to the question of teacher autonomy. In my next blogpost I hope to add a few more considerations that I haven’t covered previously.


Has new research on exclusions solved the problem of causation?

September 21, 2019

Earlier this week, Siobhan Benita, the Liberal Democrat candidate for London mayor made a speech where she declared that:

My “feel safe, be safe” plan for Londoners will give every young person a voice, activities and the security of good schooling… No child will be permanently excluded from mainstream schools.

Naturally plenty of teachers, ex-teachers and people whose kids go to mainstream schools were able to point out on social media that it might not be best to keep young people in schools if they are dangerous or out of control. But in the online discussions that followed, a number of people who work in education, but don’t teach in schools, seemed very convinced that there was great new evidence that exclusions were bad.

It turned out to be this conference paper by Bacher-Hicks et al which, while largely having the same glaring inadequacies as any other research in this area, did have a few innovations in its methodology. It found a link between suspensions from school and various negative outcomes. It is about suspensions rather than permanent exclusions. It is based in the US not the UK. It is based on a student population where 23% of students are suspended at least once per school year, and 19% go on to be arrested between the ages of 16 and 20. However, even given the general irrelevance of this research to the debate in England about permanent exclusion, it would still be interesting if it had solved a big problem with research about school discipline.

I’ve written before about how a big problem in reaching conclusions from data is being able to identify causation from correlation, i.e. being able to identify what is an effect and what is a cause where two statistics seem to be related.

The accepted method for researching causation is to use a Randomised Control Trial, where the proposed cause is assigned at random to some part of a sample, so that the effect can be isolated by comparing the “treated” and “untreated” parts of the sample. From a practical point of view it would be very easy to test policy on exclusions in this way, you could design an exclusion process where after the decision to exclude had been made it was only carried out after being confirmed by the toss of a coin. Unfortunately, while practically easy, it is ethically beyond the pale to apply punishments by chance. So there is no body of RCT based evidence on exclusions. This is not an isolated problem, similar ethical considerations also leave us with huge gaps in our knowledge of the effects of other sanctions used in schools, not to mention criminal justice policy and parental discipline.

Where there is no chance of an RCT, researchers tend to look at existing data, and try to draw conclusions from it. This is not a futile endeavour where there are clear reasons to limit which hypotheses you are testing. If it seems reasonable to think that falling out of an aeroplane without a parachute is a cause of death, but that being about to die is not a cause of falling out of aeroplanes, then looking at the death rate of people who have fallen out of planes could provide good evidence for that hypothesis. People do use correlation evidence in order to reach conclusions about causation quite often and quite reasonably in cases where there is only one reasonable hypothesis about causation that can be made. Unfortunately, this can trick us into thinking that deducing causation from correlation is a reliable “second best” method to be used in all cases where an RCT cannot be used. It isn’t. If there are multiple competing hypotheses about how causation works in a particular instance, then correlation evidence can be, not just less reliable than RCTs, but utterly useless.

This is the situation we have throughout research on sanctions and behaviour (where RCTs are ruled out). We usually wish to know whether particular sanctions improve behaviour. We wish to test the hypothesis that applying sanction X, improves behaviour. Unfortunately, because the sanction is a result of poor behaviour, there will always be a correlation between poor behaviour and sanctions or the poorly behaved and sanctions. We can pick groups to compare or we can vary the timescale, but almost every empirical claim that sanctions don’t work, or that harsher sanctions are less effective than lenient ones, runs into the problem that the punishments were a result of bad behaviour, therefore the punished were always more likely to behave badly than the unpunished. There may also be other variables that cause both bad behaviour and sanctions, which will allow for further hypotheses about causation.

The problems with causation don’t stop ideologues from making pronouncements (“non-custodial sentences are more effective than prison sentences” or “the best way to manage children’s behaviour by explaining to them why their actions are wrong”) that the evidence cannot actually support. However, we simply cannot tell whether correlations between stricter sanctions and worse behaviour are a result of behaviour being negatively affected by harsher sanctions, or sanctions being made harsher due to worse behaviour (except perhaps by applying common sense). The same problem exists where we try to find the the effects of sanctions (or systems of sanctions) on other outcomes. Do those outcomes result from the sanctions, or from the behaviour that resulted in the sanctions? The most egregious example of being unable to separate cause and effect we have seen recently has been over the supposed link between school exclusions and knife crime where some well-intentioned people seemed unable to consider the possibility that a propensity to violent criminal behaviour is a cause of exclusions, rather than the possibility that upstanding members of society are being excluded and stabbing people as a result.

There are various techniques that can be used to solve some problems involving correlation and causation. Sometimes the timing is a clue. Where young people are both permanently excluded and convicted of possessing a knife, it is far more common for a permanent exclusion to swiftly follow the conviction than for the conviction to swiftly follow the exclusion, suggesting that causation does not run from exclusion to involvement in knife crime. Sometimes alternative chains of causation can be eliminated by using multi-variate statistical methods (although this paper makes a pretty good argument that this doesn’t work as well as we think it does). Sometimes “natural experiments” occur, where just by luck, we have data that should resemble what we might expect from RCTs. Where people think that they have grounds for comparing the effects on two groups, without randomisation, research is often referred to as “quasi-experimental”.

And this brings us to the Bacher-Hicks et al paper mentioned earlier. A change in the boundaries of school districts allowed them to eliminate some variables that might confound efforts to identify causation involving suspensions. This enabled them to create a measure of schools’ willingness to suspend that controlled for student background, and to see what effect that had on outcomes for students who went to the school after the boundaries changed and, therefore, came from different backgrounds. Being able to control for some variables, allows the paper to improve on papers that couldn’t control for those variables. Unfortunately, the paper does not even begin to address the problem of controlling for behaviour. It describes schools with high conditional suspension rates (i.e. suspension rates after controlling for student background) as “strict” and concludes that strictness results in negative outcomes. What justification there is for this approach is limited. The point is made that where principals have changed schools, conditional suspension rates have also changed in ways that suggest leadership is important. However, this does not go very far to disprove the obvious hypothesis that high suspension rates may be a result of bad behaviour that also leads to the negative outcomes the researchers found. To conclude that suspensions, not bad behaviour, are the cause of the negative outcomes that are correlated to high conditional suspension rates, requires that one controls perfectly for bad behaviour. Anything less, can result in a correlation that does not prove causation.

One additional point I’m going to make is that research that assumes you can control for behaviour by attributing suspension rates entirely to a mix of pupil background and schools’ willingness to suspend, rather than, say, school culture, makes assumptions that I think few teachers will agree with. We will get better education research when researchers start getting better at listening to teachers when theorising about whether correlation indicates causation.

%d bloggers like this: