The arguments for ‘going dark’ look weak, compared with the damage to official statistics and public confidence in them, writes Rob Findlay

On the afternoon of Wednesday, 12 June, NHS England and NHS Improvement announced that 14 of the 133 trusts with Type 1 accident and emergency departments would be omitted from the waiting times statistics to be published the following morning.

Why? Because system leaders were piloting some proposed new ways of measuring and performance managing A&E waits and they didn’t want the trials to be influenced by the old way of doing things.

As the pilot sites are not submitting four-hour performance data, the national A&E performance data for May omits these sites’ performance during the field testing. The 14 trusts will instead be measured against the proposed new standards, so as not to “contaminate” the study design, NHSE said.

The rules, set out in a Memorandum of Understanding between NHSE and NHSI and the trusts, are pretty severe.

  • Trusts will be expected to monitor and manage their performance against the field-testing metrics
  • Performance against these metrics should also be reported to local commissioners
  • Performance against these metrics should not, however, be publicly reported. This includes at Public Board Meetings
  • If forums in which performance is discussed are subject to release of minutes under Freedom of Information legislation, then minutes are to be redacted.
  • During Field Testing trusts will not be required to report performance against the four-hour standard to local commissioners.

But NHSE and NHSI argued it would not make much difference to England’s overall performance. In recent months national performance when field test trusts are excluded is only 0.1 - 0.2 percentage points higher than performance when all providers are included, an explanatory note published on the NHSE website says.

Nevertheless, suddenly dropping one in 10 trusts from the national A&E data can hardly fail to make a difference. The national data series is no longer continuous, and no longer comparable (even though NHSE and NHSI’s statisticians have gone to the trouble of creating a new, comparable data series based on non-pilot trusts).

The problem

This is not a clear-cut issue. As a qualified scientist, I have sympathy with the need to design a good experiment. But this level of damage to an official statistic is a heavy price to pay.

This doesn’t just affect the current testing of proposed A&E measures. Exactly the same problem will affect any future field testing. Field testing policies before implementing them is a thoroughly sensible thing to do, but if the price is too heavy then that will make testing difficult.

So I would like to unpack some possible reasons why NHSE and NHSI felt the need for the pilot sites to “go dark”, and whether the decision is justified.

Reasons to ‘go dark’

There is (as far as I am aware) no technical reason why pilot trusts need to stop reporting against the four-hour target. They continue to record the arrival and conclusion times of all patients attending, and that data will eventually surface (after a long delay) in the HES data. It’s just that they are no longer using that data to calculate and report performance against the four-hour target, and instead they are using it to calculate performance against the new measures.

So the reasons for “going dark” must be human rather than technical. I can think of two.

Firstly, there may be a concern that, if the trust carried on measuring four-hour performance, then that would distract staff from the new measures and associated incentives being piloted.

The new measures include “mean time in A&E”, and if staff are to be incentivised by that measure then they must be aware of how long each patient has been in the department.

I would have thought it unlikely that they would not instinctively compare each patient’s time against the totemic four-hour threshold. So mere knowledge by staff of a four-hour wait cannot be the concern; it must be a fear that four-hour breaches will have consequences for staff or the trust, even during the pilot.

I can see why that fear might arise. I can see why “going dark” would be sufficient to allay that fear. But the question is: is it necessary?

If trust managers and executives, commissioners, NHSE and NHSI were all to declare with one voice that neither the trust nor its staff will be performance managed against the four-hour standard, even if they continued to report on it, would that be enough?

I would have thought that it should be – it would be a sad reflection otherwise.


But if four-hour performance continued to be reported, then criticism might still come from outside the NHS, and that brings me to the second reason for “going dark”: to protect the reputation of NHS organisations.

It is possible, perhaps even likely, that a pilot trust’s four-hour performance will be no better and may well be a lot worse during the pilot, even if the new measures and incentives do indeed turn out to be better for patients. Local and national media would probably criticise the worsening four-hour performance if it were reported, which would be a blow to the reputation of the NHS trust concerned and the NHS nationally.

Is this a justification for ‘going dark’?

I can see why the NHS might stop reporting a measure when there is evidence for a better alternative. That evidence is being sought by the pilots, but it does not yet exist, so that cannot be used as the justification.

There is also ample precedent for trusts stopping reporting when they are having severe problems with data quality and are unable to submit meaningful statistics. That does not apply either, because the data is still being collected.

But what the NHS cannot do, is stop reporting simply because the numbers look bad – and it must be seen not to do so, or public trust will be lost.


So the technical, management, and reputational reasons for stopping the reporting of four-hour A&E waits at pilot sites have not added up to very much. They certainly do not appear to outweigh the damage to official statistics and public confidence.

Perhaps I have missed an important justification for “going dark”? Such justifications were not provided in the documents from NHS England and NHS Improvement, but they may exist.

Nevertheless, as the argument stands, it seems to me that four-hour performance should continue to be reported by the pilot sites. Similarly, future pilots of policy proposals should likewise continue to report against all existing measures, unless there is compelling justification to do otherwise.

Which would be good news for the field testing of policy proposals, and good news for the integrity of official statistics and public confidence in them.

Dr Rob Findlay is director of software company Gooroo Ltd, specialists in NHS demand and capacity planning for elective and non-elective care