People in their early 60s defer seeking treatment until they have retired, a big data investigation into hospital visits reveals. By Christelle Gendrin and Chris Morris

Chris morris

Chris Morris

Chris Morris

We have been reviewing anonymised records of 3.8 million spells of hospital treatment in financial year 2013-2014, kindly made available by Albatross Financial Services. Each record represented a “spell of treatment”, consisting of one hospital visit, or several related visits.

It reported the patient’s age up to 13 ICD-10 codes for diagnoses, and up to 13 OPCS codes for treatments given per patient. We created a derived SQL database from the data and mined it.

We present here the two main findings of this investigation, one related to primary hypertension diagnoses for pregnant women and the second one related to hospital treatment counts in newly retired.

The patient’s sex was not recorded. Most records are fully anonymous.

Some records were pseudonymised, and in these cases it would be possible to tell whether two different spells of treatment were for the same individual. Some diagnostic codes reveal sex (eg birth, or testicular cancer) but we did not try to make any analysis based on this approach.

Diagnosis of primary hypertension

The most common single diagnosis was I10X, “Primary Hypertension”. The blue line in Fig 1 shows the percentage of spells of treatment that included this diagnosis, by age.

As is well known, the frequency of primary hypertension increased with age. All ICD-10 codes that matched the patterns “birth”, “pregnancy”, “pregnant”, or “gestation” were then reviewed, and where appropriate considered to be pregnancy-related (exceptions were ones that in fact said eg “non-gestational”).

The red line shows the proportion of diagnoses of I10X and/or O100 “Pre-existing essential hypertension complicating pregnancy, childbirth and the puerperium”, in patients for whom there was also a code for pregnancy.

Fig 1

Fig 1

Fig 1

Fig 1 suggests that primary hypertension is underdiagnosed among pregnant women. When hypertension is found in a pregnant patient, the diagnosis of primary hypertension is rarely made, at a rate below the incidence of primary hypertension in the age cohort.

Fig 2 shows the count of spell of treatment by age, separating out ones which were coded with a condition related to pregnancy.

Fig 2

Fig 2

Fig 2

The highest peak at age 0 corresponds to births (118,046 spells of treatment, off the graph). Thirteen year olds make fewest hospital visits.

For treatments not related to pregnancy, life then seems to be a long slide into morbidity, followed finally by a thinning of the cohort in ages above 70. (The slight fall from age 32 to 36 might be accounted for by visits that should have been attributed to pregnancy.)

Deferring treatment

This general trend is interrupted by an anomalous excess of hospital visits at age 66. There were 74,500 spells of treatment for 66 year olds, about 17 per cent more than the 63,626 at age 65 and about 14.5 per cent more than the 65,049 recorded at age 67.

We then had a look at the number of procedures and diagnoses for ages 65, 66, 67 to understand if the peak at 66 years old could be explained by a specific procedure introduced at age 66. Fig 3 depicts the counts for the 20 most common procedure codes for ages 65, 66, 67.

There was not a specific procedure code that could explain the peak in the number of hospital treatment spells for age 66 and the excess of procedures at age 66 was evenly distributed among the different codes. The same findings apply for diagnoses, so the excess in treatment spells does not seem to be explained by practices of health service providers.

Fig 3

Fig 3

Fig 3

One possible explanation is that people in their early 60s defer seeking treatment until they have retired. If true, this is likely to lead to worse health outcomes, and extra expenditure.

It seems that it would be useful to run an educational campaign to persuade people in their 60s to seek treatment earlier.

It would be useful to run an educational campaign to persuade people in their 60s to seek treatment earlier

Albatross Financial Solutions performs patient cost benchmarking analysis using data supplied by 70+ NHS trusts to allow healthcare providers to compare their performance against other institutions. We repurposed this dataset to mine it for information about comorbidity and care pathways.

Another interesting result is that hospital treatment for asthma is most often for patents older than 11 – it would seem that childhood asthma is usually successfully managed by General Practices.

Such work is limited by two problems. One is limits in the care with which data is recorded. For example OPCS code ZZ99 “Not Identified” is used 1.5 million times in the data.

There is also bias in when codes are recorded – “Q703 Webbed toes” is a lifelong condition, but rarely recorded for adult patients.

The other limitation on such work is weakness in the standard coding system. For example ICD-10 contains codes for diseases, eg “A051 Botulism”; for symptoms including “R509 Fever, unspecified”; for accidents like “W029 Fall involving ice-skates, skis, roller-skates or skateboards: Unspecified place”; and for life events including “Z561 Change of job”.

These are not the same sort of thing, and cannot meaningfully be aggregated. These problems frustrated our attempts to mine the data for evidence of comorbidities.

Nevertheless, this investigation demonstrated that the mining of large anonymised healthcare datasets (in this case 3.8 million records) primarily collected for a specific purpose – here the dataset was primarily aimed at analysis of cost of provision – turns out to be a useful source for secondary analysis and helps factoring new hypothesis about UK healthcare.

Christelle Gendrin is life sciences data scientist at Science and Technology Facilities Council and Chris Morris is data analyst at Hartree Centre, STFC.