Big data: unthinking, ignorant and callous

6 Comments

The benefits of sharing patient data are being widely hailed but privacy and meaningful regulation are as important as ever, says Martyn Thomas

Nurse holding file

nurse file medical

Battle of Ideas logo

The age of big data has arrived. Major corporations have long recognised the value they can extract from collecting as much data as possible about the people who use their services – and people have been happy to give up their personal data in return for the convenience of credit cards and a few loyalty points.

Recently, however, governments have decided that they can stimulate investment and economic growth by publishing data that they collect about their citizens. In one example, New York has launched its Open Data Plan, including more than 1,100 public datasets covering business, health, transportation, education, environment, recreation, city government and much more.

Share and share alike

In the UK, the Health and Social Care Act 2012 introduced important changes that will allow the Health and Social Care Information Centre to collect and share confidential information from medical records without explicit patient consent. It is expected this will encourage inward investment by pharmaceutical and biotechnology companies and facilitate medical research.

‘The consequent loss of privacy could be very damaging’

There is no question that big data and data analysis can be of great benefit to society (for example for medical research, social policy and the investigation of crime) but the consequent loss of privacy could also be very damaging. It is often said that “if you have nothing to hide, you have nothing to fear”, but this is unthinking, ignorant and callous.

Devil in the detail

It is ignorant because of the frequent claims that anonymised data is safe from misuse.

Modern data analysis can extract information from extremely large collections of data in a wide variety of formats and diverse sources. One consequence is that effective anonymisation of detailed personal data has become impossible in many cases. If the data contains enough detail, it will be possible to identify the person uniquely by comparing the records with other available data (for example, from social networking sites).

‘The amount of detail needed to identify someone uniquely turns out to be surprisingly little’

The amount of detail needed to identify someone uniquely turns out to be surprisingly little: gender, age and postcode will often suffice; even without these details, each fact narrows the search, so if the records contain enough personal details to be useful for research purposes then in many cases it will be possible to identify the individual concerned.

Private matters

It is callous because a decent society looks after the vulnerable, and many people have honest reasons to need privacy. For example:

Medical conditions that could attract prejudice, such as HIV or mental illness
Past traumas, such as rape
Escaping an abusive relationship
Protecting adopted children
Witness protection
Avoiding religious prejudices
… and much more

If privacy is not the societal norm, it is hard for those who need it to lead a normal life or to avoid attracting unwanted attention and they will live at risk.

Future misuse

It is unthinking because data lasts a lifetime and circumstances change. Things you did, legally and honestly, in the past may be a problem under later circumstances. Might it happen under a future government that the NHS gave you lower priority for treatment because your Tesco Clubcard or Nectar data suggests that you have chosen to live unhealthily? Can you be certain?

Data collected for a positive reason is available for abuse, as many Dutch citizens discovered in the 1940s when their civic data that included their religion (so that they could be given appropriate spiritual support in extremis) was used by an invading government to identify Jews (using the earliest Hollerith data processing machines to locate thousands more than could have been located manually).

Regulatory failings

If we are to get the benefits of big data, then the use of personal data must be properly regulated. We have a very poor record of this in the UK. The Data Protection Act does not properly implement the European Data Protection Directive in the way that it defines personal data. The Information Commissioner’s Office is grossly understaffed and often seems to lack technical knowledge.

‘The Information Commissioner’s Office is grossly understaffed and often seems to lack technical knowledge’

The penalties for breaches of the Data Protection Act rarely exceed a tiny fine that is wholly inadequate as a deterrent and, despite parliament passing an act that would permit custodial sentences for the worst offences, the government has rejected pleas from the Information Commissioner and refused to bring the law into effect.

The result is that big business and government can use personal data almost with impunity. There is a risk that something happens that causes a major change in citizens’ attitudes towards their privacy and their willingness to have their personal data used commercially.

Our inadequate regulation and the failure by companies and government departments to use advanced privacy-enhancing technologies create the risk that we lose the benefits of big data but suffer the worst consequences.

We need more investment in privacy, greater willingness to prioritise privacy over short-term gains in productivity or profits, and better regulation, much more strongly enforced.

Dr Martyn Thomas is chair of the IT policy panel at the Institution of Engineering and Technology. He is speaking at the session “Number crunching and ethics in the era of Big Data” at the Battle of Ideas festival on 19-20 October at London’s Barbican. HSJ is a Battle of Ideas media partner.