Veracuity Blog

The impact of data breaches in healthcare on research

In the U.S., summary details of private health information breaches that involved more than 500 individuals are available at the HHS OCR portal called Wall of Shame for the public to view [1]. In Europe, EU-level legislation on data privacy has to be transposed into national law. National regulators are also responsible for implementation and enforcement. Unlike in the U.S., information on data breaches and the nature of the incidents is not available in Europe. There is no one place where Europeans would be able to find information about individual incidents, affected data, and root causes that led to the breach. Analysis of the consequences of data breaches is not available to the European public. New EU legislation on data privacy provides numerous exceptions for handling PHI for a variety of legitimate purposes, including scientific research [2].

The exploitation of medical data for nefarious purposes is on the increase. The black-market value of electronic health records exceeds that of credit card data [3] — theft of PHI to gain access to health treatment or files submitted for reimbursement qualify as identity theft. The consequences include financial loss if the PHI is used to obtain medical services as goods. Life-threatening situations can occur if medical records are changed, absent, or erroneous as a result of the theft. Earlier reliance on paper records limited the volume that could be stolen during an incident. Electronic records exponentially increase the number of records that are potentially compromised during a breach. [4].

Even worse, health data controllers and processors have a limited ability to detect data breaches in real-time. According to a Verizon report, two-thirds of healthcare data breaches go undiscovered for months or even years. Failure to protect personal health information affects public trust and the willingness of the public to share information with their healthcare providers. [5].

Compared to paper systems, electronic data pools are easier to access and exploit. The risk of exploitation increases with the number of legitimate users and the number of organizations that manage access rights. Patients’ willingness to participate in research is partially rooted in their trust that data management systems will preserve confidentiality, and personal health information will not be shared inappropriately.

State actors, specifically from China, also target healthcare information. In August 2017, the FBI arrested Yu Pingan, a.k.a. GoldSun for distributing Sakula malware to 147 unique U.S.-based IP addresses, including the Office of Personnel Management (OPM) and health insurer Anthem [6]. The Anthem breach in March 2015 affected 78.8 million patients due to a hacking/IT incident that involved the insurer’s network server. The OPM breach exposed information on 25.7 million Americans.

Previously compartmented data pools are often linked into data lakes following reorganization. Such data pools are convenient as they facilitate data sharing across providers. Large data centers are also well-liked by researchers who can derive a wealth of useful data for a variety of academic and commercial projects. Centralized information systems that provide access to vast datasets are significantly more attractive for malicious actors. For example, a system that combines data from clinical trials, genetic testing, inpatient and outpatient records, emergency department visits, and administrative records is attractive enough to justify the significant effort required to penetration.

Vulnerability assessments of information systems need to take into account all human-machine interfaces, user behavior, awareness and training, and breach detection mechanisms, as well as historical experience and its impact on patients’ trust and consequently recruitment of subjects in clinical trials. Opportunity for exploitation increases exponentially with the number of individuals having authorized access to any one of these interconnected compartments, as well as the number of entities involved in access control.

The most critical challenges are the attractiveness of health data for cybercriminals, loss of trust in the ability of providers to keep the data safe, the reliance on designs that require pooling of sensitive data, and systems that require/allow PHI sharing for research and analysis.


[1] U.S. Department of Health and Human Services Office for Civil Rights. U.S. Department of Health & Human Services – Office for Civil Rights. OCR Portal – 2017.

[2] European Parliament and of the Council. Regulation (EU) 2016/679 Of The European Parliament And Of The Council Of 27 April 2016 On The Protection Of Natural Persons With Regard To The Processing Of Personal Data And On The Free Movement Of Such Data, And Repealing Directive 95/46/EC (General Data Protection Regulation). Official Journal of the European Union; 2016.

[3] Yao M. Forbes Welcome. 2017.

[4] Hiller J, McMullen M, Chumney W, Baumer D. Privacy And Security In The Implementation Of Health Information Technology (Electronic Health Records): U.S. And EU Compared. B.U. J. SCI. & TECH. L. Vol.
17; 2011. Available at:

[5] Verizon Enterprise Solutions. Protected Health Information Data Breach Report. Verizon Enterprise Solutions. 2016. Available at:

[6] Schwartz M. Chinese Man Allegedly Tied to OPM Breach Malware Arrested. Bank Info Security. 2017. Available at: