Veracuity blog

Share on facebook
Share on twitter
Share on linkedin
Share on whatsapp
Share on email
Share on print

One of the most important shortcomings of the current pharmacovigilance data collection system is the fact that the FDA Adverse Drug Event Reporting System (FAERS) does not provide any insight into the incidence, prevalence, and rates of occurrence of adverse drug events in the post-approval period [1]. To improve the ability to detect trends and rates, data collection systems need to be modified to incorporate less traditional data points.

The utilization of patient-generated data such as search engine logs, social media posts, Wikipedia access logs and health forums text mining proved valuable in public health disciplines such as disease surveillance. Digital epidemiology work has focused mainly on the description of symptoms, but recently also increasingly on sentiments and opinions and the analysis of health behaviors. Whilst the key strength of data coming from traditional sources such as electronic health systems is in its high veracity, patient-generated data produce a high-velocity content that can be used as a supplemental source of information in public health [2].

Search engine logs

Search engine logs were utilized in a predictive model of flu outbreaks Google Flu Trends. The project was abandoned in 2015, after merely seven years of experimental use, because of unsatisfactory results in terms of accuracy [3]. In 2010, White et al. examined 1 year-worth of search logs on Google, Bing, and Yahoo! and tracked drug and event searches performed on the same machine. About 1 in 250 users searched for at least 1 Top-100 drugs. The team focused on the interaction between paroxetine and pravastatin and hyperglycemia that is associated with this drug interaction. The authors used disproportionality analysis to identify drug pairs associated with the symptoms of the tested event and compared the drug-drug-event pair data to background for each of the drugs separately [4].

The model is based on the hypothesis that people search for information about drugs they were prescribed to them and then, with some delay, search for events they experience. The associations, however, are only statistical rather than causal. Four years later, the same team compared Adverse Drug Event signal scores of a set of 325 Observational Medical Outcomes Partnership (OMOP) test cases. The search logs signal detection method ranged from the Area Under Curve (AUC) of 0.73 to 0.92 [5]. As pointed out by Salathe, the key problems of digital epidemiology, including the need to integrate the data streams into existing platforms, and provide the corresponding support and investment, including validation, as big-data hubris and algorithm dynamics can make them fail [2]. A decade after the first attempts, the utilization of search engine logs in pharmacovigilance is still a matter of cautious experimentation rather than day-to-day reality.

Health related chat rooms

Health-related chat rooms, social media platforms, and patient support groups represent another important source of drug-related patient injury. It must be stressed that currently, pharmaceutical manufacturers have no obligation to screen social media for adverse drug events except for sites they themselves created. Yet, sometimes the patient voice is too loud to ignore. In 2002, a BBC program Panorama aired documentary The Secrets of Seroxat [6]. The film generated a lengthy discussion that elicited consumer feedback in the form of more than 1,300 emails to the publisher and 862 messages in the discussion thread describing a variety of adverse effects, mainly suicidal thoughts and behavior, violence and withdrawal [7]. The platform turned out to be a goldmine for consumer intelligence. Information contained in health chatrooms is not always easily accessible for corporate case collection systems, and a variety of text mining approaches needs to be employed. Stay tuned.


[1] FDA (2018). FDA Adverse Event Reporting System (FAERS) Public Dashboard. [online] Available at: [Accessed 14 Apr. 2019].

[2] Salathé, M. (2016). Digital Pharmacovigilance and Disease Surveillance: Combining Traditional and Big-Data Systems for Better Public Health. Journal of Infectious Diseases, [online] 214(suppl 4), pp.S399-S403. Available at: [Accessed 14 Apr. 2019].

[3] McDonagh, L. (2018). Google Flu Trends is dead – long live Google Trends? | UCL Research Department of Primary Care and Population Health Blog. [online] Available at: [Accessed 14 Apr. 2019].

[4] White, R., Tatonetti, N., Shah, N., Altman, R. and Horvitz, E. (2013). Web-scale pharmacovigilance: listening to signals from the crowd. Journal of the American Medical Informatics Association, [online] 20(3), pp.404-408. Available at: [Accessed 14 Apr. 2019].

[5] White, R., Harpaz, R., Shah, N., DuMouchel, W. and Horvitz, E. (2014). Toward Enhanced Pharmacovigilance Using Patient-Generated Data on the Internet. Clinical Pharmacology & Therapeutics, [online] 96(2), pp.239-246. Available at: [Accessed 14 Apr. 2019].

[6] The Secrets of Seroxat. (2002). [film] Directed by Panorama. BBC. [Accessed 14 Apr. 2019].

[7] Medawara, C., Herxheimer, A., Bell, A., and Jofre S. (2002). Paroxetine, Panorama and user reporting of ADRs: Consumer intelligence matters in clinical practice and post-marketing drug surveillance. The International Journal of Risk and Safety in Medicine, 15(3):161169.

More to explorer

Subscribe to our newsletter