Data-driven innovation in clinical pharmaceutical research

J. New
Center for Data Innovation
United States

From screening chemical compounds to optimising clinical trials and improving post-market surveillance of drugs, the increased use of data and better analytical tools such as artificial intelligence (AI) could transform drug development. This will lead to new treatments, improved patient outcomes and lower costs. This essay examines the way data-driven innovation, particularly AI, is transforming the drug development life cycle and recommends policies to accelerate this transformation.

Several recent developments have already begun to transform the entire drug development life cycle. These include widespread adoption of electronic health records, availability of new data sources thanks to technologies like genetic sequencing and smart technologies, and maturation and increased adoption of AI technologies. This transformation is particularly apparent in the clinical research phase of drug development (a 2019 white paper, “The promise of data-driven drug development”, from which this essay is an excerpt, examines the way data-driven innovation, particularly AI, is transforming the drug development life cycle, and recommends policies to accelerate this transformation (New, 2019)).

The US Food and Drug Administration (FDA) categorises the drug development life cycle into five stages: discovery and development, preclinical research, clinical research, FDA review and FDA post-market safety monitoring. This excerpt from New (2019) highlights the role data-driven innovation can play in improving the clinical research phase. The phase focuses on studying how a drug interacts with the human body through studies and clinical trials.

A major barrier to developing new treatments is the cost of evaluating candidate drugs for safety and efficacy. As of 2018, the average cost of an individual clinical trial was USD 19 million (JHSPH, 2018). This is consistent with a 2014 study from the US Department of Health and Human Services. This study estimated the total costs of Phase I, II, III and IV trials for a drug at USD 44-115.3 million (Setkaya et al., 2014). Improved use of data and analytics can significantly reduce the costs of clinical trials.

One of the most promising ways to reduce costs is through improved use of data and AI in clinical trial design, particularly to increase patient recruitment and engagement. Selecting a site to perform a clinical trial can be a significant financial commitment, especially since there are no guarantees patients will show up. To minimise this risk, companies such as and Vitrana have developed AI systems that can guide site-selection decisions. The systems analyse factors such as historical site-performance data and study requirements (Kaufman, 2018; Brown, 2019).

Several companies are using AI to improve patient recruitment directly. For example, Deep 6 AI analyses structured and unstructured clinical data to better identify patients that match trial criteria, allowing trial organisers to conduct more targeted recruitment (Kaufman, 2018; Brown, 2019). London-based Antidote uses machine learning (ML) for similar purposes. Indeed, the company claims ML enabled the referral of 8 000 patients for a clinical trial relating to Alzheimer’s disease in under two months. Moreover, these referrals were seven times more likely to follow through with the recruitment process than those from other sources (Sennaar, 2019).

Even when a trial has enough recruits, they must participate in the full trial for it to be successful. However, failure to engage participants properly can cause them to drop out or not adhere to trial rules, thereby reducing the trial’s effectiveness. Palo Alto start-up Brite Health has developed a smartphone app that uses ML to improve and maintain patent engagement to reduce this risk. The app provides users with notifications and nudges them to perform required tasks and site visits. It also uses a chatbot that can make trial information more accessible to patients, while algorithms identify and flag indicators of patient disengagement for trial organisers (Sennaar, 2019).

In some cases, patients may end their participation in a trial due to the negative side effects of a treatment. Here, too, AI can help. Researchers have developed ML algorithms that can identify the fewest and smallest doses of a chemotherapy regimen that can still shrink brain tumours, thus reducing the toxicity of the treatment (Yuaney and Shah, 2018). In a simulated trial, the researchers’ ML model reduced treatment potency by between 25-50% of all doses without reducing effectiveness (Yuaney and Shah, 2018). By minimising the risk of side effects, researchers can more reliably ensure patient adherence to a clinical trial (Harrer, 2019).

New technologies also make it possible to conduct decentralised and virtual clinical trials. This can both make it easier to recruit patients from a wide area and reduce overhead costs. In October 2017, life sciences company AOBiome Therapeutics completed a 12-week clinical trial of an acne drug that proved to be safe and effective (Mantel-Undark, 2018). Unlike a traditional clinical trial, however, participants completed the trial at home. AOBiome mailed participants either the drug or a placebo, along with an iPhone that came pre-loaded with an app for participants to take and share regular selfies of their acne, as well as communicate with study organisers throughout the trial (Mantel-Undark, 2018). This approach enabled an effective clinical trial with no in-person screening or site visits, which substantially reduced both costs and barriers to participation.

Pharmaceutical companies have been actively exploring the potential to replace or augment traditional in-person trials with data technologies. For example, the French company Sanofi launched a clinical trial that had required participants to regularly visit the trial site. This allowed organisers to collect data regarding participants’ weight, blood pressure and blood glucose. They then extended the trial, giving participants connected sensors and wireless technology to record and share these data from their homes (Mantel-Undark, 2018). GlaxoSmithKline sponsored a study to demonstrate the feasibility of using a smartphone and app to record survey data from rheumatoid arthritis patients. It also used the phone’s accelerometer to record wrist-motion exercises. The study found the accelerometer data could be much more accurate than motion-evaluation exercises performed in-person with a physician (Mantel-Undark, 2018). Finally, Novartis has partnered with Apple to use Apple’s ResearchKit, to improve clinical trial recruitment and administration. The partnership helps researchers develop apps for smart devices to collect and share medically relevant data, such as biometric sensor data and user-inputted information (McConaghie, 2018).

Site visits can cost between USD 3 000-7 000 per patient, and studies can involve dozens of visits and hundreds of patients. Thus, the potential for remote data collection could dramatically reduce the cost of clinical trials (Mantel-Undark, 2018).

New technologies such as the Internet of Things provide opportunities to collect large amounts of data outside of a traditional health-care context, known as real-world data (RWD). This might provide valuable evidence to help inform drug evaluation, known as real-world evidence (RWE) (FDA, 2018). In December 2018, the FDA published the framework for its Real-World Evidence Program. It provides guidance about how to incorporate RWD into clinical trials to create meaningful RWE (FDA, 2018).

Policy makers can and should play a role in accelerating data-driven innovation in drug development, both to maximise the benefits of these new technologies and to mitigate potential risks.

Policy makers should expand access to institutional and non-traditional data. For example, they could reduce regulatory barriers to data sharing, better enforce publication of clinical trial results and promote data sharing with international partners.

Policy makers should modernise regulatory processes, including by expanding and fully supporting programmes to evaluate and share foreign clinical trial data.

Racial and ethnic minorities, as well as women, have been historically underrepresented in clinical trials. This has led to evaluation of drugs based on data unrepresentative of the general population (Castro, 2014; ACC, 2018). Policy makers should invest in programmes that promote equity in drug development.

Policy makers should invest heavily in developing a workforce with the necessary AI skills to develop and implement data-driven innovations at scale.

Data-driven innovation promises to be even more transformative in medicine than in many other sectors. The benefits of these technologies can lead to new and safer treatments, improved patient outcomes and lower costs. The clinical research phase of drug development is particularly ripe for this kind of disruption. Already, use of AI and other data-driven technologies is transforming this field.


ACC (2018), “Study explores representation of women in clinical trials”, 30 April, American College of Cardiology, Washington, DC,

Brown, C. (2019), “The next step: Using AI to formulate clinical trial research questions”, 8 January, Anju Life Sciences Software, Phoenix,

Castro, D. (2014), “The rise of data poverty in America”, 10 September, Center for Data Innovation, Washington, DC,

FDA (2018), Framework for FDA’s Real-World Evidence Program, December, US Food and Drug Administration, Washington, DC,

Harrer, S. et al. (2019), “Artificial intelligence for clinical trial design”, Trends in Pharmacological Sciences, Vol. 40/8, pp. 577-591,

JHSPH (2018), “Cost of clinical trials for new drug FDA approval are fraction of total tab”, 24 September, Press Release, John Hopkins Bloomberg School of Public Health, Baltimore,

Kaufman, J. (2018), “The innovative startups improving clinical trial recruitment, enrollment, retention, and design”, 30 November, MobiHealthNews,

Mantel-Undark, B. (2018), “The search for new drugs is coming to your house”, 30 August, Fast Company,

McConaghie, A. (2018), “Novartis and Apple to scale up clinical trial collaboration”, 24 January, Pharma Phorum,

New, J. (2019), “The promise of data-driven drug development”, 18 September, Center for Data Innovation, Washington, DC,

Sennaar, K. (2019), “AI and machine learning for clinical trials – examining 3 current approaches,” 5 March, Emerj, Boston,

Setkaya, A. et al. (2014), “Examination of clinical trial costs and barriers for drug development”, submitted to the US Department of Health and Human Services Office of the Assistant Secretary for Planning and Evaluation, July,

Yuaney, G. and P. Shah (2018), “Reinforcement learning with action-derived rewards for chemotherapy and clinical trial dosing regimen selection”, in Proceedings of the 3rd Machine Learning for Health Care Conference, Vol. 85, pp. 161-226,

Metadata, Legal and Rights

This document, as well as any data and map included herein, are without prejudice to the status of or sovereignty over any territory, to the delimitation of international frontiers and boundaries and to the name of any territory, city or area. Extracts from publications may be subject to additional disclaimers, which are set out in the complete version of the publication, available at the link provided.

© OECD 2023

The use of this work, whether digital or print, is governed by the Terms and Conditions to be found at