For all of its promise in healthcare and elsewhere, deploying artificial intelligence is frequently a challenging endeavor. “Close collaboration between data science teams, other project team members and stakeholders is essential,” said Jennifer Bradford, director of data science at Phastar, the London-headquartered contract research organization. While input from computational, statistical or medical experts could be essential to inform data science models, all stakeholders understand the requirements and are working “in sync with the project,” Branford said.
In the following interview, Bradford shares advice on how to collaborate effectively on data science projects, the impact of COVID-19 on data science in pharma and the potential for AI to accelerate R&D timelines.
What comes next after alignment between different stakeholders on data science projects is confirmed?
Bradford: Once alignment is confirmed, continuous communications among the teams must become a priority. In this way, progress milestones can be shared to ensure the project moves forward in the right direction and that concerns and questions are answered early on.
The ability to translate the technical approach into lay language is critical to bridge any knowledge gaps to facilitate ongoing discussions. For example, an explanation of the technical approach in terms of why it is being used and what it will provide in the context of the question/medical problem can help drive engagement from the rest of the team with the added benefit that this may further discussions and identify potential future opportunities.
Data scientists must also understand when it is necessary to bring in other disciplines as required, whether computational, statistical or medical. Working closely and effectively with these teams is essential, ideally engaging them from the start.
Demand for data science in clinical trials has been increasing in recent years. How has the pandemic affected demand for data science?
Bradford: Generally, data science, alongside other related disciplines (statistics, computational biology, etc.), is supporting the pandemic in many different ways. This could be epidemiological studies, drug discovery and development, data-driven modeling for COVID-19, predictive analytics, monitoring approaches and more.
Specifically, from a trials perspective during the pandemic, we have seen the industry adapt to move forward safely and effectively. For example, we have seen an increase in remote monitoring and decentralized trials, which has driven the need for different ways to monitor data quality. In general, there are fewer visits to the clinical trial sites by medical monitors. As a result, teams need alternative ways to effectively and efficiently monitor the data to ensure data quality. Similarly, remote monitoring has a key role in decentralized trials to ensure the quality of the data. In both examples, data science techniques can support the industry. For example, the development of monitoring tools together with different methods, including machine learning, for the detection of outliers, anomalies and unusual site behaviors can provide trial teams with analytical insights into performance and data integrity during the clinical trial.
What advice would you give people with backgrounds in bioinformatics and related fields who are interested in deepening their grasp of the latest data science tools?
Bradford: Education is key. This means reading and understanding what is going on in the field. Attendance at related webinars can help create a greater understanding of current advances and technologies. From a healthcare and biology perspective, there are many different open data sets available to use and explore the application of different methods. It is important to focus on the question to be answered. These activities can assist in gaining greater knowledge in this fast-changing environment.
How can machine learning help accelerate R&D timelines?
Bradford: From the beginning, it is critical to work with interdisciplinary teams to understand exactly what insights are required, the data needed to support the requirements, the workflow of the data and what should be monitored. The technical approach (e.g., applying machine learning or other statistical techniques) is driven by the question or requirement. For example, what insights does the team hope to generate from the device, and how will the results be used? It is questions like this that will drive what data is used and in what way it will be used.
From the perspective of R&D, machine learning methods have immense potential. They may be used to better understand biological pathways and diseases, target validation, trial design and the identification of prognostic biomarkers, to name a few. From wearables alone, large amounts of instant ‘real-time’ data are generated, and machine learning can provide insights into patterns and signals that can be used to expedite and improve trial conduct.
Above all, working closely with the clinical teams is essential to understand the questions and insights the teams require to identify the right data used in the right way at the right time.
Could you summarize how machine learning can be used to target patient recruitment and other uses in clinical trials?
Bradford: Recruitment into trials is often a bottleneck and can be a costly and time-consuming step, and artificial intelligence (AI) can support recruitment in different ways. For instance, machine learning can help identify patient populations potentially responsive to treatment, for example, by identifying prognostic biomarkers based on historical trials. From a patient perspective, there are increasing applications supporting the search of trial registries matching patients to trials. Similarly, natural language processing, a branch of AI that can help make sense from the written word, could be useful in matching physician notes for people potentially eligible for a clinical trial.
Machine learning can also be used to create predictive models (for the mechanism of action, response, etc.). For example, augmenting different data, such as clinical data imaging data, electronic health care records or genomic data — all of these can be used to create predictive models. Machine learning can also support the co-development of companion diagnostics and therapies for more effective treatments.
Additionally, machine learning can be applied to medical imaging and other approaches to support faster diagnosis and monitor disease progression.
What is your vision for how techniques like machine learning could change how we develop new therapies in the medium- to long-term?
Bradford: Basically, machines learn from data. So, as we start to see the benefits from the technology, we may also see improvements in data standards, sharing and workflows. This, in turn, will drive the development of new approaches and applications.
In the future, I think much of the technology may sit behind vendor applications so that teams may use an AI interface to support their decision-making without necessarily realizing they are doing so.
From the sponsor perspective, AI and ML approaches may help drive efficiencies in the pipeline, reducing overall costs — from development through the manufacturing process.
From the patient perspective, data and AI will help researchers better understand the safety profile of a treatment before it is being administered to people, supporting how they are monitored and managed. Additionally, models may predict how a therapy will perform in the real world, which could also impact trial designs and monitoring.