While clinical trials and regulatory filings offer a semi-structured view of drug safety, a large amount of insights lie in sources ranging from patient support programs (PSPs) to social media posts. As Natural Language Processing (NLP) evolves, a growing number of tools are becoming available to unlock this potential.
Deepanshu Saini, Director of Program Management at IQVIA divides NLP techniques into four broad categories as they relate to pharmacovigilance. “These technologies have evolved over time, each bringing new capabilities to process and understand unstructured data in the pharmaceutical world,” Saini said.
[For more on Saini’s take on NLP, check out his article “From social media to safety signals: How AI and NLP are transforming drug safety monitoring“]
1. Keyword search: Practical and fast but limited accuracy
While seemingly straightforward, keyword search has been a cornerstone of data analysis for decades. The technology has its roots in decades-old web search techniques that used a basic keyword matching to index websites. While fast, it lacks precision. Imagine a pharmacovigilance team tasked with identifying reports of headaches associated with a specific drug. A simple keyword search for “headache” across patient forums and social media would unearth a large volume of mentions. Yet many of these results could be false positives.
Yet this simplicity comes at a cost. As Deepanshu Saini, Director of Program Management at IQVIA, explains, “Keyword search examines large unstructured datasets for keywords and signals, but without any context.” This lack of nuance can lead to misleading results.
In addition, the technique can fail to surface related results. A search for “cephalalgia,” a medical term for headache, might work in medical journals but not on social media sites where patients use everyday language.
2. Semantic search can spot synonyms
To overcome the shortcomings of purely keyword-based approaches, semantic search emerged as a more intelligent alternative. The technique, whose roots also stretch back for decades, became more mainstream in circa 2012/2013 when Google made strides in implementing semantic search at scale with its Knowledge Graph and Hummingbird update, which prized context over keywords.
Instead of simply matching words, semantic search explores the meaning and relationships between them. Returning to the example of a pharmacovigilance team searching for headache reports, a keyword search might miss mentions of “migraine” or “severe head pain” when looking for either “headache” or “cephalalgia,” semantic search could capture all of these related terms simultaneously. “Semantic search doesn’t just look for a keyword but considers the context and all kinds of meanings associated with it,” Saini said. This ability to recognize synonyms and related terms drastically reduces false negatives.
3. Early transformer models (e.g., BERT)
Google also helped shift the landscape of NLP in 2018 with the launch of BERT (Bidirectional Encoder Representations from Transformers), one of the first and most influential transformer-based language models. “This technology enables anyone to train their own state-of-the-art question answering system,” wrote Pandu Nayak Pandu Nayak, Google Fellow and Vice President, Search in 2019.
The launch of BERT followed the publication of the seminal paper “Attention Is All You Need” that would popularize the transformer architecture, which helped displace recurrent neural networks (RNNs), which process words in the order they appear.
Transformers, on the other hand, deploy a mechanism called “self-attention.” This capability allows the model to weigh the importance of all words in a sentence simultaneously no matter where they are, capturing their interdependencies. Imagine the model looking at every word in a sentence at the same time and determining which connections are most meaningful for understanding the overall message. BERT, in being bidirectional, can explore the semantics preceding and following a given word. Models like BERT “provide context in all directions, looking for context before and after the word you’re searching for,” Saini said.
BERT also makes use of word embeddings, which convert words into floating-point numbers that can then be analyzed computationally. Tensorflow, for instance, has an Embedding Projector (originally developed by Google), that allows users to visualize embedding data.
In pharmacovigilance, all of this translates to higher accuracy. “We are actually using [BERT] at IQVIA on the safety side to detect adverse events and reduce false positives,” Saini said. “When you detect an adverse event, your search and semantic search give you a lot of results. BERT helps us nail down the context and reduce those false positives.”
4. Large language models
While transformer models like BERT significantly advanced the NLP landscape, the emergence of large language models (LLMs) such as ChatGPT has sparked both excitement and apprehension, especially in the tightly regulated drug safety world. “Large language models have changed the game quite a bit because of the buzz,” Saini said.
While BERT is open source, the most popular large language models are proprietary, which presents a challenge for auditability.
FDA CFR guidelines, for instance, require “validation of computerized systems.” “You should be able to produce technology that is testable and can be validated as per the laws,” Saini said. “If you’re rolling out any new technology in the market, you should be able to validate it.”
The challenge with standard publicly available LLMs lies in their “black box” nature. “You’re not able to predict what the outcomes will be,” Saini observed. “For example, if you go to ChatGPT and want to rephrase or summarize something, every time you give the same input, it’s going to give you a slightly different output. That’s not testable and can’t be validated, so it can’t meet the benchmarks that the current laws have put in place.”
While LLMs show promise in processing vast amounts of unstructured data, they are known to sometimes “hallucinate” or generate plausible-sounding but incorrect information. This tendency can be particularly problematic in the drug safety context, where accuracy is paramount. Researchers have developed methods to mitigate this issue, such as Retrieval Augmented Generation (RAG), which grounds LLM outputs in verified information sources.
Deepanshu: “Large language models: These are Gen AI models like ChatGPT. A lot of research is being done, but I’m yet to see high accuracy, especially in the safety world.”
Off-the-shelf LLMs when used in medical contexts also can complicate FDA’s guidance concerning having “audit trails or other physical, logical, or procedural security measures in place to ensure the trustworthiness and reliability.”
Building a toolbox of NLP and ML techniques empowered by human expertise
While NLP tools have evolved over the years, that doesn’t necessarily mean that each new technology supplants earlier ones. “We don’t use these models in isolation,” Saini explains. “It’s really like a toolbox. You use whichever one is best for the task or a combination of tools.”
In addition to the tools outlined here, when combing through social media posts to find adverse event signals, pharma companies employ a multi-pronged strategy. “It’s a combination of tools – ontologies for semantic search (keywords plus synonyms), BERT to reduce false positives, and another tool to break ties between those two,” Saini said. “We don’t use BERT or [other NLP] models in isolation. We use them to augment the search — either to make the initial search better or to fine-tune the results and reduce false positives after semantic search.”
On top of the tools mentioned above, Saini underscores the importance of human expertise as well as myriad machine learning algorithms such as XGBoost and decision trees to explore large datasets of human-categorized information.
But Saini points out that simply identifying social media signals is only the first step. The larger impact comes from confirming these insights and using them to inform strategies. “It’s beneficial for pharma companies to design patient support programs and develop educational material,” Saini said. “On social media, you pick up signals and build hypotheses. We recommend our clients to confirm these signals through primary research or focus groups, inviting patients and HCPs.” Once confirmed, they develop strategies such as designing patient support programs, training materials, and better guidance for HCPs. Saini concluded: “Then they implement and gather feedback to see if the desired change has occurred. It’s a multi-pronged strategy, not as simple as reading something online and making immediate changes.”
Filed Under: Drug Discovery, Drug Discovery and Development, machine learning and AI, Regulatory affairs