Endometriosis, a condition where endometrial tissue grows outside the uterus, has a strong genetic underpinning. A new study published in the Journal of Molecular Diagnostics sheds light on this connection. Researchers from a team including Genzeva, LumaGene, RYLTI Biopharma, Brigham & Women’s Hospital of Harvard University and QIAGEN Digital Insights used a unique approach in their analysis of endometriosis patient samples. They combined multiple data sources, spanning multiomics data, next-generation sequencing, phenotype-driven analysis, and a biomimetic digital twin ecosystem.
Potential biomarker identified
The landmark research led by Dr. William G. Kearns found eight pathogenic mutations and four variants of unknown significance (VUSs) potentially associated with the disease across nearly all patient samples. Notably, a specific VUS in the MUC20 gene appeared in every endometriosis sample, suggesting it could be a critical biomarker for developing less invasive diagnostic tests.
Advanced analytics helps pave the way to new insights
The pioneering study also underscored how advanced machine learning capabilities, when integrated with genomic sequencing and multi-dimensional data analysis, can drive new biological insights. In particular, the group’s approach combined phenotype-ranking algorithms with knowledge engineering via a biomimetic digital twin ecosystem, as described in the Journal of Molecular Diagnostics paper. This ecosystem applied real-world reasoning principles to analyze non-normalized, raw data and identify hidden or “dark” data relationships. The paper states this is the first study to incorporate recommendations from the National Academies of Sciences on using digital twins to model biological complexity in biomedical research. The authors propose this biomimetic digital twin method could be applied to understand disease mechanisms, conduct virtual clinical trials, and identify new therapies.
To learn more about the research, we caught up with Dr. Kearns and Joseph Glick, an award-winning biomimetic AI pioneer and RYLTI co-founder who helped architect the digital twin ecosystem used in the study, to learn more about this approach and its implications.
Your study identified four variants of unknown clinical significance potentially linked to endometriosis-related disorders in nearly all patients analyzed, with one specific VUS present in all patient samples. Could you elaborate on the potential significance of these findings?
Dr. William G. Kearns: Endometriosis is a complex, multifactorial disease that requires an invasive surgical procedure to obtain samples for pathology diagnosis. In the endometriomas of all patient samples analyzed, we identified a DNA variant classified as a variant of unknown clinical significance in the MUC20 gene. This could potentially lead to a less invasive test to diagnose endometriosis. For example, one could do a needle biopsy to obtain tissue, and then do DNA sequencing to determine if this VUS is present.
The pathophysiology of endometriosis is poorly understood. The identification of VUSs in two genes that are controlled by the same gene enhancer could lead to clearer understanding of pathways in the development of endometriosis.
Can you tell me about the backstory behind the biomimetic digital twin ecosystem in the advanced genomics experimental protocol to address the limitations of AI/ML/LLMs described in the study?
Joseph Glick: In the real-world lab of delivering solutions, RYLTI has evolved the methods of real-world complexity modeling, real-world reasoning, real-world learning and other adaptation methods, to explore highly complex, multidimensional and multiscale problem domains for global organizations. The NAS Digital Twins report is an authoritative validation of our approach and innovations. To support genomic research, we worked with Dr. Kearns to build a biomimetic digital twin ecosystem composed of four twin classes — patient profile, phenotype, gene variant and protein variant. We incorporated the ecosystem into his advanced genomics experimental protocol, which we believe is the first report of including this methodology in research to understand the pathophysiology of disease. The use of this methodology has both leveraged and utilized dark data, and has enabled unexpected discoveries. This example is illustrative, and the methodology can be applied in drug research and development to all information domains that are complex, multidimensional, multiscale or dynamic. Any executive that needs to weigh complex options and tradeoffs, or manage dynamic, multifactor risks, can benefit from the methodology.
Can you say a bit more about how the study incorporated the recommendations of the National Academies of Sciences, Engineering, and Medicine (NAS) to address the limitations of AI/ML/LLMs in biomedical research?
Dr. Kearns: While AI, Machine Learning (ML) and Large Language Models (LLMs) hold tremendous promise for driving advances in biomedical research, these technologies, like all technologies, have limitations. Traditional AI, ML and LLMs normalize data and remove outliers, thus hindering the identification of hidden “dark” data, which is unstructured or disconnected information. In fairness, it must be said and can be argued, that this normalization of data to remove outliers simplifies datasets, which can also be adjusted in other ways as well. AI, ML and LLMs also require a test training set to perform the analysis, and this could unintentionally introduce bias into the process. However, in using AI, ML and LLMs, one can modify and enhance training sets to identify the problem more accurately to be solved to reduce bias.
Filed Under: Genomics/Proteomics, machine learning and AI, Obstetrics & gynecology, Omics/sequencing