Capturing the dynamic dance of proteins
“Proteins are really quite incredible,” said Melanie Adams-Cioaba, senior director/general manager at Thermo Fisher Scientific. “The best way to think about them is that they’re constantly breathing, wiggling and moving.”
The advent of cryo-EM, paired with sophisticated computational methods, has enabled structural biologists to capture the dynamic nature of proteins in unprecedented detail, as Nature noted in 2022. As a single-particle technique, cryo-EM can capture images representing different conformational states of the same protein, providing valuable insights into their functional mechanisms.
In situ electron tomography allows researchers to visualize proteins in their native cellular environment, providing a more comprehensive understanding of protein function and interactions. “Increasingly, we’re now able to visualize proteins in their native environments at high resolution, without losing the context of everything that happens in the cell around them,” Adams-Cioaba added.
Beyond protein snapshots
This ability to visualize proteins as dynamic entities has significant implications for understanding protein function and drug discovery. By integrating cryo-EM data with prior physico-chemical knowledge, researchers can now model the dynamic behavior of proteins and understand how they respond to changes in their environment. In addition, researchers’ ability to simultaneously determine protein structure and dynamics from cryo-EM data can shed light on the mechanisms of protein function and the effects of drug binding. “It’s not enough to have just one structure; you want the structure and the movement,” Adams-Cioaba said. “You want to be able to watch what happens as you put a drug on there or as we think about changing features of its environment and understand how it responds to its environment.”
Continued advances in resolution, contrast, improved cryo-EM accessibility and increasing integration of cryo-EM with AI and automation has upended the field of structural biology by enabling unprecedented structural insights into challenging biological targets. “Where we want to go, and what is becoming very tractable in the coming years, is getting closer and closer to structure on demand and our ability to extract not just single structures, but many structures from the same experimental dataset,” Adams-Cioaba said.
Managing cryo-EM’s information avalanche
While cryo-EM is a powerful tool, the amount of data it generates can overwhelm even data-savvy researcher organizations. A single cryo-EM dataset can consist of hundreds of thousands of individual particle images, each containing millions of pixels. Researchers are making progress in creating data banks of cryo-EM maps and models. Frontiers in Molecular Biosciences noted that as of August 2, 2023, almost 24,000 single-particle EM maps and some 15,000 structural models were in the Electron Microscopy Data Bank (EMDB).
“I think that along the cryo-EM workflow, cryo-EM itself is a big data problem,” Adams-Cioaba said. “We take thousands upon thousands of images of millions and millions of particles, and then we want to extract the best possible information from there. Increasingly, we’re seeing the use of artificial intelligence approaches to help better mine structural information and automate the workflows for data processing from those images from the microscope.”
Unveiling the secrets of protein motion
Researchers acquire thousands upon thousands of images containing millions and millions of individual particle views. “We want to extract the best possible information from there,” Adams-Cioaba noted. Increasingly, AI approaches are shedding light on this structural data and automate the complex data processing workflows required to reconstruct 3D models from the raw microscope images.
The trend has ramifications for the dynamic view of proteins described earlier. “We should be able to capture discrete and continuous conformations using AI, big data approaches, advanced processing algorithms, and leaner data. This will allow us to really start seeing not just a protein structure, but also how it moves,” Adams-Cioaba said. “I think as AI, data processing, data acquisition, image analysis, and everything improves, these dynamic views of proteins are going to become more the standard. I think that’s a really exciting way of thinking about biology.”
Integrating AI with cryo-EM and the role of federated learning
Advancing AI’s role in structural biology faces a hurdle: the proprietary nature of many drug discovery datasets. That’s where some of the early work in federated learning is promising. Cutting-edge protein folding prediction software could potentially learn from this data without compromising confidentiality.
Federated learning, which allows AI models to learn from decentralized datasets without the need for data sharing, could be a game-changer in this regard. By enabling AI algorithms to learn from proprietary drug discovery data while preserving data privacy, federated learning could accelerate the development of new drugs and therapies.
Looking to the future, Adams-Cioaba sees more potential in the combination of AI and structural biology. “If all of the world’s structural information was accessible to AI learning algorithms, I think we would be even further ahead in how well we could predict binding events, what happens, and how well we could do in silico design of new drugs,” she said.
Filed Under: Drug Discovery and Development, machine learning and AI, Omics/sequencing