Picture this: A lung cancer drug candidate was headed for abandonment until AI-powered pathology analysis identified which patients would benefit from it. Using machine learning and digital pathology, the two sponsors backing the candidate built an algorithm capable of distinguishing responders from non-responders to the compound they were developing. “Without that AI model and digital biomarker, they were considering retiring the drug,” explained Nathan Buchbinder, co-founder and chief strategy officer of the digital pathology firm Proscia (Philadelphia). “Now they have a new approach to fueling their pipeline.” This success story, involving a pair of Big Pharma partners, highlights how Proscia’s Concentriq Embeddings platform (formally launched late last year) could shift how pharmaceutical companies develop targeted therapies.
The foundational power of embeddings
As its name indicates, this technology involves embeddings—numerical representations that distill the essential features of complex data, such as pathology images, into a form AI can analyze. Put simply, embeddings translate context into numbers, allowing machine learning models to spot patterns in raw data.
In the case of Concentriq, embeddings generated by foundation models empower researchers to identify biomarkers, predict treatment responses, and classify samples in pathology and beyond. Although embeddings don’t always capture headlines, they’re foundational to a range of technologies, from Google’s semantic search to ChatGPT’s grasp of context. In pathology, embeddings are promising in that they can spot tissue patterns invisible to the human eye. This matters as they can help distinguish responders from non-responders to treatments, identify novel biomarkers, and enable precision diagnostics without requiring exhaustive manual annotation of every slide.The problem Concentriq embeddings addresses

Julianna Ianni, Ph.D.
Yet, for all their power, embeddings have historically been a challenge to implement, demanding specialized expertise and significant computational resources. That’s where Concentriq Embeddings steps in. “The pathology world didn’t need another hit movie. It needed to be easy,” Julianna Ianni, Ph.D., VP of AI Research & Development at Proscia, said in a podcast with Heather Couture, Ph.D. “Some models are, of course, better for one task than another, but when we looked at where we could make the most impact, it was just obvious to me and my team, that the common struggles of implementing these models, the logjam to blockbuster, if you will, those struggles were far more the barrier to use than having better model performance.”
For life sciences organizations investing in AI, the promise of finding hidden patterns in pathology data often collides with practical realities. “When these life sciences organizations try to build AI applications, they’re typically spending a lot of resources on AI—they’re throwing money at it because they know how important it is, but it’s generally a really inefficient process,” Ianni said in an interview.
The workflow of using embeddings in the context of digital pathology can be daunting with working with off-the-shelf tools. Researchers must source data, download gigabyte-sized whole slide images, organize them, and tile them into smaller images—all before they can begin actual analysis. “You’re going through many different steps,” Ianni notes. “At that point, you can either build an end-to-end machine learning or deep learning model, which requires a good amount of data, or you need to find and deploy some kind of encoder or foundation model to generate features—to generate embeddings from those images.”
In a traditional approach, only after completing these labor-intensive preliminary steps can researchers “really do what you were setting out to do in the first place: build an AI model, whether that’s a model to predict tumor regions, do anomaly detection, identify potential tissue damage from a therapeutic candidate, or countless other applications of AI in digital pathology.”
Buchbinder adds that this complexity has created an adoption barrier in the industry. “The entire process of leveraging foundation models is pretty cumbersome, expensive, time-consuming, and requires a fair amount of infrastructure.”
How Concentriq Embeddings Works

Nathan Buchbinder
Concentriq Embeddings was designed to eliminate these barriers through a streamlined, API-based workflow. “The way this works is that users can select either a whole slide image or an entire repository of whole slide images within Concentriq, and they can also select a foundation model,” explains Ianni. “Concentriq Embeddings works via an API, so users submit an API request, and they have six foundation models to choose from.”
These models include DINOv2, ConvNext, CTransPath, FLIP, HistoMTL, and MoCo-v2 currently. Proscia is continuously adding more to that collection. The diversity is intentional: “Most of those are pathology-specific, some are not. We aim to provide users with a variety, and importantly, we include both vision and vision-language foundation models so users can accomplish both their visual tasks and also interact with their images via language and text, which is pretty cool.”
Once the request is submitted, “Concentriq Embeddings will tile those images correctly and do all the pre-processing needed to generate embeddings on those images and image tiles. From there, we create a file that users can download with all of their embeddings, ready to go and ready to be used to build AI models.”
In essence, Concentriq takes all of that complexity and bundles it up into a single, streamlined workflow. “This allows you to just select which tools in your toolkit—which foundation models—you want to use, and produces the results you need, all in the same platform where you’re already working with your pathology data,” Buchbinder said.
In one internal case study, Proscia “found we could build AI 13 times faster than with traditional methods by using embeddings,” according to Ianni. Concentriq Embeddings processed whole slide images from the IMPRESS dataset in just 2.5 hours, compared to 33.4 hours required by a high-end Linux workstation. With this acceleration, the Proscia partner was able to build “80 AI-based breast cancer biomarker prediction models in under 24 hours.” Proscia projects that these efficiency gains could scale up to 100x for the larger datasets typically used in core discovery and development activities in pharmaceutical research.
The future of precision medicine with AI pathology
The impact of streamlined foundation model access extends beyond technical efficiency. “What AI is allowing us to do—and what embeddings are enabling us to do much faster and more cost-effectively—is find patterns in tissue that translate into whether somebody is or isn’t a good respondent to a particular drug compound,” Buchbinder said. “Ultimately, precision medicine involves ensuring you deliver the best, most specific treatment attuned to a particular patient based on their biological makeup—whatever uniquely identifies them as being a good or bad candidate for particular therapies,” says Buchbinder.
In the aforementioned podcast, when asked about Proscia’s future direction, Ianni envisions a seamless experience: “This process of finding, using, employing your pathology data with AI… should be as easy as using ChatGPT. ChatGPT didn’t pervade our culture because it’s fantastic technology, she said. “It pervaded our culture because the fantastic technology was easy to use, and pathology should be that easy, and our aim is to drive it there. We want to help unlock those life-enabling therapies for more patients and make sure that they can be connected to the right ones.”
Filed Under: machine learning and AI