Blog Details

image image
image

Enhancing automated drug substance impurity structure elucidation from tandem mass spectra through transfer learning and domain knowledge

What’s the key idea

·        When developing a new API, there's often very little historical impurity data for that specific molecule / route. That “cold start” problem means impurity predictions & controls are weak at early stages.

·        This work proposes simulating possible impurity-formation pathways in silico, using SMARTS reaction templates (reaction pattern templates) and knowledge of starting materials + reaction conditions, to predict what impurities might form.

·        They then integrate those predictions with ML / transfer-learning models to help interpret MS/MS spectra (mass spec data) of unknown impurities more rapidly and more accurately.

Why this matters for KSMs & impurity control

·        It ties KSM identification directly to impurity risk: the choice of KSMs determines which “blocks” are available for impurity formation. If you know ahead of time which impurity fragments are possible, you can avoid KSMs that likely contribute bad fragments. 

·        It enables better prediction rather than waiting until impurity shows up in the stability or clinical batches. This helps you build control strategy & specification limits earlier. 

·        For regulatory or process development teams, this approach can reduce time & cost in impurity elucidation, and make starting material selection more data-driven.