AI for small-molecule design has advanced from rule-based enumeration to generative models and reinforcement learning (RL). Many platforms still operate as black boxes trained primarily on public data, which can limit novelty and produce brittle results.
CrystalsFirst® takes a structure-first path. Our FragAI platform is a 3D-aware generative AI trained on experimental protein–ligand complexes produced in-house, then optimized in a closed loop with wet-lab validation. The result is faster convergence on optimized and IP-ready molecules because learning is grounded in real structural interactions.
3D-aware generative modeling — Unlike many AI tools that only work with chemical strings (SMILES), our FragAI platform works directly with 3D protein–ligand structures. This means the molecules it proposes are already consistent with how they should bind in the pocket. In practice, FragAI has delivered real, testable molecules. For example on our in-house target, over half of the suggested compounds were synthetically accessible, with multiple binders confirmed crystallographically and in biophysical assays.
FragAI doesn’t just “imagine molecules.” It designs compounds under realistic medicinal chemistry rules:
To make the AI better aligned with medicinal chemistry practice, we use reinforcement learning:
The AI is not left on its own — every campaign runs in a closed loop:
This make–test–learn loop ensures that proposals rapidly converge on potent, selective, and novel chemotypes that are supported by both AI predictions and experimental proof.
CADD prioritizes physics-based docking and simulations; AIDD generates novel molecules de novo, then optimizes them with multi-objective RL and real feedback. At CrystalsFirst, AIDD is structure-first and tightly coupled to experiments, improving reliability and IP differentiation.
FragAI learns from experimental protein–ligand complexes, so it reasons about poses and interactions in 3D. This improves binding-mode plausibility and speeds up convergence to active, selective chemotypes.
We enforce constrained generation, retrosynthesis filters, and reward terms for synthetic accessibility. Designs are benchmarked, then triaged to synthesis partners; prior campaigns showed high synthesis success and multiple validated binders.
Our in-house GPU/CPU infrastructure supports parallel campaigns. We can deploy the models 100+ GPUs on cloud infrastructure.
We use in-house preference optimization-based RLHF, actor–critic/off-policy training, and learning curricula; continuous benchmarking and wet-lab feedback stabilize training and keep top-tier quality high across iterations.
Subscribe to stay tuned