Generative AI in Chemistry

Generative models enabling systematic exploration of chemical space under biological and synthetic constraints

Generative AI in Chemistry

Research Focus

Deep Generative Models

We develop and apply multiple generative modeling approaches:

  • Variational Autoencoders (VAEs): Learning continuous latent representations of chemical space
  • Generative Adversarial Networks (GANs): Adversarial training for high-quality molecular generation
  • Flow Models: Normalizing flows for invertible molecular transformations
  • Diffusion Models: State-of-the-art generative models for 3D molecular conformations

ReLeaSE Framework

Our foundational work, published in Science Advances with over 1,000 citations, integrates deep learning with reinforcement learning. The ReLeaSE (Reinforcement Learning for Structural Evolution) system employs a stack-augmented recurrent neural network to produce chemically valid SMILES strings while policy gradient methods guide generation toward molecules with desired properties. The approach achieves 95% validity rates for generated molecules.

Conditional Generation

The RRCGAN (Regression-Reward-Classification GAN) framework addresses conditional generation with continuous property targets, enabling the system to propose molecules with properties beyond the training data range through iterative transfer learning protocols. This capability is essential for discovering molecules with unprecedented characteristics.

Practical Applications

Our research emphasizes closed-loop workflows coupling generative models with:

  • Predictive Models: ML-based property estimation for rapid screening
  • Physics-Based Simulations: Quantum chemistry and molecular dynamics validation
  • Active Learning Strategies: Intelligent selection of molecules for detailed evaluation
  • Experimental Validation: Feedback from synthesis and testing

Validated Drug Discovery

A notable demonstration involved computationally designed EGFR inhibitors validated through experimental synthesis and biological testing, showcasing the practical applicability of generative AI in drug discovery pipelines.

Integration with Physical Models

We uniquely integrate generative AI with rigorous physical evaluation methods:

  • Machine Learning Potentials: Fast energy and property predictions
  • Quantum Chemistry: Accurate assessment of electronic structure
  • Molecular Stability: Evaluation of conformational preferences and strain
  • Reactivity Predictions: Assessing synthetic accessibility and metabolic stability
  • Binding Mode Analysis: Structure-based evaluation of protein-ligand interactions

Multi-Objective Optimization

Real-world molecular design requires balancing multiple competing objectives:

  • Biological activity and target selectivity
  • ADMET properties (absorption, distribution, metabolism, excretion, toxicity)
  • Synthetic accessibility and cost
  • Intellectual property considerations
  • Physicochemical properties (solubility, stability)

Our reinforcement learning approaches navigate these trade-offs, identifying Pareto-optimal solutions in high-dimensional property space.

Future Directions

Emerging priorities in our generative AI research include:

  • Geometry-Aware Generation: Direct generation of 3D structures rather than 2D graphs
  • Uncertainty Quantification: Reliable confidence estimates for generated molecules
  • Autonomous Discovery: Minimally supervised workflows integrating generation, prediction, simulation, and experimentation
  • Synthesis Planning Integration: Generation constrained by retrosynthetic feasibility
  • Multi-Modal Learning: Combining molecular structures with experimental data and scientific literature

Impact

Our generative methods enable:

  • Exploration of vast chemical spaces (10^60+ possible molecules)
  • Discovery of molecules with novel scaffolds and mechanisms
  • Rapid iteration in lead optimization campaigns
  • Integration with automated synthesis platforms
  • Acceleration of early-stage drug discovery

Software and Tools

Key software tools from our generative AI research are available open-source, enabling researchers worldwide to apply these methods to their own discovery challenges.

Collaborations

This work is conducted in partnership with:

  • Pharmaceutical companies for drug discovery applications
  • Materials science groups for functional materials design
  • Automated synthesis platforms for experimental validation
  • Academic collaborators in machine learning and chemistry