Molecular modeling made easy

datamol.io is an open-source toolkit that simplifies molecular processing and featurization workflows for ML scientists in drug discovery.

Already used by scientists in leading organizations

  • MIT
  • ETH Zurich
  • Schrodinger
  • Mila
  • Janssen
  • AstraZeneca

Discover the next generation of open-source tools for molecular modeling

Datamol Logo

Accelerate molecular processing workflows

Datamol is an elegant, RDKit-powered Python library optimized for molecular machine learning workflows.

  • Highly intuitive

    A familiar Pythonic API with good defaults by design. Get started in one line.

    Start Now
  • Powerful

    Seamlessly integrated with RDKit to support you in every step. Built-in parallelization to accelerate your workflows.

    Experience Efficiency
  • Modern I/O

    Read and write multiple formats (sdf, xlsx, csv) with out-of-the-box support.

    Try Now
MolFeat Logo

An open-source hub of molecular featurizers

Spending too much time searching for the right featurizer? Don’t know which featurizers are most effective? Molfeat makes it easy to evaluate and implement a wide range of featurizers directly into your workflow.

  • Incredibly simple

    You don't need much to get started with molfeat.

    Start Now
  • Unrivaled diversity of featurizers

    Descriptors, 2D/3D pharmacophores, graph featurization. You name it, we have it.

    Explore
  • Extendable

    Something missing? Contribute your own featurizer.

    Contribute
Medchem

Filter by medicinal chemistry rules

Want to intelligently apply constraints to prioritize more drug-like molecules? Medchem provides an easy and uniform way to try filtering in ways that medicinal chemists do.

  • Consistent application

    Get most alerts, filters and rules through one API.

    Start Now
  • Apply Eli Lilly, Novartis rules and more

    More than 20 different rules for you to consider.

    Explore
  • Quickly triage compounds at scale

    Run in parallel in processes or threads.

    Accelerate your search
Splito

Evaluate your models meaningfully

What does a good model look like for chemistry and biology? Splito provides powerful methods for splitting datasets in a meaningful way considering your downstream application.

  • Efficient processing

    Split your dataset with only two lines.

    Start Now
  • Get better generalization

    Compare splitting methods to see how representative they are.

    Compare splits
  • Explore chemical space

    Easily visualize your train and test distribution.

    Get Intuitions

We’re only just getting started

datamol.io is creating a new, simplified experience for ML scientists working on molecular modeling.

Create a PR, work on an issue, or interact with us on Twitter to let us know what features you want.