Learn
LIVERBLOODAabsorbDdistribMmetabE / T
ADMET · Toxicity

ADMET Prediction with Machine Learning

Why most drug candidates fail and how AI predicts absorption, metabolism, and toxicity early

7 min read

The majority of drug candidates that enter clinical trials never reach patients. Historically, a large proportion of these failures have been due to poor pharmacokinetic properties — the drug is not absorbed well, is metabolized too quickly, or causes toxic side effects. ADMET prediction aims to identify these problems computationally before expensive and time-consuming experiments.

What ADMET Stands For

Absorption refers to how well a drug enters the bloodstream after administration. For oral drugs, this depends on intestinal permeability and solubility. Distribution describes how the drug spreads through the body and reaches its target tissue. Metabolism covers how the body chemically modifies the drug, primarily through liver enzymes (cytochrome P450 family). Excretion is how the drug and its metabolites are eliminated, typically through the kidneys or bile. Toxicity encompasses adverse effects — from liver damage (hepatotoxicity) and heart rhythm disruption (hERG channel inhibition) to mutagenicity and broader organ toxicity.

Poor ADMET properties are among the most common reasons for drug candidate failure. Identifying problematic molecules early — ideally before any synthesis — can save years of development effort and significant cost.

Traditional Experimental Approaches

ADMET properties have traditionally been measured through in vitro assays and animal studies. Caco-2 cell assays measure intestinal permeability. Microsomal stability assays assess metabolic clearance. The hERG assay tests for cardiac toxicity risk. These assays are well-established but require physical compounds, take time, and have limited throughput. Animal pharmacokinetic studies provide more complete ADMET profiles but are expensive, slow, and raise ethical concerns. Furthermore, animal models do not always predict human pharmacokinetics accurately.

Graph Neural Networks for Property Prediction

Machine learning models for ADMET prediction take a molecular structure as input and predict one or more ADMET properties. Graph neural networks (GNNs) have become a dominant architecture for this task. In a GNN, a molecule is represented as a graph (atoms as nodes, bonds as edges), and the network learns to propagate information along bonds to build a representation of the whole molecule. This representation is then used to predict properties.

GNN architectures commonly used for ADMET include message-passing neural networks (MPNNs), as implemented in the open-source Chemprop library from the Barzilay and Jaakkola labs at MIT. Chemprop has been widely adopted in pharmaceutical research for property prediction. Transformer-based architectures operating on molecular graphs or SMILES strings are also increasingly used.

Benchmarks: Tox21, TDC, and ADMET-related Datasets

Standardized benchmarks are critical for evaluating ADMET models. The Tox21 dataset, released as part of a federal collaboration between the NIH, EPA, and FDA, contains experimental results for roughly 10,000 compounds across 12 toxicity-related assay targets, including nuclear receptor signaling and stress response pathways. Therapeutics Data Commons (TDC), developed at Harvard, provides a curated collection of ADMET benchmark datasets spanning absorption (e.g., Caco-2 permeability, lipophilicity), metabolism (e.g., CYP450 inhibition), and toxicity (e.g., hERG, AMES mutagenicity, LD50). These benchmarks enable consistent comparison across modeling approaches.

Multi-Task Learning

ADMET properties are not independent — a molecule's absorption, metabolism, and toxicity are all influenced by its underlying physicochemical characteristics. Multi-task learning exploits this by training a single model to predict multiple ADMET endpoints simultaneously. The shared lower layers of the model learn general molecular representations, while task-specific heads predict individual properties. This approach often outperforms separate single-task models, especially when data for individual endpoints is limited, because the model can transfer knowledge across related tasks.

Platforms and Industry Adoption

Several companies and platforms have built integrated ADMET prediction tools. Schrodinger offers QikProp and related physics-based ADMET models alongside their molecular simulation suite. Insilico Medicine incorporates ADMET prediction into its Chemistry42 generative chemistry platform, using predicted ADMET properties as optimization objectives during molecule generation. Simulations Plus (with its ADMET Predictor software) is widely used by pharmaceutical companies and regulatory agencies. Open-source tools like ADMETlab and SwissADME provide free web-based ADMET predictions based on established models.

Despite significant progress, ADMET prediction remains challenging. Models are limited by the quality and quantity of training data — many endpoints have small datasets with noisy measurements. Applicability domain (knowing when a model's predictions can be trusted for a given molecule) remains an open problem. And the most important toxicity signals often involve complex multi-organ or immune-mediated mechanisms that are difficult to predict from molecular structure alone.

SharePostShare

Continue reading
TARGETSCREENOPTIMIZETEST

How AI Is Changing Drug Discovery

A stage-by-stage look at where machine learning enters the pharmaceutical pipeline

8 min read
M E T H I O N I N E · A L A · G L Y

Protein Structure Prediction Explained

From the protein folding problem to AlphaFold, Boltz, and co-folding models

10 min read
NOnoisestructure

Generative Models in Drug Design

How diffusion models, VAEs, and language models are designing novel molecules

7 min read
DOCKINGSCOREhit 1hit 2hit 310M3

Virtual Screening and Molecular Docking

How computational methods sift through billions of molecules to find drug candidates

8 min read
EGFRJAK2TP53KRASPI3KmTORRASknowninferred

AI for Target Identification

Finding the right protein to drug — how machine learning mines omics data for novel targets

8 min read
FcVLCLVHCH1CDRCDRAgFabFab

AI-Driven Antibody and Biologics Design

From traditional hybridoma screening to de novo computational antibody generation

9 min read
1DC(=O)Nc1ccc(O)cc1CC#NSMILES2DNOGRAPH3DxyzCOORDS

Molecular Representations for Machine Learning

SMILES, molecular graphs, fingerprints, and 3D coordinates — how molecules become data

7 min read
[CLS]MetAlaGlySerL1L2LN...houtself-attn+ FFNTRANSFORMER ENCODER

Foundation Models in Biology

How protein language models and biological LLMs are creating a new paradigm for drug discovery

9 min read
LARGEPHARMAAI-FIRSTBIOTECHDISCOVERYPLATFORMCLINICALSTAGE

The AI Drug Discovery Landscape

A map of the companies, funding, partnerships, and clinical programs reshaping pharma

10 min read

Stay current

Weekly digest of AI drug discovery developments. No noise.