AI model from Google's DeepMind reads recipe for life in DNA
Getty ImagesAn AI model developed by Google's DeepMind could transform our understanding of DNA - the complete recipe for building and running the human body - and its impact on disease and medicine discovery, according to researchers.
Called AlphaGenome, the model could help scientists discover why subtle differences in our DNA put us at risk of conditions such as high blood pressure, dementia and obesity.
It could also dramatically accelerate our understanding of genetic diseases and cancer.
The developers of the model acknowledge it's not perfect, but experts have described it as "an incredible feat" and "a major milestone".
"We see AlphaGenome as a tool for understanding what the functional elements in the genome do, which we hope will accelerate our fundamental understanding of the code of life," says Natasha Latysheva, research engineer at DeepMind.
The human genome is made up of three billion letters of DNA code – represented by the letters A,C,G and T.
Around 2% of it are genes which code for all the proteins the body needs to grow and function. The remaining 98%, which is less well understood, is labelled the 'dark genome'. It plays a crucial role in organising how genes are used in the body and is where many mutations linked to disease are found.
AlphaGenome can analyse one million letters of code at a time, helping to unravel the 'dark genome'.
It can predict where the genes are, but also what the 'dark genome' is influencing. For example, how it affects gene expression (whether a gene is highly active or being suppressed) and gene splicing (the tool the body uses to make different proteins from a single gene).
Crucially, the model can predict the impact of changing even a single letter in genetic code.
'Big leap'
Latysheva said she was "really excited" by the AI model's potential to understand which mutations cause disease and help pinpoint the cause of rare genetic diseases.
The AI model could be used to "add another piece of the puzzle for the discovery of drug targets and ultimately the development of new drugs", she added.
Ultimately, it could also be used in synthetic biology and the design of new sequences of DNA which could be used in gene therapies.
AlphaGenome has been described in the journal Nature, but was made available for non-commercial use last year and 3,000 scientists have since used the tool.
Dr Gareth Hawkes, from the University of Exeter, is using it to explore how mutations could be altering our risk of obesity and diabetes.
Studies that sequenced the entire genetic code of tens of thousands of people have identified variants linked to the conditions, but they are often in the dark genome.
"They're directly impacting some important piece of biology that we don't really understand," Hawkes told the BBC.
Using AlphaGenome allows researchers to rapidly predict what those variants are up to so they can be tested in the lab.
Hawkes said: "Those predictions will help to inform which biological processes those genetic variants might be impacting, and potentially lead to drug developments.
"I wouldn't say the dark side of the genome is solved by AlphaGenome, but it's a big leap. I'm really excited."
Cancer is another field where the AI model could accelerate research.
AlphaGenome has been used to predict which mutations are fuelling cancer and are also the potential targets of treatment, and which mutations are incidental.
Dr Robert Goldstone, head of genomics at the Francis Crick Institute, said the model was a "major milestone in the field of genomic AI" and the breakthrough was "an incredible technical feat" for its "ability to predict gene expression from DNA sequence alone".
Prof Ben Lehner, the head of generative and synthetic genomics at the Wellcome Sanger Institute, said they had tested AlphaGenome in more than half a million experiments and it was performing very well.
But he said it was "far from perfect" and there was still a lot of work to do.
"It's a really exciting time with three areas where the UK is world-leading - genomics, biomedical research and AI - combining to transform biology and medicine," Prof Lehner said.
The team at DeepMind won the Nobel Prize for Chemistry in 2024 for their work on AlphaFold – an AI system that predicts the 3D structure of proteins in the body.
"I think we are at the start of a new era of scientific progress, and AI is going to enable a number of different breakthroughs," says Pushmeet Kohli, vice president of science and strategic initiatives at Google DeepMind.
How does it work?
AlphaGenome doesn't work like large language models (such as ChatGPT) that predict the next word in a sequence. Instead, it is a "sequence-to-function model" looking at how changes in the text affect the meaning at the end.
It was trained on publicly available databases of human and mouse cell experiments.
There is general agreement that the AI model needs refining. It is less accurate in some areas such as predicting how genes are regulated over long distances (more than 100,000 letters of code away).
The team also want to improve the accuracy of the model in different tissues. A neuron in the brain, for example, has the same genetic code as a beating heart cell, but each has different properties based on the way the genetic instructions are being used in each cell type.
