[LINK] 70k new RNA viruses identified using AI metagenomics
Stephen Loosley
stephenloosley at zoho.com
Sat Oct 12 23:16:44 AEDT 2024
AI scans RNA ‘dark matter’ and uncovers 70,000 new viruses
Many are bizarre and live in salt lakes, hydrothermal vents and other extreme environments.
By Smriti Mallapaty 11 October 2024 https://www.nature.com/articles/d41586-024-03320-6
[Photo caption: Observation of the basaltic organs of the Panarea volcanic island in the Aeolian islands archipelago, Mediterranean Sea.
Some of the newly discovered viruses live in hydrothermal vents and other extreme environments. Credit: Alexis Rosenfeld/Getty]
Researchers have used artificial intelligence (AI) to uncover 70,500 viruses previously unknown to science1, many of them weird and nothing like known species.
The RNA viruses were identified using metagenomics, in which scientists sample all the genomes present in the environment without having to culture individual viruses.
The method shows the potential of AI to explore the ‘dark matter’ of the RNA virus universe.
Viruses are ubiquitous microorganisms that infect animals, plants and even bacteria, yet only a small fraction have been identified and described. There is “essentially a bottomless pit” of viruses to discover, says Artem Babaian, a computational virologist at the University of Toronto in Canada. Some of these viruses could cause diseases in people, which means that characterizing them could help to explain mystery illnesses, he says.
Previous studies have used machine learning to find new viruses in sequencing data.
https://www.nature.com/articles/d41586-018-03358-3
The latest study, published in Cell this week, takes that work a step further and uses it to look at predicted protein structures1.
The AI model incorporates a protein-prediction tool, called ESMFold, that was developed by researchers at Meta (formerly Facebook, headquartered in Menlo Park, California). A similar AI system, AlphaFold, was developed by researchers at Google DeepMind in London, who won the Nobel Prize in Chemistry this week.
Missed viruses
In 2022, Babaian and his colleagues searched 5.7 million genomic samples archived in publicly available databases and identified almost 132,000 new RNA viruses2. Other groups have led similar efforts3.
But RNA viruses evolve quickly, so existing methods for identifying RNA viruses in genomic sequence data probably miss many. A common method is to look for a section of the genome that encodes a key protein used in RNA replication, called RNA-dependent RNA polymerase (RdRp). But if the sequence that encodes this protein in a virus is vastly different from any known sequence, researchers won’t recognize it.
Shi Mang, an evolutionary biologist at Sun Yat-sen University in Shenzhen, China, and a co-author of the Cell study, and his colleagues went looking for previously unrecognized viruses in publicly available genomic samples.
They developed a model, called LucaProt, using the ‘transformer’ architecture that underpins ChatGPT, and fed it sequencing and ESMFold protein-prediction data.
They then trained their model to recognize viral RdRps and used it to find sequences that encoded these enzymes — evidence that those sequences belonged to a virus — in the large tranche of genomic data. Using this method, they identified some 160,000 RNA viruses, including some that were exceptionally long and found in extreme environments such as hot springs, salt lakes and air.
Just under half of them had not been described before.
They found “little pockets of RNA virus biodiversity that are really far off in the boonies of evolutionary space”, says Babaian.
“It’s a really promising approach for expanding the virosphere,” says Jackie Mahar, an evolutionary virologist at the CSIRO Australian Centre for Disease Preparedness in Geelong.
Characterizing viruses will help researchers to understand the microbes’ origins and how they evolved in different hosts, she says.
And expanding the pool of known viruses makes it easier to find more viruses that are similar, says Babaian.
“All of a sudden you can see things that you just weren’t seeing before.”
The team wasn’t able to determine the hosts of the viruses they identified, which should be investigated further, says Mahar.
Researchers are particularly interested in knowing whether any of the new viruses infect archaea, an entire branch of the tree of life for which no RNA viruses have been clearly shown to infect.
Shi is now developing a model to predict the hosts of these newly identified RNA viruses.
He hopes this will help researchers to understand the roles that viruses have in their environmental niches.
doi: https://doi.org/10.1038/d41586-024-03320-6
References
Hou, X. et al. Cell https://doi.org/10.1016/j.cell.2024.09.027 (2024).
Article
Google Scholar
Edgar, R. C. et al. Nature 602, 142–147 (2022).
Article
PubMed
Google Scholar
Zayed, A. A. et al. Science 376,156–162 (2022).
Article
PubMed
Google Scholar
Download references
Reprints and permissions
Latest on:
Evolution
Genomics
Machine learning
A modular circuit coordinates the diversification of courtship strategies
A modular circuit coordinates the diversification of courtship strategies
Article 09 OCT 24
Accidental gunpowder blast caused an uproar at London Zoo in 1874
Accidental gunpowder blast caused an uproar at London Zoo in 1874
News & Views 08 OCT 24
Bacteria implanted in fungi hints at ancient relationships that helped cells evolve
Bacteria implanted in fungi hints at ancient relationships that helped cells evolve
News 03 OCT 24
Nature Careers
Jobs
The 5th Capital Medical University International Young Scholars Forum Announcement
High-level talents
Beijing (CN)
Capital Medical University
Professor/Associate Professor/Assistant Professor/Senior Lecturer/Lecturer
The School of Science and Engineering (SSE) at The Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen) sincerely invites applications for mul...
Shenzhen, China
The Chinese University of Hong Kong, Shenzhen (CUHK Shenzhen)
Professor/Associate Professor/Assistant Professor/Senior Lecturer/Lecturer
The School of Science and Engineering (SSE) at The Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen) sincerely invites applications for mul...
Shenzhen, China
The Chinese University of Hong Kong, Shenzhen (CUHK Shenzhen)
Postdoctoral Fellowships at West China Hospital/West China School of Medicine of Sichuan University
Open to PhD students, PhD, Post-Doc and residents.
Chengdu, Sichuan, China
West China School of Medicine/West China Hospital
Welcome Global Talents to West China Hospital/West China School of Medicine of Sichuan University
Top Talents; Leading Talents; Excellent Overseas Young Talents on National level; Overseas Young Talents
Chengdu, Sichuan, China
West China School of Medicine/West China Hospital
Related Articles
100-year-old pandemic flu viruses yield new genomes
Chemistry Nobel goes to developers of AlphaFold AI that predicts protein structures
AlphaFold’s new rival? Meta AI predicts shape of 600 million proteins
Subjects
Evolution Genomics Machine learning Virology
Sign up to Nature Briefing
An essential round-up of science news, opinion and analysis, delivered to your inbox every weekday.
Email address
e.g. jo.smith at university.ac.uk
Yes! Sign me up to receive the daily Nature Briefing email. I agree my information will be processed in accordance with the Nature and Springer Nature Limited Privacy Policy.
Close
More information about the Link
mailing list