The Wellcome Sanger Institute has begun a research program to empower scientists to engineer biology as easily as software, electronics, and cars.
Leaders at the Sanger Institute, a Hinxton, U.K.-based genomic research center, have set up the new generative and synthetic genomics program to fix a problem that has persisted despite decades of progress in the field. As it stands, researchers still struggle to predict how biological systems will respond to change, including simple alterations such as single DNA base edits, and to engineer biology quickly and easily.
The problem, as the Sanger Institute sees things, is that decades of research have failed to solve the core question of how a genetic sequence determines the properties and regulation of proteins and RNAs. The failure to solve that question has led the researchers to identify a need for a new approach.
Professor Ben Lehner, head of generative and synthetic genomics at the Sanger Institute, outlined how the program will pursue a new approach to the problem. The approach is built on technologies that enable large-scale experiments on genes and proteins and highly predictive models that use artificial intelligence (AI).
“It will be the combination of these technologies that will enable us to solve the fundamental question of how genetic sequence determines the properties and regulation of proteins. To do this we require huge amounts of data, and the Sanger Institute’s capabilities of large-scale data generation and genomics expertise make it the natural place for us to undertake this ambitious research,” Lehner said in a statement.
The ultimate goal is to “predict the effects of editing each and every one of the building blocks of DNA” and develop “technologies to write and edit genomes at scale and speed.” Lehner and his collaborators have broken down the work toward that goal into three areas of focus: single nucleotide resolution genetics, generative biology, and synthetic genomics.
Scientists will run massively parallel perturbation experiments using DNA synthesis or editing, selection, and sequencing to generate data on the biological effects of editing every nucleotide in every genome. The researchers think the scale and diversity of the data will support the development of AI models that can predict the biological effects of genetic changes.
Making genomics predictable will support other aspects of the program, such as the Sanger Institute’s plans to facilitate the generation of new proteins and regulatory elements for industry and medicine.