A team of nearly 100 scientists from the Telomere-to-Telomere (T2T) Consortium has for the first time unveiled the complete human genome. The findings were published on March 31 in the journal Science.
The paper presents a complete 3.055 billion-base pair sequence of a specific human genome, T2T-CHM13, adding 400 million letters to the previously sequenced DNA and correcting several errors in it.
"There are no longer any hidden or unknown bits," said Robert Waterston, PhD, University of Washington geneticist and a leader in the original Human Genome Project. He was not involved in the new effort. "I think that is psychologically a big thing. I just admire these scientists for sticking with it."
In 2003, the Human Genome Project made history when it sequenced 92% of the human genome at a cost of $3 billion. The remaining 8% of the genome consisted of large chunks of DNA containing highly repetitive sequences that many scientists at the time dismissed as "junk DNA."
In the two decades since, scientists labored to decipher the remaining 8%. Until now, it was unclear what these unknown genes coded, but the new findings reveal that they are far from "junk."
"It turns out that these genes are incredibly important for adaptation," said Evan Eichler, PhD, professor of genome sciences at the University of Washington and lead author of the paper in Science. "They contain immune response genes that help us to adapt and survive infections and plagues and viruses. They contain genes that are very important in terms of predicting drug response."
The work was facilitated by rapid improvements in the gene sequencing machines made by Oxford Nanopore Technologies and Pacific Biosciences. By 2017, machines had emerged with the capability to accurately read a million letters of DNA at a time, opening the door to finally tackling the genome's hard bits.
This prompted Adam Phillippy of the National Human Genome Research Institute and Karen Miga of the University of California, Santa Cruz to establish the T2T Consortium to sequence all 23 pairs of chromosomes from one end, or telomere, to the other.
"We had the benefit of youthful optimism and we were fired up by the promise of these new technologies," Phillippy said.
By summer 2020, the consortium had sequenced and analyzed two chromosomes and planned what Phillippy called a "hackathon" to do the same for the other 21. With the COVID-19 pandemic in full swing, the members collaborated remotely over Zoom and Slack.
The consortium is now beginning a pan-genome effort to read the entire DNA sequences of hundreds of people from around the world. "The goal is to create as complete a human genome as possible, representing much more of human diversity," said Erich Jarvis, PhD, a Rockefeller University neurogeneticist and co-leader of the pan-genome effort.