National Institutes of Health (NIH) researchers have released an innovative software tool that makes assembling complete genome sequences from a variety of species more affordable and accessible. The study, published February 16 in Nature Biotechnology, describes the software, called Verkko -- “network” in Finnish.
The development of Verkko grew out of the assembly of the first complete, or gapless, human genome sequence, completed in 2022 by the Telomere-to-Telomere (T2T) consortium.
T2T collaborators, funded by the NIH’s National Human Genome Research Institute (NHGRI), used new DNA sequencing technologies and analytical methods to generate and assemble the remaining 8% to 10% of the human genome sequence. However, scientists had to assemble those fragments manually -- a process that took a large, highly skilled team several years to complete.
The researchers who participated in the new study contend that Verkko can perform the same task in a few days.
“Verkko can democratize generating gapless genome sequences,” NHGRI senior investigator Adam Phillippy said in a statement. “This new software will make assembling complete genome sequences as affordable and routine as possible.”
The researchers say that assembling a genome sequence is similar to putting together a jigsaw puzzle. However, different DNA sequencing technologies generate different types of genomic puzzle pieces. Some are small and highly detailed, while others are much bigger, but blurrier.
Verkko compares and assembles both types of pieces to generate a complete and accurate picture. It starts by putting together the small, detailed pieces, creating many partially assembled but disconnected segments. The software tool then compares the assembled regions with the larger, less precise pieces. These larger pieces help put the more detailed regions in order. The final product is an accurate and complete genome sequence.
The researchers tested Verkko with both human and nonhuman genome sequencing data. The software quickly and precisely assembled the sequences of whole chromosomes -- a formerly painstaking feat. With only one gapless human genome sequence currently in existence, scientists lack knowledge about the diversity of many genome portions, such as regions of highly repetitive DNA, across human populations. The software may potentially open the doors for a greater number of complete human genome sequences, allowing researchers to better assess human genomic diversity.
Verkko may also accelerate efforts to generate gapless genome sequences for commonly used research species, such as mice, fruit flies, and zebrafish, improving their usefulness to scientists.
Additionally, generating gapless genome sequences from a wide variety of plants, animals, and other organisms will further our understanding of comparative genomics, the study of the differences and similarities among the genomes of diverse species.