UK Biobank has made deidentified whole genome sequencing (WGS) data on 500,000 people available, enabling researchers around the world to analyze the genomics results alongside 15 years of health data on the individuals.
Since 2006, the nonprofit UK Biobank has collected biological and medical data on 500,000 people, ages 40 to 69 years old, by analyzing regular blood, urine, and saliva samples and collecting information on the lifestyles of participants. The data is linked to health-related records. In recent years, the organization has invested 200 million pounds ($254 million) to sequence the genomes of participants.
Amgen, AstraZeneca, GSK, and Johnson & Johnson each invested 25 million pounds ($31.6 million) in the sequencing initiative, securing themselves nine months of exclusive access to the WGS data in the process. With the exclusivity window now over, UK Biobank has made the data available to all approved researchers.
“The sheer amount of genetic data is exceptional -- it is twice as much as anywhere else -- but UK Biobank’s data is so illuminating because we've been able to follow the health of our brilliant volunteers for around 15 years,” Professor Sir Rory Collins, principal investigator at UK Biobank, said in a statement.
Linking the WGS results to all the other data held by UK Biobank could enable researchers to reveal how genomic, medical, biochemical, lifestyle, and environmental factors combine to influence human health. The biobank features data on more than 10,000 variables such as blood pressure, cognitive function, diet, and bone density.
UK Biobank sees opportunities for the data to enable more targeted drug discovery and development, uncover the role that noncoding genetic variants play in disease, support precision medicines, and generate insights into the biological drivers of illnesses such as Parkinson’s, Alzheimer’s, and autoimmune diseases.
The Wellcome Sanger Institute and Amgen’s deCODE Genetics sequenced the genomes, processing up to 20,000 samples a month at the peak of the project. Sequencing ended early last year, after which data review and quality control were undertaken.
UK Biobank’s industry partners gained access to the WGS data, linked to other health data, in February and have led work to determine the frequency of all the variants in the population and create a single combined genetic dataset. UK Biobank has collaborated with Amazon Web Services to host the data and make it, along with the computing power to analyze it, available to approved researchers around the world.