Skip to main content

Using blockchain to tackle one of the largest data sets around

(Image credit: Image Credit: Zapp2Photo / Shutterstock)

Science has made remarkable progress in mapping DNA over the past few decades. It took scientists 13 years and $3 billion to sequence the first human genome by 2003. This can now be done in a matter of weeks, or possibly even days.  

The costs of gathering this data have fallen fast. Just as recently as January 2017, the worldwide leading sequencing provider Illumina unveiled a new sequencer, the NovaSeq, that the company reports will eventually provide a whole genome for less than $100.  

These technological advances and plummeting costs come at a time when more and more people are interested in learning genetic information about themselves or their family. The global consumer genetic testing market was worth $70 million in 2015 and is expected to rise to $340 million by 2022.  

The growth rate of genomic data acquisition over the past decade has been truly remarkable, with the total amount of sequence data produced doubling approximately every seven months.  

And the growth of this data is only going to increase as the scientific community and governments spearhead major genomic data projects. For example, both England and Saudi Arabia have announced plans to sequence 100,000 of their citizens, and researchers in both the United States and China aim to sequence 1 million genomes in the next few years. Already, one-third of Iceland’s 320,000 citizens have donated blood for genetic testing purposes.  

Scientists estimate that between 100 million and as many as 2 billion human genomes could be sequenced by 2025. It’s predicted that genomics will eventually create more digital information than astronomy, particle physics and even popular Internet sites like YouTube. Within 10 years, genomics is looking at generating between 2 and 40 exabytes of data a year. 

Data collection outpacing security  

Our ability to quickly and inexpensively generate and analyze genetic data is far outstripping the security measures used to keep this very sensitive personal data safe and private. This is a huge issue that needs to be addressed before we can move forward in realizing the transformative potential of this data treasure trove.

The enormous amount of genomic data to be generated in the near future poses a serious challenge in terms of storage and security. Already, consumer genetic testing companies have access to a vast amount of data. Not only do consumers have zero control over how this sensitive data is used or who can access it, but most of them have no idea what these companies are doing with that data.  

The truth is, there are few laws regulating how companies handle the privacy and security of genetic testing results. Genetic testing companies routinely sell data to third parties, and although it’s pseudonymised before it’s passed on, it may not be difficult in some cases to link identifying information to gene data.  

In the wrong hands, access to identifiable genetic information could lead to genetic discrimination or worse — not just for the donor, but for their family as well. And a serious breach of medically sensitive data could have a major impact on the development of genomic medicine. The current lack of controls over genetic data access is a landmine that could potentially make prominent recent financial data breaches look like child’s play. 

Bring on the blockchain  

If we are to realise the vast potential of genetic data, we must solve the current crisis in genetic data security and access. Luckily, blockchain technology can make this possible.  

Blockchain technology is a powerful tool. Like a massive ledger, the blockchain records and indexes data. However, instead of storing it on a central server, the blockchain stores data across vast networks of computers that constantly verify information with each other. This distributed and decentralised ledger system allows critical data be stored securely, with no way for it to be altered or accessed.  

This is ideally suited to address most trust issues in storing genomic data, such as patient consent, unclear data ownership, data integrity, or user authentication, enabling complex data rights management and fine-grained access using smart contracts.  

With permission-based smart contracts, access to genomic information could be authorised via the blockchain. This means that no one would get access to a donor’s own data unless the donor gives permission first. Users would also control which of that data is released and to whom.   

Unlocking the power of data  

Creating a system in which people are in control of their genetic data, and in which that data can be transferred safely and easily can help usher in an era of open, collaborative, and data-driven science that paves the way for precision medicine.

Genomic medicine can enhance the understanding and treatment of as many as 7,000 rare diseases, alongside cancers, complex and long-term disease such as cardiovascular and neurodegenerative conditions, and infections — and that’s just the tip of the iceberg.  

Currently, many people are understandably reluctant to share their genomic data. But if this data can be encrypted and anonymised and, crucially, remains under the ownership of the individual, far more people will be willing to participate in genomic sequencing.  

The more genetic data that is collected, the faster medical science can progress. Accurate, comprehensive interpretation of individual genomes will only be possible when the biomedical community has analyzed enough genome sequences representing the full range of natural human variation.  

A secure, blockchain-based system would also allow scientists to easily access the benefits of research in collaboration with other organizations, speeding progress even further. Without a better way to safely share data, there is a risk that data will remain in individual silos, thwarting the potential that can only be realised with access to the bigger picture.  

Ultimately, what genomics needs is trust — between scientists, institutions and DNA donors — that genomics data will be stored and shared safely. Blockchain technology can offer access to improved security, easier data sharing, lower-cost management, and better-quality “big data” analytics for genomic data. In turn, this will support the greater goal of maximising this important data to transform disease prevention and personalised healthcare.    

Dr. Axel Schumacher, CEO / Co-Founder of  Shivom (opens in new tab) 

Image Credit: Zapp2Photo / Shutterstock

Dr Axel Schumacher - 20 years’ experience in genetics; CEO/co-founder of blockchain-enabled genomic data-hub startup, Shivom, aiming to be the largest genomic & healthcare data-hub, securely stored with the help of blockchain technology.