I recently read a great article in Bioscience Technology that acknowledges how new technologies are generating significant amounts of data in the healthcare and life sciences sectors and that this is viewed as a positive development for these markets.
The article talks about the fact that data is no longer just being produced from lab equipment and corporate research projects, but that data is now also coming from consumer devices, from new-to-the-game companies like Apple and Google. The article talks about the increasing public popularity of wearable devices and mobile health/digital health applications, coupled with growth in the use of social media and analytics. This means that more and more data streams are now available to medical researchers looking to extract meaningful information.
These new technology companies (Apple, Google and so on) that have historically never been in the healthcare market, are very much part of it now. And because of their reach, and the public’s widespread acceptance of the technology they use to gather data (as evidenced by the success of the Apple Watch and all the companies producing ‘fitness wearables’ such as FitBits), huge numbers of people are generating huge amounts of data that traditional institutions in these markets would very much like to get their hands on.
The article also points out that one of the critical factors in the use of this new data is that irrespective of how data is sourced, be it from a scientific journal, an Electronic Media Record (EMR), a social media post or a wearable device, the data can only be analysed effectively if it has been semantically organised. In other words you need to be able to make use of it. However, another major factor in the successful use of this data is the extreme necessity for it to be absolutely 100 per cent trustworthy and authentic, and for it to be usable and accessible for years to come. As the author of the article points out, a clear data management strategy needs to underpin the use of such data, irrespective of its source (but especially if it’s coming from new, varied and largely untried sources such as consumer wearables).
Quite rightly in my view, the author concludes the article by saying that data harmonisation and discovery are key to extracting meaningful information from the data. I would suggest we add that this also needs to be underpinned by data integrity, as the more data there is, the greater the risk that a tiny percentage of it is unusable, and the impact of this could be huge on a healthcare organisation or research project.
At Arkivum we have a significant focus on the healthcare market and understand the importance of securing the integrity and authenticity of valuable data. In fact the security, provenance and authenticity of healthcare data is our top priority and we have designed this into our long-term storage products. Take for example one of our customers, the Bristol Genetics Laboratory which is based at Bristol Southmead NHS hospital. It delivers routine genetic testing services to the South West region, a population of approximately 5 million, as well as providing highly specialised services to the rest of the UK and internationally.
With the help of Arkivum, the laboratory implemented our cloud-based Arkivum/100 service to facilitate the storing of gene-sequencing data and digital imaging from the 30,000 genetic investigations it conducts annually. This includes a wide range of sample types using a range of molecular and cytogenetic techniques.
Many of these investigations, including those conducted using its Illumina NextSeq and MiSeq next generation sequencing (NGS) platforms, are resulting in a massive increase in the size of data sets (exacerbated by the fact that as the testing gets cheaper, clinicians are requesting more tests and thus increasing diagnostic yield). So right now, big data is becoming a very real challenge for all genetics laboratories, not just Bristol.
Over the next two years, Bristol Genetics Laboratory will generate more than 30TB of data from NGS services alone. The true value of that data is not yet known so it must be stored for the very long-term to be able to access and interrogate it over time as new data analysis techniques are developed.
Nik Stanbridge, VP Marketing, Arkivum
Image source: Shutterstock/McIek