Big data has grown exponentially over the past few years - and its evolution shows no sign of stagnating.
Research predicts that the digital universe will grow to an incredible 163 zettabytes, (or 163 trillion gigabytes) of data by 2025. For reference - one zettabyte equates to roughly 2 billion years worth of music (enough to power you through a pretty long work week…)
We can apply this data across a variety of different concepts - like creating custom learning models for students or offering more personalised healthcare. But there is still a considerable amount of uncertainty wrapped around both the analysis and the deconstruction of big data.
Much of the apprehension behind the use of big data arises from the fact that 80 percent of it is unstructured - think audio, video, and social media. Equally, big data is difficult and cost-heavy to process and analyse, and by large is being generated much faster than we can feasibly keep up with.
Still - big data is predicted to evolve considerably in the next few years. Many businesses have started planning the implementation of big data into their strategic arsenal. As the technology continues to grow, we can start to envisage its impact on our lives before the dawn of 2021.
We caught up to speed with seven tech experts on their predictions for big data over the coming two years. These are their thoughts:
Data science will soar
Harry Dewhirst, President at Blis, a mobile location data solutions provider, stated “I recently read that the Harvard Business Review dubbed this role the ‘sexiest job of the 21st century.’ There is no denying that data is going to be the currency that powers our economy moving forward; we are already well down this road. Which means data scientists will continue to drive the future.
“It’s critical businesses start planning for the integration of data scientists into their organisational structures now, and perhaps more so for colleges and other educators to provide more opportunities for future workers to explore this field. Data has staying power, it’s not going away any time soon.”
And he’s right - data science has become one of the rapidly progressing fields, thanks to the crucial role it plays in understanding big data.
The Quant Crunch, a report by IBM, estimated that up to 2.72 million jobs requiring big data science skills we be posted by 2020.
Skipper Seabold, Co-Lead of Data Science R&D at Civics Analytics, went on to explain the anatomy of data science jobs by 2021.
“The role ‘data scientist’ will cease to be a specialised position that people hire for. The data science toolbox will become a set of skills that people in various functional roles within an organisation are expected to have. Most data scientists will no longer have to think about distributed systems – Hadoop, Spark, or HPCs. Old technologies, like traditional relational databases, will catch up in performance and capabilities to these technologies, and the need to think about and program for multiple machines connected over a network will be removed by tools available through the big cloud providers.”
Accessible big data
Sam Underwood, VP of Business Strategy at Futurety, a data analytics and marketing agency, predicts the accessibility of big data:
“By 2021, big data will become much more accessible, and therefore much more useful. A key challenge for many enterprises today is unifying all of this data; by definition, this is a big job! Building data lakes and other flexible storage environments is a major priority in 2018, and we predict that by 2021, much of this critical data will be housed in systems that are much more accessible by the tools that will use them (visualisation, analysis, predictive modelling). This opens up limitless possibilities for every aspect of business operations to be purely data-driven.”
Exactly - merely collecting and processing big data is insufficient. If business end-users and decision-makers within companies aren’t able to digest data, they will of course struggle to find value in it.
Jeff Houpt, President of DocInfusion and enterprise app developer of over 15 years, reinforces Sam’s sentiment with his own insight.
“I see the landscape for big data evolving from highly technical and expensive to more self-service and on-demand methods where the resources you need spin up automatically and you are only charged for what you use. Really, in today’s landscape to analyse big data you need massive or expensive infrastructure to capture, catalogue, and prepare the data for use. Then to query and analyse the data you need to have the skillset of a very technical programmer/mathematician or data scientist. I think that there will be platforms and apps that continue to make these tasks easier and more intuitive, and within 3 years we are going to get to a point where you feed the data straight into a single application that will handle all of the remaining details for you – and do it at scale.”
“I also think that through the use of artificial intelligence (AI) and machine learning concepts the applications will be able to automatically understand your goals by using knowledge obtained from past users who have done a similar task. This will allow the systems to optimise the data for specific purposes with very little feedback from the user.”
Natural language processing
Locating relevant data as fast as possible could be facilitated through natural learning processing - a subset of AI which dissects human language for machines to understand.
KG Charles-Harris, CEO of Quarrio – a conversational interface for enterprises – argues:
“The most fundamental prediction for big data is that by 2021, information retrieval from big data repositories will be done using natural language and be instantaneous. People will just ask questions in normal language and the system will answer back in ordinary language, with auto-generated charts and graphs when applicable.”
Database as a service
Ben Bromhead, CTO and Co-Founder of Instaclustr – an open-source big data technology solutions provider – talks about the relation of DBaaS and big data.
“We expect to see Database-as-a-Service (DBaaS) providers really embrace big data analytics solutions over the next three years, as they adapt to serve a fast-growing client need. Enterprise companies have been collecting and storing more and more data, and continue to seek ways to most efficiently sift through that data and make it work for them.
By integrating big data analytics solutions into their platforms, DBaaS providers will not just host and manage data, but also help enterprise clients to better harness it. For example, Elasticsearch is a powerful open source technology we’ve become quite familiar with that enables developers to search and analyse data in real-time.
Expect this and similar technologies that put developers in command of their data to become increasingly prominent within DBaaS repertoires.”
Our final big data prediction comes from Jomel Alos, Online PR Lead of Spiralytics Performance Marketing, a data science-backed marketing agency.
“One of the biggest issues right now for big data is the clutter and incorrect data. Most companies right now have their own cleansing framework or are still developing theirs. Eventually, cleansing and organising will be automated with the help of various tools. Because big data is not static, these tools are also expected to automate the cleansing process on a regular basis.”
Indeed - for quick data-retrieval to occur, big data needs to be cleansed for quality and relevancy. Back in 2016, an estimated $3.1 trillion was lost in the US as a result of poor data quality. Hence why ‘scrubbing’ through processed data is gaining relevance globally. Such procedures do involve an intense amount of work from data scientists. No wonder then that 60 percent of data scientists stated they spend most of their time cleaning data for quality. Once these processes are able to be automated through the use of AI and machine learning, as mentioned by Jomel, real progress will be made.
Charlie Smith, MD, EMEA, Blis
Image source: Shutterstock/wk1003mike