Skip to main content

Big data: How to choose a storage option that works for you

Big data is in full swing and it's time for your business — big or small — to get in on the action.

But, where do you begin? There's no easy choice for which service and provider is right for you and your business. And while the variety of options may seem daunting at first, once you understand the company's goal with big data, you'll be grateful for them.

After deciding to implement big data, you'll want to decide whether to store data on site or in the cloud. Being realistic with the purpose for implementing big data and what resources are currently available, will go a long way in determining which storage option will best meet your company's goals.

There are advantages and disadvantages to each.

Personal involvement

How much do you and your team want to be involved in the data gathering and storing process? On-premise storing generally means that you employ an IT team to monitor and fix any issues that arise. It gives you more control over the data, but also more liability for anything that goes wrong.

If you go with the cloud you eliminate the need for an IT team and eliminate any liability for hiccups, but you also lose some of the customisability.


This is, arguably, the strongest component of an on-premise system. You have 100 percent control over the data and who can access it. There are no middlemen or internet transfers to worry about.

That's not to say, however, that big data in the cloud is unsafe. Understanding that their success lies in the rapport developed with clients, cloud companies are doing everything in their power to keep data secure.


On this one the advantage goes to cloud storage. A certain infrastructure has to be in place for an on-premise site to be established. That takes time, money and personnel and it's not something that can easily be moved if the need arises.

For the cloud though, mobility comes naturally. There's no hardware to worry about, it's all via the internet. A change in location isn't going to throw any kinks into your data service.


This is difficult to gauge. One thing for sure is the upfront costs for on-premise storage are much higher than the cloud. As we mentioned, an infrastructure has to be developed and the hardware installed. That doesn't have to be done for the cloud. Somehow companies need to recuperate expenses for storage, development and monitoring the safety of your data.

They pass those expenses on to you in one form or another. That being said, if you establish an infrastructure on site and end up getting more or less data than anticipated you're in a sticky situation.

With more data you've got to install more infrastructure and more hardware. If you don't have enough data then you've wasted thousands of dollars in unused hardware. You don't have that problem with big data cloud storage. You can store as much or as little as you need.

To make the best decision, both the IT and executive teams need to be actively involved in the process. Both are going to be interacting with the data, albeit in different ways. What the executives see as smooth and seamless, the IT team may not and vice versa. A middle ground should be reached to ensure success on both technical and business aspects.

Today there are a plethora of options to choose from including:

  • Google Compute Engine
  • Amazon Web Services
  • Qubole Big Data as a Service
  • Microsoft Azure
  • Oracle Data
  • Teradata Portfolio for Hadoop
  • IBM InfoSphere BigInsights
  • SAS

The options are endless for what you can do with big data. By understanding what your company's goals and objectives are you can then choose the best provider for you and your situation.

Gil Allouche is the Vice President of Marketing at Qubole, a vendor of big data cloud storage.