Skip to main content

The secrets to big data tool and technology selection

In a fast-paced market, where new big data tools are introduced frequently, companies can be confused and unable to make an informed decision on the best-suited tools and technologies. The burden of siloed legacy systems coupled with the pressure to saving costs while increasing efficiency are the hurdles many face. From that perspective, it is easy to see why drastic changes could be considered both daunting and expensive.

The success of any big data programme relies heavily on the robust, scalable architecture built by using the right tools. Yet many companies ignore this and try to implement complex solutions on their old and slow systems that are simply not designed for the volume of data that is now influencing the business. Having the best technology in place that is tailor made for your businesses big data strategy and the wider business goals is key. But how do you do it and where do you start?

Proprietary or open source?

For many organisations, using open-source tools and technology is often preferable to their proprietary equivalents due to the much lower cost. The very nature of open source helps organisations avoid the vendor lock-in which many have suffered in the past - finding themselves held to ransom on price or proprietary solutions.

Considerations when selecting the tools

There are many competing open-source tools and vendors on the market, and many of them claim overlapping or undefined functional boundaries. So selecting the right tool for the job can be a challenge. A big data implementation is effectively a stack of lots of inter-related tools, some of which will be dependent on each other, will require complex integration, or will exclude the use of other tools.

Ideally, you'll test at least some of the options, but there's always the chance that the technology landscape will have moved on before you finish a proof of concept. So you'll need to maintain close contact with the open source community, and keep up to date with toolset development and overall development direction. You may even be able to influence the development direction, which is one of the hallmarks of the open-source ethos.

However, you may also encounter some dead ends. You may commit time to a tool, only to find the industry heads off in a different direction, leaving the tool unsupported and unmaintained. To minimise the risk of this happening to you, try to favour tools which have been widely adopted and have a large community of followers and contributors. To reduce the risk further, choose tools where the contributors on the project come from more than one company.

Rather than choosing a tool which does everything you need right now, but has a small following and may become a dead-end, the better choice may be a tool with a long life and features close enough to what you want, which can be extended or steered towards what you really need through open source participation. Striking this balance requires careful and frequent monitoring of the direction in which the tools - and the industry as a whole - is moving.    

Taking the long-term view

A big data initiative is normally a complex ongoing programme of work, rather than a single implementation project. Over time, you'll probably implement multiple use cases onto a shared platform, so it's vital to take a balanced architectural view on the tool stack.

Design a stack that supports your immediate needs, but keep sight of the longer-term strategic vision as well. Make sure your technology choices support your long-term aspirations, or can be extended to do so, and avoid an architectural dead-end.    

Four key considerations

1. Decide whether open-source or proprietary technology is right for your organisation

2. Watch the roadmap for any open-source tools; try to avoid single-contributor projects

3. Assess every angle of inter-tool dependency and avoid dead-ends

4. Design a balanced architecture to meet both current needs and long-term aspirations

Clearly, there are many options and variations when it comes to the tools needed for the adoption of big data, all of which can vary depending on the integration with the existing systems and the budget available for building the new infrastructure. However, one element that remains consistent is that building a big data strategy, in which the technologies play an important role, takes a long-term vision. As long as a business is looking to adopt big data tools and software through that lens, then navigating the congested market in terms of software, and how different technologies fit together, becomes much more manageable. Have a plan, stick to it and pick the best-suited tools, whether open source or proprietary, that are capable of supporting the long-term business needs, objectives and goals.

Rick Farnell is the Co-Founder and Senior Vice President of Think Big, a Teradata company. He is responsible for Think Big’s business across International markets.