Skip to main content

Data integration: DIY, or off-the-shelf?

(Image credit: Image source: Shutterstock/Wright Studio)

Data integration may not be one of the most headline-grabbing or exciting sounding elements of enterprise IT, but it is vitally important. After all, data sits (or should sit) under all business decisions. Data unlocks business intelligence, underpins business intelligence, and unlocks business action. The successful integration of disparate datasets can free up valuable developer resource to focus on new opportunities, get previous untapped information in front of key stakeholders, and ultimately be the engine of digital transformation. 

So it’s worth getting right. There are two main options for organisations seeking to implement effective data integration: they can build a tool in-house using available development resource or, buy an off-the-shelf integration-Platform-as-a-Service (iPaaS) solution. The question is – which?

Why do you need a data integration solution?

A single view of your activities is essential in order to make informed decisions. However, a single view rarely happens organically. Siloes of data in conflicting formats, using various terminologies and different coding are the norm. Data integration is the process of bringing those disparate parts together.

How is it achieved? Sometimes it can be accomplished manually by coding scripts or deploying simple tools. More complex demands call for a data integration platform, a product to quickly extract data from a number of systems, so it can be viewed, transformed and prepared.

From there, you can create a unified dataset, and from there you can make that dataset for use in other IT systems or departments such as a reporting and analytics application.

Traditionally, data integration processes begin by extracting data from all those disparate sources. Then that data is cleaned, getting rid of duplicated, corrupted or inaccurate information. Then it is made compatible enough to provide a single view, often via a data warehouse. Sometimes it is converted into an intermediate format or schema.

These processes can take a long time and a great deal of skill to develop, particularly in large or complex organisations. But integrated platform as a service (iPaaS) platforms are changing the picture. By providing hybrid integration functionality, they allow organisations to query each type of database directly and bring them together in one data view, all without building expensive data warehouses or other storage solutions. The result is a significantly faster data integration process at a fraction of the cost – broadening the appeal of data integration to a much wider audience, and a significantly larger set of use cases.

DIY vs off-the-shelf: pros and cons

The choice, then, is whether you want to do it all yourself, or rely on software from a data integration vendor.

The obvious advantage of DIY is control. You can tailor the application to your specific needs, with no unnecessary additional or unused features. This can also make for a more controlled budget.

However, as the demands on a home-grown solution change, whether through business growth, new requirements, changes to source and target systems and so on, the likelihood is that self-coded scripts will begin to run out of runway. It may become difficult, for example, to access new resources in the cloud effectively, information is inaccurate or arrives late. The effort needed to maintain the current system then grows increasingly burdensome, and there is a greater risk of error. The entire situation is compounded by the risk of in-house developers leaving and skills or knowledge gaps opening up.

With an off-the-shelf solution, by contrast, your in-house resource can focus immediately on building and automating new data management processes, freeing up time for the really important tasks. With cloud solutions now available that allow non-technical people to create data views from multiple data sources, the need for skilled technical people to always be on hand for every small integration effort is significantly reduced. Now, smaller organisations that never thought they would be able to afford a data integration solution, can start to master their own data. Support is offered out of the box, so there is no initial coding effort required, and some easy data preparation operations such as sorting, deduplication and reformatting are built in.

On the other hand, the vendor may have a product vision which does align closely to yours. Not all features or add-on products may be required, and additional consultancy might be required for particularly complex projects.

Questions to consider

As ever, there is a balancing act to strike. Broadly speaking, the DIY approach is best suited to small organisations with relatively simple needs, or else enormous organisations with huge in-house resource; the likes of Uber, Amazon and HSBC have all managed successful data integration projects in-house.

In turn, this means that understanding whether the time has come to buy an off-the-shelf data integration solution is all about having an accurate understanding of where your organisation is in its development. How rapidly are you likely to grow, and what impact will that have on the complexity of your data? What internal resources are available to run, develop and monitor a solution, and what will that resource look like in the future?

In order to establish whether you have passed the ‘tipping point’ into benefitting from an off-the-shelf solution, it is worth asking these five questions:

  • Is business growth outstripping the capability of existing tools and reporting?
  • Are previously adequate manual methods now error-prone and struggling to cope?
  • Is there currently nothing in place to bring multiple data sources together and unify them?
  • Is your data (warehouse) no longer answering the questions you need insights into?
  • Do you now have an opportunity to access richer, more timely insights from an ever-changing customer base?

If the answer to all or most of these questions is ‘yes’, then it may well be time to look at iPaaS. Such solutions are now being adopted across the entire spectrum for specific use cases where other solutions would be too expensive, cumbersome, time consuming and heavy weight to consider – or where an existing in-house tool may be failing to deliver the required results. Once you’ve answered the above questions, and perhaps more besides, you’ll be in a better place to make the right decision for your business.

Carlos Oliveira, CEO and founder, SPINR (opens in new tab)

Carlos is the CEO and founder of SPINR, leveraging his background in software and web application development in the financial services industry. He is an advocate of data management for all.