"Companies will incorporate big data to the extent it's just data": Q&A with Pentaho's Davy Nys

We sat down with Davy Nys, vice president for EMEA and APAC at business intelligence specialist Pentaho, to learn more about the company, its core product offerings, and the direction of big data heading into 2014.

Pentaho is an unusual name for a company. Where did it come from?

Pentaho was founded by five people in 2004, so that's where the 'penta' came from. This was then adapted to Pentaho to make it a completely unique name, great for online searches. It doesn't seem to translate into anything embarrassing in other languages, which is very important to me as we expand internationally!

When and why did you join Pentaho?

I joined Pentaho back in October 2007 because I thought that joining a young startup would be fun, interesting and give me a lot of potential to use my engineering and sales skills to grow my career. I also believed in the team behind Pentaho. When I was started out as an eager territory sales manager, I wouldn't have dared to dream that only six years later, I would be on a tour of Australia and Japan in my current role of heading up the EMEA and APAC regions. Pentaho gives you as much responsibility and opportunity as you're willing to take on and has been an incredible experience for me and my team. It's definitely a 'work hard/play hard' culture!

Another thing that I really liked about Pentaho is that its software is built on the commercial open source model, so instead of selling software licences and maintenance plans, we're selling annual subscriptions and services. This means that we have to put the customer at the centre of our universe and treat them right so that they continue to renew and invest. I think this creates a much better sustainable business culture and commercial model than what the traditional, proprietary vendors offer.

Pentaho is known for providing big data services. Can you explain more specifically what you offer there?

First, let me say that big data is not some 'me-too' bandwagon we decided to jump on. We got involved with big data in 2010 - back before it became fashionable - when we announced we were the first business intelligence platform to support Hadoop, the most popular big data store today. So that's the gamble we took then and it's really paid off because, today, about a fifth of our roughly 1,000 enterprise customers are actively working with big data and we think this is really only the tip of the iceberg.

Fast forwarding to the present, our latest platform (version 5.0), that we announced recently, is completely re-architected to support businesses working with any big data source at any stage - from when they initially start experimenting with it to when they get to the point that they are embedding it into every aspect of their business. Companies in the latter category are what we're referring to as the 'data-driven business' and we think most will move in this direction, albeit at very different rates depending on the industry.

Who are some of your customers and how does Pentaho benefit them specifically?

High-growth consumer internet companies like Housetrip, Travian Games and Spreadshirts are all using Pentaho to understand their customer behaviour by analysing clickstream data to improve their products, services and target more effectively. But it's not just the young online companies. Large, established companies like Lufthansa, for example, use Pentaho to analyse data and improve passenger handling processes. A lot of the new functionality we introduced into version 5.0 specifically benefits large enterprise use cases.

What can you do with Pentaho that you can't do with any other software?

One of the most unique aspects of version 5.0 is the ability to do what we call 'data blending at the source.' Without going into too much technical detail, other products make you export data first from various host systems, move them into a 'staging' area, join them together manually, and then analyse the data. Normally, by the time this long, fiddly process is complete, the data - which was probably not clean in the first place - is outdated. Using our approach, data analysts can blend and analyse data directly from the source and also from big data stores like MongoDB, where sources like machine data from Splunk are stored, then blend and analyse them on the fly. This is incredibly important for data-driven companies, like Internet firms, whose every business decision depends on a current and accurate understanding of information.

Is your software only for the deep techies, or can anyone play?

The short answer is that it's for everyone in an organisation who's involved in managing, analysing and making decisions based on data. Without question, that involves IT, because in order to analyse data successfully it needs to go through some kind of hygiene and governance process first. Where some analytics applications have decided to sidestep IT and tell customers they can make do with data that's 'good enough,' we've taken the much harder route of simplifying and speeding up the data management processes on the back end. We just don't see the logic of putting pretty pictures on top of dirty data.

Another major change we've made in Pentaho 5.0 is to completely overhaul the look and feel so everyone on the business side, from your top executives to the administrators, can engage with the software more productively and easily.

As a commercial open source company, do you think the global economic recession has helped or hindered your business?

Without question, it's helped our business because the commercial open source model is so much more attractive to buyers than the traditional license model in this climate. Not only does our software work out to be around 20 per cent of the cost of proprietary alternatives, the ability for organisations to reuse and extend our software without having to pay royalties means that they can control their costs much better over time.

Besides, the cost or value argument is also the benefit argument. Open source software is advancing fast all the time thanks to the contributions from our massive, loyal community. There is almost no way a traditional, closed vendor could adapt their software fast and comprehensively enough to conform to all the changes in the big data ecosystem without making it prohibitively expensive.

What's Pentaho planning to focus on in the next year or so in terms of improving product and services?

We will continue to develop more and better support for big data analytics, and also invest further in the future of analytics, which may include things like predictive analytics. We recently announced Pentaho Labs, which is an initiative to experiment with the most cutting edge technology in the realm of analytics and look forward to sharing some very exciting stuff with you!

Where do you think the whole big data industry is heading? Do you think big data will eventually become so mainstream that it will just start to become 'data'?

Although relational data won't be disappearing any time soon, there is absolutely no doubt in my mind that most companies will eventually incorporate big data to the extent that it is just referred to as data. In fact I think that we will look back on the time when we used the term 'big data' with some nostalgia!

In a nearer sense, I think we are going to start to see some really interesting applications for predictive analytics. Big data is actually driving this because predictive, as an idea, has been around for about 20 years. It wasn't until it became possible to analyse the massive amounts of data in the machine and online worlds that the idea of being able to make predictions with data became possible.

I am really excited about what the future holds and I am sure Pentaho is going to continue to go from strength to strength in this brave new world!