Knowing what knowledge to share in data science

Data science findings, like laws and sausages, are something people generally like the results of but don’t pay much attention to how they are made and who makes them. Many businesses may be tempted to ignore the processes behind the snazzy insights that data scientists provide.

However this is disregarding an untapped resource, the data scientist is often delighted to share their methodologies with others. The only realistic limitation to sharing is restrictions around IP and patents.

I believe that to fully realise the potential that data science unlocks for a business, a relationship of collaboration and mutual sharing of knowledge should be adopted.

It is much more powerful to share knowledge throughout a business at key stages of a project. Whether you’re a third party data science provider or an in-house team, sharing knowledge with a business is one of the best ways to strengthen a relationship and maximise the value of data science.

Not only does knowledge transfer build confidence in the integrity of your findings, it also opens the door to widening the remit of data science. It equips decision makers with the knowledge required to realise that data science expertise could be applied to other areas of the business. It also helps increase the longevity of a relationship by ensuring that there are more stakeholders within the client who understand how, why and what you do. This in turn promotes ‘institutional knowledge’, the creation of a reservoir of good faith and expertise that does not dry up when a few people leave the company.

In an ideal world, knowledge sharing creates a virtuous circle where key players within an organisation understand the significance of data science. This encourages them to provide data scientists with more information and requests relating to other departments, thereby maximising the value of data science to the organisation.

What should knowledge sharing entail? There’s a big difference between presenting a finding and saying, via a few PowerPoint slides, how you reached your conclusions, and building a comprehensive programme of training and collaboration. An ongoing conversation should be instigated between staff to help them understand how various aspects of data science models were created. To ensure that the knowledge share has more longevity, clear documentation should be created that charts the entire process and results. This will help to create the institutional knowledge I mentioned above. Of course, it is to be expected that the knowledge sharing will be limited to a degree by proprietary techniques, IP and patents.

Data science is a relatively new discipline. Many people struggle to understand the leap from the world of web and mobile analytics to data science with its machine learning, probability models and pattern recognition.

The first step to creating a quality knowledge sharing system is identifying stakeholders within the company that will be best placed to disseminate what they learn to their colleagues and provide useful feedback. Identifying these stakeholders will be integral to promoting data science and its abilities to other departments. Second, the conversation between organisations and data scientists needs to impart knowledge that is understandable and easy to digest. It needs to relay the most important information in a succinct manner. This can be facilitated by working out key development points and the best way to convey this information. It’s fairly futile to put the time and effort into creating a knowledge sharing process if the only result is that you confound the parties involved.

Sharing knowledge shouldn’t be a nice added extra to the data science process – it should be intrinsic to how data scientists work with a business – especially given the chronic shortfall of skills within a business. A recent Teradata survey discovered that UK graduates lack skills for data analysis and 60 per cent of businesses find difficulties in sourcing skills for big data projects.

If you’re a business leader it can be tempting to ‘black box’ data science and simply collect the results. However, it’s much better to treat the entire process as a collaboration. If your data science team or consultancy is unwilling to go along with this, you should question where their priorities in your relationship lay.

In contrast, if you’re a data scientist and the wider-business shows no interest in learning more about the data science, it may be an indication that they don’t understand its full value. The difference to a business of seeing the overall potential of data science is immense, it is seeing data science in only once context compared to multiple possibilities. It is incumbent on the data science team or consultancy to ensure that decision makers in the business are aware that mutual collaboration has a crucial part to play.

It’s easy to get into a comfortable routine of providing a service and focusing on the results rather than the process. The truth is, no matter how mature a relationship with a client is, or how entrenched an existing internal system, it’s never too late to change how it all works. You don’t employ experts simply for what they produce, you do so because they can share their wisdom and skills with the entire business.

Building an ongoing conversation between data scientists and organisations has no down sides and should be something every data scientist or business leader considers. We have much to learn from each other – and we learn better together.

Anthony Mullen, head of research and development at data science consultancy Profusion