Skip to main content

The three leagues of data literacy – and how to play to win

(Image credit: Image source: Shutterstock/alexskopje)

As organizations ranging from baseball teams to tech start-ups continue to gather reams of data, the conversation is shifting from terabytes to zettabytes. Yet collecting all of this data is not the same as using it. While the sabermetrics revolution that swept through baseball forever changed the game, we haven’t yet seen an equivalent movement in business. In fact, companies leave a staggering 88% of data they collect on the cutting-room floor. 

It’s true that there’s been some progress in the business world. For starters, data scientists (the three-time consecutive winners of jobs site Glassdoor’s “Best Job in America” honor) are recognized as star players at organizations of all sizes. Someone has to make sense out of all that data, right? Yet the demand for data expertise is far outpacing the supply. IBM predicts data science job demand will soar 28% by 2020. Compounding the problem: the amazingly high demand for data science and analytic professionals leads to shorter tenures, with almost one in four changing jobs each year.

Therefore, as a business leader, it’s time to recognize that exclusively depending on data scientists is not only impractical, but outright irresponsible. In the same way a manager of a sports team doesn’t need to play the game but does need to know the game, everyone making critical business decisions doesn’t need to be able to code in Python but does need to be “data literate.”    

What does data literacy mean?  Broadly, it’s the ability to read, create, interpret, and communicate data as meaningful information. There are, in fact, three leagues of data literacy that drive business impact, and you need to be able to compete in each of them. 

The first league of data literacy is lexical. Just like baseball or any other specialized field, data science has its own vocabulary. Of course, to enjoy a ball game, you don’t need to know the difference between RBI (runs batted in) and ERA (earned run average), but you do need to know something about the roles of batters and pitchers. Similarly, while every business leader doesn’t need to become an expert in the difference between data science terms like SGD (Stochastic Gradient Descent) and GANs (Generative Adversarial Networks), they do need to start learning the vernacular to participate in data discussions.   

As a non-data scientist who is the president of a data science training company, I can share some of the key concepts I’ve found most valuable to understand as we’ve grown and developed our training methods:   

  • Core statistics and probability concepts, such as mean, variance, fit, error, confidence intervals, sample size, and statistical significance 
  • Model training, feature engineering, and overfitting Structured vs. unstructured data 
  • The difference between machine learning, deep learning, and artificial intelligence 
  • Common algorithms (like linear regression, classification, clustering, and sequence prediction) and their purposes 
  • The difference between data science and data engineering The separate phases of data science (i.e., data collection, data preparation, data modeling, data interpretation, data visualization). 

The second league of data literacy is cultural. The prominent data scientist Carl Anderson argues that becoming a truly data-driven organization “requires establishing an effective, deeply-ingrained data culture.” This is no easy task, but it’s never too late to start working towards. Cultural data literacy should be adopted at all levels of an organization, especially by leaders who interact with data scientists and analysts. You can recognize a culture of data literacy as one that cultivates the following actions: 

  • Understanding how data is collected and used and recognizing that data science is only as good as the data quality 
  • Moving beyond reports, dashboards, and alerts to actions and recommendations 
  • Embracing testing, experimentation, and constant iteration 
  • Sharing rather than hoarding data 
  • Prioritizing inquisitiveness, curiosity, and goal orientation in the hiring process
  • Designing jobs so that data scientists can excel at what they do best. 

As every good team knows, strong leadership can mean the difference between playing in the bush league and the big league. It’s up to you to instill this culture from the top-down. 

Finally, the third league of data literacy is strategic. As a decision-maker, you must know what problems to consider, what questions to ask, and especially what assumptions to challenge.    

As Florian Zettelmeyer, a professor at the Kellogg School of Management, says, “The most important skills in analytics are not technical skills. They’re thinking skills.” Here’s a fact that I think encapsulates the technical skills vs. thinking skills issue he’s talking about: although there are more than 100 machine learning algorithms to consider based on data size, structure, complexity, processing speed, and other factors, each one is worthless if used in response to the wrong problem. This has hugely significant real world implications. In fact, data quality problems cost the United States $3.1 trillion per year, according to IBM.

The plain fact is that strategic data literacy drastically improves your team’s chances of hitting home runs. Consider Coca-Cola’s decision to launch New Coke based on 200,000 instances of blind taste-test data. Unfortunately, the company only asked respondents if they liked the new taste, not if they would be willing to switch. Another example: the “Big Three” credit bureaus, Equifax, Experian, and TransUnion, collect 4.5 billion pieces of data each month. In what’s now seen as a world-famous strikeout, Equifax failed to ask the questions that could have prevented a data security breach costing the company almost $10 billion in market value. So take a lesson that the sports world grasped years ago, and the next time that your data geeks start debating the latest models in your field, don’t tune out. Whether you fully understand the conversation or are still in the data science dugout, taking the first steps toward playing in the three leagues of data literacy will help you and your company get set up for a grand slam.   

Jason Moss, President and Co-Founder of Metis 

Image Credit: Alexskopje / Shutterstock

Jason Moss
Jason Moss is President and Co-Founder of Metis, a data science training provider offering full-time immersive bootcamps, evening part-time professional development courses, corporate programs, and online courses and resources.