Skip to main content

Tips for tackling dark data on shared drives

data
(Image credit: Shutterstock / whiteMocca)

With the pandemic pushing life further online and remote work becoming the norm, companies face growing risks and challenges with “dark data” – data collected and stored but never used. Much of this “dark data” hides in documents, spreadsheets and content scattered across the enterprise on shared drives, Microsoft 365 and SharePoint. 

According to ASG’s recent survey report, What’s Slowing Modernization? Barriers Hindering Enterprise IT Systems and Content Management, the top content management challenge is employees saving content on shared drives/personal drives (41 percent), which is happening at 58 percent of respondents’ organizations. This way of working slowly builds a seemingly bottomless lake of mismanaged, ungoverned data. Although many companies had initial success in managing structured data, unstructured data remains a dangerous black hole for enterprises, putting the organization at risk of non-compliance with privacy regulations and making it vulnerable to steep fines if this information is not properly managed and governed.

As the explosive growth of records and “data hoarding” continues in the new normal, organizations need to rethink how they find, classify and manage critical, personal and sensitive information across the enterprise. Here are some tips for organizations seeking to track down unstructured “dark data” living in collaborative spaces. 

1. Prioritize “dark data” discovery in your governance practice  

Dark data is becoming an increasingly pressing issue in the enterprise – with IDG predicting that 93 percent of all data will be dark and unstructured by 2022. As organizations wake up to the scale of the problem, it is essential that they make it a priority to understand both the structured and unstructured data that they house within their vast stores of documents and records.

To start, an organization must catalog and map its entire data ecosystem, not just its structured data sources. Taking this first step enables organizations to classify and understand what data is useful, redundant and must be governed.  Embracing content and metadata management solutions that provide discovery and classification, at scale, using machine learning and AI puts a shining light on “dark data.” Using these kinds of solutions allows enterprises to automate cumbersome and time-consuming manual discovery processes and locate the potentially valuable and risky data scattered across the enterprise. 

Today’s content and metadata management technologies enable organizations to automate and streamline governance practices. Barriers to understanding and governing “dark data” continue to be lowered allowing organizations to improve usage and governance of data. 

2. Extend your information governance to shared drives, SharePoint and Microsoft 365  

Remote work is here to stay, and so too is usage of cloud-based collaboration tools and platforms such as Microsoft 365, SharePoint and Teams.  Organizations need to balance improved productivity as employees use these tools with mitigating risks associated to sensitive information being created and shared in these less secure environments. 

Organizations need to extend their governance strategies to shine a light on the “dark data” created and shared by employees, partners and even customers using these collaboration tools and platforms. Unintentional use of social security numbers, account numbers, phone numbers, pictures and images put organizations at the tremendous brand and financial risks with expanding privacy regulations. As more states and countries pass data privacy regulations that result in sometimes complex constraints for enterprises, extending governance to these tools and platforms is now a must. 

With today’s content and metadata management technologies, organizations can extend governance that encompass privacy considerations, automating processes and reporting, while minimizing negative impacts to end user productivity. Remote work and usage of cloud-based collaboration tools and platforms can continue to expand while “dark data” in Microsoft 365, collaborative workspaces, shared drives, and more becomes better understood and governed. 

3. Automate redaction and disposition policies 

While discovering untapped data across the enterprise can be beneficial to organizations in terms of both ROI and business growth, securing and protecting sensitive and personally identifiable information also needs to be a priority for businesses in today’s increasingly regulated world. 

A key aspect of data privacy regulations, such as the GDPR and CCPA, is the “Right to be Forgotten.” According to a recent poll, nearly nine in 10 U.S. voters want the right to be forgotten on the Internet. As more and more personal information is collected and stored and consumers take more ownership of their data, organizations need to ensure “dark data” isn’t undermining their compliance with these data privacy regulations. Companies need to tackle the right to be forgotten, and advanced retention and redaction capabilities are critical to not only ensure organizations are following through on their customers’ right to be forgotten, but also to overall regulatory compliance. 

By adopting a content services solution, organizations can automate processes such as migration, legal holds, redaction and dispositions based on business rules and user roles to prevent privacy issues. This includes predefined personal and sensitive content sets including names, addresses, telephone, gender, marital status and social security numbers. These solutions can also enable regular auditing and pruning of an organization’s databases, data lakes and unstructured data sources, cataloging the data that needs to be held for legal reasons or business value and archiving the rest.

Organizations must be sure not to let the very innovations that sustained their business in 2020 be their downfall in 2021. Collaboration tools and shared drives enabled the rapid shift to remote work, but it’s now time to address the risk they introduced, especially when it comes to information management. Not only does “dark data” increase risk around data privacy regulations and governance, but it’s also leaving value on the table. By identifying, governing and leveraging dark data, organizations can make better use of all their assets – turning what was at one time a liability into a truly competitive edge.

Kyle McNabb, SVP of Product Marketing, ASG Technologies

Kyle McNabb, SVP of Product Marketing at ASG Technologies, has over 25 years of experience in customer focused products and service innovation lifecycles, developing compelling content and implementing growth strategies.