Join our webinar on democratizing data in the cloud with Forrester, Snowflake and TIAA - Sign up hereStart Now
Published on September 28, 2021 AUTHOR - Privacy Research Team
Companies that are not leveraging big data may face imminent extinction, suggests a survey by Accenture.
Data is an invaluable asset that allows organizations across the world to accelerate growth and foster innovation. But to examine that data and derive meaningful insights, it is crucial for teams to have seamless access to that precise data.
Here, data discovery plays an integral role in helping organizations discover the data, classify it, and catalog it. Apart from commercial purposes and gains, data discovery enables organizations to fix security issues, mitigate risks, meet obligations, such as NIST, PCI, HIPAA, GDPR, and CCPA, respectively.
Data discovery helps organizations to keep a track of the personal or sensitive data they collect, how they collect, whose information they store, how they assess data risks, who have access control, and how they protect it. Under certain regulatory obligations, organizations also need to maintain a report of processing activities (RoPA).
The report enables regulatory authorities to assess the organization’s compliance with the policies. However, data discovery is challenging for organizations that deal with a massive volume of data.
According to an IDC survey, commissioned by Ermetic, 64% of CISOs and IT leaders agree that a lack of visibility into access management and processing activities mainly contributes to cloud security breaches.
Data Intelligence (DI) unifies and harnesses the power of Artificial Intelligence (AI), Machine Learning (ML), and Natural Language Processing (NLP) to cater to data discovery challenges, and have detailed insights into the information that hyper-scale enterprises collect and process.
Data Intelligence equips organizations with robotic tools that allow them to look through a variety of data, classify it, and catalog it under searchable labels or metadata. Enterprises can further use DI to interact with the information in a meaningful way, assess data risks, access control, and meet security or privacy requirements.
At Securiti.ai, our Data Intelligence workflow takes the following approach:
Sign up for a Demo to check Securiti’s Sensitive Data Intelligence in action.
Enterprises require effective Data Intelligence solutions when:
The digital landscape is experiencing a flood of data that is being produced at a massive scale. This has given rise to data lakes that provide enterprises with an economical means to store and mobilize data at scale. This has led to the increased market size of data lakes which is now forecasted to grow to $17.60 billion by 2026.
Data scientists and analysts require access to data lakes to run big data analytics and translate them into actionable and meaningful insights. But to successfully do that, they need to know where the required data is in that massive data lake.
Enterprises are migrating to the cloud to cater to their growing volumes of data or to maximize the technologies that different cloud service providers (CSPs) offer. Here, enterprises need to assess the type of data that can be transferred to the cloud and the data to hold. Security and privacy regulations tend to vary for local and international data transfer and storage. Secondly, once the data is in the cloud, enterprises need to keep track of all the data assets in the cloud, the data in those assets, and the access control.
Structured data is something that is available in processed form and that can be used in any model. Unstructured data is a heterogeneous collection of data that is raw in nature and requires further processing.
Experts believe that 80% to 90% of data in companies is usually in unstructured form. If done manually, it would take hundreds of hours of human labor to plow the data for processing.
Data mapping is integral as it allows enterprises to ensure not only data governance but also to meet privacy regulations. For example, GDPR laws require enterprises to keep and maintain RoPA to demonstrate compliance.
Ever since the EU’s General Data Protection Regulation (GDPR), organizations are now required to honor data subject requests. GDPR empowers data subjects to have better access, visibility, and control over their data.
But the challenge that most organizations face while honoring DSRs is the lack of visibility into the data they hold, access control of the data, and the type of data that falls under privacy obligations.
Automation is the keyword in Data Intelligence as it delivers speed and efficiency.
To get started, organizations first need to discover the data assets and data across multi-cloud platforms, data lakes, and data warehouses. It should also include the discovery of shadow data assets that organizations have on legacy systems. Configuration management databases (CMDBs) also need to be scanned continuously as more data assets are added to the framework over time.
After asset discovery, it is important to discover the structured, semi-structured (Avro, Parquet, etc), and unstructured data in that sea of data assets. The automated data discovery system should integrate a high-efficacy data detection system. The system must be effective enough to discover and classify personal and sensitive data attributes that are needed to be handled as per regulation policies like GDPR, CCPA, etc. The elements will further need to be applied to different policy-based, security, and private labels.
Now, bring all that discovered data assets and data into a single repository. The repository is where the organization can sort data by its sensitivity labels or content profile. Furthermore, the administrators then need to catalog the security controls associated with each data.
The next requirement is to link the data to specific data owners and identities. The discovered structured and unstructured PI need to be mapped with the users. Data mapping plays an important role in complying with the data subject rights (DSR) and breach notification policies.
Enterprises can mitigate and remediate risks effectively when they have to know the inherent risk that any data sets carry. To determine the inherent risk, enterprises need to analyze data sensitivity, location, and residence, along with other indicators of risks (IoR), such as data transferred across borders, copies of data, collection of new data, etc.
The next step is to identify the security posture of your data assets. Scan for security misconfigurations associated with your data assets. Security posture allows enterprises to enforce the best practices while configuring their data assets, ensuring compliance with industry standards (PCI DSS, HIPAA, GDPR, etc.), and deploying native data system security best practices.
Finally, enterprises can map the access control with the different security and privacy regulatory frameworks where applied. This will enable the company to produce an audit and evidence report demonstrating your compliance with standard regulations.
Check out our webinar to get more insights into Data Intelligence, its importance, and its application.