The consistent increase in frequency and severity of data breach incidents, coupled with the introduction of data privacy regulations such as GDPR and CCPA (recently amended by the CPRA), is encouraging organizations to revisit their privacy operations and how they handle their consumers’ personal information.
Data discovery is the process of reviewing databases to identify personal information (PI) and determine whether it falls within California Consumer Privacy Act (CCPA) governance or is a permitted business exemption. PI can exist in any number of places within an organization.
The quest for better handling, management, and protection of consumers’ personal information begins with fully understanding the concept of “CCPA Data Discovery” and following the step-wise process to ensure CCPA compliance.
How Data Discovery Contributes to Data Protection
Organizations are churning out 44 zettabytes of data on an average daily basis, and the number just keeps increasing every year. However, the mass production of data isn't the primary concern here. Rather, a significant volume of data is in an unstructured format, scattered across emails, spreadsheets, invoices, IoT, or rich media. IDC forecasts that 85% of data will be in an unstructured format by 2025.
Helps Design Security Controls
A robust data discovery mechanism can help organizations gain high visibility into where the data resides in structured and unstructured systems, classify that data, catalog it under a single repository and tag it according to its regulatory status and its sensitivity and confidentiality. With this classification, organizations determine the security posture of the residing data in their hyper-scale environments and set security measures accordingly.
Eliminates False Positives
Moreover, there's a lot of ambiguity involved when it is about PI or sensitive information. Just about anything can be deemed PI, such as name, email address, social security number, credit card number, consumers' location, biometrics information, etc. With traditional discovery practices, IT teams tend to get lost in the ambiguity that the definition of PI carries under the CCPA regulation, which ultimately gives rise to false positives.
Consequently, IT teams spend 25% of their time and effort wading through false positive or false negative alerts, which drastically affect their productivity and the ability to take timely measures.
Smart data discovery systems help organizations save time by efficiently reducing ambiguity and resolving false positives using contextual analysis, artificial intelligence, and machine learning.
Higher accuracy in data discovery further enables organizations to ensure optimal security posture and practices and compliance with CCPA regulations.
The Role of Data Discovery in CCPA Compliance
Data discovery is how businesses collect data from different sources, analyze it, and link it to a consumer. This process allows the data to be properly discovered, cataloged, and protected to stay compliant with privacy regulations. Following are some of the ways data discovery helps organizations remain compliant with the CCPA.
Data Linking for CCPA Compliance
As per CCPA Section 1798.140(o)(1), the term personal information is defined as information that identifies, relates to, describes, or is reasonably capable of being associated with, directly or indirectly, a particular consumer or household. Examples provided by the CCPA include:
- Identifiers such as a real name, alias, postal address, unique personal identifier, online identifier Internet Protocol address, email address, account name, social security number, driver’s license number, passport number, or other similar identifiers.
- Any categories of personal information described in subdivision (e) of Section 1798.80 of California Civil Code (the California Breach law).
- Characteristics of protected classifications under California or federal law.
- Commercial information, including records of personal property, products or services purchased, obtained, or considered, or other purchasing or consuming histories or tendencies.
- Biometric information.
- Internet or other electronic network activity information, including, but not limited to, browsing history, search history, and information regarding a consumer’s interaction with an Internet Web site, application, or advertisement.
- Geolocation data.
- Audio, electronic, visual, thermal, olfactory, or similar information.
- Professional or employment-related information.
- Education information, is defined as information that is not publicly available personally identifiable information as defined in the Family Educational Rights and Privacy Act.
- Inferences that are drawn from any of the information identified in this subdivision to create a profile about a consumer reflecting the consumer’s preferences, characteristics, psychological trends, predispositions, behavior, attitudes, intelligence, abilities, and aptitudes.
Thus one of the most crucial parts of CCPA compliance is finding and linking the personal information of consumers within your systems to its owners. Data discovery can help an organization in this process by identifying data, classifying it, and then linking it to the owner of the data through effective data mapping. It can also help visualize the data sprawl by identity and identify compliance risks based on a subject’s residency, as per the CCPA.