Securiti AI Launches Context-Aware LLM Firewalls to Secure GenAI Applications

View

ROT Data Minimization : Reducing Data Attack Surface & Cost

Published December 11, 2023 / Updated December 20, 2023

Listen to the content

Data breeds growth and, most importantly, innovation. How does Generative AI (GenAI) do all the wonders? It is the data- the crown jewel that makes machines talk human language but in a smarter way.

But, when data is left unchecked, it can pose significant risks to organizations. Studies estimate that many large enterprises spend as much as $34 million dollars on redundant, obsolete, and trivial (ROT) data that could safely be deleted.

Useless business data doesn’t cost organizations only money. It puts them at risk of operational inefficiency, unauthorized access, data breaches, and legal penalties, particularly due to prolonged data retention.

Organizations are now shifting from a “collect it all” to a more proactive approach, i.e., Data Minimization, to address this challenge. The approach involves more than just reducing ROT data.

Continue reading to learn more about ROT data and how data reduction helps eliminate ROT to reduce attack surface and cost.

Why ROT Should Be a Concern for Businesses?

Reducing ROT data is crucial for businesses aiming to bolster their data security posture and streamline data management practices.

The acronym (ROT) encapsulates Redundant, Obsolete, and Trivial data. ROT is any piece of information that organizations continue to retain even after it has served its purpose or has no operational or legal value. To put things into perspective, it is estimated that over one-third of the data that organizations store is either Dark or ROT.

Dark data may be useful to an organization but hasn’t been used to its maximum potential for any decision-making or operational task. It is called Dark Data because the organization doesn’t know what information it is that exists in its environment.

[Download Infographic] Discovering Dark Data - An Elusive Security & Privacy Risk

Whereas ROT is concerned, it refers to:

Redundant Data: When redundant data is mentioned, it gives the idea of data that is used for backup or recovery purposes. However, it is not redundant data since it is useful. The R in ROT refers to the unnecessary or purposeless duplication of data. Redundancy is one of the primary causes of ROT accumulating in an organization’s environment.

One example of redundant data is an employee handbook that may exist in different directories across a corporate server.

Obsolete Data: Organizations accumulate data over time and retain even older data for later use, such as compliance or audit. However, most data tends to lose its value over time. Hence, data that is no longer required or that is replaced with an updated version is referred to as obsolete data.

Imagine a manufacturing company has stored some product designs that it no longer produces. Eliminating outdated designs that aren’t required anymore may save storage space if deleted.

Trivial Data: As the name suggests, any piece of information that has no value to the business is called trivial data. For instance, brainstorming notes or casual ideation sessions that are no longer required can be deemed trivial.

Understanding the Risks of ROT Data Retention

ROT data doesn’t accumulate in an organization’s environment overnight. It racks up somewhere in the databases, on-premise data servers, or cloud storage over time because of the assumption that the “data might be beneficial in the future.” However, it is imperative to understand that retaining ROT is detrimental to businesses in numerous ways.

  • Security risks: When organizations amass unnecessary data, they open backdoors for threat actors. ROT data is usually kept in unmonitored and unsecure servers because it is unnecessary. But, if this data contains sensitive information, such as outdated ex-employee credentials, it may lead to serious consequences, such as unauthorized access.

Depending on an organization, the cost of a breach may be minimal, but the cost of a tarnished market reputation and deteriorated customer trust may be fairly significant.

  • Regulatory risks: Studies reveal that over 75% of records containing personally identifiable information (PII) are over-retained. Under the legal lens, the retention period refers to a specific amount of time businesses are allowed to retain data. Once the retention period expires, the data must be disposed of or anonymized.

Data retention is one of the core provisions among numerous data privacy laws across the globe. Take, for instance, the European Union’s General Data Protection Regulation (GDPR), which requires that organizations should delete personal data that has served its purpose and is no longer required. Similar data retention provisions can be found in the US privacy laws, such as the Consumer Privacy Rights Act (CPRA).

Similarly, the Health Insurance Portability and Accountability Act (HIPAA) also has a data retention policy. HIPAA requires that an individual’s personal health information (PHI) can only be retained for at least six years.

  • Operational risks: When organizations have high volumes of data sitting in their environment, it becomes significantly challenging for them to discover, identify, and analyze it. More data also means that it would require more time and resources to process and analyze it.
  • Cost risks: Organizations spend as much as $34 million on keeping unnecessary data. This may include storage, management, and data protection costs, which significantly burden organizations’ budgets and resources. For instance, organizations require storage space to store data, be it on-premise or cloud storage. Hence, as the volume of data increases, so as the storage expenses.

ROT also incurs high risk in terms of data security. According to a global study, organizations are forecasted to spend a whopping £1.30 trillion on cybersecurity by 2025. Data protection requires using numerous security tools, such as DSPM for data security, CSPM for cloud security, etc. When organizations keep unnecessary data, an unnecessary amount of cost also goes into protecting and managing it.

These risks underscore the significance of leveraging a robust data minimization strategy. By actively minimizing unnecessary data, organizations can proactively reduce the attack surface, streamline their compliance efforts, and focus their resources and costs on managing only the required data.

ROT Data Minimization - A Critical Component of Compliance & Data Security Posture Management

Security professionals face many challenges in cybersecurity. It all starts with the lack of understanding of what data exists in the environment and where it is. When organizations indiscriminately accumulate high volumes of data, it is difficult for security teams to discover, manage, and protect the data.

As data exists in large volumes and different formats, it increases threat surfaces, opening backdoors to malware attacks, ransomware attacks, insider threats, unauthorized access, and data breaches. This situation can best be handled with a data minimization strategy.

In its simplest definition, data minimization restricts data collection, storage, analysis, and processing to what is necessary or useful. It should be discarded if data isn’t useful or no longer serves any purpose. However, distinctions in some aspects may arise when the term is explored under the separate lenses of security and privacy.

Data Minimization & Data Privacy

The Data Minimization requirement is defined under most of the data privacy laws across the globe. Take, for instance, the European Union’s General Data Protection Regulation (EU GDPR).

Under Article 5 of the regulation, organizations must adhere to six principles for processing personal data. Among those six, two provisions explicitly talk about minimization and retention.

Personal data shall be:

  • adequate, relevant and limited to what is necessary in relation to the purposes for which they are processed (‘data minimisation’);
  • kept in a form which permits identification of data subjects for no longer than is necessary for the purposes for which the personal data are processed…(storage limitation).

Similarly, the United State’s California Privacy Rights Act (CPRA) also prohibits businesses from collecting data that has served its purpose and is no longer required. Section 1789.100 states, “A business' collection, use, retention, and sharing of a consumer's personal information shall be reasonably necessary and proportionate to achieve the purposes for which the personal information was collected or processed.”

In privacy, data minimization starts much earlier during data collection. It helps organizations streamline their data management and compliance efforts by defining clear purposes and establishing retention and sharing policies.

Data Minimization & Data Security

Traditionally, data minimization is seen as a core component of data privacy and responsible data management practices. However, data minimization can equally be considered a critical imperative in fighting against the ever-growing cyber threats.

In the context of cyber security, data minimization, or data reduction, is the practice of identifying and reducing unnecessary (ROT) data and enforcing retention policies and controls. By limiting the data to what is necessary for business objectives, data minimization helps reduce potential attack surfaces that may lead to unauthorized access or misuse of sensitive data.

Data minimization has gained more traction as new technologies surface and, along with them, new attack surfaces. Take, for instance, Generative Artificial Intelligence (GenAI). The disruptive technology has introduced new cybersecurity risks that, if not handled appropriately, may result in data breaches, financial damages, and loss of customer trust.

The strategy adopts the concept of quality over quantity. This approach helps streamline data management efforts and limits the data footprints that may attract threat actors. Apart from risk reduction, it also helps reduce security efforts. It means fewer data systems to monitor for security, less data to classify, and no need to manage access to those systems.

On the financial front, minimization significantly contributes to cost savings. Take, for instance, multicloud environments. Data reduction helps organizations reduce their storage expense since the less data they have, the less storage space it would require. Similarly, ROT data minimization also reduces the complexity that organizations usually have to deal with data transfers and interoperability.

Accomplishing Data Minimization with Securiti Data Security Posture Management

Discover and reduce redundant, obsolete, and trivial (ROT) data while minimizing accidental data loss with Data Security Posture Management- an integrated module within Securiti Data Command Center.

The Data Command Center enables organizations to discover all the native and shadow data assets and classify sensitive data accurately, enabling teams to efficiently operationalize their data minimization strategy.

Here’s how a Data Command Center helps you streamline your data minimization efforts:

  • Around 40% to 90% of an organization’s data is Dark data, including ROT. This data sits in both the native and shadow data assets. Securiti helps you discover and inventory the native and shadow data assets across the corporate environment(s).
  • The sensitive data discovery and classification should follow the data assets discovery. Organizations should have a bird’s eye view of the entire data landscape. Securiti enables organizations to discover all the data across the environment and tag and label the data based on sensitivity, business usage, and potential value. The objective of discovery and classification is to understand the data's value, sensitivity, and potential risks.
  • Next is identifying duplicated data, which is a significant part of ROT data. One potentially efficient technique that teams can use here is cluster analysis. The technique is mostly used in data cleaning and data quality improvement practices. Cluster analysis leverages similarity measurements and cluster algorithms to group similar data points for discovering duplicated data. This process aims to reduce the overall volume of the ROT data and keep only unique information.
  • Another important part of ROT data that organizations need to reduce is over-retained data. By unifying regulatory insights and classification, organizations can discover and tag data that has exceeded its retention period. Reducing over-retained data decreases potential risks as well as non-compliance.
  • Trivial data can be identified by looking at data systems and files that haven’t been accessed or modified in a long time.
  • Once the ROT data is identified, the next step is the remediation. To achieve minimization, ROT can be remediated via practices like data deletion, data quarantine, or data delegation.
    • Delete - when sure that data is not needed.
    • Delegate - to someone else / data steward who can decide whether to delete data or not.
    • Quarantine - data may be kept safe until the data steward determines the appropriate remediation steps.

Interested in learning more? Request a demo now to see how Securiti can help you minimize ROT data for enhanced data protection.

Join Our Newsletter

Get all the latest information, law updates and more delivered to your inbox

Share


More Stories that May Interest You

What's
New