Securiti launches Gencore AI, a holistic solution to build Safe Enterprise AI with proprietary data - easily

View

LLM Firewalls Are Not Enough for AI Security

AI Security requires a robust, ‘system-level’ security approach, not just prompt level security mindset.

OWASP Top 10 for LLMs provides a comprehensive framework for ‘AI System’ Security. We are jazzed to announce a strong offering for OWASP Top 10 for LLMs in Securiti Data+AI Command Center!

Author

Rehan Jalil

Founder & CEO Securiti

Listen to the content

AI Security requires a robust, entire system-level security approach, vs just prompt level security mindset, to mitigate risks at every stage of the data and AI pipeline.  OWASP Top 10 for LLMs provides a very promising and comprehensive framework for ‘AI System’ Security.  In  this article we would cover:

  • Why ‘AI System’ level security is needed, vs just prompt level.
  • How does OWASP Top 10 for LLMs define AI System Security.  As we go through various controls, it will become evident that the Security controls are needed at every step of the entire ‘AI system’, from Data ingestion to AI Consumption layers
  • How does Securiti Data+AI Command Center provide robust coverage of most important aspects of OWASP Top 10 for LLMs!

The 2025 OWASP Top 10 for LLMs serves as a comprehensive framework, shedding light on the most critical vulnerabilities in AI Systems and debunking the misconception that protecting GenAI is solely about securing the model or analyzing prompts.

A System-Level Approach to AI Security

The 2025 OWASP Top 10 for LLMs highlights a system view of an LLM application architecture, showing how data flows within the app through various components and the vulnerabilities that can be exploited at each step. Securing your GenAI app requires implementing controls at every step of this data+AI pipeline

[Credit: OWASP Top 10 for LLMs 2025]

“It’s obvious from this OWASP diagram that visibility and controls are needed across the entire AI system, starting from source data systems, not just at the user interaction layer.”

To truly secure LLM applications, organizations need to implement a multi-faceted security strategy:

  1. Data Sanitization: All training, fine-tuning, or inference data must be meticulously classified and sanitized.
  2. Supply Chain Risk Assessment: Third-party components within the application ecosystem must be thoroughly vetted.
  3. Infrastructure Security: The entire stack hosting the data, model, and associated agents requires robust security measures.
  4. Comprehensive LLM Firewall Controls: Multiple layers of firewalls are needed, not just at the prompt or response level but also at internal interfaces during data retrieval and plugin interactions.
  5. Zero-Trust Architecture: Access privileges across various systems should be tightened, adhering to zero-trust security principles.

Mitigating OWASP Top 10 For LLMs

To bridge the knowledge gap in AI security, the OWASP Top 10 for LLMs provides essential guidance for developers, data scientists, and security professionals. Let’s step through these Top 10 LLM Risks and discuss how Securiti can help you mitigate them.

LLM01: Prompt Injection

Prompt injection attacks exploit a large language model (LLM) by crafting malicious natural language inputs (prompts) to manipulate the model output or the intended behavior.

These attacks typically fall into two categories:

  • Direct Prompt Injection involves overriding or bypassing the developer's system-level instructions. For example, X (formerly known as Twitter) users exploited remoteli.io’s LLM chatbot to make a threat against the president.
  • Indirect Prompt Injection leverages prompts embedded in external data sources, such as web pages or documents, to hijack the conversation context. For instance, a Bing chatbot was tricked into revealing its developer instructions through a malicious prompt hidden on a website. This prompt, invisible to humans due to its font color matching the background, was still readable by the LLM. Such techniques can escalate to compromise downstream systems, including databases or backend servers.

Mitigation Strategies

Securiti's context-aware prompt firewall leverages advanced policies to detect and mitigate malicious inputs, including prompts containing harmful URLs, jailbreak instructions, or those crafted to extract sensitive data or generate offensive or off-topic content. This proactive defense protects against adversarial attacks while safeguarding sensitive data.

To further enhance security, Securiti incorporates multiple layers of controls. Data sanitization ensures sensitive data is not used for model training, fine-tuning, or Retrieval-Augmented Generation (RAG). Built-in entitlement controls enforce least-privileged data access, reducing the risk of unauthorized users accessing sensitive data. Additionally, data retrieval and LLM response firewalls provide critical safeguards to protect against any attack that bypasses other defenses by preventing LLMs from leaking sensitive data or generating malicious outputs,

These comprehensive measures enable organizations to effectively counter direct and indirect prompt injection threats, ensuring the secure and reliable deployment of LLM applications.

LLM02: Sensitive Information Disclosure

Sensitive information disclosure occurs when LLMs inadvertently expose confidential, proprietary, or regulated data used during training, fine-tuning, or Retrieval-Augmented Generation (RAG). For example, Samsung employees unintentionally uploaded proprietary source code snippets and internal meeting notes to an LLM provider, raising concerns about data reuse for model retraining or unintended outputs that could reveal Samsung’s sensitive information to other users.

Mitigation Strategies

To prevent sensitive information disclosure, Securiti employs a multi-layered approach that provides robust protection across all stages of data usage within LLM systems.

Data sanitization is a cornerstone of this strategy, with Securiti classifying and masking sensitive data across on-premises, SaaS, and cloud systems. This ensures that sensitive information is secured before being used for model training, fine-tuning, or Retrieval-Augmented Generation (RAG).

Entitlement controls further enhance security by enforcing least-privileged access at the data system level. These controls ensure that sensitive data is only accessible to authorized users or systems. Further, within AI pipelines, entitlements are enforced to allow users to retrieve only their own authorized data via prompts, minimizing the risk of unauthorized access.

To address vulnerabilities in cloud environments, Securiti strengthens the security posture of cloud services and systems hosting the models. This ensures that AI infrastructure is fortified against potential threats.

Additionally, Securiti deploys multi-layered firewalls at the prompt, retrieval, and response levels. These firewalls detect and block any potential leakage of sensitive data through LLM outputs, safeguarding the integrity of AI interactions.

By integrating these measures, Securiti ensures comprehensive safeguards against sensitive information disclosure, enabling organizations to secure and automate data usage effectively within their AI workflows.

LLM03: Supply Chain Vulnerability

Companies rely heavily on open-source and pre-trained foundation models to build AI systems. The AI system supply chain includes various components—such as third-party packages, pre-trained models, public or crowd-sourced datasets, hardware (GPUs, TPUs), machine learning frameworks, cloud platforms, and integrations—which can introduce vulnerabilities.

For instance, pre-trained models may contain hidden biases, backdoors, or malicious features, while poisoned datasets can be intentionally used to compromise the system. Threat actors can exploit these supply chain weaknesses to corrupt model integrity, exfiltrate data, disrupt systems, or enable harmful content generation.

Mitigation Strategies

Managing LLM supply chain vulnerabilities begins with a comprehensive AI model and Agent discovery and risk assessment. The solution automatically discovers sanctioned and unsanctioned (shadow) AI models, agents, and datasets across the organization, providing a clear view of the AI ecosystem.

Building on this foundation, users can leverage model cards to evaluate AI model risks such as toxicity, bias, efficiency, copyright infringement, and misinformation using established benchmarks. Securiti also detects and remediates misconfigurations that could expose data stores or services hosting models, mitigating risks of unauthorized access and tampering. These capabilities ensure a secure and reliable foundation for AI deployments.

To further enhance supply chain security, Securiti’s third-party risk assessment capability helps organizations proactively identify vulnerabilities in supplier components, including AI embedded in SaaS applications.

By providing deep insights into supply chain risks, Securiti helps enterprises identify additional controls they need to mitigate vulnerabilities stemming from compromised supply chain components.

LLM04: Data and Model Poisoning

Data and model poisoning occurs when an attacker contaminates datasets used for pre-training, fine-tuning, or Retrieval-Augmented Generation (RAG), skewing the model’s behavior or introducing backdoors and biases. A notable example is Microsoft’s Tay chatbot, which was compromised when malicious users engaged it with abusive and offensive language. Since Tay dynamically learned from user interactions, it began generating similar offensive content, forcing Microsoft to shut it down within 24 hours of its launch.

Mitigation Strategies

Securiti’s preventive and detective controls for mitigating supply chain vulnerabilities also address the risks associated with data and model poisoning. These measures safeguard the foundational elements of the machine learning pipeline, including third-party datasets and pre-trained or open-source models.

In addition to these safeguards, Securiti provides enhanced protection against successful poisoning attacks through LLM Retrieval and Response Firewalls, which inspect RAG pipelines and monitor LLM outputs to detect and block unintended model behaviors or harmful system responses.

LLM05: Improper Output Handling

Improper output handling occurs when LLM responses are not filtered, sanitized, or validated before being passed to downstream components, such as backend servers, API calls, or LLM functions. This vulnerability can lead to severe consequences. For example, if an LLM allows users to craft SQL queries for a backend database, a malicious actor could exploit this to generate queries that delete all database tables or compromise sensitive data.

Mitigation Strategies

To address this risk, Securiti’s Response Firewall examines and filters AI outputs to ensure the system only generates appropriate content that’s aligned with your company’s security and compliance policies. With built-in policies, the solution automatically detects and blocks the AI system from generating improper outputs such as company confidential information, sensitive PII, source code, IT secrets, and other content types that may represent negative sentiment, prohibited languages or offensive statements.

LLM06: Excessive Agency

Excessive agency occurs when LLM agents have entitlements that exceed their intended scope, allowing access to systems and data they should not have permission to use. For example, a car dealership’s LLM agent with elevated privileges could inadvertently expose confidential information about upcoming sales promotions. Such a disclosure, meant only for internal sales and finance teams, could lead to customers delaying purchases, negatively impacting current quarter sales and revenue.

Mitigation Strategies

Securiti’s solution automatically discovers both sanctioned and unsanctioned AI models, along with their data access entitlements within an organization. The built-in Data Access Intelligence & Governance solution then mitigates this vulnerability by evaluating and enforcing data access controls at the source data system, ensuring both users and machines, like LLMs, adhere to a least-privileged access model. This is especially critical for securely using SaaS AI copilots, such as Microsoft 365, as source-level access controls prevent data oversharing among users of these applications.

However, while source-level access controls are essential, they alone are insufficient to mitigate excessive Agency risk in internally developed LLM-based AI systems. LLMs with source data access can potentially bypass underlying user entitlements to data, leading to unauthorized data access through prompts. To address this, Securiti’s Gencore AI solution cross-verifies the identity of the user sending the prompt with their entitlements to embeddings stored in Vector DBs. This ensures that users receive information that is generated only from data they are authorized to access.

This dual-layered approach integrates source-level access governance with LLM-specific entitlement controls, effectively mitigating excessive agency risks and safeguarding sensitive information from unauthorized access.

LLM07: System Prompt Leakage

System prompt leakage is a specific form of prompt injection attack that involves extracting internal system prompts, potentially exposing sensitive information such as API keys, system architecture, or content moderation rules. For example, malicious actors have successfully extracted GPT-4's voice mode system prompts, revealing the model’s behavior constraints and response guidelines.

Mitigation Strategies

As system prompt leakage is a subset of prompt injection, the preventive and detective controls outlined under LLM01 also apply here. Specifically, the prompt injection policies built-in to the Prompt Firewall can block an attacker’s attempts to extract system prompts. In addition, organizations should implement preventive measures that ensure sensitive data, such as API keys and system configurations, is excluded from internal system prompts. They should instead store and process sensitive information separately under strict security controls to minimize the risk of exposure.

LLM08: Vector and Embedding Weaknesses

Vector and embedding weaknesses arise in Retrieval-Augmented Generation (RAG) systems when adversaries exploit vulnerabilities in vector DBs to inject malicious content, alter the model’s behavior, or access sensitive information stored in embeddings. For instance, in a multi-tenant environment where different user groups share the same vector database, embeddings from one group might inadvertently be retrieved by queries from another group’s LLM, leading to the potential leakage of sensitive business information.

Mitigation Strategies

Securiti’s Data Command Center addresses Retrieval-Augmented Generation (RAG) pipeline risks with a comprehensive set of capabilities tailored for security teams. At its core, the solution incorporates built-in data sanitization to automatically classify and mask sensitive data in source systems before storing it as embeddings in vector databases. This proactive approach minimizes the risk of data exposure from attacks that exploit vulnerabilities in vector databases.

In addition to data sanitization, the solution automates the enforcement of entitlement controls, ensuring that users or agents can only retrieve embeddings derived from data they are authorized to access. This capability prevents prompts within LLM applications from bypassing underlying data access policies, maintaining strict adherence to least-privileged access principles.

To further enhance security, the Retrieval Firewall analyzes data retrieved from the RAG pipeline. If an adversary successfully injects malicious content with a fake vector designed to look like other content in the Vector DB, the retrieval firewall can detect and block such malicious content from being retrieved by the AI model. Together, these capabilities ensure robust protection for RAG pipelines.

LLM09: Misinformation

Misinformation in LLMs occurs when models hallucinate and generate false or misleading information, often due to a lack of understanding of context and meaning. For example, Air Canada’s chatbot recently misinformed a passenger, incorrectly stating that they could apply for a bereavement fare refund within 90 days, which conflicted with the airline’s actual policies.

Mitigation Strategies

Securiti’s Gencore AI solution mitigates misinformation risk by enabling users to build safe RAG pipelines. The solution securely connects LLMs to trusted internal knowledge bases, ensuring that outputs are grounded in verified source information. This reduces hallucinations and ensures the relevance and correctness of responses.

The solution also enables model risk assessment and selection based on industry benchmarks such as Stanford HELM to highlight potential issues like hallucinations, bias, and toxicity. This allows builders to compare multiple models based on risk scores and choose the most appropriate model for their specific AI application use case. By selecting models that align with the organization’s risk tolerance, Securiti helps mitigate potential inaccuracies from the outset.

Additionally, with Securiti, users can minimize redundant, obsolete, and trivial (ROT) data, such as stale information or duplicate files, that could compromise the integrity of AI workflows. By identifying and eliminating ROT data from model training, fine-tuning, and Retrieval Augmented Generation (RAG) processes, the solution enhances input quality, which in turn further improves the reliability of model responses.

Together, these strategies enable organizations to safeguard their AI systems against misinformation and deliver trustworthy outputs.

LLM10: Unbounded Consumption

Unbounded consumption occurs when attackers exploit an LLM's inference capabilities by making excessive requests without proper limitations. This vulnerability can result in denial-of-service attacks, service degradation, model theft, and financial loss. For instance, an adversary might overwhelm the LLM with numerous inputs of varying lengths, exploiting processing inefficiencies to drain resources and potentially render the system unresponsive, significantly impacting service availability.

Mitigation Strategies

Securiti’s LLM Response Firewall addresses the risks of unbounded consumption by detecting and blocking attacks that cause LLMs to generate a high volume of off-topic content, mitigating resource depletion and maintaining service stability.


The Path Forward: Prioritizing LLM Application Risks

Adopting LLMs within in-house applications or third-party SaaS introduces substantial uncertainties and risks to an organization’s IT security posture. The non-deterministic behavior of LLMs means their outputs cannot always be predicted, making risk management a complex challenge. Additionally, LLMs rely heavily on data, and most organizations lack the mature data security and governance frameworks necessary to address these risks effectively.

Relying solely on an edge firewall mindset is insufficient and highly risky. Only a comprehensive, multi-layered security strategy can adequately protect against the evolving threats in the GenAI landscape.

To prioritize the most critical vulnerabilities, I recommend focusing on LLM06: Excessive Agency and LLM02: Sensitive Information Disclosure first. GenAI systems are increasingly interconnected with more data systems, making it imperative to ensure they have access only to the data relevant to their tasks and that their entitlements are minimized. Without strict entitlement controls, GenAI systems capable of leveraging full APIs may be manipulated into executing unintended or harmful actions.

Customer-sensitive data remains a prime target for cybercriminals, and attackers can also exploit LLMs for reconnaissance, using them to gather valuable information about your organization to support future attacks. Addressing these two vulnerabilities early reduces the attack surface significantly, weakening the impact of exploits targeting other vulnerabilities.

Looking to sharpen your AI Security & Governance skills? Take our highly popular AI Security certification - it’s free.

Want to align your organization with OWASP Top 10 for LLMs?

Request a demo now and learn how Securiti can help you manage AI risk across internal and SaaS applications.

Join Our Newsletter

Get all the latest information, law updates and more delivered to your inbox


Share


More Stories that May Interest You

What's
New