Globally, Generative AI (GenAI), specifically Large Language Models (LLMs), are increasingly being assimilated into business processes. Though LLMs have certainly brought a tidal wave of innovations and breakthroughs, these models have also exposed businesses to unprecedented security, privacy, and compliance risks. Take, for instance, AI prompt injections that could allow end users to manipulate an LLM into giving unethical, often harmful responses, such as the DAN model, or revealing sensitive data.
Typically, enterprises lack the right security tools, controls, and policies to ensure the safe development, deployment, and usage of internal, public, or consumer-facing LLMs. Hence, there’s a dire need for an advanced solution that could help prevent LLMs from leaking sensitive data, responding toxically, or falling victim to emerging AI threats.
Here, LLM firewalls come into the picture. This blog discusses what an LLM firewall is, how it is different from traditional firewalls, and why it is important in the GenAI era.
What is an LLM Firewall? How Is It Different from Traditional Firewalls?
LLM firewalls are different from traditional application or network-based firewalls as they are built to respond to the way GenAI applications function. They are placed at different instances of LLM interactions, such as prompts and responses.
Traditional firewalls are deterministic in nature in that these tools react in response to pre-defined rules and policies. Take, for instance, an AWS RDS instance. A firewall policy can be placed for the AWS RDS instance, allowing only specific IPs, virtual private clouds, or subnets to connect with it or to some specific ports while filtering out the rest.
LLMs, on the other hand, cannot possibly be safeguarded against threats, which are unique to AI, with traditional firewalls. LLM interactions ingest natural languages, and hence, the response is always unique, even if the prompt is the same every time. Here, an LLM firewall can be placed to sanitize and filter out potentially malicious prompts or hallucinatory responses.
To summarize the difference between traditional and LLM firewalls, traditional applications monitor network traffic, while LLM firewalls inspect and safeguard prompts and responses. Advanced solutions, such as Securiti LLM Firewall, may go even further, adding retrieval monitoring to the mix for enhanced protection against sensitive data leaks or AI poisoning.
Critical Risks That Threaten LLMs
According to leading analysts, IT businesses have, on average, 1,689 LLM models in production, and some of those models are crucial to their success. The increasing adoption of LLM across business processes has led to a significant rise in adversarial machine learning (AML) attacks. The aim of such attacks is to either downgrade the performance of AI systems or manipulate the models into leaking confidential data.
Recognizing the need for a robust framework, the National Institute of Standards and Technology (NIST) and other online communities like the Open Worldwide Application Security Project (OWASP) have put together a list of emerging AI threats. Let’s briefly discuss the top 3 AI threats and recommended mitigations as outlined in the OWASP Top 10 List for LLM Applications.
LLM01: Prompt Injection
Injection threats, such as SQL injections, have always existed. However, prompt injection is unique to LLMs in that a malicious actor can leverage crafted or misleading prompts to manipulate an LLM’s responses. Successful prompt injections could lead to serious harm to an individual’s privacy or the reputation of an enterprise. For instance, these threats may result in the extraction of confidential or sensitive information, and they may also influence business decision-making.
Prompt injections are carried out either indirectly or directly. The DAN model is a great example of a direct prompt injection, also known as jailbreaking. The model was created to jailbreak ChatGPT and make it generate unethical and toxic responses that violate the policies of the application.
OWASP recommends restricting entitlements to prevent and mitigate prompt injection threats. Access to the LLM’s backend must be restricted to authorized individuals and kept to a minimum level.
LLM02: Insecure Output Handling
Insecure output handling can lead to excessive privileges, cross-site scripting, and information leakage risks. This type of threat is often the result of improper or insufficient controls around the validation, sanitization, and handling of LLMs’ output. OWASP recommends enterprises follow its ASVS (Application Security Verification Standard), which is a highly detailed set of guidelines created to help organizations implement robust validation and sanitization controls.
LLM06: Sensitive Data Exposure
Sensitive data exposures can result in unauthorized access to confidential information, privacy breaches, and other security threats. They may also lead to a damaged reputation, loss of customer trust, and heavy legal penalties. Sensitive data exposures in LLMs occur when a model inadvertently leaks data, either due to improper data handling during data ingestion while training the model or a lack of data curation, cleansing, and sanitization.
OWASP suggests that adequate data sanitization needs to be ensured to filter out or redact sensitive data. The guidelines further recommend implementing input validation and sanitization to prevent users from using malicious or confidential inputs.
Read the Complete OWASP Top 10 List for LLM Applications Here
NIST also highlights similar critical risks associated with LLMs in its AI Risk Management Framework (RMF). For instance, the AI Abuse Attack, where an incorrect piece of data is fed to the LLM to compromise its source and thus the resulting output.
These threats damage not only the LLMs but also users’ privacy, business performance and reputation, and the socio-economic affairs of society as a whole.
How Does an LLM Firewall Work?
Apart from other security controls, LLM firewalls add an enhanced layer of protection around LLMs, safeguarding the models from various internal and external threats. A distributed LLM firewall is placed at different stages of the GenAI application’s interaction with the LLM or LLM’s interaction with the data, such as user prompts, retrieval data, and LLM responses. This way, the LLM can effectively be protected against malicious internal users and external risks.
Let’s take a quick look at how an advanced firewall for LLM, such as Securiti’s LLM Firewall, inspects and safeguards prompts, retrievals, and responses.
LLM Firewall for Prompts
LLM prompt firewalls evaluate users' prompts, thereby identifying and preventing malicious use or misuse. The firewall redacts sensitive information, preventing important data from being accessed or used by the LLM.
LLM Firewall for Retrievals
The Retrieval Augmented Generation (RAG) process is a phase where AI threats like indirect prompt injection or AI poisoning can critically put LLMs at risk of abnormal behavior and inadvertent exposure of sensitive data. Retrieval firewalls help monitor and control important data during the RAG stage, preventing any sensitive data exposure or AI poisoning.
LLM Firewall for Responses
The primary role of an LLM response firewall is to monitor the responses generated by the AI model and ensure that it doesn’t violate ethical, privacy, security, and compliance guidelines. This firewall should check and block toxic content, filter out prohibited topics, and redact sensitive data to prevent unintended exposure.
Protect Your GenAI Pipeline with a Context-Aware, Distributed LLM Firewall
Securiti provides a new category of distributed and context-aware LLM Firewalls. The LLM firewall understands the context of AI systems, data flows, regulatory intelligence, and access entitlements. The solution safeguards GenAI pipelines against sensitive data exposure, prompt injections, prohibited topics, and harmful content. The solution ensures that the data interacted with or generated by an internal, public, or commercial LLM remains secure and compliant. Combined with the capabilities of the Securiti Data Command Center, the LLM Firewall protects GenAI applications against the threats covered under the OWASP Top 10 List for LLM Applications and NIST AI RMF v.1.
Our solution’s highlighted features include:
- Advanced Machine-Learning Protection: Protect sensitive data with inline detection, classification, and sanitization.
- Dynamic Content Filtering: Automate sensitive data detection, classification, and redaction. Prevent toxic content and enable compliance with tone and guidelines.
- OWASP-Targeted and Customizable GenAI Policies: Depending on the individual needs of your enterprise, you may tailor LLM security based on a comprehensive policy framework.
- Data+AI Compliance: Enable compliance with global data and AI regulations and industry frameworks such as the EU AI Act, NIST AI RMF, etc.
- Comprehensive Dashboard Capabilities: Get complete visibility of your AI landscape, AI usage insights, and policy violations.
Request a demo today to learn more.