Mitigating Indirect Prompt Injection Attacks on LLMs

Indirect prompt injection attacks can manipulate AI systems by embedding malicious instructions in trusted data sources. Learn how enterprises can mitigate these threats using OWASP best practices and AI Gateways to enforce security policies, validate data, and protect LLM-powered applications.

So you think your competitor is going to launch a new product and you don’t want to get caught off guard. You have a few contacts at the competitor you could check with, but they wouldn’t outright reveal what they’re working on. Could you say something to convince them you already know what it is? “Hey, so I heard you’re working on something super cool, do you think it’ll have the impact you’re hoping?” If you make them think you already know, they may just reveal what it is. This is a form of social engineering to trick someone into giving up information they shouldn’t.

Prompt injection is when an attacker manipulates an AI system into doing something it shouldn’t by crafting a seemingly legitimate prompt. Recall that prompts contain both instructions and data. It's not always clear the distinction between the two.

So what appears to be data could include malicious instructions. With a direct prompt injection, an attacker explicitly provides a cleverly crafted prompt that overrides or bypasses the model’s intended safety and content guidelines. With an indirect prompt injection the attack is embedded in external data sources that the LLM consumes and trusts.
‍
Enterprises are augmenting their LLM-powered apps with sensitive data, pulling from internal knowledge bases, customer support logs, product documentation, and internal systems all to make "AI more helpful.”

Here's the scary part: every time you augment your prompts with data, you’re inviting an attack.

Some organizations believe they’re immune to this because they “own their data.” A false sense of security may feel good, but it's just that: a false sense of security. Your AI isn’t running in a vacuum. It ingests data from customers, partners, third-party APIs, and internal sources all of which could be maliciously manipulated to turn your fancy chatbot into a compliance and competitive liability.

Indirect Prompt Injection Example

In this amazing blog post from "Embrace the Red", we see a real-life indirect prompt injection attack on GitHub Copilot (via a VS Code plugin) which has since been fixed. In this attack the attacker plants hidden instructions inside a source code file which the copilot reads and interprets as a legitimate instruction. In this case, the instruction was disguised as markdown data which pointed to a URL for an image, only this was not an image at all. The URL in the markdown instructed the LLM to populate a call out to a malicious website with sensitive data in the payload. When Copilot renders the HTML/Markdown, it sends the data to the attacker.

*Image credit:* *https://embracethered.com/blog/posts/2024/github-copilot-chat-prompt-injection-data-exfiltration/*

In this case, we see an attacker doesn’t need access to the AI itself—just the data it processes. And guess what? Your enterprise chatbot is no different.

Enhancing our enterprise data with Retrieval-Augmented Generation (RAG) is supposed to make AI more accurate by pulling in real-world data. But what happens when that data gets poisoned with a prompt injection?

Consider the following scenario: you build an LLM-powered risk management system which uses RAG to inject financial transactions to assess for fraud. An attacker injects fake but realistic transactions into emails, support tickets, or chat logs. One of these logs gets processed by the LLM as part of a risk assessment and returns a result that ignores fraud.

This isn’t a hypothetical. Researchers have proven that injecting just a handful of malicious documents into a RAG system (poison RAG) can cause an LLM to return attacker-chosen answers over 90% of the time.

What does OWASP Say?

The OWASP Top 10 for LLM Applications explicitly highlights prompt injection as the number one risk to AI-powered systems. While there is no silver bullet, they outline several mitigation strategies that enterprises should adopt:

Constrain Model Behavior – Define strict boundaries for what the LLM is allowed to do. This includes limiting the AI’s ability to take action beyond text generation and enforcing system-level constraints.
‍
Validate and Sanitize Inputs – Treat all input data as untrusted. Use strict validation techniques to sanitize both user-generated content and external documents before ingestion into the AI system.
‍
Output Filtering and Monitoring – Implement post-processing rules to analyze AI-generated responses for anomalies. Flag unexpected or potentially harmful outputs before they reach the end user.
‍
Restrict External Dependencies – Be cautious about what data sources your AI pulls from. Avoid blind trust in third-party APIs, open databases, or customer-generated data that hasn’t been vetted.
‍
Employ Adversarial Testing – Regularly conduct red-teaming exercises to simulate attacks and evaluate how the AI handles deceptive inputs.
‍
Context Isolation – Segregate data sources to prevent untrusted inputs from contaminating privileged or sensitive information streams.

OWASP's recommendations provide a strong foundation for mitigating prompt injection risks, but implementing them effectively in real-world enterprise environments is no trivial task. Traditional security measures—like input validation and adversarial testing—help, but they don’t offer a scalable, systematic way to govern how LLMs interact with data sources at runtime.

Can an AI Gateway Help?

In modern AI architectures, where LLMs ingest data from diverse internal and external sources, a more structured enforcement mechanism is needed. This is where an AI Gateway comes in. Similar to how API gateways secure and control access to backend services, an AI gateway acts as a policy enforcement layer for LLM interactions—validating inputs, filtering responses, and ensuring compliance with security best practices.

One thing we are working on here at Solo.io is implementing RAG directly in the AI gateway. With this feature, we can limit RAG adjusted prompts to only those data sources that have been approved. An AI gateway can help with the following to mitigate prompt injections (aligning with OWASP):

Data Provenance & Validation – Ensures that external data sources feeding into the LLM are verified, reducing the risk of ingesting manipulated or malicious content.
‍
Eliminate Data Source Changes - The gateway enforces that no manipulation or changes to the trusted data sources can occur (at least not through LLM queries)
‍
Content Filtering & Anomaly Detection – Monitors incoming data and AI-generated responses for patterns indicative of prompt injection attempts.
‍
Rate Limiting & Access Controls – Restricts excessive or suspicious queries that might attempt to manipulate the model.
‍
Audit Logging & Monitoring – Provides visibility into what data is being injected into prompts, allowing security teams to detect and respond to threats.

There is no 100% foolproof solution to indirect prompt injection, but following the suggestions from OWASP, and architecturally controlling, vetting, and logging traffic to your LLM systems can give you greater confidence in your LLM security.

Mitigating Indirect Prompt Injection Attacks on LLMs

Indirect Prompt Injection Example

What does OWASP Say?

Can an AI Gateway Help?

Featured content

Fortifying Your Cloud Native Connectivity Security Posture with Solo and Ambient Mesh

Migrating from Sidecars to Ambient Mesh - Risks, Challenges, and Benefits

Overhaul of Agent Gateway supporting A2A, MCP, and Kubernetes Gateway API

How Ambient Mesh Delivers Advanced Resource and Cost Savings

Getting Started with Ambient Mesh: From 0 to 100 mph

Agent Discovery, Naming, and Resolution - the Missing Pieces to A2A

Part Two: MCP Authorization The Hard Way

Part One: MCP Authorization The Hard Way

Agent Identity and Access Management - Can SPIFFE Work?

Deep Dive into llm-d and Distributed Inference

Gloo Mesh 2.8 simplifies service mesh operations with new enhanced user experience across multi-cluster environments.

Gloo Gateway 1.19 accelerates context-rich, real-time AI apps with Gateway API

llm-d: Distributed Inference Serving on Kubernetes

AI Reliability Engineering For More Dependable Humans

Kubernetes Identity the Right Way with SPIRE and Ambient

Optimizing GenAI in Production: High-Value Use Cases for AI Gateways

Solo.io Recognized as a Visionary in the 2024 Gartner® Magic Quadrant™ for API Management for the SECOND year in a row.

Guardians of the Governance: GenAI Gateway Guidance with GitOps and Gloo

Istio Ambient Waypoint Proxy explained

Hands-On with the Kubernetes Gateway API and Envoy Proxy: A Tutorial with GitOps and Gloo Gateway

Istio and the State of DevOps: Enhancing Key Metrics

What is an AI Gateway and its role in AI Applications?

Best practices for secure Istio deployment with Gloo Mesh Core

Gloo Mesh 2.6: Istio's Ambient mode now ready for production

HTTP Observability Without Compromises

Advance your knowledge of service mesh tech with Solo.io Academy certifications

Service Mesh for the developer workflow, a series

Challenges of adopting service mesh in enterprise organizations

Service Mesh in the Real World #2 — Ingress Traffic Control

Service Mesh in the Real World Video Series – Episode # 1: Egress Traffic

Service Mesh the easy way with AWS App Mesh and SuperGloo

Webinar Recap: Intro to Service Mesh Hub and SMI

D-TECK Uses Solo.io Gloo Gateway and Google Cloud to Help Businesses Make Better HR Decisions

Minimize the blast radius of changes with Solo.io Gloo Gateway and Weaveworks Flagger

Announcing Service Mesh Interface (SMI) Support and Collaboration

Service Mesh Interface (SMI) and our Vision for the Community and Ecosystem

The need for a standard, service mesh API

SuperGloo to the Rescue! Making it easier to write extensions for Service Mesh

Introducing The Service Mesh Hub -everything you need for your service mesh

Kubernetes Ingress Past, Present, and Future

Solo.io Streamlines Service Mesh and Serverless Adoption for Enterprises in Google Cloud

Ingenico

ParkMobile

Vonage

Domino’s Pizza

Gloo Mesh Feature Comparison

Service Mesh for Developers, Part 1: Exploring the Power of Observability and OpenTelemetry

Service Mesh at Scale

Compare Capabilities of the Top Service Mesh Platforms

Compare Capabilities of the Top API Gateways

Establishing zero trust security for modern cloud architectures

Unlocking the Power of Your API Gateway

API Gateways: Productivity, Resilience, and Security for Next-Generation Cloud Applications

Driving Business Value with Istio

Service Mesh Vendor Comparison

Istio Then & Now

4 Reasons Why You Need an AI Gateway

Gloo Gateway vs. Kong

Gloo Gateway vs. Apigee

3 Reasons You Need an API Gateway for Microservices Apps

Ambient Mesh Lab: EnvoyFilter Support

Ambient Mesh Lab: SPIRE integration with Gloo Mesh in Istio Ambient Mode

Ambient Mesh Lab: Introduction to ztunnel in Ambient Mesh

Solo Academy Course: Service Mesh Basics

Solo Academy Course: Istio Basics

Solo Academy Course: Envoy Basics

Solo Academy Course: API Gateway Basics

Solo Academy Course: Get Started with Istio Service Mesh

Solo Academy Course: Introduction to Envoy Proxy

Solo Academy Course: Deploying Istio for Production

Kgateway Lab: Integrating kgateway with Istio at Ingress

Kgateway Lab: Kgateway as a Waypoint

Kgateway AI Lab: Consumption Reporting

Kgateway AI Lab: Deploying kgateway as an AI Gateway

Kagent Lab: How to build an AI agent