So you think your competitor is going to launch a new product and you don’t want to get caught off guard. You have a few contacts at the competitor you could check with, but they wouldn’t outright reveal what they’re working on. Could you say something to convince them you already know what it is? “Hey, so I heard you’re working on something super cool, do you think it’ll have the impact you’re hoping?” If you make them think you already know, they may just reveal what it is. This is a form of social engineering to trick someone into giving up information they shouldn’t.
Prompt injection is when an attacker manipulates an AI system into doing something it shouldn’t by crafting a seemingly legitimate prompt. Recall that prompts contain both instructions and data. It's not always clear the distinction between the two.

So what appears to be data could include malicious instructions. With a direct prompt injection, an attacker explicitly provides a cleverly crafted prompt that overrides or bypasses the model’s intended safety and content guidelines. With an indirect prompt injection the attack is embedded in external data sources that the LLM consumes and trusts.
Enterprises are augmenting their LLM-powered apps with sensitive data, pulling from internal knowledge bases, customer support logs, product documentation, and internal systems all to make "AI more helpful.”
Here's the scary part: every time you augment your prompts with data, you’re inviting an attack.
Some organizations believe they’re immune to this because they “own their data.” A false sense of security may feel good, but it's just that: a false sense of security. Your AI isn’t running in a vacuum. It ingests data from customers, partners, third-party APIs, and internal sources all of which could be maliciously manipulated to turn your fancy chatbot into a compliance and competitive liability.
Indirect Prompt Injection Example
In this amazing blog post from "Embrace the Red", we see a real-life indirect prompt injection attack on GitHub Copilot (via a VS Code plugin) which has since been fixed. In this attack the attacker plants hidden instructions inside a source code file which the copilot reads and interprets as a legitimate instruction. In this case, the instruction was disguised as markdown data which pointed to a URL for an image, only this was not an image at all. The URL in the markdown instructed the LLM to populate a call out to a malicious website with sensitive data in the payload. When Copilot renders the HTML/Markdown, it sends the data to the attacker.

In this case, we see an attacker doesn’t need access to the AI itself—just the data it processes. And guess what? Your enterprise chatbot is no different.
Enhancing our enterprise data with Retrieval-Augmented Generation (RAG) is supposed to make AI more accurate by pulling in real-world data. But what happens when that data gets poisoned with a prompt injection?
Consider the following scenario: you build an LLM-powered risk management system which uses RAG to inject financial transactions to assess for fraud. An attacker injects fake but realistic transactions into emails, support tickets, or chat logs. One of these logs gets processed by the LLM as part of a risk assessment and returns a result that ignores fraud.
This isn’t a hypothetical. Researchers have proven that injecting just a handful of malicious documents into a RAG system (poison RAG) can cause an LLM to return attacker-chosen answers over 90% of the time.
What does OWASP Say?
The OWASP Top 10 for LLM Applications explicitly highlights prompt injection as the number one risk to AI-powered systems. While there is no silver bullet, they outline several mitigation strategies that enterprises should adopt:
- Constrain Model Behavior – Define strict boundaries for what the LLM is allowed to do. This includes limiting the AI’s ability to take action beyond text generation and enforcing system-level constraints.
- Validate and Sanitize Inputs – Treat all input data as untrusted. Use strict validation techniques to sanitize both user-generated content and external documents before ingestion into the AI system.
- Output Filtering and Monitoring – Implement post-processing rules to analyze AI-generated responses for anomalies. Flag unexpected or potentially harmful outputs before they reach the end user.
- Restrict External Dependencies – Be cautious about what data sources your AI pulls from. Avoid blind trust in third-party APIs, open databases, or customer-generated data that hasn’t been vetted.
- Employ Adversarial Testing – Regularly conduct red-teaming exercises to simulate attacks and evaluate how the AI handles deceptive inputs.
- Context Isolation – Segregate data sources to prevent untrusted inputs from contaminating privileged or sensitive information streams.
OWASP's recommendations provide a strong foundation for mitigating prompt injection risks, but implementing them effectively in real-world enterprise environments is no trivial task. Traditional security measures—like input validation and adversarial testing—help, but they don’t offer a scalable, systematic way to govern how LLMs interact with data sources at runtime.
Can an AI Gateway Help?
In modern AI architectures, where LLMs ingest data from diverse internal and external sources, a more structured enforcement mechanism is needed. This is where an AI Gateway comes in. Similar to how API gateways secure and control access to backend services, an AI gateway acts as a policy enforcement layer for LLM interactions—validating inputs, filtering responses, and ensuring compliance with security best practices.

One thing we are working on here at Solo.io is implementing RAG directly in the AI gateway. With this feature, we can limit RAG adjusted prompts to only those data sources that have been approved. An AI gateway can help with the following to mitigate prompt injections (aligning with OWASP):
- Data Provenance & Validation – Ensures that external data sources feeding into the LLM are verified, reducing the risk of ingesting manipulated or malicious content.
- Eliminate Data Source Changes - The gateway enforces that no manipulation or changes to the trusted data sources can occur (at least not through LLM queries)
- Content Filtering & Anomaly Detection – Monitors incoming data and AI-generated responses for patterns indicative of prompt injection attempts.
- Rate Limiting & Access Controls – Restricts excessive or suspicious queries that might attempt to manipulate the model.
- Audit Logging & Monitoring – Provides visibility into what data is being injected into prompts, allowing security teams to detect and respond to threats.
There is no 100% foolproof solution to indirect prompt injection, but following the suggestions from OWASP, and architecturally controlling, vetting, and logging traffic to your LLM systems can give you greater confidence in your LLM security.