LLM providers like OpenAI and Gemini offer advanced natural language processing capabilities, enabling innovative applications. An API gateway like Gloo Gateway is essential to efficiently and securely access these APIs. Gloo Gateway acts as a central hub, providing a unified endpoint and essential functionalities such as request routing, security enforcement, and monitoring.
If you’re building or considering building with LLM/GenAI APIs, please take our 30-second survey — we’d love to talk with you.
This article explores how Gloo Gateway can be configured to support LLM APIs, focusing on accessing the OpenAI API. We’ll cover topics such as enforcing security policies, managing API keys, and optimizing traffic routing. By integrating Gloo Gateway, organizations can streamline access to LLM APIs, enforce security policies, and simplify API management, enabling the development of innovative AI-driven applications.
What Is an API Gateway?
API gateways serve as a bridge between clients (API consumers) and backend services, serving as a single point of contact.
They offer a range of functionalities, such as request routing, protocol translation, security, monitoring, and API version management. By consolidating these features into a single service, API gateways simplify managing and securing API traffic, enhancing the system’s overall reliability and performance.
What Is an LLM API?
Large Language Model (LLM) APIs, such as Open AI, Mistral, and Gemini provide access to advanced natural language processing capabilities offered by language models like GPT. These APIs allow developers to use natural language prompts as inputs to the API, enabling tasks such as text analysis, content generation, translation, and conversational interfaces.
Compliance With Organizational Standards
API gateways play a crucial role in enforcing security policies and ensuring that applications comply with organizational standards established by security and infosec teams. LLM APIs are not subject to different requirements from any other API within the organization when it comes to preventing unauthorized access and maintaining the integrity and confidentiality of the data processed.
As the industry rushes to build new applications that consume LLM APIs, careful consideration needs to be made by the organization on establishing clear usage policies and high security standards of these public AI models.
Let’s illustrate this scenario: To mitigate the risk of leaking customer data, the security team mandates that all applications need to use an internal gateway as the endpoint to access LLM APIs rather than connecting directly. By doing so, the gateway team can centralize control over traffic routing, observability, and security policies, enhancing the organization’s overall security posture.
While a security strategy is important, it is also critical to implement a strategy that minimally compromises developer agility. Platform and security teams looking to better control or at minimum monitor the outflow of requests to public LLMs should look for solutions that provide benefits back to their consumers, in this case the developers of these AI-enabled applications!
API gateways such as Gloo Gateway can provide:
- A unified endpoint to standardize internal API interfaces, with access to multiple LLM backends
- Authentication strategies to streamline access management and security in accordance with org-wide authentication standards (for example issuing and refreshing individual API keys)
- Rate limiting strategies at the gateway and route level
- Traffic monitoring and logging
- Transformations for request/response and header shaping
- And many other features
Leveraging an API gateway to support LLM APIs can provide a robust solution for organizations seeking to enhance the security and consistency of their API infrastructure.
Configuring Gloo Gateway for LLMs
Assuming you’ve already set up Gloo Gateway, the next step is to use the ExternalService CRD and configure the LLM APIs you want the Gloo Gateway to handle. In this example, we’ll be using the OpenAI API; however, you can do this with other LLM APIs as well.
Here’s how you would create an ExternalService to represent the OpenAI API:
Then we can use a route table to configure Gloo Gateway to route to the external service. For example:
The above RouteTable is attached to a virtual gateway and listens for all hosts. Note that this is where you’d configure the actual hostname for your gateway, for example, mycompany.domain.com
.
The RouteTable matches the incoming requests based on the configuration in the routes specified. In this case the prefix /openai
on the catch-all route.
Once the incoming requests are matched on the prefix, the request is forwarded to the OpenAI external service. Before the request is forwarded, the host is re-written to api.openai.com
and the path to /v1/chat/completions
.
This configuration allows us to accept traffic destined for example to mycompany.domain.com/openai
to resolve to the OpenAI Completions API external service endpoint at api.openai.com/v1/chat/completions
. Note that here we are specifying /v1/chat/completions in the prefix rewrite as an additional mechanism to scope the usage of this route down to the Completions API endpoint only – however this is completely configurable by the platform team. This provides precise control over gradually rolling out additional LLM capabilities such as routing to the image generations endpoint at /v1/images/generations
using an organization specific route path such as /openai/images
.
At this point, you can send the request to Gloo Gateway and see that it is forwarded to the OpenAI API. Note the $GATEWAY
points to the Gloo Gateway external address.
Let’s try this:
It works! We can access the Open AI API through the Gloo Gateway with the /openai
path routing to the OpenAI LLM completions endpoint.
We can already see the benefits of implementing an “LLM Proxy” in comparison to configuring a curl command to each public LLM directly.
OpenAI requires the API key as a header:
Gemini requires the API key as a query parameter:
Instead we can provide a unified consumer endpoint such asmycompany.domain.com/openai
mycompany.domain.com/gemini
Protecting the Gloo Gateway With an API Key
Let’s explore how we can continue to optimize the security and developer productivity experience. Using an ExtAuthPolicy resource, you can enforce authentication and authorization for the traffic reaching the gateway. Gloo Gateway supports multiple types of external auth policies, but for this example we’ll use a simple API key to authenticate the requests.
For the following example, we will edit our current direct-to-openai-routetable
so that we rewrite to /
as this example consumes the /v1/models
endpoint:
Now create a Kubernetes Secret, in which we specify the API key a caller must provide to be authenticated by the Gloo Gateway:
Note the value for the api-key
must be base64 encoded.
Before you can create an ExtAuthPolicy, you must configure the external auth server using the ExtAuthServer resource:
For Gloo Gateway to enforce the API key authentication, you’ll create an ExtAuthPolicy resource. The resource tells the external auth server you configured previously to use the API key authentication with the key provided in the Kubernetes Secret.
The above policy configures the apiKeyAuth
by pointing to a Kubernetes Secret using label selectors and tells the Gloo Gateway to compare the value provided in the api-key
header name with the value read from the Kubernetes Secret.
This time, If you repeat the same request as before, you’ll get an HTTP 401 Unauthorized message:
The message is telling us that the Gloo Gateway is expecting an API key, but we haven’t provided one. Let’s try again with the API key value we set in the Kubernetes Secret:
Gloo Gateway lets the request through; however, we still received an HTTP 401 message, this time from the OpenAI API.
The OpenAI API expects an OpenAI API key in the authorization header, following this format: Authorization: Bearer OPENAI_API_KEY
. Instead of including this header for each request, we could automatically attach it for all requests sent to the OpenAI API at the gateway level.
Let’s modify the original Kubernetes Secret to include the OpenAI API key. Then, we can extract the OpenAI API key from the secret and add it to a header passed to the Gloo Gateway.
You can update the ExtAuthPolicy to read the openai-api-key
value from the Secret and put it in a header called x-api-key
:
At this point, the external auth policy assigns the OpenAI API key value to the header called x-api-key
. Still, we must tell the Gloo Gateway to construct the authorization header in a format that OpenAI API expects. We can do that with the TransformationPolicy resource.
The policy uses an inja template to extract a value from the x-api-key
header and store it in the openai_api_key
variable. In the headers field we’re then specifying the value of the authorization header by concatenating the word Bearer
with the value of the openai_api_key
.
If you send a request to the gateway, you’ll get back a valid response from the OpenAI API:
Let’s also try sending a simple prompt to the /v1/chat/completions
endpoint:
To summarize, in this example we have added an additional layer of security on our API gateway that allows us to mask our LLM API key(s) using custom headers. This shifts the control of what public LLM API keys are being consumed and how they are created and rotated into the platform team’s hands, reducing individual/team API key sprawl.
Customizing LLM API Definition
Notice the requests to the chat completion API are more complex and can include things such as:
- Prompts (user and/or system)
- Model
- Temperature
- Frequency penalty
- Logit bias
- Maximum number of tokens
- Number of chat completions
- and many more.
Just like we automatically attach an API key to the requests, we can also use transformation policies to create prompt templates, handle query parameters and API Key substitution, modify the request body before being sent to the OpenAI API, and much more. Take a look at some additional in-depth examples of how to configure Gloo Gateway for LLM backends.
Additionally, not all LLM APIs have the exact API definition, which requires developers to learn each API separately. To solve this, you can customize the LLM API definition you want to provide to your applications and let the Gloo Gateway transform it into the correct backed LLM API definition.
Here’s an example of a transformation policy that takes a prompt from the client requests and adds a model name and a system message at the API gateway level:
Note the {{ prompt }}
is Inja notation that will extract the attribute value (prompt) from the request body. If you send a request to the gateway, you’ll see the response is formatted according to the prompt, and we didn’t have to follow the API definition of the backing LLM API – we only sent the prompt value.
Here’s the response we passed through the jq tool to retrieve only the actual content of the message:
We can take this even further. Instead of using jq to parse the response at the client, we could update the response from OpenAI and only return the content that the caller cares about:
We added the response field to the configuration and defined another inja template that extracts the specific field from the response we receive from the OpenAI.
Take a look at some additional in-depth examples of how to configure Gloo Gateway for LLM backends.
Get Started With Gloo Gateway and LLM APIs
Using Gloo Gateway to support LLM APIs offers a robust solution for organizations seeking to enhance their API infrastructure’s security, reliability, and performance.
Organizations can enforce security policies, control traffic routing, and simplify API management by centralizing access through the gateway. Developers benefit from a unified internal endpoint, reducing the complexity of managing individual API keys and ensuring application consistency.
If you’re building or considering building with LLM/GenAI APIs, please complete this 30-second survey.
Contact Solo.io to learn how we can help you leverage Gloo Gateway for your LLM APIs and other API management needs.