Event driven architectures and event streaming applications are becoming core building blocks of modern, cloud native applications that have the requirement to process real time data at scale. Whether it’s website tracking, hyper-personalization or fraud detection, applications have various requirements to produce, consume and process events in high volumes with low latency.
In a previous blog post, I explained how Solo.io Gloo Gateway can be combined with Apache Kafka to expose Kafka systems to external consumers in a controlled manner. We demonstrated a number of Gloo Gateway’s features, adding authentication, authorization and rate-limiting capabilities to our Kafka architecture, laying the foundations for more advanced architectures powered by Gloo.
One of the most popular hosted and managed Kafka environments on the market is Confluent Cloud. As such, many of our customers have asked how they can integrate Confluent Cloud with Gloo. Like in the previous article, we will explore how we can integrate solo.io with Confluent Cloud, and add Gloo Gateway capabilities like authentication, authorization and rate-limiting to our Kafka clusters running in Confluent Cloud.
Apache Kafka and External Consumers
Kafka is a broker-based technology, in which a set of Kafka brokers form a cluster that processes and stores events. Kafka uses its own Kafka binary protocol for communication, which allows for efficient and flexible processing of events. At the same time, Kafka puts a lot of responsibilities on the client. For example, the Kafka client is, and needs to be, fully aware of the broker topology (e.g. the locations of the individual brokers, which brokers are the topic partition leaders and replicas, etc.) and is responsible for keeping track of the offset of messages that it has consumed. This makes Kafka clients relatively complex and non-trivial to implement. Second, as the communication mechanism is not based on an open standard like HTTP, dedicated clients need to be built, maintained, and supported for various programming languages and systems to connect them to Kafka. This results in an ecosystem with clients at various levels of maturity, with the Java client being the most mature.
Due to the reasons above, exposing a Kafka-based system to external consumers can be more difficult than exposing HTTP-based microservices. For the latter, a large ecosystem of (API) gateways and (application) firewalls, like solo.io Gloo Edge and Gloo Gateway, exist to expose systems to external consumers in a secured and controlled way. As long as the client side is able to communicate over HTTP, they can consume the exposed functionality.
Event Gateway Pattern
A common architectural pattern to make an event streaming platform accessible to the widest possible range of clients is the Event Gateway pattern. In this pattern, access to the event streaming platform is provided via a standard, well-supported interface. An example of such an interface is a RESTful API over HTTP.
In the previous article we explored how you can use a Kafka REST Proxy or HTTP Bridge to implement the Event Gateway pattern. This architecture has the downside that it adds an additional component (the Kafka REST Proxy) to your architecture which needs to be deployed, managed, maintained and operated.
With Confluent Cloud, users get a REST endpoint for their Kafka cluster out of the box. This allows us to directly integrate Gloo Gateway with Kafka clusters running in Confluent Cloud, removing the need for any additional components or moving parts.
Gloo Gateway
Gloo Gateway is a feature-rich, Kubernetes-native ingress controller and next-generation API gateway. Gloo Gateway provides function-level routing, discovery capabilities, advanced security, authentication and authorization semantics, rate limiting and more, with support for legacy apps, microservices, and serverless.
By using Gloo Gateway’s capabilities like authentication, authorization, rate limiting and monitoring, combined with a Confluent Cloud, we can expose our Kafka environment in a secured, controlled and managed way to clients over HTTP(s) (REST). Second, by exposing Kafka over HTTP, Kafka clients no longer need to be available for your programming language of choice. This allows any system that can communicate over HTTP to interact with the Kafka environment to produce events to the Kafka cluster.
Using the Confluent Cloud REST API
Prerequisites
To follow along with the example in this article, you will need:
- A Kubernetes cluster: this can be any Kubernetes cluster. In this example we will be running this on a local, single node, minikube cluster
- A Confluent Cloud Kafka cluster: this can be any Confluent Cloud Kafka cluster. Confluent provides a free plan that you can use to deploy a Kafka cluster free of charge. See here for the details.
- A Confluent Cloud API Key for your Kafka cluster.
- A Kafka topic with the name orders with 1 partition.
Confluent Cloud REST API
The Confluent Cloud REST API provides a set of operations to:
- View existing topics
- Create topics
- Delete topics
- Produce records to a topic, etc.
In this article, we will cover the use-cases to retrieve configuration information (topic information) from our Kafka cluster and produce data (records) to a Kafka topic using Gloo Gateway. Note however that all the other endpoints/operations can also be exposed via Gloo. This allows you to, for example, add advanced security features like WAF and OPA to the management REST endpoints of your Kafka cluster.
The Confluent Cloud RESTful API provides two ways to produce records to a Kafka topic:
- Streaming mode: recommended when sending a batch of records to a topic
- Non streaming mode: not recommended
As Confluent recommends using “streaming mode”, we will use that mode in this article. Note that “non streaming mode” works as well with Gloo Gateway, using the exact same setup that we will demonstrate in this article.
Retrieving Kafka Topic Information
We can use the Confluent Cloud REST API to retrieve information about our topics. We can call the topics endpoint and retrieve the information via an HTTP GET operation. Use the following cURL command. Replace the {CONFLUENT_CLOUD_KAFKA_CLUSTER_REST_ENDPOINT} and {API-KEY} placeholders with the values of your Kafka cluster (note that the {CONFLUENT_CLOUD_KAFKA_CLUSTER_REST_ENDPOINT} should look something like this:
This will return information about all the topics on our Kafka cluster, and the output will look like this:
Notice how the response contains the information about our orders topic.
Producing Records to Kafka
To send records to our Kafka orders topic, we can use the following cURL command. Replace the {CONFLUENT_CLOUD_KAFKA_CLUSTER_REST_ENDPOINT} and {API-KEY} placeholders with the values of your Kafka cluster:
Send a couple of these requests to your topic. You can use the Confluent Cloud message viewer to see how the messages are being added to the topic.
Continue learning in part two of this blog series to learn about securing, controlling, and manaing the Confluent Cloud REST API with Gloo Gateway.