Preface: Kitex Proxyless enables the Kitex service to interact directly with istiod without envoy sidecar. It dynamically obtains service governance rules delivered by the control plane based on the xDS protocol and converts them to Kitex rules to implement some service governance functions, such as traffic routing. Based on Kitex Proxyless, Kitex can be managed by Service Mesh without sidecar. Besides, the governance rule Spec, governance control plane, governance delivery protocol, and heterogeneous data governance capability can be unified under multiple deployment modes. By rewriting the bookinfo project using Kitex and Hertz, it demonstrates how to implement a traffic lane using xDS protocol.
Kitex is a Golang RPC framework open-sourced by ByteDance that already natively supports the xDS standard protocol and can be managed by Service Mesh in Proxyless way. Refer to this doc for detailed design: Proposal: Kitex support xDS Protocol. Official doc is also available here at Kitex/Tutorials/Advanced Feature/xDS Support
Kitex Proxyless Simply means that Kitex services can interact directly with istiod without envoy sidecar and dynamically obtain service governance rules delivered by the control plane based on the xDS protocol. And those rules will be translated into Kitex corresponding rules to implement some service governance functions (such as traffic routing which is the focus of this blog).
Based on Kitex Proxyless, Kitex application can be managed by Service Mesh in a unified manner without sidecar, and thus the governance rule Spec, governance control plane, governance delivery protocol, and heterogeneous data governance capability can be unified under multiple deployment modes.
Traffic routing refers to the ability to route traffic to a specified destination based on its specific metadata identifier.
Traffic routing is one of the core capabilities in service governance and one of the scenarios that Kitex Proxyless supports in the first place.
The approach of Kitex implementing traffic routing base on xDS is as follows:
Specific procedure:
As you can see, traffic routing is a process of selecting the corresponding SubCluster according to certain rules.
Based on traffic routing capability, we can extend many usage scenarios, such as: A/B testing, canary release, blue-green release, etc., and the focus of this paper: Traffic Lane.
The traffic lane can be understood as splitting a group of service instances in a certain way (such as deployment environment), and based on the routing capability and global metadata, so that traffic can flow in the specified service instance lanes in accordance with the exact rules (logically similar to lanes in a swimming pool). Traffic lane can be used for full-path grey release.
In Istio we typically group instances with DestinationRule subset, splitting a service into multiple subsets (e.g. Based on attributes such as version and region) and then work with VirtualService to define the corresponding routing rules and route the traffic to the corresponding subset. In this way, the single-hop routing capability in the lane is realized.
However, traffic routing capability alone is not enough to realize traffic lane. We need a good mechanism to accurately identify the traffic and configure routing rules for each hop traffic based on this feature when a request spans multiple services.
As shown in the following figure: Suppose we want to implement a user request that is accurately route to the v1 version of service-b.
The first thought might be to put a uid = 100
in the request header and configure the corresponding VirtualService to match the uid = 100
in the header.
But it has several obvious drawbacks for this approach:
Therefore, in order to achieve uniform traffic routing across the full path, we also need to use a more general traffic dyeing and the capability to transmit the dye identifier through the full path.
Traffic dyeing refers to marking the request traffic with a special identifier and carrying this identifier in the full request path. The so-called traffic lane means that all services in the path sets traffic routing rules based on the uniform gray traffic dyeing identifier so that the traffic can be accurately controlled in different lanes.
Usually, traffic dyeing is done at the gateway layer, and the metadata in the original request is converted into corresponding dye identifiers according to certain rules (conditions and proportions).
uid = 100
in the request header and cookie matching), the current request is marked with a dye identifier.With a unified traffic dyeing mechanism, we do not need to care about specific business attribute identifiers when configuring routing rules. We only need to configure routes based on the dye identifiers. The specific service attributes are abstracted into conditional dyeing rules to be more universal. Even if the business attributes change, the routing rules do not need to change frequently.
The dyed identifier is usually transmitted through the Tracing Baggage, which is used to pass business custom KV attributes through the entire call chain (full-path), such as traffic dyeing identifiers and business identifiers such as AccountID.
We usually use the Tracing Baggage mechanism to transmit the corresponding dye identifiers through the full-path. Most of the Tracing frameworks support the Baggage concept, such as: OpenTelemetry, Skywalking, Jaeger, etc.
With a set of universal full-path transmitting mechanism, the service only needs to config the tracing once, and there is no need to adapt every time the service attribute identifier changes.
Next part introduces and demonstrates how to implement the traffic lane based on Kitex Proxyless and OpenTelemetry Baggage by using a specific engineering example.
The demo is a rewriting of the Istio Bookinfo project using Hertz and Kitex:
In keeping with Bookinfo, the overall business architecture is divided into four separate microservices:
productpage
- This microservice calls details
and reviews
;details
- This microservice contains information about the book;reviews
- This microservice contains book related reviews. It also calls ratings
;ratings
- This microservice contains ratings information consisting of book reviews.reviews
are available in three versions:
The whole call chain is divided into 2 lanes:
The gateway is responsible for traffic dyeing. For example, the request with uid=100
in the request header is dyed and carries baggage of env=dev
.
The dye mode may vary according to different gateways. For example, when we select istio ingress as the gateway, we can use EnvoyFilter + Lua to write the gateway dye rules.
Take service reviews
as an example. You only need to label the corresponding pod with version: v1
.
The gateway has already dyed the request header with uid=100
and automatically loaded env=dev
baggage,
so we only need to match the route according to the header. Here is an example of the route rule configuration:
Requests without uid=100
header in the inbound traffic are automatically routed to the base lane, which is a round-robin of v1 and v3 of reviews
service resulting in a round-robin score of 0 and 1.
We use the mod-header plug-in of the browser to simulate the scenario where the uid=100
is carried in the inbound traffic request header.
Click the refresh button again, you can find that the request hits the branch lane, and the traffic lane takes effect successfully.
So far, we have implemented a complete full-path traffic lane based on Kitex Proxyless and OpenTelemetry. And we can set corresponding routing rules for Kitex based on Istio standard governance rule Spec without Envoy sidecar.
In addition to traffic routing capabilities, Kitex Proxyless is also continuously iterating and optimizing to meet more requirements for data plane governance capabilities. As an exploration and practice of Service Mesh data plane, Proxyless not only can enrich the deployment form of data plane, but also hopes to continuously polish Kitex, enhance its ability in open source ecological compatibility, and create a more open and inclusive microservice ecosystem.
Here is a list of the projects involved in the demo:
This demo has been submitted in the biz-demo repository, and will be optimised continuously. biz-demo will include some complete demos based on CloudWeGo technology stack with certain business scenarios. The original intention is to provide valuable references for enterprise users to use CloudWeGo in production. Contributors are always welcomed to participate in the contribution of CloudWeGo biz-demo. Let’s try something fun together.