Skip to main content

8 posts tagged with "kubevela"

View All Tags

KubeVela v1.3 released, CNCF's Next Generation of Cloud Native Application Delivery Platform

KubeVela Community

KubeVela Community

KubeVela Team

Thanks to the contribution of hundreds of developers from KubeVela community and around 500 PRs from more than 30 contributors, KubeVela version 1.3 is officially released. Compared to v1.2 released three months ago, this version provides a large number of new features in three aspects as OAM engine (Vela Core), GUI dashboard (VelaUX) and addon ecosystem. These new features are derived from the in-depth practice of many end users such as Alibaba, LINE, China Merchants Bank, and iQiyi, and then finally become part of the KubeVela project that everyone can use out of the box.

Pain Points of Application Delivery#

So, what challenges have we encountered in cloud-native application delivery?

Hybrid clouds and multi-clusters is the new norm#

On one hand, as global cloud providers' service maturing, the way most enterprises build infrastructure has become mainly replying on cloud providers and self-built as a supplement. More and more business enterprise can directly enjoy the business convenience brought by the development of cloud technology, use the elasticity of the cloud, and reduce the cost of self-built infrastructure. Enterprises need a standardized application delivery layer, which can include containers, cloud services and various self-built services in a unified manner, so as to easily achieve cloud-to-cloud interoperability, reduce the risks brought by tedious application migration, and worry-free cloud migration.

On the other hand, for security concerns such as infrastructure stability and multi-environment isolation and due to limitations by the maximized size of Kubernetes can handle, more and more enterprises are beginning to adopt multiple Kubernetes clusters to manage container workloads. How to manage and orchestrate container applications at the multi-cluster level, and solve problems such as scheduling, dependencies, versions, and gray releasing, while providing business developers with a low-threshold experience, is a big challenge.

It can be seen that the hybrid cloud and multi-cluster involved in modern application delivery are not only multiple Kubernetes clusters, but also diverse workloads and DevOps capabilities for managing cloud services, SaaS, and self-built services.

How to pick from more than 1000+ techniques in cloud-native era#

Let's take the open-source projects that have joined the CNCF ecosystem as an example, the number of which has exceeded 1,000. For teams of different scales, different industries, and different technical backgrounds, it seems that the R&D team is doing similar business application delivery and management, but with changes in requirements and usage scenarios, huge differences in technology stacks will be derived. This involves a very large learning cost and threshold for integration and migration. And CNCF's thousands of ecological projects are always tempting us to integrate new projects, add new features, and better accomplish business goals. The era of static technology stacks is long gone.

alt Figure 1. CNCF landscape

Next-generation application delivery and management require flexible assembly capabilities. According to the needs of the team, based on the minimum capability set, new functions can be expanded at a small cost, but not significantly enlarged. The traditional PaaS solution based only on a set of fixating experiences has been proven to be difficult to meet the changing scenario needs of a team during product evolution.

Next step of DevOps, delivering and managing applications for diverse infrastructures#

For more than a decade, DevOps technology has been evolving to increase productivity. Today, the production process of business applications has also undergone great changes, from the traditional way of coding, testing, packaging, deployment, maintenance, and observation, to the continuous enhancement of cloud infrastructure meaning various SaaS services based on API directly become an integral part of the application. From the diversification of development languages to the diversification of deployment environments, to the diversification of components, the traditional DevOps toolchain is gradually unable to cope with and meanwhile, the complexity of user needs is increasing exponentially.

Although DevOps prolongs, we need some different solutions. For modern application delivery and management, we still have the same pursuit of reducing human input as much as possible and becoming more intelligent. The new generation of DevOps technology needs to have easier-to-use integration capabilities, service mesh capabilities, and management capabilities that integrate observation and maintenance. At the same time, the tools need to be simple and easy to use, and the complexity stays within the platform. When choosing, enterprises can combine their own business needs, cooperate with the new architecture and legacy systems, and assemble a platform solution suitable for their team, to avoid the new platform becoming a burden for business developers or enterprises.

The Path of KubeVela Lies Ahead#

To build the next generation application delivery platform, we do: alt Figure 2. Overlook of OAM/KubeVela ecosystem

OAM(Open Application Model): evolving methodology in fast pacing practice#

Based on the internal practical experience of Alibaba and Microsoft, we launched OAM, a brand-new application model and concept in 2019. Its core idea lies in the separation of concerns, through the unified abstraction of components and traits, it can standardize business research and development in the cloud-native era. Collaboration between development team and DevOps team becomes more efficient, and at the same time we expect to avoid the complexity caused by differences in different infrastructures. We then released KubeVela as a standardized implementation of the OAM model to help companies quickly implement OAM while ensuring that OAM-compliant applications can run anywhere. In short, OAM describes the complete components of a modern application in a declarative way, while KubeVela runs according to the final state declared by OAM. Through the reconcile loop oriented to the final state, the two jointly ensure the consistency and correctness of application delivery.

Recently, we have seen a paper published by Google announcing the results of its internal learning in infrastructure construction named as "Prodspec and Annealing". Its design concept and practice are strikingly similar to "OAM and KubeVela". It can be seen that different enterprises in global shares the same vision for delivering cloud-native applications. This paper also re-confirm the correctness of the standardized model and KubeVela. In the future, we will continue to promote the development of the OAM model based on the community's practice and evolution of KubeVela, and continue to deposit best practices into methodology.

A universal hybrid environment and multi-cluster delivery control plane#

The kernel of KubeVela exists in the form of a CRD Controller, which can be easily integrated with the Kubernetes ecosystem, and the OAM model is also compatible with the Kubernetes API. In addition to the abstraction and orchestration capabilities of the OAM model, KubeVela's microkernel is also a natural application delivery control plane designed for multi-cluster and hybrid cloud environments. This also means that KubeVela can seamlessly connect diverse workloads such as cloud resources and containers, and orchestrate and deliver them in different clouds and clusters.

In addition to the basic orchestration capabilities, one core feature of KubeVela is that it allows users to customize the delivery workflow. The workflow steps provide deploying components to the cluster, setting up manual approval, sending notifications, etc. When the workflow execution enters a stable state (such as waiting for manual approval), KubeVela will also automatically maintain the state. Or, through the CUE-based configuration language, you can integrate any IaC-based process, such as Kubernetes CRD, SaaS API, Terraform module, image script, etc. KubeVela's IaC extensibility enables it to integrate Kubernetes' ecological technology at a meager cost. It is very quickly for platform builders to incorporate into their own PaaS or delivery systems. Also, through KubeVela's powerful extensibility, other ecological capabilities can be standardized for enterprise users.

In addition to the advanced model and extended kernel, we've also heard a lot from the community to call out an out-of-the-box product that makes using KubeVela easier. Since version 1.2, the community has invested in developing the GUI dashboard (VelaUX) project, based on KubeVela's microkernel, running on top of the OAM model and creating a delivery platform for CI/CD scenarios. We hope that enterprises can swiftly adopt VelaUX to meet business needs and have a robust, extensible ability to meet the needs of future businesses. alt Figure 3. Product architecture of KubeVela

Around this path, in version 1.3, the community brought the following updates:

Enhancement as a Kubernetes Multi-Cluster Control Plane#

No migration and switch to multi-cluster seamlessly#

After the enterprise has completed the application transformation to a cloud-native architecture, is it still necessary to perform configuration transformation when switching to multi-cluster deployment? The answer is negative.

KubeVela is naturally built upon a multi-cluster basis. As shown in Figure 4, this application YAML represents an application of the Nginx component that will be published to all clusters labeled as region=hangzhou. For the same application description, we only need to specify the name of the cluster to be delivered in Policy or filter specific collections by tags. alt Figure 4. OAM application - select deployment cluster

Of course, the application description shown in Figure 4 is entirely based on the OAM specification. If your current application has been defined in Kubernetes native resources, don't worry, we support the smooth transition from it, as shown in Figure 5 below, "Referencing Kubernetes resources for multi-cluster deployment," which describes a particular application whose components depend on a Secret resource that exists in the control cluster and publishes it to all clusters labeled as region=hangzhou. alt Figure 5. Reference Kubernetes native resource

In addition to multi-cluster deployment of applications, referencing Kubernetes objects can also be used in scenarios such as multi-cluster replication of existing resources, cluster data backup, etc.

Handling multi-cluster differences#

Although the application has been described in a unified OAM Model, there may be differences in the deployment of different clusters. For example, other regions use different environment variables and image registries. Different clusters deploy various components, or a component is deployed in multiple clusters but works as high availability for all, etc. For such requirements, we provide a deployment strategy to do differentiated configuration, as shown in Figure 6 below, as part of this kind of Policy. The first and second topology types of Policy define two target strategies in two ways. The third one means to deploy only the specified components. The fourth Policy represents the deployment of the selected two kinds of components and the difference in the image configuration of one of the components. alt Figure 6. Differentiated configuration of multi-clusters

KubeVela supports flexible differential configuration policies, which can be configured by component properties, traits, and other forms. As shown in the figure above, the third strategy describes the component selection ability, and the fourth strategy describes the difference between the image version. We can see that there is no target specified when describing the difference. The differentiated configuration can be patched flexibly by combining it into the workflow steps.

Configure a multi-cluster delivery process#

The application delivery process to different target clusters is controllable and described by workflow. As shown in Figure 7, the steps of deploying to two clusters and the target policy and differentiation strategy were adopted, respectively. The above shows that policy deployment only needs to be defined atomically and can be flexibly combined in the workflow steps to meet the requirements of different scenarios. alt Figure 7. Customize the multi-cluster delivery process

There are many more usages for delivery workflow, including multi-cluster canary release, manual approval, precise release control, etc.

Version control, safe and traceable#

The description of complex applications is changing at any time with agile development. To ensure the security of application release, we need to have the ability to roll back our application to a previous correct state at the time of release or after release. Therefore, we have introduced a more robust versioning mechanism in the current version. alt Figure 8. Querying historical version of the application

We can query every past version of an application, including its release time and whether it was successful or not. We can compare the changes between versions and quickly roll back based on the snapshot rendered by the previous successful version when we encounter a failure during the release. After releasing a new version, you don't need to change the configuration source if it fails. You can directly re-release based on a history version. The version control mechanism is the centralized idea of application configuration management. After the complete description of the application is rendered uniformly, it is checked, stored, and distributed.

See more Vela Core usages#

VelaUX Introduces Multi-Tenancy Isolation and User Authentication#

Multi-tenancy and isolation for enterprises#

In VelaUX, we introduce the concept of a Project that separate multi-tenancy for safety, including application delivery targets, environments, team members and permissions, etc. Figure 9 shows the project list page. Project administrators can create different projects on this page according to the team's needs to allocate corresponding resources. This capability becomes very important when there are multiple teams or multiple project groups in the enterprise publishing their business applications using the VelaUX platform simultaneously. alt Figure 9. Project management page

Open Authentication & RBAC#

As a vital platform, user authentication is one of the basic capabilities that must be possessed. Since version 1.3, we have supported user authentication and RBAC authentication.

We believe that most enterprises have built a unified authentication platform (Oauth or LDAP) for user authentication. Therefore, VelaUX prioritizes Dex getting through the single sign-on capability, supports LDAP, OIDC, Gitlab/Github, and other user authentication methods, and regards VelaUX as one of the sub portals that let access get through. Of course, if your team does not need unified authentication, we also provide basic local user authentication capabilities. alt Figure 10. Local user management

For authentication, we use the RBAC model. Still, we also saw that the primary RBAC mode could not handle more precise permission control scenarios, such as authorizing the operation rights of an application to specific users. We inherit the design concept of IAM and expand the permissions to the policy composition of resource + action + condition + behavior. The authentication system (front-end UI authentication/back-end API authentication) has implemented policy-oriented fine-grained authentication. However, in terms of authorization, the current version only has some built-in standard permission policies, and subsequent versions provide the ability to create custom permissions.

At the same time, we have also seen that some large enterprises have built independent IAM platforms. The RBAC data model of VelaUX is the same as that of common IAM platforms. Therefore, users who wish to connect VelaUX to their self-built IAM can extend seamlessly.

More secure centralized DevOps#

There will inevitably be some configuration management of operation and maintenance requirements in application delivery, primarily based on multi-cluster. The configuration management requirements are particularly prominent, such as the authentication configuration of the private image repository, the authentication configuration of the Helm product library, or the SSL certificate Wait. We need to uniformly manage the validity of these configurations and securely synchronize them where they are needed, preferably without business developer awareness.

In version 1.3, we introduced a module for integrated configuration management in VelaUX. Its bottom layer also uses component templates and application resource distribution links to manage and distribute configurations. Currently, Secret is used for configuration storage and distribution. The configuration lifecycle is independent of business applications, and we maintain the configuration distribution process independently in each project. You only need to fill in the configuration information for administrator users according to the configuration template. alt Figure 11. Integration configuration

Various Addons provide different configuration types, and users can define more configuration types according to their needs and manage them uniformly. For business-level configuration management, the community is also planning.

See more VelaUX usages#

Introducing version control in Addon ecosystem#

The Addon function was introduced in version 1.2, providing an extended plug-in specification, installation, operation, and maintenance management capabilities. The community can expand the ecological capacities of KubeVela by making different Addons. When our plug-ins and frameworks are constantly iterating, the problem of version compatibility gradually emerges, and we urgently need a version management mechanism.

  • Addon version distribution: We develop and manage the community's official Addon on Github. In addition to the integrated third-party product version, each Addon also includes a Definition and other configurations. Therefore, after each Addon is released, we define it according to its Definition. The version number is packaged, and the history is preserved. At the same time, we reused Helm Chart's product distribution API specification to distribute Addon.
  • Addon version distribution: We develop and manage the community's official Addon on Github. In addition to the integrated third-party product version, each Addon also includes a Definition and other configurations. Therefore, after each Addon is released, we define it according to its Definition. The version number is packaged, and the history is preserved. At the same time, we reused Helm Chart's product distribution API specification to distribute Addon.

Multi-cluster Addon controllable installation#

A type of Addon needs to be installed in the subcluster when installing, such as the FluxCD plug-in shown in Figure 12, which provides Helm Chart rendering and deployment capabilities. We need to deploy it to sub-clusters, and in the past, this process was distributed to all sub-clusters. However, according to community feedback, different plug-ins do not necessarily need to be installed in all clusters. We need a differential processing mechanism to install extensions to specified clusters on demand. alt Figure 12 Addon configuration

The user can specify the cluster to be deployed when enabling Addon, and the system will deploy the Addon according to the user's configuration.

New members to Addon ecosystem#

While iteratively expanding the framework's capabilities, the existing Addons in the community are also continuously being added and upgraded. The number of supported vendors has increased to seven at the cloud service support level. Ecological technology, AI training and service plug-ins, Kruise Rollout plug-ins, Dex plug-ins, etc., have been added. At the same time, the Helm Chart plug-in and the OCM cluster management plug-in have also been updated for user experience.

More Addon usages#

Recent roadmap#

As KubeVela core becomes more and more stable, its scalability is unleashed gradually. The evolution of the 1.2/1.3 version of the community has been accelerated. In the future, we will iterate progressively new versions in a two-month cycle. In the next 1.4 release, we will add the following features:

  • Observability: Provide a complete observability solution around logs, metrics, and traces, provide out-of-the-box observability of the KubeVela system, allow custom observability configuration, and integrate existing observability components or cloud resources.
  • Offline installation: Provide relatively complete offline installation tools and solutions to facilitate more users to use KubeVela in an offline environment.
  • Multi-cluster permission management: Provides in-depth permission management capabilities for Kubernetes multi-cluster.
  • More out-of-the-box Addon capabilities.

The KubeVela community is looking forward to your joining to build an easy-to-use and standardized next-generation cloud-native application delivery and management platform!

Easily Manage your Application Shipment With Differentiated Configuration in Multi-Cluster

Wei Duan

Wei Duan

KubeVela Team

Under today's multi-cluster business scene, we often encounter these typical requirements: distribute to multiple specific clusters, specific group distributions according to business need, and differentiated configurations for multi-clusters.

KubeVela v1.3 iterates based on the previous multi-cluster function. This article will reveal how to use it to do swift multiple clustered deployment and management to address all your anxieties.

Before Starting#

  1. Prepare a Kubernetes cluster as the control plane of KubeVela.
  2. Make sure KubeVela v1.3 and KubeVela CLI v1.3.0 have been installed successfully.
  3. The list of Kubeconfig from sub clusters that you want to manage. We will take three clusters naming beijing-1, beijing-2 and us-west-1 as examples.
  4. Download and combine with Multi-Cluster-Demo to better understand how to use the KubeVela multi-cluster capabilities.

Distribute to Multiple Specified Clusters#

Distributing multiple specified clusters is the most basic multi-cluster management operation. In KubeVela, you will use a policy called topology to implement it. The cluster will be listed in the attribute clusters, an array.

First let's make sure switching kubeconfig to the control plane cluster, go with vela cluster join to include in the 3 clusters of Beijing-1, Beijing-2 and us-west-1:

➜ vela cluster join beijing-1.kubeconfig --name beijing-1
➜ vela cluster join beijing-2.kubeconfig --name beijing-2
➜ vela cluster join us-west-1.kubeconfig --name us-west-1
➜ vela cluster list
CLUSTER TYPE ENDPOINT ACCEPTED LABELS
beijing-1 X509Certificate https://47.95.22.71:6443 true
beijing-2 X509Certificate https://47.93.117.83:6443 true
us-west-1 X509Certificate https://47.88.31.118:6443 true

Then open multi-cluster-demo, look into Basic.yaml:

apiVersion: core.oam.dev/v1beta1
kind: Application
metadata:
name: example-app
namespace: default
spec:
components:
- name: hello-world-server
type: webservice
properties:
image: oamdev/hello-world
port: 8000
traits:
- type: scaler
properties:
replicas: 3
- type: gateway
properties:
domain: testsvc-mc.example.com
# classInSpec : true If the sub clusters has Kubernetes versions below v1.20 installed, please add this field
http:
"/": 8000
policies:
- type: topology
name: beijing-clusters
properties:
clusters: ["beijing-1","beijing-2"]

It can be seen that this app uses the component of type webservice and distributes 3 Deployments to beijing-1 and beijing-2 clusters through the topology policy.

Please note that the premise of successfully distributing resource into managed clusters is that it must contain the exactly same namespace as control plane did. Since each cluster has the default namespace by default, we won't be worry in this case. But suppose we change the namespace in basic.yaml to be multi-cluster, we will receive an error:

...
Status: runningWorkflow
Workflow:
mode: DAG
finished: false
Suspend: false
Terminated: false
Steps
- id:9fierfkhsc
name:deploy-beijing-clusters
type:deploy
phase:failed
message:step deploy: step deploy: run step(provider=oam,do=components-apply): Found 1 errors. [(failed to apply component beijing-1-multi-cluster-0: HandleComponentsRevision: failed to create componentrevision beijing-1/multi-cluster/hello-world-server-v1: namespaces "multi-cluster" not found)]
Services:
...

In future versions of KubeVela, we plan to support a comprehensive Authentication System, more convenient and more securely to: create namespaces in managed cluster through the hub cluster in quick moves.

After creating the sub cluster's namespace, come back to the control plane cluster to create the application and ship out resources:

➜ vela up -f basic.yaml
Applying an application in vela K8s object format...
"patching object" name="example-app" resource="core.oam.dev/v1beta1, Kind=Application"
✅ App has been deployed 🚀🚀🚀
Port forward: vela port-forward example-app
SSH: vela exec example-app
Logging: vela logs example-app
App status: vela status example-app
Service status: vela status example-app --svc hello-world-server

We use vela status <App Name> to view detailed infos about this app:

➜ vela status example-app
About:
Name: example-app
Namespace: default
Created at: 2022-03-25 17:42:33 +0800 CST
Status: running
Workflow:
mode: DAG
finished: true
Suspend: false
Terminated: false
Steps
- id:wftf9d4exj
name:deploy-beijing-clusters
type:deploy
phase:succeeded
message:
Services:
- Name: hello-world-server
Cluster: beijing-1 Namespace: default
Type: webservice
Healthy Ready:3/3
Traits:
✅ scaler ✅ gateway: Visiting URL: testsvc-mc.example.com, IP: 60.205.222.30
- Name: hello-world-server
Cluster: beijing-2 Namespace: default
Type: webservice
Healthy Ready:3/3
Traits:
✅ scaler ✅ gateway: Visiting URL: testsvc-mc.example.com, IP: 182.92.222.128

Both the beijing-1 and beijing-2 have issued the corresponding resources, they also displayed external access IP addresses, and you can therefore make it public for your users.

Use Cluster Labels to Do Grouping#

In addition to the above basic need, we often encounter additional situations: cross-regional deployment to certain clusters, specify which cloud provider's cluster, etc. In order to achieve a similar goal, the labels feature can be used.

Here, suppose the us-west-1 cluster comes from AWS, we must additionally apply to the AWS cluster. You can use the vela cluster labels add command to tag the cluster. Of course, if there is more of AWS related clusters such as us-west-2, it will be handled as well after they were labeled:

➜ ~ vela cluster labels add us-west-1 provider=AWS
Successfully update labels for cluster us-west-1 (type: X509Certificate).
provider=AWS
➜ ~ vela cluster list
CLUSTER TYPE ENDPOINT ACCEPTED LABELS
beijing-1 X509Certificate https://47.95.22.71:6443 true
beijing-2 X509Certificate https://47.93.117.83:6443 true
us-west-1 X509Certificate https://47.88.31.118:6443 true provider=AWS

Next we update the basic.yaml to add an application policy topology-aws:

...
policies:
- type: topology
name: beijing-clusters
properties:
clusters: ["beijing-1","beijing-2"]
- type: topology
name: topology-aws
properties:
clusterLabelSelector:
provider: AWS

In order save your time, please deploy intermediate.yaml directly:

➜ ~ vela up -f intermediate.yaml

Review the status of the application again:

➜ vela status example-app
...
- Name: hello-world-server
Cluster: us-west-1 Namespace: default
Type: webservice
Healthy Ready:3/3
Traits:
✅ scaler ✅ gateway: Visiting URL: testsvc-mc.example.com, IP: 192.168.40.10

Differentiated Configuration#

Apart from above scenarios, we tend to have more application strategic needs, such as high availability of hoping to distribute 5 replicas. In this case, use the override policy:

...
clusterLabelSelector:
provider: AWS
- type: override
name: override-high-availability
properties:
components:
- type: webservice
traits:
- type: scaler
properties:
replicas: 5

At the same time, we hope that only AWS clusters can get high availability. Then we can expect KubeVela's workflow give us a hand. We use the following workflow: it aims to deploy this app by, first distributing to Beijing's clusters through the deploy-beijing policy, then distributing 5 copies to clusters which were labeled as AWS:

...
properties:
replicas: 5
workflow:
steps:
- type: deploy
name: deploy-beijing
properties:
policies: ["beijing-clusters"]
- type: deploy
name: deploy-aws
properties:
policies: ["override-high-availability","topology-aws"]

Then we attach the above policy and workflow to intermediate.yaml and make it to advanced.yaml:

...
policies:
- type: topology
name: beijing-clusters
properties:
clusters: ["beijing-1","beijing-2"]
- type: topology
name: topology-aws
properties:
clusterLabelSelector:
provider: AWS
- type: override
name: override-high-availability
properties:
components:
- type: webservice
traits:
- type: scaler
properties:
replicas: 5
workflow:
steps:
- type: deploy
name: deploy-beijing
properties:
policies: ["beijing-clusters"]
- type: deploy
name: deploy-aws
properties:
policies: ["override-high-availability","topology-aws"]

Then deploy it, view the status of the application:

➜ vela up -f advanced.yaml
Applying an application in vela K8s object format...
"patching object" name="example-app" resource="core.oam.dev/v1beta1, Kind=Application"
✅ App has been deployed 🚀🚀🚀
Port forward: vela port-forward example-app
SSH: vela exec example-app
Logging: vela logs example-app
App status: vela status example-app
Service status: vela status example-app --svc hello-world-serverapplication.core.oam.dev/podinfo-app configured
➜ vela status example-app
...
- Name: hello-world-server
Cluster: us-west-1 Namespace: default
Type: webservice
Healthy Ready:5/5
Traits:
✅ scaler ✅ gateway: Visiting URL: testsvc-mc.example.com, IP: 192.168.40.10

The above all are what we'd like to share with you for this time, thank you for reading and trying them out.

We invite you to explore KubeVela v1.3 for more to meet further complex requirements on business, such as dig deep in differentiated configurations to use override application policy to either override all resources on one type or only certain specific components.

China Merchants Bank's Practice on Offline Installation with KubeVela

Xiangbo Ma

Xiangbo Ma

(Cloud platform development team)

The cloud platform development team of China Merchants Bank has been trying out KubeVela since 2021 internally and aims to using it for enhancing our primary application delivery and management capabilities. Due to the specific security concern for financial insurance industry, network control measurements are relatively strict, and our intranet cannot directly pull Docker Hub image, and there is no Helm image source available as well. Therefore, in order to landing KubeVela in the intranet, you must perform a complete offline installation.

This article will take the KubeVela V1.2.5 version as an example, introduce the offline installation practice to help other users easier to complete KubeVela's deployment in offline environment.

KubeVela Offline Installation Solution#

We divide the offline installation of KubeVela in three parts, which are Vela CLI, Vela Core, and Addon offline installation. Each part mainly involves the loading of the relevant Docker image and Helm's package, which can greatly speed up deployment process in offline environment.

Before doing so, please ensure that Kubernetes cluster version is >= v1.19 && < v1.22. One way of KubeVela as a control plane relies on Kubernetes, which can be placed in any product or in any cloud provider. At the same time, you can also use Kind or Minikube to deploy KubeVela locally.

Vela CLI Offline Installation#

  • First, you need to download of the binary version of vela that you want by checking KubeVela Release Log
  • Unzip binary files and configure the appropriate environment variables in $PATH
    • Unzip binary file
      • tar -zxvf vela-v1.2.5-linux-amd64.tar.gz
      • mv ./linux-amd64/vela /usr/local/bin/vela
    • Set environment variables
      • vi /etc/profile
      • export PATH="$PATH:/usr/local/bin"
      • source /etc/profile
    • Verify the installation of Vela CLI through vela version
CLI VERSION: V1.2.5
Core Version:
GitRevision: git-ef80b66
GOLANGVERSION: Go1.17.7
  • At this point, Vela CLI has been deployed offline!

Vela Core Offline Installation#

  • Before deploying Vela Core offline, first you need to install Helm in an offline environment and its version needs to meet v3.2.0+
  • Prepare Docker image. Vela Core's deployment mainly involves 5 images, you need to first visit the Docker Hub in extranet to download the corresponding images, then load them to offline environment
    • Pull the image from Docker Hub
      • docker pull oamdev/vela-core:v1.2.5
      • docker pull oamdev/cluster-gateway:v1.1.7
      • docker pull oamdev/kube-webhook-certgen:v2.3
      • docker pull oamdev/alpine-k8s:1.18.2
      • docker pull oamdev/hello-world:v1
    • Save image to local disks
      • docker save -o vela-core.tar oamdev/vela-core:v1.2.5
      • docker save -o cluster-gateway.tar oamdev/cluster-gateway:v1.1.7
      • docker save -o kube-webhook-certgen.tar oamdev/kube-webhook-certgen:v2.3
      • docker save -o alpine-k8s.tar oamdev/alpine-k8s:1.18.2
      • docker save -o hello-world.tar oamdev/hello-world:v1
    • Re-load the image in the offline environment
      • docker load vela-core.tar
      • docker load cluster-gateway.tar
      • docker load kube-webhook-certgen.tar
      • docker load alpine-k8s.tar
      • docker load hello-world.tar
  • Download KubeVela Core, copy it to offline environment and use Helm to repackage
    • Repackage the KubeVela source code and install the chart package to the control cluster offline
      • helm package kubevela/charts/vela-core --destination kubevela/charts
      • helm install --create-namespace -n vela-system kubevela kubevela/charts/vela-core-0.1.0.tgz --wait
    • Check the output
KubeVela Control Plane Has Been successfully set up on your cluster.
  • At this point, Vela Core has been deployed offline!

Addon Offline Installation#

  • First download Catalog Source and copy it to offline environment
  • Here, we will take VelaUX, one of many more addons, as an example. First prepare its Docker image, VelaUX mainly involve 2 images, you need to first access the extranet to download the corresponding image from Docker Hub, then load it to offline environment
    • Pull the image from Docker Hub
      • docker pull oamdev/vela-apiserver:v1.2.5
      • docker pull oamdev/velaux:v1.2.5
    • Save image to local disks
      • docker save -o vela-apiserver.tar oamdev/vela-apiserver:v1.2.5
      • docker save -o velaux.tar oamdev/velaux:v1.2.5
    • Re-load the image in the offline environment
      • docker load vela-apiserver.tar
      • docker load velaux.tar
  • Install VelaUX
    • Install VelaUX via Vela CLI
      • vela addon enable catalog-master/addons/velaux
    • Check the output
Addon: velaux enabled Successfully.
  • If there is a cluster installed route Controller or Nginx Ingress Controller and also linked with an available domain, you can deploy external routing to make VelaUX accessible. Here present Openshift Route as an example, you can also choose Ingress if you wish
apiVersion: route.openshift.io/v1
kind: Route
metadata:
name: velaux-route
namespace: vela-system
spec:
host: velaux.xxx.xxx.cn
port:
targetPort: 80
to:
kind: Service
name: velaux
weight: 100
wildcardPolicy: None
  • Check the installation
curl -I -m 10 -o /dev/null -s -w %{http_code} http://velaux.xxx.xxx.cn/applications
  • At this point, VelaUX has been deployed offline! At the same time, for other types of Addon's offline deployment, access to the corresponding directory of the Catalog Source and repeat the above moves, you would complete all the addons' offline deployments for good.

Summarize#

During offline deployment, we also try to save Vela Core and Addon's resource that generated to be YAML files after deploying in extranet and re-deploy them in an offline environment, but because of all different kinds of resource involved in and it requires many other authorization issues to resolve, this way is more than cumbersome.

With this practice of KubeVela's offline deployment, we hope it help you build a complete set of KubeVela in offline environment much faster. Offline installation is pretty much a pain point for most developers, we also see that the KubeVela community is introducing the brand new velad, a fully offline, highly accountable installation tool. Velad can help automate completion by making many steps as one, such as preparing clusters, downloading and packing image, installing and etc. Further more, it do support many features: In Linux machine (such as Alibaba Cloud ECS) we can locally spin up a cluster to install Vela-Core; while starting a KubeVela control plane, do not have to worry about its data to be lost when machine behind it accidentally was shutdown; Velad can stores all the data from control plane cluster into a traditional database (such as MySQL deployed on another ECS).

In the recent version to come, China Merchants Bank will increase the efforts in the open source community of KubeVela, actively building: enterprise-level capacity, enhancement on multi-cluster, offline deployment and application-level observability. We'll also be contributing the financial industry's user scenarios and business needs, driving cloud-native ecology achieve more easily and efficient application management experience, and at last but not at least, welcome you the community member to join us together in this journey.

Machine Learning Practice with KubeVela

Tianxin Dong

Tianxin Dong

KubeVela team

At the background of Machine learning goes viral, AI engineers not only need to train and debug their models, but also need to deploy them online to verify how it looks(of course sometimes, this part of the work is done by AI platform engineers. ). It is very tedious and draining AI engineers.

In the cloud-native era, our model training and model serving are also usually performed on the cloud. Doing so not only improves scalability, but also improves resource utility. This is very effective for machine learning scenarios that consume a lot of computing resources.

But it is often difficult for AI engineers to use cloud-native techniques. The concept of cloud native has become more complex over time. Even to deploy a simple model serving on cloud native architecture, AI engineers may need to learn several additional concepts: Deployment, Service, Ingress, etc.

As a simple, easy-to-use, and highly scalable cloud-native application management tool, KubeVela enables developers to quickly and easily define and deliver applications on Kubernetes without knowing any details about the underlying cloud-native infrastructure. KubeVela's rich extensibility extends to AI addons and provide functions such as model training, model serving, and A/B testing, covering the basic needs of AI engineers and helping AI engineers quickly conduct model training and model serving in a cloud-native environment.

This article mainly focus on how to use KubeVela's AI addon to help engineers complete model training and model serving more easily.

KubeVela AI Addon#

The KubeVela AI addon is divided into two: model training and model serving. The model training addon is based on KubeFlow's training-operator and can support distributed model training in different frameworks such as TensorFlow, PyTorch, and MXNet. The model serving addon is based on Seldon Core, which can easily use the model to start the model serving, and also supports advanced functions such as traffic distribution and A/B testing.

alt

Through the KubeVela AI addon, the deployment of model training and serving tasks can be significantly simplified. At the same time, the process of model training and serving can be combined with KubeVela's own workflow, multi-cluster and other functions to complete production-level services.

Note: You can find all source code and YAML files in KubeVela Samples. If you want to use the model pretrained in this example, style-model.yaml and color-model.yaml in the folder will do that and copy the model into the PVC.

Model Training#

First enable the two addons for model training and model serving.

vela addon enable model-training
vela addon enable model-serving

Model training includes two component types, model-training and jupyter-notebook, and model serving includes the model-serving component type. The specific parameters of these three components can be viewed through the vela show command.

You can also read KubeVela AI Addon Documentation for more information.

vela show model-training
vela show jupyter-notebook
vela show model-serving

Let's train a simple model using the TensorFlow framework that turns gray images into colored ones. Deploy the following YAML file:

Note: The source code for model training comes from: emilwallner/Coloring-greyscale-images

apiVersion: core.oam.dev/v1beta1
kind: Application
metadata:
name: training-serving
namespace: default
spec:
components:
# Train the model
- name: demo-training
type: model-training
properties:
# Mirror of the trained model
image: fogdong/train-color:v1
# A framework for model training
framework: tensorflow
# Declare storage to persist models. Here, the default storage class in the cluster will be used to create the PVC
storage:
- name: "my-pvc"
mountPath: "/model"

At this point, KubeVela will pull up a TFJob for model training.

It's hard to see what's going on just by training the model. Let's modify this YAML file and put the model serving after the model training step. At the same time, because the model serving will directly start the model, and the input and output of the model are not intuitive (ndarray or Tensor), therefore, we deploy a test service to call the service and convert the result into an image.

Deploy the following YAML file:

apiVersion: core.oam.dev/v1beta1
kind: Application
metadata:
name: training-serving
namespace: default
spec:
components:
# Train the model
- name: demo-training
type: model-training
properties:
image: fogdong/train-color:v1
framework: tensorflow
storage:
- name: "my-pvc"
mountPath: "/model"
# Start the model serving
- name: demo-serving
type: model-serving
# The model serving will start after model training is complete
dependsOn:
- demo-training
properties:
# The protocol used to start the model serving can be left blank. By default, seldon's own protocol is used.
protocol: tensorflow
predictors:
- name: model
# The number of replicas for the model serving
replicas: 1
graph:
# model name
name: my-model
# model frame
implementation: tensorflow
# Model address, the previous step will save the trained model to the pvc of my-pvc, so specify the address of the model through pvc://my-pvc
modelUri: pvc://my-pvc
# test model serving
- name: demo-rest-serving
type: webservice
# The test service will start after model training is complete
dependsOn:
- demo-serving
properties:
image: fogdong/color-serving:v1
# Use LoadBalancer to expose external addresses for easy to access
exposeType: LoadBalancer
env:
- name: URL
# The address of the model serving
value: http://ambassador.vela-system.svc.cluster.local/seldon/default/demo-serving/v1/models/my-model:predict
ports:
# Test service port
- port: 3333
expose: true

After deployment, check the status of the application with vela ls:

$ vela ls
training-serving demo-training model-training running healthy Job Succeeded 2022-03-02 17:26:40 +0800 CST
├─ demo-serving model-serving running healthy Available 2022-03-02 17:26:40 +0800 CST
└─ demo-rest-serving webservice running healthy Ready:1/1 2022-03-02 17:26:40 +0800 CST

As you can see, the application has started normally. Use vela status <app-name> --endpoint to view the service address of the application.

$ vela status training-serving --endpoint
+---------+-----------------------------------+---------------------------------------------------+
| CLUSTER | REF(KIND/NAMESPACE/NAME) | ENDPOINT |
+---------+-----------------------------------+---------------------------------------------------+
| | Service/default/demo-rest-serving | tcp://47.251.10.177:3333 |
| | Service/vela-system/ambassador | http://47.251.36.228/seldon/default/demo-serving |
| | Service/vela-system/ambassador | https://47.251.36.228/seldon/default/demo-serving |
+---------+-----------------------------------+---------------------------------------------------+

The application has three service addresses, the first is the address of our test service, the second and third are the addresses of the native model.

We can call the test service to see the effect of the model: the test service will read the content of the image, convert it into a Tensor and request the model serving, and finally convert the Tensor returned by the model serving into an image to return.

We choose a black and white female image as input:

alt

After the request, you can see that a color image is output:

alt

Model Servings: Canary Testing#

In addition to starting the model serving directly, we can also use multiple versions of the model in one model serving and assign different traffic to them for canary testing.

Deploy the following YAML, you can see that both the v1 version of the model and the v2 version of the model are set to 50% traffic. Again, we deploy a test service behind the model serving:

apiVersion: core.oam.dev/v1beta1
kind: Application
metadata:
name: color-serving
namespace: default
spec:
components:
- name: color-model-serving
type: model-serving
properties:
protocol: tensorflow
predictors:
- name: model1
replicas: 1
# v1 version model traffic is 50
traffic: 50
graph:
name: my-model
implementation: tensorflow
# Model address, our v1 version model is stored under the /model/v1 path in the pvc of color-model, so specify the address of the model through pvc://color-model/model/v1
modelUri: pvc://color-model/model/v1
- name: model2
replicas: 1
# v2 version model traffic is 50
traffic: 50
graph:
name: my-model
implementation: tensorflow
# Model address, our v2 version model is stored under the /model/v2 path in the pvc of color-model, so specify the address of the model through pvc://color-model/model/v2
modelUri: pvc://color-model/model/v2
- name: color-rest-serving
type: webservice
dependsOn:
- color-model-serving
properties:
image: fogdong/color-serving:v1
exposeType: LoadBalancer
env:
- name: URL
value: http://ambassador.vela-system.svc.cluster.local/seldon/default/color-model-serving/v1/models/my-model:predict
ports:
- port: 3333
expose: true

When the model deployment is complete, use vela status <app-name> --endpoint to view the address of the model serving:

$ vela status color-serving --endpoint
+---------+------------------------------------+----------------------------------------------------------+
| CLUSTER | REF(KIND/NAMESPACE/NAME) | ENDPOINT |
+---------+------------------------------------+----------------------------------------------------------+
| | Service/vela-system/ambassador | http://47.251.36.228/seldon/default/color-model-serving |
| | Service/vela-system/ambassador | https://47.251.36.228/seldon/default/color-model-serving |
| | Service/default/color-rest-serving | tcp://47.89.194.94:3333 |
+---------+------------------------------------+----------------------------------------------------------+

Request the model with a black and white city image:

alt

As you can see, the result of the first request is as follows. While the sky and ground are rendered in color, the city itself is black and white:

alt

Request again, you can see that in the result of this request, the sky, ground and city are rendered in color:

alt

By distributing traffic to different versions of the model, it can help us better judge the model results.

Model Serving: A/B Testing#

For a black and white image, we can turn it into color through the model. Or in another way, we can transfer the style of the original image by uploading another style image.

Do our users love colorful pictures more or pictures of different styles more? We can explore this question by conducting A/B testing.

Deploy the following YAML, by setting customRouting, forward the request with style: transfer in the Header to the model of style transfer. At the same time, make this style transfer model share the same address as the colorized model.

Note: The model for style transfer comes from TensorFlow Hub

apiVersion: core.oam.dev/v1beta1
kind: Application
metadata:
name: color-style-ab-serving
namespace: default
spec:
components:
- name: color-ab-serving
type: model-serving
properties:
protocol: tensorflow
predictors:
- name: model1
replicas: 1
graph:
name: my-model
implementation: tensorflow
modelUri: pvc://color-model/model/v2
- name: style-ab-serving
type: model-serving
properties:
protocol: tensorflow
# The model of style migration takes a long time, set the timeout time so that the request will not be timed out
timeout: "10000"
customRouting:
# Specify custom Header
header: "style: transfer"
# Specify custom routes
serviceName: "color-ab-serving"
predictors:
- name: model2
replicas: 1
graph:
name: my-model
implementation: tensorflow
modelUri: pvc://style-model/model
- name: ab-rest-serving
type: webservice
dependsOn:
- color-ab-serving
- style-ab-serving
properties:
image: fogdong/style-serving:v1
exposeType: LoadBalancer
env:
- name: URL
value: http://ambassador.vela-system.svc.cluster.local/seldon/default/color-ab-serving/v1/models/my-model:predict
ports:
- port: 3333
expose: true

After successful deployment, view the address of the model serving through vela status <app-name> --endpoint:

$ vela status color-style-ab-serving --endpoint
+---------+---------------------------------+-------------------------------------------------------+
| CLUSTER | REF(KIND/NAMESPACE/NAME) | ENDPOINT |
+---------+---------------------------------+-------------------------------------------------------+
| | Service/vela-system/ambassador | http://47.251.36.228/seldon/default/color-ab-serving |
| | Service/vela-system/ambassador | https://47.251.36.228/seldon/default/color-ab-serving |
| | Service/vela-system/ambassador | http://47.251.36.228/seldon/default/style-ab-serving |
| | Service/vela-system/ambassador | https://47.251.36.228/seldon/default/style-ab-serving |
| | Service/default/ab-rest-serving | tcp://47.251.5.97:3333 |
+---------+---------------------------------+-------------------------------------------------------+

In this application, the two services have two addresses each, but the model service address of the second style-ab-serving is invalid because the model service is already pointed to the address of color-ab-serving . Again, we see how it works by requesting the test service.

First, without the header, the image changes from black and white to color:

alt

Let's add an image of an ocean wave as a style render:

alt

We add the Header of style: transfer to this request, and you can see that the city has become a wave style:

alt

We can also use an ink painting image as a style rendering:

alt

It can be seen that this time the city has become an ink painting style:

alt

Summary#

Through KubeVela's AI plug-in, it can help you to perform model training and model serving more conveniently.

In addition, together with KubeVela, we can also deliver the tested model to different environments through KubeVela's multi-environment function, so as to realize the flexible deployment of the model.

Using GitOps + KubeVela for Application Continuous Delivery

Tianxin Dong

Tianxin Dong

KubeVela Team

KubeVela is a simple, easy-to-use, and highly extensible cloud-native application platform. It can make developers deliver microservices applications easily, without knowing Kubernetes details.

KubeVela is based on OAM model, which naturally solves the orchestration problems of complex resources. It means that KubeVela can manage complex large-scale applications with GitOps. Convergence of team and system size after the system complexity problem.

What is GitOps#

GitOps is a modern way to do continuous delivery. Its core idea is to have a Git repository which contains environmental and application configurations. An automated process is also needed for sync the config to cluster.

By changing the files in repository, developers can apply the applications automatically. The benefits of applying GitOps include:

  • Increased productivity. Continuous delivery can speed up the time of deployment.
  • Lower the barrier for developer to deploy. By pushing code instead of container configuration, developers can easily deploy Kubernetes without knowing its internal implementation.
  • Trace the change records. Managing the cluster with Git makes every change traceable, enhancing the audit trail.
  • Recover the cluster with Git's rollback and branch.

GitOps with KubeVela#

KubeVela as an declarative application delivery control plane can be naturally used in GitOps approach, and this will provide below extra bonus to end users alongside with GitOps benefits:

  • application delivery workflow (CD pipeline)
    • i.e. KubeVela supports pipeline style application delivery process in GitOps, instead of simply declaring final status;
  • handling deployment dependencies and designing typologies (DAG);
  • unified higher level abstraction atop various GitOps tools' primitives;
  • declare, provision and consume cloud resources in unified application definition;
  • various out-of-box deployment strategies (Canary, Blue-Green ...);
  • various out-of-box hybrid/multi-cloud deployment policies (placement rule, cluster selectors etc.);
  • Kustomize-style patch for multi-env deployment without the need to learn Kustomize at all;
  • ... and much more.

In this section, we will introduce steps of using KubeVela directly in GitOps approach.

GitOps workflow#

The GitOps workflow is divided into CI and CD:

  • CI(Continuous Integration): Continuous integration builds code and images, and pushes images to the registry. There are many CI tools like GitHub Action, Travis, Jenkins and so on. In this article, we use GitHub Action for CI. You can also use other CI tools. KubeVela can connect CI processes under any tool around GitOps.
  • CD(Continuous Delivery): Continuous delivery automatically updates the configuration in the cluster. For example, update the latest images in the registry to the cluster.
    • Currently there are two main CD modes:
      • Push-based: Push mode CD is mainly accomplished by configuring CI pipeline. In this way, the access key of the cluster is shared with CI so that the CI pipeline can push changes to the cluster. For this mode, please refer to our previous blog post: Using Jenkins + KubeVela for Application Continuous Delivery.
      • Pull-based: Pull mode CD listens for changes to the repository (code repository or configuration repository) in the cluster and synchronizes those changes to the cluster. In this way, the cluster actively pulls the update, thus avoiding the problem of exposing the secret key. This article will introduce using KubeVela and GitOps in pull mode.

This article will separate into two perspectives:

  1. For platform administrators/SREs, they can update the config in Git repo. It will trigger automated re-deployment.

  2. For developers, they can update the app source code and then push it to Git. It will trigger building latest image and re-deployment.

For platform administrators/SREs#

Platform administrators/SREs prepares the Git repo for operational config. Every config change will be traceable by that. KubeVela will watch the repo and apply changes to the clusters.

alt

Setup Config Repository#

The configuration files are from the Example Repo.

In this example, we will deploy an application and a database, the application uses the database to store data.

The structure of the config repository looks below:

  • The clusters/ contains the GitOps config. It will command KubeVela to watch the specified repo and apply latest changes.
  • The apps/ contains the Application yaml for deploying the user-facing app.
  • The infrastructure/ contains infrastructure tools, i.e. MySQL database.
├── apps
│   └── my-app.yaml
├── clusters
│   ├── apps.yaml
│   └── infra.yaml
└── infrastructure
└── mysql.yaml

KubeVela recommends using the directory structure above to manage your GitOps repository. clusters/ holds the associated KubeVela GitOps configuration that need to be applied to cluster manually, apps/ holds your application and infrastructure/ holds your base configuration. By separating applications from basic configurations, you can manage your deployment environment more reasonably and isolate application changes.

Directory clusters/#

The clusters/ is the initialize configuration directory for KubeVela GitOps.

Below is how the clusters/infra.yaml looks like:

apiVersion: core.oam.dev/v1beta1
kind: Application
metadata:
name: infra
spec:
components:
- name: database-config
type: kustomize
properties:
repoType: git
# replace it with your repo url
url: https://github.com/FogDong/KubeVela-GitOps-Infra-Demo
# replace it with your git secret if it's a private repo
# secretRef: git-secret
# the pull interval time, set to 10m since the infrastructure is steady
pullInterval: 10m
git:
# the branch name
branch: main
# the path to sync
path: ./infrastructure

apps.yaml and infra.yaml in clusters/ are similar. Their difference is to watch different directories. In apps.yaml, the properties.path will be ./apps.

Apply the files in clusters/ manually. They will sync the files in infrastructure/ and apps/ dir of the Git repo.

Directory apps/#

The file in apps/ is a simple application with database information and Ingress. The app serves HTTP service and connects to a MySQL database. In the '/' path, it will display the version in the code; in the /db path, it will list the data in database.

apiVersion: core.oam.dev/v1beta1
kind: Application
metadata:
name: my-app
namespace: default
spec:
components:
- name: my-server
type: webservice
properties:
image: <your image address> # {"$imagepolicy": "default:apps"}
port: 8088
env:
- name: DB_HOST
value: mysql-cluster-mysql.default.svc.cluster.local:3306
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: mysql-secret
key: ROOT_PASSWORD
traits:
- type: ingress
properties:
domain: testsvc.example.com
http:
/: 8088

This is an Application binds with Traits Ingress. In this way, the underlying Deployment, Service, and Ingress can be brought together in a single file, making it easier to manage the application.

Directory infrastructure/#

The infrastructure/ contains the config of some infrastructures like database. In the following, we will use MySQL operator to deploy a MySQL cluster.

Notice that there must be a secret in your cluster with MySQL password specified in key ROOT_PASSWORD.

apiVersion: core.oam.dev/v1beta1
kind: Application
metadata:
name: mysql
namespace: default
spec:
components:
- name: mysql-controller
type: helm
properties:
repoType: helm
url: https://presslabs.github.io/charts
chart: mysql-operator
version: "0.4.0"
- name: mysql-cluster
type: raw
dependsOn:
- mysql-controller
properties:
apiVersion: mysql.presslabs.org/v1alpha1
kind: MysqlCluster
metadata:
name: mysql-cluster
spec:
replicas: 1
# replace it with your secret
secretName: mysql-secret

We use workflow in this Application. The first step is to deploy the MySQL controller, after the controller is running, the second step will deploy the MySQL cluster.

Apply the files in clusters/#

After storing bellow files in the Git config repo, we need to apply the GitOps config files in clusters/ manually.

First, apply the clusters/infra.yaml to cluster, we can see that the MySQL in infrastructure/ is automatically deployed:

kubectl apply -f clusters/infra.yaml
$ vela ls
APP COMPONENT TYPE TRAITS PHASE HEALTHY STATUS CREATED-TIME
infra database-config kustomize running healthy 2021-09-26 20:48:09 +0800 CST
mysql mysql-controller helm running healthy 2021-09-26 20:48:11 +0800 CST
└─ mysql-cluster raw running healthy 2021-09-26 20:48:11 +0800 CST

Apply the clusters/apps.yaml to cluster, we can see that the application in apps/ is automatically deployed:

kubectl apply -f clusters/apps.yaml
APP COMPONENT TYPE TRAITS PHASE HEALTHY STATUS CREATED-TIME
apps apps kustomize running healthy 2021-09-27 16:55:53 +0800 CST
infra database-config kustomize running healthy 2021-09-26 20:48:09 +0800 CST
my-app my-server webservice ingress running healthy 2021-09-27 16:55:55 +0800 CST
mysql mysql-controller helm running healthy 2021-09-26 20:48:11 +0800 CST
└─ mysql-cluster raw running healthy 2021-09-26 20:48:11 +0800 CST

By deploying the KubeVela GitOps config files, we now automatically apply the application and database in cluster.

curl the Ingress of the app, we can see that the current version is 0.1.5 and the application is connected to the database successfully:

$ kubectl get ingress
NAME CLASS HOSTS ADDRESS PORTS AGE
my-server <none> testsvc.example.com <ingress-ip> 80 162m
$ curl -H "Host:testsvc.example.com" http://<ingress-ip>
Version: 0.1.5
$ curl -H "Host:testsvc.example.com" http://<ingress-ip>/db
User: KubeVela
Description: It's a test user

Modify the config for GitOps trigger#

After the first deployment, we can modify the files in config repo to update the applications in the cluster.

Modify the domain of the application's Ingress:

...
traits:
- type: ingress
properties:
domain: kubevela.example.com
http:
/: 8089

Check the Ingress in cluster after a while:

NAME CLASS HOSTS ADDRESS PORTS AGE
my-server <none> kubevela.example.com <ingress-ip> 80 162m

The host of the Ingress has been updated successfully!

In this way, we can edit the files in the Git repo to update the cluster.

For developers#

Developers writes the application source code and push it to a Git repo (aka app repo). Once app repo updates, the CI will build the image and push it to the image registry. KubeVela watches the image registry, and updates the image in config repo. Finally, it will apply the config to the cluster.

User can update the configuration in the cluster automatically when the code is updated.

alt

Setup App Code Repository#

Setup a Git repository with source code and Dockerfile.

The app serves HTTP service and connects to a MySQL database. In the '/' path, it will display the version in the code; in the /db path, it will list the data in database.

http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
_, _ = fmt.Fprintf(w, "Version: %s\n", VERSION)
})
http.HandleFunc("/db", func(w http.ResponseWriter, r *http.Request) {
rows, err := db.Query("select * from userinfo;")
if err != nil {
_, _ = fmt.Fprintf(w, "Error: %v\n", err)
}
for rows.Next() {
var username string
var desc string
err = rows.Scan(&username, &desc)
if err != nil {
_, _ = fmt.Fprintf(w, "Scan Error: %v\n", err)
}
_, _ = fmt.Fprintf(w, "User: %s \nDescription: %s\n\n", username, desc)
}
})
if err := http.ListenAndServe(":8088", nil); err != nil {
panic(err.Error())
}

In this tutorial, we will setup a CI pipeline using GitHub Actions to build the image and push it to a registry. The code and configuration files are from the Example Repo.

Create Git Secret for KubeVela committing to Config Repo#

After the new image is pushed to the image registry, KubeVela will be notified and update the Application file in the Git repository and cluster. Therefore, we need a secret with Git information for KubeVela to commit to the Git repository. Fill the following yaml files with your password and apply it to the cluster:

apiVersion: v1
kind: Secret
metadata:
name: my-secret
type: kubernetes.io/basic-auth
stringData:
username: <your username>
password: <your password>

Setup Config Repository#

The configuration repository is almost the same as the previous configuration, you only need to add the image registry config to the file. For more details, please refer to Example Repository.

Add the config of image registry in clusters/apps.yaml, it listens for image updates in the image registry:

...
imageRepository:
image: <your image>
# if it's a private image registry, use `kubectl create secret docker-registry` to create the secret
# secretRef: imagesecret
filterTags:
# filter the image tag
pattern: '^master-[a-f0-9]+-(?P<ts>[0-9]+)'
extract: '$ts'
# use the policy to sort the latest image tag and update
policy:
numerical:
order: asc
# add more commit message
commitMessage: "Image: {{range .Updated.Images}}{{println .}}{{end}}"

Modify the image field in apps/my-app.yaml and add annotation # {"$imagepolicy": "default:apps"}. Notice that KubeVela will only be able to modify the image field if the annotation is added after the field. default:apps is namespace:name of the GitOps config file above.

spec:
components:
- name: my-server
type: webservice
properties:
image: ghcr.io/fogdong/test-fog:master-cba5605f-1632714412 # {"$imagepolicy": "default:apps"}

After update the files in clusters/ to cluster, we can then update the application by modifying the code.

Modify the code#

Change the Version to 0.1.6 and modify the data in database:

const VERSION = "0.1.6"
...
func InsertInitData(db *sql.DB) {
stmt, err := db.Prepare(insertInitData)
if err != nil {
panic(err)
}
defer stmt.Close()
_, err = stmt.Exec("KubeVela2", "It's another test user")
if err != nil {
panic(err)
}
}

Commit the change to the Git Repository, we can see that our CI pipelines has built the image and push it to the image registry.

KubeVela will listen to the image registry and update the apps/my-app.yaml in Git Repository with the latest image tag.

We can see that there is a commit form kubevelabot, the commit message is always with a prefix Update image automatically. You can use format like {{range .Updated.Images}}{{println .}}{{end}} to specify the image name in the commitMessage field.

alt

Note that if you want to put the code and config in the same repository, you need to filter out the commit from KubeVela in CI configuration like below to avoid the repeat build of pipeline.

jobs:
publish:
if: "!contains(github.event.head_commit.message, 'Update image automatically')"

Re-check the Application in cluster, we can see that the image of the my-app has been updated after a while.

KubeVela polls the latest information from the code and image repo periodically (at an interval that can be customized):

  • When the Application file in the Git repository is updated, KubeVela will update the Application in the cluster based on the latest configuration.
  • When a new tag is added to the image registry, KubeVela will filter out the latest tag based on your policy and update it to Git repository. When the files in the repository are updated, KubeVela repeats the first step and updates the files in the cluster, thus achieving automatic deployment.

We can curl to Ingress to see the current version and data:

$ kubectl get ingress
NAME CLASS HOSTS ADDRESS PORTS AGE
my-server <none> kubevela.example.com <ingress-ip> 80 162m
$ curl -H "Host:kubevela.example.com" http://<ingress-ip>
Version: 0.1.6
$ curl -H "Host:kubevela.example.com" http://<ingress-ip>/db
User: KubeVela
Description: It's a test user
User: KubeVela2
Description: It's another test user

The Version has been updated successfully! Now we're done with everything from changing the code to automatically applying to the cluster.

Summary#

For platform admins/SREs, they update the config repo to operate the application and infrastructure. KubeVela will synchronize the config to the cluster, simplifying the deployment process.

For end users/developers, they write the source code, push it to Git, and then re-deployment will happen. It will make CI to build the image. KubeVela will then update the image field and apply the deployment config.

By integrating with GitOps, KubeVela helps users speed up deployment and simplify continuous deployment.

Using Jenkins + KubeVela for Application Continuous Delivery

Da Yin, Yang Song

Da Yin, Yang Song

KubeVela Team

KubeVela bridges the gap between applications and infrastructures, enabling easy delivery and management of development codes. Compared to Kubernetes objects, the Application in KubeVela better abstracts and simplifies the configurations which developers care about, and leave complex infrastruature capabilities and orchestration details to platform engineers. The KubeVela apiserver further exposes HTTP interfaces, which help developers to deploy applications even without Kubernetes cluster access.

This article will use Jenkins, a popular continuous integration tool, as basis and give a brief introduction to how to build GitOps-based application continuous delivery highway.

Continuous Delivery Highway#

As application developer, you might care more about whether your application is functioning correctly and if development is convenient. There will be several system components on this highway to help you achieve that.

  1. First, you need a git repo to place program codes, test codes and a YAML file to declare your KubeVela application.
  2. Second, you also need a continuous integration tool to help you automate the integration test of codes, build container images and push images to image repo.
  3. Finally, you need to have a Kubernetes cluster and install KubeVela in it, with its apiserver function enabled.

Currently, the access management for KubeVela apiserver is under construction. You will need to configure apiserver access in later version of KubeVela (after v1.1).

In this article, we adopt GitHub as the git repo, Jenkins as the CI tool, DockerHub as the image repo. We use a simple HTTP Server written in Golang as example. The whole process of continuous delivery is shown as below. We can see that on this highway of continuous delivery, developers only need to care about application development and managing code version with Git. The highway will help developer run integration test and deploy applications into target Kubernetes cluster automatically.

arch

Set-up Environment#

Jenkins#

This article takes Jenkins as the CI tool. Developers can choose other CI tools like Travis or GitHub Action.

First you need to set up Jenkins to deploy CI pipelines. The installation and initialization of Jenkins could refer to the official docs.

Notice that since the CI pipeline in this example is based on Docker and GitHub, you need to install related plugins in Jenkins (Dashboard > Manage Jenkins > Manage Plugins), including Pipeline、HTTP Request Plugin、Docker Pipeline、Docker Plugin.

Besides, you also need to configure Docker environment for Jenkins to use (Dashboard > Manage Jenkins > Configure System > Docker Builder). If Docker has already been installed, you can set Docker URL as unix:///var/run/docker.sock.

Since the docker image will be pushed to image repo during the running of CI pipelines, you also need to store image repo accounts in Jenkins Credintial (Dashboard > Manage Jenkins > Manage Credentials > Add Credentials), such as DockerHub username and password.

jenkins-credential

GitHub#

This example uses GitHub as git repo. Developer can change it to other repos on demand, such as Gitlab.

To enable Jenkins to retrieve GitHub updates and write pipeline status back to GitHub, you need to execute the following two steps in GitHub.

  1. Configure Personal Access Token. Notice to check repo:status to get the permission for writing commit status.

github-pat

Then fill Personal Access Token from GitHub in Jenkins Credential (with Secret Text type).

jenkins-secret-text

Finally, go to Dashboard > Manage Jenkins > Configure System > GitHub in Jenkins and click Add GitHub Server to fill the newly created credential in. You can click Test connection to check if the configuration is correct.

jenkins-github

  1. Add Webhook to GitHub code repo settings. Fill Jenkins Webhook address into it. For example, http://my-jenkins.example.com/github-webhook/ . In this way, all Push events in this code repo will be pushed to Jenkins.

github-webhook

KubeVela#

You need to install KubeVela in your Kubernetes cluster and enable the apiserver function. Refer to official doc for details.

Composing Applications#

We use a simple HTTP Server as example. Here, we declare a constant named VERSION and print it when accessing the HTTP service. A simple test is also set up, which can be used to validate the format of VERSION.

// main.go
package main
import (
"fmt"
"net/http"
)
const VERSION = "0.1.0-v1alpha1"
func main() {
http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
_, _ = fmt.Fprintf(w, "Version: %s\n", VERSION)
})
if err := http.ListenAndServe(":8088", nil); err != nil {
println(err.Error())
}
}
// main_test.go
package main
import (
"regexp"
"testing"
)
const verRegex string = `^v?([0-9]+)(\.[0-9]+)?(\.[0-9]+)?` +
`(-([0-9A-Za-z\-]+(\.[0-9A-Za-z\-]+)*))?` +
`(\+([0-9A-Za-z\-]+(\.[0-9A-Za-z\-]+)*))?$`
func TestVersion(t *testing.T) {
if ok, _ := regexp.MatchString(verRegex, VERSION); !ok {
t.Fatalf("invalid version: %s", VERSION)
}
}

To build container image for the HTTP server and publishing it as KubeVela Application into Kubernetes, we also need another two files Dockerfile and app.yaml in the code repo. They are used to describe how container image is built and configure the KubeVela Application respectively.

# Dockerfile
FROM golang:1.13-rc-alpine3.10 as builder
WORKDIR /app
COPY main.go .
RUN go build -o kubevela-demo-cicd-app main.go
FROM alpine:3.10
WORKDIR /app
COPY --from=builder /app/kubevela-demo-cicd-app /app/kubevela-demo-cicd-app
ENTRYPOINT ./kubevela-demo-cicd-app
EXPOSE 8088

In app.yaml, we declare the application should contain 5 replica and expose the service through Ingress. The labels trait is used to tag Application Pods with current git commit id. Then the delivery pipeline in Jenkins will inject GIT_COMMIT into it and submit the Application configuration to KubeVela apiserver. Then the updates for Application will be triggered. The application will update 2 replica first, then hang and wait for manual approve. After developer confirms the change is valid, the rest 3 replica will be updated. This canary release is configured by the rollout trait declared in the Application.

# app.yaml
apiVersion: core.oam.dev/v1beta1
kind: Application
metadata:
name: kubevela-demo-app
spec:
components:
- name: kubevela-demo-app-web
type: webservice
properties:
image: somefive/kubevela-demo-cicd-app
imagePullPolicy: Always
port: 8080
traits:
- type: rollout
properties:
rolloutBatches:
- replicas: 2
- replicas: 3
batchPartition: 0
targetSize: 5
- type: labels
properties:
jenkins-build-commit: GIT_COMMIT
- type: ingress
properties:
domain: kubevela-demo-cicd-app.cf7c0ed25b151437ebe1ef58efc29bca4.us-west-1.alicontainer.com
http:
"/": 8088

Configure CI pipelines#

In this article, we set up two pipelines in Jenkins. One is the test pipeline, which is for running tests for application codes. The other one is the delivery pipeline, which builds container images and uploads them to image repo. Then the application configuration will be updated.

Test Pipeline#

Create a new pipeline in Jenkins. Set Build Triggers as GitHub hook trigger for GITScm polling.

test-pipeline-create test-pipeline-config

This pipeline uses golang image as execution environment at first. Next, it checkouts the dev branch of the target GitHub repo, indicating that this pipeline will be triggered by push events to dev branch. The piepline status will be written back to GitHub after execution finished.

void setBuildStatus(String message, String state) {
step([
$class: "GitHubCommitStatusSetter",
reposSource: [$class: "ManuallyEnteredRepositorySource", url: "https://github.com/Somefive/KubeVela-demo-CICD-app"],
contextSource: [$class: "ManuallyEnteredCommitContextSource", context: "ci/jenkins/test-status"],
errorHandlers: [[$class: "ChangingBuildStatusErrorHandler", result: "UNSTABLE"]],
statusResultSource: [ $class: "ConditionalStatusResultSource", results: [[$class: "AnyBuildResult", message: message, state: state]] ]
]);
}
pipeline {
agent {
docker { image 'golang:1.13-rc-alpine3.10' }
}
stages {
stage('Prepare') {
steps {
script {
def checkout = git branch: 'dev', url: 'https://github.com/Somefive/KubeVela-demo-CICD-app.git'
env.GIT_COMMIT = checkout.GIT_COMMIT
env.GIT_BRANCH = checkout.GIT_BRANCH
echo "env.GIT_BRANCH=${env.GIT_BRANCH},env.GIT_COMMIT=${env.GIT_COMMIT}"
}
setBuildStatus("Test running", "PENDING");
}
}
stage('Test') {
steps {
sh 'CGO_ENABLED=0 GOCACHE=$(pwd)/.cache go test *.go'
}
}
}
post {
success {
setBuildStatus("Test success", "SUCCESS");
}
failure {
setBuildStatus("Test failed", "FAILURE");
}
}
}

Delivery Pipeline#

The delivery pipeline, similar to the test pipeline, first pulls codes in prod branch of the git repo. Then use Docker to build images and push it to remote image repo. (Here we use DockerHub, the withRegistry function takes image repo location and the Credential ID of the repo as parameters). After image been built, the pipeline converts Application YAML file into JSON file, with GIT_COMMIT injected. Finally, the pipeline sends POST requests to KubeVela apiserver (here is http://47.88.24.19/) for creating or updating target application.

Currently, KubeVela apiserver takes JSON object as inputs. Therefore we do extra conversion in the delivery pipeline. In the future, the KubeVela apiserver will further improve and simplify this interaction process. The admission management will be added as well to address the security issue.

In this case we will create an application named cicd-demo-app in Namespace kubevela-demo-namespace. Notice that the Namespace need to be created in Kubernetes in advance. KubeVela apiserver will simplify it in later version.

void setBuildStatus(String message, String state) {
step([
$class: "GitHubCommitStatusSetter",
reposSource: [$class: "ManuallyEnteredRepositorySource", url: "https://github.com/Somefive/KubeVela-demo-CICD-app"],
contextSource: [$class: "ManuallyEnteredCommitContextSource", context: "ci/jenkins/deploy-status"],
errorHandlers: [[$class: "ChangingBuildStatusErrorHandler", result: "UNSTABLE"]],
statusResultSource: [ $class: "ConditionalStatusResultSource", results: [[$class: "AnyBuildResult", message: message, state: state]] ]
]);
}
pipeline {
agent any
stages {
stage('Prepare') {
steps {
script {
def checkout = git branch: 'prod', url: 'https://github.com/Somefive/KubeVela-demo-CICD-app.git'
env.GIT_COMMIT = checkout.GIT_COMMIT
env.GIT_BRANCH = checkout.GIT_BRANCH
echo "env.GIT_BRANCH=${env.GIT_BRANCH},env.GIT_COMMIT=${env.GIT_COMMIT}"
setBuildStatus("Deploy running", "PENDING");
}
}
}
stage('Build') {
steps {
script {
docker.withRegistry("https://registry.hub.docker.com", "DockerHubCredential") {
def customImage = docker.build("somefive/kubevela-demo-cicd-app")
customImage.push()
}
}
}
}
stage('Deploy') {
steps {
sh 'wget -q "https://github.com/mikefarah/yq/releases/download/v4.12.1/yq_linux_amd64"'
sh 'chmod +x yq_linux_amd64'
script {
def app = sh (
script: "./yq_linux_amd64 eval -o=json '.spec' app.yaml | sed -e 's/GIT_COMMIT/$GIT_COMMIT/g'",
returnStdout: true
)
echo "app: ${app}"
def response = httpRequest acceptType: 'APPLICATION_JSON', contentType: 'APPLICATION_JSON', httpMode: 'POST', requestBody: app, url: "http://47.88.24.19/v1/namespaces/kubevela-demo-namespace/applications/cicd-demo-app"
println('Status: '+response.status)
println('Response: '+response.content)
}
}
}
}
post {
success {
setBuildStatus("Deploy success", "SUCCESS");
}
failure {
setBuildStatus("Deploy failed", "FAILURE");
}
}
}

NOTE: the deploy stage is written with KubeVela v1.1. The apiserver interaction method is updated in KubeVela v1.2, leveraging VelaUX (the UI dashboard) and webhook trigger. If you are using KubeVela v1.2.0+, you should refer to the latest documents.

Performance#

After finishing the configuration process described above, the whole process of continuous delivery has already been set up. Let's check how it works.

pipeline-overview

First, we set the VERSION constant in main.go to Bad Version Number, aka,

const VERSION = "Bad Version Number"

Then, we submit this change to dev branch. We can see that the test pipeline in Jenkins is triggered and the failure status is written back to GitHub.

test-pipeline-fail test-github-fail

We edit the VERSION to 0.1.1 again and resubmit it. Now we see that the test pipeline is successfully executed, with the commit in GitHub marked as succeeded.

test-pipeline-success test-github-success

Then we issue a Pull Request to merge dev branch into prod branch.

pull-request

The Jenkins delivery pipeline is triggered once the Pull Request is accepted. After execution finished, the latest commit in prod branch is also marked as succeeded.

deploy-pipeline-success deploy-github-success

$ kubectl get app -n kubevela-demo-namespace
NAME COMPONENT TYPE PHASE HEALTHY STATUS AGE
kubevela-demo-cicd-app kubevela-demo-app-web webservice running true 112s
$ kubectl get deployment -n kubevela-demo-namespace
NAME READY UP-TO-DATE AVAILABLE AGE
kubevela-demo-app-web-v1 2/2 2 2 2m1s

As shown above, the target application is successfully accepted by KubeVela apiserver and related resources are created by KubeVela controller. The current replica number of Deployment is 2. After deleting batchPartition : 0 in the rollout trait of the application, which means confirming current release, the Deployment replica is updated to 5. Now we can access the domain configured in Ingress and get the current version number.

$ kubectl edit app -n kubevela-demo-namespace
application.core.oam.dev/kubevela-demo-cicd-app edited
$ kubectl get deployment -n kubevela-demo-namespace -w
NAME READY UP-TO-DATE AVAILABLE AGE
kubevela-demo-app-web-v1 4/5 5 4 3m39s
kubevela-demo-app-web-v1 5/5 5 5 3m39s
kubevela-demo-app-web-v1 5/5 5 5 3m40s
kubevela-demo-app-web-v1 5/5 5 5 3m40s
$ curl http://kubevela-demo-cicd-app.cf7c0ed25b151437ebe1ef58efc29bca4.us-west-1.alicontainer.com/
Version: 0.1.1

Repeat the steps above. Upgrade the version number to 0.1.2. Finish both test pipeline and delivery pipeline. Then we will see there is a version change to the Deployment managed by the target application. The replica number of the old Deployment decreases from 5 to 3 while the new one contains 2 replica at this moment. If we access the service now, we will find sometimes the old version number is returned and sometimes the new version number is displayed. This is because when rolling update the application, both new version replica and old version replica exist. The incoming traffic will be dispatched to different version replica. Therefore we can observe two different version at the same time.

$ kubectl get deployment -n kubevela-demo-namespace -w
NAME READY UP-TO-DATE AVAILABLE AGE
kubevela-demo-app-web-v1 5/5 5 5 11m
kubevela-demo-app-web-v2 0/0 0 0 0s
kubevela-demo-app-web-v2 0/0 0 0 0s
kubevela-demo-app-web-v2 0/0 0 0 0s
kubevela-demo-app-web-v1 5/5 5 5 12m
kubevela-demo-app-web-v2 0/0 0 0 0s
kubevela-demo-app-web-v2 0/0 0 0 0s
kubevela-demo-app-web-v2 0/0 0 0 0s
kubevela-demo-app-web-v2 0/0 0 0 0s
kubevela-demo-app-web-v2 0/2 0 0 0s
kubevela-demo-app-web-v2 0/2 0 0 0s
kubevela-demo-app-web-v2 0/2 0 0 0s
kubevela-demo-app-web-v2 0/2 2 0 0s
kubevela-demo-app-web-v1 5/5 5 5 12m
kubevela-demo-app-web-v2 1/2 2 1 2s
kubevela-demo-app-web-v2 2/2 2 2 2s
kubevela-demo-app-web-v1 5/3 5 5 13m
kubevela-demo-app-web-v1 5/3 5 5 13m
kubevela-demo-app-web-v1 3/3 3 3 13m
$ curl http://kubevela-demo-cicd-app.cf7c0ed25b151437ebe1ef58efc29bca4.us-west-1.alicontainer.com/
Version: 0.1.1
$ curl http://kubevela-demo-cicd-app.cf7c0ed25b151437ebe1ef58efc29bca4.us-west-1.alicontainer.com/
Version: 0.1.1
$ curl http://kubevela-demo-cicd-app.cf7c0ed25b151437ebe1ef58efc29bca4.us-west-1.alicontainer.com/
Version: 0.1.1
$ curl http://kubevela-demo-cicd-app.cf7c0ed25b151437ebe1ef58efc29bca4.us-west-1.alicontainer.com/
Version: 0.1.1
$ curl http://kubevela-demo-cicd-app.cf7c0ed25b151437ebe1ef58efc29bca4.us-west-1.alicontainer.com/
Version: 0.1.2
$ curl http://kubevela-demo-cicd-app.cf7c0ed25b151437ebe1ef58efc29bca4.us-west-1.alicontainer.com/
Version: 0.1.2
$ curl http://kubevela-demo-cicd-app.cf7c0ed25b151437ebe1ef58efc29bca4.us-west-1.alicontainer.com/
Version: 0.1.2
$ curl http://kubevela-demo-cicd-app.cf7c0ed25b151437ebe1ef58efc29bca4.us-west-1.alicontainer.com/
Version: 0.1.2
$ curl http://kubevela-demo-cicd-app.cf7c0ed25b151437ebe1ef58efc29bca4.us-west-1.alicontainer.com/
Version: 0.1.1

After confirming new services are functioning correctly, we can remove the batchPartition: 0 as described above to complete the whole canary release process.

$ kubectl get deployment -n kubevela-demo-namespace -w
NAME READY UP-TO-DATE AVAILABLE AGE
kubevela-demo-app-web-v1 3/3 3 3 18m
kubevela-demo-app-web-v2 2/2 2 2 5m24s
kubevela-demo-app-web-v2 2/5 2 2 5m36s
kubevela-demo-app-web-v2 2/5 2 2 5m37s
kubevela-demo-app-web-v2 2/5 2 2 5m37s
kubevela-demo-app-web-v2 2/5 5 2 5m37s
kubevela-demo-app-web-v2 3/5 5 3 5m38s
kubevela-demo-app-web-v2 4/5 5 4 5m38s
kubevela-demo-app-web-v2 5/5 5 5 5m39s
kubevela-demo-app-web-v1 3/0 3 3 18m
kubevela-demo-app-web-v1 3/0 3 3 18m
kubevela-demo-app-web-v1 0/0 0 0 18m
kubevela-demo-app-web-v1 0/0 0 0 18m
kubevela-demo-app-web-v2 5/5 5 5 5m41s
kubevela-demo-app-web-v2 5/5 5 5 5m41s
kubevela-demo-app-web-v1 0/0 0 0 18m

Conclusion#

In summary, we executed the whole continuous delivery process successfully. In this process, developers can easily update and deploy their applications, with the help of KubeVela and Jenkins. Besides, developers can use their favourite tools in different stages, such as substituting GitHub with Gitlab, or using TravisCI instead of Jenkins.

Readers might also notice that this progress can not only upgrade the application service, but also change deployment plan via editing app.yaml, such as scaling up or adding sidecars, which works like classical push-style GitOps. About more KubeVela GitOps content, you can refer to other related case studies.

KubeVela Performance Test - Managing Massive Applications

As an application management and integration platform, KubeVela needs to handle thousands of applications in production scenario. To evaluate the performance of KubeVela, develop team has conducted performance tests based on simultated environments and demonstrated the capability of managing a large number of applications concurrently.

Setup#

Cluster Environment#

Working with large clusters requires lots of resources, such as Machines, Network Bandwidth, Storages and many other devices. Therefore, KubeVela team adopts kubemark, an official tool provided by kubernetes, to simulate large clusters by mocking hundreds of kubelets. Each kubelet works like a real node except that they do not run real containers inside pods. The aim of KubeVela performance test mainly focus on whether KubeVela controller can manage thousands of applications effectively, instead of pulling images or executing commands inside pods. As a result, we only need to get resources hosting these fake nodes, also named as hollow-nodes.

We set up the Kubernetes clusters on Alibaba Cloud which includes 5 master nodes and 15 worker nodes. The master nodes wil host Kubernetes core components such as kube-apiserver and kube-controller-manager. The worker nodes need to run other pressure-test related componets, including monitoring tools, KubeVela controller and kubemark pods. Since the major target is to test the performance of KubeVela controller, we do not expect other components to be the bottleneck of the pressure test. To this end, both master nodes and worker nodes are equiped with 32 cores and 128 Gi memory. We use the combination of Prometheus, Loki and Grafana as the monitoring suites and grant them enough resources in avoid of crash caused by Out-of-Memory.

Notice that KubeVela controller and monitoring tools need to be placed on real nodes to function while all pods created during performance tests should be assigned to hollow-nodes correctly. To achieve that, we give different taints to hollow-nodes and real nodes, and add corresponding tolerations to different pods.

Application#

To simulate real applications in production, we design an application template with 2 components and 5 functional traits, including

  • 1 webservice component
    • a scaler trait setting its replica to 3
    • a sidecar trait attaching another container to each pod
    • an ingress trait generating one ingress instance and one service instance
  • 1 worker component
    • a scaler trait also setting its replica to 3
    • a configmap trait generating a new configmap and attaching it to worker pods

In the following experiment, we test the performance of KubeVela controller managing 3,000 Applications (12,000 Pods in total) on 200 nodes. Applications are created in parallel at first, then kept running for a while, and finally deleted from the cluster. Each application will be reconciled multiple times with latencies and consumed resources recorded by monitoring tools.

In practice, we also have another trait used for adding tolerations as described above.

KubeVela Controller#

The KubeVela controller is set up with a group of recommendation configurations as follows

  • Kubernetes Resource
    • 0.5 core CPU
    • 1 Gi Memory
    • 1 replica
  • Program
    • concurrent-reconciles=2 (The number of reconcile threads)
    • kube-api-qps=300 (The qps of kubernetes client used in controller)
    • kube-api-burst=500 (The burst of kubernetes client used in controller)
    • informer-re-sync-interval=20m (The interval of routine reconciles.) We will analyze these settings in the below sections.

To evaluate the performance of KubeVela Controller itself, we disabled the Ingress MutatingWebhook and Application ValidatingWebhook which is beyond the focus of this test but will affect the performance of KubeVela Controller by increasing the latency of creating/patching resources.

Experiments#

Creation#

The creation of all 3,000 applications lasted 25min. Getting all pods running takes a bit longer time, which is out of the scope of KubeVela controller.

For each application creation, it will trigger three turns of reconciling. The usage of CPU will reach 100% in the late period of creation. The memory usage will increase as the number of applications rises. It reaches around 67% at the end of creation.

create-cpu create-memory

The average time of the first turn reconciling is relatively short since it only needs patch finalizer. The second and third turn reconciling contain full reconcile cycles and need more time to process. The following charts record the time consumptions of different period during reconciling applications. The average time is generally below 200ms while 99% of reconciles uses less than 800ms.

create-avg-time create-p99-time

Regular Reconciles#

After creation, applications are reconciled by controller every 20min. the monitoring of 8-hour reconcile process are displayed as below. The CPU usage will come up to 90% once reconcile happens routinely. The memory usage generally keeps a stable pattern, up to 75% memory usage.

med-cpu med-memory

The average reconcile time is under 200ms while 99% are about 800ms~900ms. Each regular reconcile for all applications generally takes around 10min.

med-avg med-p99

Deletion#

The application deletion process is fast and low-resource consumptive. It takes less than 3min to delete all applications. However, notice that the deletion of resources managed by application usually takes longer time. This is because the cleanup of these resources (such as deployments or pods) are not directly controlled by the KubeVela controller. KubeVela controller takes charge of deleting their owner and cleanup them by triggering cascading deletion. In addition, each deletion is associated with two turns of reconcile where the second turn returns immediately when it fails to retrieve target applciation (since it is deleted).

del-cpu del-memory del-avg del-p99

Analysis#

Number of Applications#

The experiment displayed above demonstrates a classical scenario for KubeVela. Although 3,000 applications are successfully managed by the KubeVela controller in this case, it is strongly recommended to adopt a smaller number (such as 2,000) of applications with the above configuration for the following reasons: The time and resource consumption is closely associated with the spec of applications. If a lot of users apply larger applications with more pods and more other resources, 3,000 applications might break the resource limit more easily. The memory and CPU usage shown above is approching resource limits. If memory drains, the KubeVela controller will crash and restart. If high CPU usage maintains for a long time, it might cause a long waiting queue in KubeVela controller which further lead to longer response time for application changes.

Configurations#

There are several parameters users could config to adapt into their own scenario. Using more replica for KubeVela controller do not scale up the ability of KubeVela controller. The leader election mechanism ensures that only one replica will work while others will wait. The aim of multi-replica is to support fast recovery when the working one crash. However, if the crash is caused by OOM, the recovery usually will not be able to fix that. The number of qps and burst in the program configuration should be increased accordingly while scaling up KubeVela controller. These two parameters limit the capability for controller to send requests to apiserver. Generally, in order to scale up KubeVela controller, scale up the resource limits and all the program parameters mentioned above (except the reconcile interval). If you have more applications to manage, add more memory. If you have higher operation frequency, add more CPU and threads, then increase qps and burst accordingly. Longer reconcile interval allows more applications to handle, at the cost of longer time to fix potential underlying resource problems. For example, if one deployment managed by one application disappears, the routine reconciling can discover this problem and fix it.

Scaling Up#

In addition to the experiment described above, we conducted another two experiment to test how well KubeVela controller can scale to larger clusters and more applications.

In a 500-node cluster, we tried to create 5,000 applications with the same spec described above, at the speed of 4 per second and lasted for around 21min. 1 core and 2 Gi memory are granted to KubeVela controller with the use of 4 concurrent reconcile threads. The kube-api-qps and kube-api-burst are raised to 500/800 correspondingly. All 30,000 pods successfully turn into Running phase. The time costs for each reconcile is similar to the previous experiment and CPU/Memory cost is not very high compared to the given resources. The regular reconciles for 5,000 applications takes 7~8 minutes, and no significant resource cost is observed. During this scaling, we found that the throughput of kube-apiserver starts to block the creation of application, as too many resources need to be created while applying applications.

std-cpu std-memory

Scaling up to 12,000 applications on 1,000 nodes is much harder than previous attempts. With the same creation speed, the apiserver will be flooded by lots of pod scheduling requests and finally start to drop application creation request. To overcome this difficulty, we divide the creation process of applications into several stage. Each stage only create 1,000~3,000 applications and the next stage will not begin until all pods are ready. With this strategy, we successfully create 12,000 applications, 24,000 deployments, 12,000 services, 12,000 ingress, 12,000 configmaps and 72,000 pods. The whole process takes about 30 hours. To hold this number of applications, KubeVela controller consumes 1.7 cores and 2.45 Gi memory. It takes about 12min to finish a full turn of regular reconciles for all 12,000 applications.

large-cpu large-memory large-all

Future#

The performance test of KubeVela demonstrates its ability of managing thousands of applications with limited resource consumption. It can also scale up to over 10,000 applications on large clusters with 1,000 nodes. In addition, KubeVela team also conducted similar pressure test for non-deployment based applications such as CloneSet in OpenKruise (which is not enclosed in this report) and reach same conclusions. In the future, we will add more performance tests for more complex scenario like Workflow or MultiCluster.

KubeVela - The Extensible App Platform Based on Open Application Model and Kubernetes

Lei Zhang and Fei Guo

Lei Zhang and Fei Guo

CNCF TOC Member/Kubernetes

7 Dec 2020 12:33pm, by Lei Zhang and Fei Guo

image

Last month at KubeCon+CloudNativeCon 2020, the Open Application Model (OAM) community launched KubeVela, an easy-to-use yet highly extensible application platform based on OAM and Kubernetes.

For developers, KubeVela is an easy-to-use tool that enables you to describe and ship applications to Kubernetes with minimal effort, yet for platform builders, KubeVela serves as a framework that empowers them to create developer-facing yet fully extensible platforms at ease.

The trend of cloud native technology is moving towards pursuing consistent application delivery across clouds and on-premises infrastructures using Kubernetes as the common abstraction layer. Kubernetes, although excellent in abstracting low-level infrastructure details, does introduce extra complexity to application developers, namely understanding the concepts of pods, port exposing, privilege escalation, resource claims, CRD, and so on. We’ve seen the nontrivial learning curve and the lack of developer-facing abstraction have impacted user experiences, slowed down productivity, led to unexpected errors or misconfigurations in production.

Abstracting Kubernetes to serve developers’ requirements is a highly opinionated process, and the resultant abstractions would only make sense had the decision-makers been the platform builders. Unfortunately, the platform builders today face the following dilemma: There is no tool or framework for them to easily extend the abstractions if any.

Thus, many platforms today introduce restricted abstractions and add-on mechanisms despite the extensibility of Kubernetes. This makes easily extending such platforms for developers’ requirements or to wider scenarios almost impossible.

In the end, developers complain those platforms are too rigid and slow in response to feature requests or improvements. The platform builders do want to help but the engineering effort is daunting: any simple API change in the platform could easily become a marathon negotiation around the opinionated abstraction design.

Introducing KubeVela#

With KubeVela, we aim to solve these two challenges in an approach that separates concerns of developers and platform builders.

For developers, KubeVela is an easy-to-use yet extensible tool that enables you to describe and deploy microservices applications with minimal effort. And instead of managing a handful of Kubernetes YAML files, a simple docker-compose style appfile is all you need.

A Sample Appfile#

In this example, we will create a vela.yaml along with your app. This file describes how to build the image, how to deploy the image to Kubernetes, how to access the application and how the system would scale it automatically.

name: testapp
services:
express-server:
image: oamdev/testapp:v1
build:
docker:
file: Dockerfile
contrxt: .
cmd: ["node", "server.js"]
port: 8080
cpu: "0.01"
route:
domain: example.com
rules:
- path: /testapp
rewriteTarget: /
autoscale:
min: 1
max: 4
cpuPercent: 5

Just do: $ vela up, your app will then be alive on https://example.com/testapp.

Behind the Appfile#

The appfile in KubeVela does not have a fixed schema specification, instead, what you can define in this file is determined by what kind of workload types and traits are available in your platform. These two concepts are core concepts from OAM, in detail:

  • Workload type, which declares the characteristics that runtime infrastructure should take into account in application deployment. In the sample above, it defines a “Web Service” workload named express-server as part of your application.
  • Trait, which represents the operation configurations that are attached to an instance of workload type. Traits augment a workload type instance with operational features. In the sample above, it defines a route trait to access the application and an autoscale trait for the CPU based horizontal automatic scaling policy.

Whenever a new workload type or trait is added, it would become immediately available to be declared in the appfile. Let’s say, a new trait named metrics is added, developers could check the schema of this trait by simply $ vela show metrics and define it in the previous sample appfile:

name: testapp
services:
express-server:
type: webservice
image: oamdev/testapp:v1
build:
docker:
file: Dockerfile
contrxt: .
cmd: ["node", "server.js"]
port: 8080
cpu: "0.01"
route:
domain: example.com
rules:
- path: /testapp
rewriteTarget: /
autoscale:
min: 1
max: 4
cpuPercent: 5
metrices:
port: 8080
path: "/metrics"
scheme: "http"
enabled: true

Vela Up#

The vela up command deploys the application defined in appfile to Kubernetes. After deployment, you can use vela status to check how to access your application following the route trait declared in appfile.

Apps deployed with KubeVela will receive a URL (and versioned pre-release URLs) with valid TLS certificate automatically generated via cert-manager. KubeVela also provides a set of commands (i.e. vela logs, vela exec) to best support your application management without becoming a Kubernetes expert. Learn more about vela up and appfile.

KubeVela for Platform Builders#

The above experience cannot be achieved without KubeVela’s innovative offerings to the platform builders as an extensible platform engine. These features are the hidden gems that make KubeVela unique. In details, KubeVela relieves the pains of building developer facing platforms on Kubernetes by doing the following:

  • Application Centric. Behind the appfile, KubeVela enforces “application” as its main API and all KubeVela’s capabilities serve the applications’ requirements only. This is how KubeVela brings application-centric context to the platform by default and changes building such platforms into working around application architecture.
  • Extending Natively. As mentioned in the developer section, an application described by appfile is composed of various pluggable workload types and operation features (i.e. traits). Capabilities from Kubernetes ecosystem can be added to KubeVela as new workload types or traits through Kubernetes CRD registry mechanism at any time.
  • Simple yet Extensible User Interface. Behind the appfile, KubeVela uses CUELang as the “last mile” abstraction engine between user-facing schema and the control plane objects. KubeVela provides a set of built-in abstractions to start with and the platform builders are free to modify them at any time. Capability adding/updating or abstraction changes will all take effect at runtime, neither recompilation nor redeployment of KubeVela is required.

Under the hood, KubeVela core is built on top of Crossplane OAM Kubernetes Runtime with KEDA, Flagger, Prometheus, etc as dependencies, yet its feature pool is “unlimited” and can be extended at any time.

With KubeVela, platform builders now have the tooling support to design and ship any new capabilities with abstractions to end-users with high confidence and low turnaround time. And for a developer, you only need to learn these abstractions, describe the app with them in a single file, and then ship it.

Not Another PaaS System#

Most typical Platform-as-a-Service (PaaS) systems also provide full application management capabilities and aim to improve developer experience and efficiency. In this context, KubeVela shares the same goal.

Though unlike most typical PaaS systems which are either inextensible or create their own addon systems maintained by their own communities. KubeVela is designed to fully leverage the Kubernetes ecosystems as its capability pool. Hence, there’s no additional addon system introduced in this project. For platform builders, a new capability can be installed in KubeVela at any time by simply registering its API resource to OAM and providing a CUE template. We hope and expect that with the help of the open source community, the number of the KubeVela’s capabilities will grow dramatically over time. Learn more about using community capabilities by $vela cap.

So in a nutshell, KubeVela is a Kubernetes plugin for building application-centric abstractions. It leverages the native Kubernetes extensibility and capabilities to resolve a hard problem – making application management enjoyable on Kubernetes.

Learn More#

KubeVela is incubated by the OAM community as the successor of Rudr project, while rather than being a reference implementation, KubeVela intends to be an end-to-end implementation that could be used in wider scenarios. The design of KubeVela’s appfile is also part of the experimental attempt in OAM specification to bring a simplified user experience to developers.

To learn more about KubeVela, please visit KubeVela’s documentation site. The following content are also good next steps:

  • Try out KubeVela following the step-by-step tutorial in its Quick Start page.
  • Give us feedback! KubeVela is still in its early stage and we are happy to ask the community for feedback via OAM Gitter or Slack channel.
  • Extend KubeVela to build your own platforms. If you have an idea for a new workload type, trait or try to build something more complex like a database or AI PaaS with KubeVela, post your idea as a GitHub Issue or propose it to the OAM community, we are eager to know.
  • Contribute to KubeVela. KubeVela is initialized by the open source community with bootstrap contributors from 8+ different organizations. We intend to donate this project to a neutral foundation as soon as it gets stable.