In this article, we will build a custom operator from start to finish that will deploy a simple web app. The web app on its own is not capable of much and depends on the operator to provide the necessary bits of information to deploy and run successfully. We will provide all the required information necessary to run our web app in a custom resource (CR), which our operator will then use to deploy our application. We can deploy multiple web apps to demonstrate this workflow and how the operator will maintain different sets of deployments and related variables. Click here to learn more about operators and how they help extend Kubernetes.
Our focus will be on building and diving into the operator’s logic, I will not cover installing Operator-SDK as you can find that here. I will break down each function within the business logic and related helper functions to give you a deeper understanding of how it works. There is a fair bit of code to cover and the purpose is to get you up and running with the concepts so you can use it to roll your own and build custom controllers specific to your business needs.
The Operator-SDK is a framework that uses the controller-runtime library to make writing operators easier by providing:
- High level APIs and abstractions to write the operational logic more intuitively
- Tools for scaffolding and code generation to bootstrap a new project quickly
- Extensions to cover common operator use cases
Let’s get to it!
Initialize your environment
1) Make sure you’ve got your Kubernetes cluster running
2) Get into your Go workspace (cd $GOPATH/src/github.com/username/)
3) Initialize a new webapp-operator project using the operator-sdk.
$ operator-sdk new webapp-operator
INFO[0000] Creating new Go operator 'webapp-operator'.
INFO[0000] Created go.mod
INFO[0000] Created tools.go
INFO[0000] Created cmd/manager/main.go
INFO[0000] Created build/Dockerfile
INFO[0000] Created build/bin/entrypoint
INFO[0000] Created build/bin/user_setup
INFO[0000] Created deploy/service_account.yaml
INFO[0000] Created deploy/role.yaml
INFO[0000] Created deploy/role_binding.yaml
INFO[0000] Created deploy/operator.yaml
INFO[0000] Created pkg/apis/apis.go
INFO[0000] Created pkg/controller/controller.go
INFO[0000] Created version/version.go
INFO[0000] Created .gitignore
INFO[0000] Validating project
INFO[0006] Project validation successful.
INFO[0006] Project creation complete.
webapp-operator/
By running the operator-sdk new
command, we scaffold out a framework for our project.
Folders | Purpose |
---|---|
pkg/apis | Contains the APIs for our CR |
pkg/controller | Contains the Controller implementations for our Operator |
build | Contains the Dockerfile and build scripts used to build the operator |
deploy | Contains various YAML manifests for registering CRDs, setting up RBAC, and deploying the operator as a Deployment |
cmd | Contains manager/main.go which is the main program of the operator. This new manager which registers all custom resource definitions starts all controllers |
.
├── build
│ ├── Dockerfile
│ └── bin
│ ├── entrypoint
│ └── user_setup
├── cmd
│ └── manager
│ └── main.go
├── deploy
│ ├── operator.yaml
│ ├── role.yaml
│ ├── role_binding.yaml
│ └── service_account.yaml
├── go.mod
├── go.sum
├── pkg
│ ├── apis
│ │ └── apis.go
│ └── controller
│ └── controller.go
├── tools.go
└── version
└── version.go
1) Create Custom Resource and it’s API using the operator-sdk.
$ operator-sdk add api --api-version=blog.arctiq.com/v1alpha1 --kind=WebApp
INFO[0000] Generating api version blog.arctiq.com/v1alpha1 for kind WebApp.
INFO[0000] Created pkg/apis/blog/group.go
INFO[0001] Created pkg/apis/blog/v1alpha1/webapp_types.go
INFO[0001] Created pkg/apis/addtoscheme_blog_v1alpha1.go
INFO[0001] Created pkg/apis/blog/v1alpha1/register.go
INFO[0001] Created pkg/apis/blog/v1alpha1/doc.go
INFO[0001] Created deploy/crds/blog.arctiq.com_v1alpha1_webapp_cr.yaml
INFO[0001] Running deepcopy code-generation for Custom Resource group versions: [blog:[v1alpha1], ]
INFO[0009] Code-generation complete.
INFO[0009] Running CRD generator.
INFO[0010] CRD generation complete.
INFO[0010] API generation complete.
What was created?
A new CustomResourceDefinition defining our WebApp object so Kubernetes will know about it.
deploy/crds:
├── blog.arctiq.com_v1alpha1_webapp_cr.yaml
└── blog.arctiq.com_webapps_crd.yaml
API’s - A general manifest for deploying apps of type WebApp
└── blog
├── group.go
└── v1alpha1
├── doc.go
├── register.go
├── webapp_types.go
└── zz_generated.deepcopy.go
- Modify webapp_types.go and add the following custom variables to WebAppSpec and WebAppStatus structs
- The fields in WebAppSpec map directly back to the CR
- The fields in WebAppStatus are used to report status of the CR
// WebAppSpec defines the desired state of WebApp
type WebAppSpec struct {
Count int32 `json:"count"`
Image string `json:"image"`
Port int32 `json:"port"`
Webgroup string `json:"webgroup"`
Message string `json:"message"`
}
// WebAppStatus defines the observed state of WebApp
type WebAppStatus struct {
Nodes []string `json:"nodes"`
Message string `json:"message"`
}
5) Important: Run operator-sdk generate k8s
to regenerate code after modifying this file
$ operator-sdk generate k8s
INFO[0000] Running deepcopy code-generation for Custom Resource group versions: [blog:[v1alpha1], ]
INFO[0007] Code-generation complete.
This commaned generated the DeepCopy methods.
└── blog
└── zz_generated.deepcopy.go
6) Let’s add a controller
$ operator-sdk add controller --api-version=blog.arctiq.com/v1alpha1 --kind=WebApp
INFO[0000] Generating controller version blog.arctiq.com/v1alpha1 for kind WebApp.
INFO[0000] Created pkg/controller/webapp/webapp_controller.go
INFO[0000] Created pkg/controller/add_webapp.go
INFO[0000] Controller generation complete.
.
├── build
...
│ └── controller
│ ├── add_webapp.go
│ ├── controller.go
│ └── webapp
│ └── webapp_controller.go
What was created?
- The
pkg/controller/webapp/webapp_controller.go
pkg/controller/add_webapp.go
files were generated
The webapp_controller.go
is where our main controller logic lives, so let’s dive into this file. The reconcile function is responsible for synchronizing the resources and their specifications according to the business logic implemented in them. This works like a loop and continuously attempts to reconcile state until all conditionals match its implementation. The functions within our business logic need to return status to our controller to indicate if an operation was successful. The controller will re-queue the request to be processed again if the returned error is non-nil or Result.Requeue
is true, otherwise upon completion it will remove the work from the queue.
Purpose | Return Method |
---|---|
Return to parent with and error | return reconcile.Result{}, err |
Return to parent without and error and Requeue operation | return reconcile.Result{Requeue: true}, nil |
Return to parent and stop the Reconcile | return reconcile.Result{}, nil |
For more information on the underlying Go package reconcile see here
This is the main reconcile function within the controller.
Line | Purpose |
---|---|
5 | Fetch the WebApp instance |
10 | If requested object not found, return and don’t requeue |
12 | Reading the object - requeue the request. |
17 | End of loop, all is well, return with no error |
All the business logic below will live after line 15 and before the final return.
The first deployment function is our primary work horse, since our operators purpose is to maintain a deployment(s) config with particular attributes, the purpose of the this function is exactly that. Check if the deployment already exists, if not create a new one. Lets step through the function.
Get/Create Deployment Object
Line | Purpose |
---|---|
1 | Fetch the Deployment instanace (appsv1.Deployment spec) |
4 | Calls our Deployment helper function |
6 | Client library call to API to create deployment (dep) |
7-9 | If failed advise parent |
11 | Deployment recreated - Requeue |
14 | Failed to get deployment, advise parent |
Ensure count value from CRD is the same as replica count in deployment
Line | Purpose |
---|---|
1 | Get Count value from CRD |
2-4 | If the CRD value and Deployment value do not match update deployment via API client |
7 | Return to parent that Deployment could not be updated |
9 | Return to parent to Requeue. |
List the pods for this instance’s deployment and update status.Nodes
value. (This is reflected when we describe the WebApp object)
Line | Purpose |
---|---|
1-5 | Prepare PodList and ListOptions for pods in this namespace |
6 | Fetch PodList |
11 | Call Helper function to list pods |
13-21 | Check if Pod list in Status.Nodes matches those Currently Deployed. If not update to match names. |
Similar to Pods Function above, this will match the CRD Spec.Message
to Status.Message
(This is reflected when we describe the WebApp object)
Line | Purpose |
---|---|
1-7 | Prepare PodList and ListOptions for pods in this namespace |
6 | Fetch PodList |
11 | Call Helper function to list pods |
13-21 | Check if Pod list in Status.Nodes matches those Currently Deployed. If not update to match names. |
These are our helper functions called by the reconsile loop
Line | Purpose |
---|---|
1-7 | Simply loop through a range of pods, append them to a slice of string and returns the list back |
13 | Builds our deployment object |
14-44 | This is the main deployment generator of the controller. Lets take a look at this in detail. |
- All m.Spec ( m.Spec.Message, m.Spec.Portm m.Spec.Image, m.Spec.Count) attributes are pulled from our CR.
- Lines starting at 22 (appsv1.DeploymentSpec), 27 (corev1.PodTemplateSpec), 31 (corev1.PodSpec) are all key compoents used to build a Deployment
- Once these are populated, the deployment configuraiton is turned back to the parent and the reconsile loop attempts to apply them.
Testing
- Create a namespace and deploy the service account, role and role binding
- Make sure to deploy the CRD prior to deploying the operator.
$ kubectl create -f deploy/service_account.yaml -n test-op
serviceaccount/webapp-operator created
$ kubectl create -f deploy/role.yaml -n test-op
role.rbac.authorization.k8s.io/webapp-operator created
$ kubectl create -f deploy/role_binding.yaml -n test-op
rolebinding.rbac.authorization.k8s.io/webapp-operator created
$ kubectl create -f deploy/crds/blog.arctiq.com_webapps_crd.yaml -n test-op
customresourcedefinition.apiextensions.k8s.io/webapps.blog.arctiq.com created
Run the operator locally to debug and test
$ operator-sdk run --local --namespace=test-op
INFO[0000] Running the operator locally in namespace test-op.
{"level":"info","ts":1587343428.3924742,"logger":"cmd","msg":"Operator Version: 0.0.1"}
{"level":"info","ts":1587343428.392537,"logger":"cmd","msg":"Go Version: go1.14.1"}
{"level":"info","ts":1587343428.392544,"logger":"cmd","msg":"Go OS/Arch: darwin/amd64"}
{"level":"info","ts":1587343428.392548,"logger":"cmd","msg":"Version of operator-sdk: v0.16.0"}
{"level":"info","ts":1587343428.39569,"logger":"leader","msg":"Trying to become the leader."}
{"level":"info","ts":1587343428.3957288,"logger":"leader","msg":"Skipping leader election; not running in a cluster."}
{"level":"info","ts":1587343429.457581,"logger":"controller-runtime.metrics","msg":"metrics server is starting to listen","addr":"0.0.0.0:8383"}
{"level":"info","ts":1587343429.457721,"logger":"cmd","msg":"Registering Components."}
{"level":"info","ts":1587343429.457837,"logger":"cmd","msg":"Skipping CR metrics server creation; not running in a cluster."}
{"level":"info","ts":1587343429.4578428,"logger":"cmd","msg":"Starting the Cmd."}
{"level":"info","ts":1587343429.45803,"logger":"controller-runtime.manager","msg":"starting metrics server","path":"/metrics"}
{"level":"info","ts":1587343429.458099,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"webapp-controller","source":"kind source: /, Kind="}
{"level":"info","ts":1587343429.560235,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"webapp-controller","source":"kind source: /, Kind="}
{"level":"info","ts":1587343429.6614149,"logger":"controller-runtime.controller","msg":"Starting Controller","controller":"webapp-controller"}
{"level":"info","ts":1587343429.661464,"logger":"controller-runtime.controller","msg":"Starting workers","controller":"webapp-controller","worker count":1}
Deploy CRD
apiVersion: blog.arctiq.com/v1alpha1
kind: WebApp
metadata:
name: example-webapp
spec:
# Add fields here
count: 3
webgroup: "Demo-WebApp"
image: r00tsh3ll/op-websrv
port: 8080
message: "Hello From WebApp 1"
$ kubectl create -f deploy/crds/blog.arctiq.com_v1alpha1_webapp_cr.yaml -n test-op
webapp.blog.arctiq.com/example-webapp created
Operator Reconciles Changes
{"level":"info","ts":1587343572.649143,"logger":"controller_webapp","msg":"Reconciling WebApp","Request.Namespace":"test-op","Request.Name":"example-webapp"}
{"level":"info","ts":1587343572.753247,"logger":"controller_webapp","msg":"Creating a new Deployment %s/%s\n","Request.Namespace":"test-op","Request.Name":"example-webapp","test-op":"example-webapp"}
{"level":"info","ts":1587343572.7898061,"logger":"controller_webapp","msg":"Reconciling WebApp","Request.Namespace":"test-op","Request.Name":"example-webapp"}
If we take a look in our namespace we can see three Pods as per our request have been deployed and a related deployment config has also been created.
$ kubectl get pods -n test-op
NAME READY STATUS RESTARTS AGE
example-webapp-768748bd7-cs52v 1/1 Running 0 7s
example-webapp-768748bd7-kdw9j 1/1 Running 0 7s
example-webapp-768748bd7-qrd7x 1/1 Running 0 7s
$ kubectl get deployments -n test-op
NAME READY UP-TO-DATE AVAILABLE AGE
example-webapp 3/3 3 3 10s
Lets describe our WebApp CRD
We can see that the Status section has been updated with the pod list from our deployment and the message field has been updated as well. This is all being driven from our operator using the functions described above.
kubectl describe WebApp example-webapp -n test-op
Name: example-webapp
Namespace: test-op
...
Spec:
Count: 3
Image: r00tsh3ll/op-websrv
Message: Hello From WebApp 1
Port: 8080
Webgroup: Demo-WebApp
Status:
Message: Hello From WebApp 1
Nodes:
example-webapp-768748bd7-kdw9j
example-webapp-768748bd7-cs52v
example-webapp-768748bd7-qrd7x
Events: <none>
Lets edit our CRD and increment the count value to 5
...
spec:
count: 5
image: r00tsh3ll/op-websrv
message: Hello From WebApp 1
port: 8080
webgroup: Demo-WebApp
...
$ kubectl get pods -n test-op
NAME READY STATUS RESTARTS AGE
example-webapp-768748bd7-cs52v 1/1 Running 0 3m11s
example-webapp-768748bd7-kdw9j 1/1 Running 0 3m11s
example-webapp-768748bd7-qrd7x 1/1 Running 0 3m11s
example-webapp-768748bd7-x6fkf 1/1 Running 0 4s <<<
example-webapp-768748bd7-zcmr2 1/1 Running 0 4s <<<
Our operator reconciles the change and updates the replica count which spins up two new pods.
Lets validate our message field is properly picked up by our web app with a cURL test. Expose the deployment and service as per your environment.
curl -kv http://192.168.1.60:8080
...
< HTTP/1.1 200 OK
< Date: Wed, 22 Apr 2020 14:50:47 GMT
< Content-Length: 28
< Content-Type: text/html; charset=utf-8
<
* Connection #0 to host 192.168.123.60 left intact
Hello From WebApp 1
Above you can see our message payload from our CRD was properly passed through our operator, to the deployment config where our pods picked it up display it through their web service.
Build and push the webapp-operator image to a registry
$ operator-sdk build docker.io/USERNAME/webapp-operator
# Push image
$ docker push USERNAME/webapp-operator
# Update the operator manifest to use the built image name deploy/operator.yaml
REPLACE_IMAGE with docker.io/USERNAME/webapp-operator
# Deploy the app-operator
$ kubectl create -f deploy/operator.yaml
Clean Up
$ kubectl delete -f deploy/crds/blog.arctiq.com_v1alpha1_webapp_cr.yaml
$ kubectl delete -f deploy/operator.yaml
$ kubectl delete -f deploy/role.yaml
$ kubectl delete -f deploy/role_binding.yaml
$ kubectl delete -f deploy/service_account.yaml
$ kubectl delete -f deploy/crds/blog.arctiq.com_webapps_crd.yaml
$ kubectl delete namespace test-op
TL;DR
- operator-sdk new webapp-operator
- operator-sdk add api –api-version=blog.arctiq.com/v1alpha1 –kind=WebApp
- modify the spec and status of the CRD
- operator-sdk generate k8s
- operator-sdk generate crds
- operator-sdk add controller –api-version=blog.arctiq.com/v1alpha1 –kind=WebApp
- write controller logic
- kubectl create -f deploy/service_account.yaml -n test-op
- kubectl create -f deploy/role.yaml -n test-op
- kubectl create -f deploy/role_binding.yaml -n test-op
- kubectl create -f deploy/crds/blog.arctiq.com_webapps_crd.yaml -n test-op
- local Testing: operator-sdk run –local –namespace=test-op
- package and push
- deploy stand alone operator
Code used in this article is available here. Interested in learning more about operators and Kubernetes? We would love to hear from you.
//take the first step