Author: Aly Khimji


In this article, we will build a custom operator from start to finish that will deploy a simple web app. The web app on its own is not capable of much and depends on the operator to provide the necessary bits of information to deploy and run successfully. We will provide all the required information necessary to run our web app in a custom resource (CR), which our operator will then use to deploy our application. We can deploy multiple web apps to demonstrate this workflow and how the operator will maintain different sets of deployments and related variables. Click here to learn more about operators and how they help extend Kubernetes.

opsdk-logo

Our focus will be on building and diving into the operator’s logic, I will not cover installing Operator-SDK as you can find that here. I will break down each function within the business logic and related helper functions to give you a deeper understanding of how it works. There is a fair bit of code to cover and the purpose is to get you up and running with the concepts so you can use it to roll your own and build custom controllers specific to your business needs.

The Operator-SDK is a framework that uses the controller-runtime library to make writing operators easier by providing:

  • High level APIs and abstractions to write the operational logic more intuitively
  • Tools for scaffolding and code generation to bootstrap a new project quickly
  • Extensions to cover common operator use cases

codeblock1


Let’s get to it!

Initialize your environment

1) Make sure you’ve got your Kubernetes cluster running

2) Get into your Go workspace (cd $GOPATH/src/github.com/username/)

3) Initialize a new webapp-operator project using the operator-sdk.

$ operator-sdk new webapp-operator
INFO[0000] Creating new Go operator 'webapp-operator'.  
INFO[0000] Created go.mod                               
INFO[0000] Created tools.go                             
INFO[0000] Created cmd/manager/main.go                  
INFO[0000] Created build/Dockerfile                     
INFO[0000] Created build/bin/entrypoint                 
INFO[0000] Created build/bin/user_setup                 
INFO[0000] Created deploy/service_account.yaml          
INFO[0000] Created deploy/role.yaml                     
INFO[0000] Created deploy/role_binding.yaml             
INFO[0000] Created deploy/operator.yaml                 
INFO[0000] Created pkg/apis/apis.go                     
INFO[0000] Created pkg/controller/controller.go         
INFO[0000] Created version/version.go                   
INFO[0000] Created .gitignore                           
INFO[0000] Validating project                           
INFO[0006] Project validation successful.               
INFO[0006] Project creation complete.  

webapp-operator/

By running the operator-sdk new command, we scaffold out a framework for our project.

Folders Purpose
pkg/apis Contains the APIs for our CR
pkg/controller Contains the Controller implementations for our Operator
build Contains the Dockerfile and build scripts used to build the operator
deploy Contains various YAML manifests for registering CRDs, setting up RBAC, and deploying the operator as a Deployment
cmd Contains manager/main.go which is the main program of the operator. This new manager which registers all custom resource definitions starts all controllers

.
├── build
│   ├── Dockerfile
│   └── bin
│       ├── entrypoint
│       └── user_setup
├── cmd
│   └── manager
│       └── main.go
├── deploy
│   ├── operator.yaml
│   ├── role.yaml
│   ├── role_binding.yaml
│   └── service_account.yaml
├── go.mod
├── go.sum
├── pkg
│   ├── apis
│   │   └── apis.go
│   └── controller
│       └── controller.go
├── tools.go
└── version
    └── version.go

1) Create Custom Resource and it’s API using the operator-sdk.

$ operator-sdk add api --api-version=blog.arctiq.com/v1alpha1 --kind=WebApp

INFO[0000] Generating api version blog.arctiq.com/v1alpha1 for kind WebApp. 
INFO[0000] Created pkg/apis/blog/group.go               
INFO[0001] Created pkg/apis/blog/v1alpha1/webapp_types.go 
INFO[0001] Created pkg/apis/addtoscheme_blog_v1alpha1.go 
INFO[0001] Created pkg/apis/blog/v1alpha1/register.go   
INFO[0001] Created pkg/apis/blog/v1alpha1/doc.go        
INFO[0001] Created deploy/crds/blog.arctiq.com_v1alpha1_webapp_cr.yaml 
INFO[0001] Running deepcopy code-generation for Custom Resource group versions: [blog:[v1alpha1], ] 
INFO[0009] Code-generation complete.                    
INFO[0009] Running CRD generator.                       
INFO[0010] CRD generation complete.                     
INFO[0010] API generation complete.                                 

What was created?

A new CustomResourceDefinition defining our WebApp object so Kubernetes will know about it.

deploy/crds: 
 ├── blog.arctiq.com_v1alpha1_webapp_cr.yaml
 └── blog.arctiq.com_webapps_crd.yaml

API’s - A general manifest for deploying apps of type WebApp

└── blog
    ├── group.go
    └── v1alpha1
    ├── doc.go
    ├── register.go
    ├── webapp_types.go
    └── zz_generated.deepcopy.go
  • Modify webapp_types.go and add the following custom variables to WebAppSpec and WebAppStatus structs
  • The fields in WebAppSpec map directly back to the CR
  • The fields in WebAppStatus are used to report status of the CR
// WebAppSpec defines the desired state of WebApp
type WebAppSpec struct {
	Count    int32  `json:"count"`
	Image    string `json:"image"`
	Port     int32  `json:"port"`
	Webgroup string `json:"webgroup"`
	Message  string `json:"message"`
}

// WebAppStatus defines the observed state of WebApp
type WebAppStatus struct {
	Nodes []string `json:"nodes"`
	Message string `json:"message"`
}

5) Important: Run operator-sdk generate k8s to regenerate code after modifying this file

$ operator-sdk generate k8s
INFO[0000] Running deepcopy code-generation for Custom Resource group versions: [blog:[v1alpha1], ] 
INFO[0007] Code-generation complete.   

This commaned generated the DeepCopy methods.

└── blog
    └── zz_generated.deepcopy.go

6) Let’s add a controller

$ operator-sdk add controller --api-version=blog.arctiq.com/v1alpha1 --kind=WebApp
INFO[0000] Generating controller version blog.arctiq.com/v1alpha1 for kind WebApp. 
INFO[0000] Created pkg/controller/webapp/webapp_controller.go 
INFO[0000] Created pkg/controller/add_webapp.go         
INFO[0000] Controller generation complete.

.
├── build
...
│   └── controller
│       ├── add_webapp.go
│       ├── controller.go
│       └── webapp
│           └── webapp_controller.go

What was created?

  • The pkg/controller/webapp/webapp_controller.go
  • pkg/controller/add_webapp.go files were generated

The webapp_controller.go is where our main controller logic lives, so let’s dive into this file. The reconcile function is responsible for synchronizing the resources and their specifications according to the business logic implemented in them. This works like a loop and continuously attempts to reconcile state until all conditionals match its implementation. The functions within our business logic need to return status to our controller to indicate if an operation was successful. The controller will re-queue the request to be processed again if the returned error is non-nil or Result.Requeue is true, otherwise upon completion it will remove the work from the queue.

Purpose Return Method
Return to parent with and error return reconcile.Result{}, err
Return to parent without and error and Requeue operation return reconcile.Result{Requeue: true}, nil
Return to parent and stop the Reconcile return reconcile.Result{}, nil

For more information on the underlying Go package reconcile see here


This is the main reconcile function within the controller.

codeblock1

Line Purpose
5 Fetch the WebApp instance
10 If requested object not found, return and don’t requeue
12 Reading the object - requeue the request.
17 End of loop, all is well, return with no error

All the business logic below will live after line 15 and before the final return.

The first deployment function is our primary work horse, since our operators purpose is to maintain a deployment(s) config with particular attributes, the purpose of the this function is exactly that. Check if the deployment already exists, if not create a new one. Lets step through the function.

Get/Create Deployment Object

codeblock2

Line Purpose
1 Fetch the Deployment instanace (appsv1.Deployment spec)
4 Calls our Deployment helper function
6 Client library call to API to create deployment (dep)
7-9 If failed advise parent
11 Deployment recreated - Requeue
14 Failed to get deployment, advise parent

Ensure count value from CRD is the same as replica count in deployment

codeblock3

Line Purpose
1 Get Count value from CRD
2-4 If the CRD value and Deployment value do not match update deployment via API client
7 Return to parent that Deployment could not be updated
9 Return to parent to Requeue.

List the pods for this instance’s deployment and update status.Nodes value. (This is reflected when we describe the WebApp object)

codeblock4

Line Purpose
1-5 Prepare PodList and ListOptions for pods in this namespace
6 Fetch PodList
11 Call Helper function to list pods
13-21 Check if Pod list in Status.Nodes matches those Currently Deployed. If not update to match names.

Similar to Pods Function above, this will match the CRD Spec.Message to Status.Message (This is reflected when we describe the WebApp object)

codeblock5

Line Purpose
1-7 Prepare PodList and ListOptions for pods in this namespace
6 Fetch PodList
11 Call Helper function to list pods
13-21 Check if Pod list in Status.Nodes matches those Currently Deployed. If not update to match names.

These are our helper functions called by the reconsile loop

codeblock6

Line Purpose
1-7 Simply loop through a range of pods, append them to a slice of string and returns the list back
13 Builds our deployment object
14-44 This is the main deployment generator of the controller. Lets take a look at this in detail.
  • All m.Spec ( m.Spec.Message, m.Spec.Portm m.Spec.Image, m.Spec.Count) attributes are pulled from our CR.
  • Lines starting at 22 (appsv1.DeploymentSpec), 27 (corev1.PodTemplateSpec), 31 (corev1.PodSpec) are all key compoents used to build a Deployment
  • Once these are populated, the deployment configuraiton is turned back to the parent and the reconsile loop attempts to apply them.

Testing

  • Create a namespace and deploy the service account, role and role binding
  • Make sure to deploy the CRD prior to deploying the operator.
$ kubectl create -f deploy/service_account.yaml -n test-op
serviceaccount/webapp-operator created

$ kubectl create -f deploy/role.yaml -n test-op
role.rbac.authorization.k8s.io/webapp-operator created

$ kubectl create -f deploy/role_binding.yaml -n test-op
rolebinding.rbac.authorization.k8s.io/webapp-operator created

$ kubectl create -f deploy/crds/blog.arctiq.com_webapps_crd.yaml -n test-op
customresourcedefinition.apiextensions.k8s.io/webapps.blog.arctiq.com created

Run the operator locally to debug and test

$ operator-sdk run --local --namespace=test-op
INFO[0000] Running the operator locally in namespace test-op. 
{"level":"info","ts":1587343428.3924742,"logger":"cmd","msg":"Operator Version: 0.0.1"}
{"level":"info","ts":1587343428.392537,"logger":"cmd","msg":"Go Version: go1.14.1"}
{"level":"info","ts":1587343428.392544,"logger":"cmd","msg":"Go OS/Arch: darwin/amd64"}
{"level":"info","ts":1587343428.392548,"logger":"cmd","msg":"Version of operator-sdk: v0.16.0"}
{"level":"info","ts":1587343428.39569,"logger":"leader","msg":"Trying to become the leader."}
{"level":"info","ts":1587343428.3957288,"logger":"leader","msg":"Skipping leader election; not running in a cluster."}
{"level":"info","ts":1587343429.457581,"logger":"controller-runtime.metrics","msg":"metrics server is starting to listen","addr":"0.0.0.0:8383"}
{"level":"info","ts":1587343429.457721,"logger":"cmd","msg":"Registering Components."}
{"level":"info","ts":1587343429.457837,"logger":"cmd","msg":"Skipping CR metrics server creation; not running in a cluster."}
{"level":"info","ts":1587343429.4578428,"logger":"cmd","msg":"Starting the Cmd."}
{"level":"info","ts":1587343429.45803,"logger":"controller-runtime.manager","msg":"starting metrics server","path":"/metrics"}
{"level":"info","ts":1587343429.458099,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"webapp-controller","source":"kind source: /, Kind="}
{"level":"info","ts":1587343429.560235,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"webapp-controller","source":"kind source: /, Kind="}
{"level":"info","ts":1587343429.6614149,"logger":"controller-runtime.controller","msg":"Starting Controller","controller":"webapp-controller"}
{"level":"info","ts":1587343429.661464,"logger":"controller-runtime.controller","msg":"Starting workers","controller":"webapp-controller","worker count":1}

Deploy CRD

apiVersion: blog.arctiq.com/v1alpha1
kind: WebApp
metadata:
  name: example-webapp
spec:
  # Add fields here
  count: 3
  webgroup: "Demo-WebApp"
  image: r00tsh3ll/op-websrv
  port: 8080
  message: "Hello From WebApp 1"
$ kubectl create -f deploy/crds/blog.arctiq.com_v1alpha1_webapp_cr.yaml -n test-op
webapp.blog.arctiq.com/example-webapp created

Operator Reconciles Changes

{"level":"info","ts":1587343572.649143,"logger":"controller_webapp","msg":"Reconciling WebApp","Request.Namespace":"test-op","Request.Name":"example-webapp"}
{"level":"info","ts":1587343572.753247,"logger":"controller_webapp","msg":"Creating a new Deployment %s/%s\n","Request.Namespace":"test-op","Request.Name":"example-webapp","test-op":"example-webapp"}
{"level":"info","ts":1587343572.7898061,"logger":"controller_webapp","msg":"Reconciling WebApp","Request.Namespace":"test-op","Request.Name":"example-webapp"}

If we take a look in our namespace we can see three Pods as per our request have been deployed and a related deployment config has also been created.

$ kubectl get pods -n test-op

NAME                             READY   STATUS    RESTARTS   AGE
example-webapp-768748bd7-cs52v   1/1     Running   0          7s
example-webapp-768748bd7-kdw9j   1/1     Running   0          7s
example-webapp-768748bd7-qrd7x   1/1     Running   0          7s

$ kubectl get deployments -n test-op

NAME             READY   UP-TO-DATE   AVAILABLE   AGE
example-webapp   3/3     3            3           10s

Lets describe our WebApp CRD

We can see that the Status section has been updated with the pod list from our deployment and the message field has been updated as well. This is all being driven from our operator using the functions described above.

kubectl describe WebApp example-webapp  -n test-op
Name:         example-webapp
Namespace:    test-op
...
Spec:
  Count:     3
  Image:     r00tsh3ll/op-websrv
  Message:   Hello From WebApp 1
  Port:      8080
  Webgroup:  Demo-WebApp
Status:
  Message:  Hello From WebApp 1
  Nodes:
    example-webapp-768748bd7-kdw9j
    example-webapp-768748bd7-cs52v
    example-webapp-768748bd7-qrd7x
Events:  <none>

Lets edit our CRD and increment the count value to 5

...
spec:
  count: 5
  image: r00tsh3ll/op-websrv
  message: Hello From WebApp 1
  port: 8080
  webgroup: Demo-WebApp
...

$ kubectl get pods -n test-op 
NAME                             READY   STATUS    RESTARTS   AGE
example-webapp-768748bd7-cs52v   1/1     Running   0          3m11s
example-webapp-768748bd7-kdw9j   1/1     Running   0          3m11s
example-webapp-768748bd7-qrd7x   1/1     Running   0          3m11s
example-webapp-768748bd7-x6fkf   1/1     Running   0          4s  <<<
example-webapp-768748bd7-zcmr2   1/1     Running   0          4s  <<<

Our operator reconciles the change and updates the replica count which spins up two new pods.

Lets validate our message field is properly picked up by our web app with a cURL test. Expose the deployment and service as per your environment.

 curl -kv http://192.168.1.60:8080
...

< HTTP/1.1 200 OK
< Date: Wed, 22 Apr 2020 14:50:47 GMT
< Content-Length: 28
< Content-Type: text/html; charset=utf-8
< 
* Connection #0 to host 192.168.123.60 left intact

Hello From WebApp 1

Above you can see our message payload from our CRD was properly passed through our operator, to the deployment config where our pods picked it up display it through their web service.

Build and push the webapp-operator image to a registry

$ operator-sdk build docker.io/USERNAME/webapp-operator

# Push image
$ docker push USERNAME/webapp-operator

# Update the operator manifest to use the built image name deploy/operator.yaml
REPLACE_IMAGE with docker.io/USERNAME/webapp-operator

# Deploy the app-operator
$ kubectl create -f deploy/operator.yaml

Clean Up

$ kubectl delete -f deploy/crds/blog.arctiq.com_v1alpha1_webapp_cr.yaml 
$ kubectl delete -f deploy/operator.yaml
$ kubectl delete -f deploy/role.yaml
$ kubectl delete -f deploy/role_binding.yaml
$ kubectl delete -f deploy/service_account.yaml
$ kubectl delete -f deploy/crds/blog.arctiq.com_webapps_crd.yaml
$ kubectl delete namespace test-op

TL;DR

  • operator-sdk new webapp-operator
  • operator-sdk add api –api-version=blog.arctiq.com/v1alpha1 –kind=WebApp
  • modify the spec and status of the CRD
  • operator-sdk generate k8s
  • operator-sdk generate crds
  • operator-sdk add controller –api-version=blog.arctiq.com/v1alpha1 –kind=WebApp
  • write controller logic
  • kubectl create -f deploy/service_account.yaml -n test-op
  • kubectl create -f deploy/role.yaml -n test-op
  • kubectl create -f deploy/role_binding.yaml -n test-op
  • kubectl create -f deploy/crds/blog.arctiq.com_webapps_crd.yaml -n test-op
  • local Testing: operator-sdk run –local –namespace=test-op
  • package and push
  • deploy stand alone operator

Code used in this article is available here. Interested in learning more about operators and Kubernetes? We would love to hear from you.

//take the first step

Tagged:



//comments


//blog search


//other topics