Let’s create your first Kubernetes Operator with operator-sdk

Masato Naka
7 min readNov 12, 2021

Introduction

It is quite hard for beginners to start studying Kubernetes Operator, I’d like to write down and share my experience of creating my first Kubernetes Operator. In this post, I will use the operator-sdk’s tutorial.

This post will help beginners start the tutorial more easily with smaller and detailed steps.

Overview

Following Go Operator Tutorial (operator-sdk), we’ll implement memcached-operator, whose detailed implementation is in memcached_controller.go

memcached-operator does the following:

Custom Resource Memcached

  1. spec.size determines the number of pods for Memcached
  2. status.nodes stores the name of pods for Memcached

Controller with the following reconciliation loop

  1. Fetch Memcached instance.
  2. Create a new Deployment if not exist.
  3. Keep the Memcached.Spec.Size and Deployment.Replicas same.
  4. Update Memcached.Status.Nodes with the name of corresponding Pods.

Code is available for each stage with a separate commit: https://github.com/nakamasato/memcached-operator/commits/master

Prerequisite

  1. Git
  2. Go version 1.15
  3. Docker version 17.03+
  4. kubectl (require access to Kubernetes cluster with compatible version)
  5. Docker registry (Optional) (If you just develop in your local and no need of running the operator in a Kubernetes cluster, not necessary to have a registry.)

Steps

1. Create a Project with operator-sdk

Create a directory

mkdir -p $HOME/projects/memcached-operator cd $HOME/projects/memcached-operator

Initialize a Kubernetes operator

operator-sdk init --domain example.com --repo github.com/example/memcached-operator

2. Create API with operator-sdk

Create Kubernetes API resource and controller

operator-sdk create api --group cache --version v1alpha1 --kind Memcached --resource --controller

3. Define API (api/v1alpha1/memcached_types.go)

Start editing api/v1alpha1/memcached_types.go the automatically generated go file for Memcached resource:

  • MemcachedSpec: Add a field size to determine how many pods are deployed for Memcached
  • MemcachedStatus: Store the name of each Memcached Pod in Nodes of Memcached status.
  • Memcached: struct defined with the spec and status above, both of which are also structs.

I Attached each of the definitions:

MemcachedSpec :

MemcachedStatus :

Memcached:

After modifying the go file that defines the API resource, we need to update the automatically generated files with the following command:

make generate

This command will update api/v1alpha1/zz_genearated.deepcopy.go

After defining API we also need to update the manifest files for CRD:

make manifests

This command will update config/crd/bases/cache.example.com_memcacheds.yaml.

We’ve completed defining our Memcached API resource.

4. Implement controller

As we finalized the custom resource Memcached in the previous section, now we need to watch and control them. we’ll start implement it in controller, which is responsible for watching the target resources and keeping the desired state and actual state of the target resources the same.

However, it seems very complicated especially for beginners to implement the whole controller at once. Thus, I would like to explain step by step, and you can run the controller at any of those steps to see what is exactly implemented in each step.

4.1. Implement controller (Fetch the Memcached instance)

Firstly, we add the logic to fetch Memcached instance in Reconcile function. To make it easy to check the logs, I added 1. Fetch the Memcached instance. all the logs.

The very important point here is the returning value. Reconcile function returns (ctrl.Result, error), which needs to tell if the current event is properly processed. If we return an error with a non-nil value, the event will be re-queued and the Reconcile function will be called with the event again later. Otherwise, the event will be considered as completed and the reconciliation loop will not be called with the event.

The basic rule for now is:

  1. Error occurs → Return ctrl.Result{}, err
  2. Completed process without any problem → Return ctrl.Result{}, nil

For more details, you can see reconcile loop section in the tutorial or reconcile package.

At this point, you can run the controller with yourkubectl configured with an accessible Kubernetes cluster. Please make sure that your kubectl is configured to the cluster you can freely deploy and delete resources. If you’re not sure, I would recommend using a local Kubernetes cluster, e.g. Kubernetes for Docker or kind.

You can run the controller with CRD with the following command:

make install run

What the command does is:

  1. Install the CRD for Memcached
  2. Run the controller on your machine with go run ./main.go

You can run your controller just with make run once you already installed the CRD. Whenever you change CRD, you need to run make install to update the CRD installed in the Kubernetes cluster.

4.2. Implement controller (Create Deployment if not exist)

The next step is to check if Deployment exists for theMemcached object and create one if not exist.

  1. Import necessary libraries. (Mainly to use the definition of Deployment)
  2. Add necessary RBAC to the controller by markers. (The controller needs access to Deployment and Pods)
  3. Call deploymentForMemcached function if Deploymentnot exist (IsNotFound)for the corresponding NamespacedName.
  4. Define deploymentForMemcached function to create Deployment with memcached:1.4.36-alpine image.
  5. Define labelsForMemcached function to return the labels that are added to the newly created Deployment.

As you can see in the code, we create a separate function deploymentForMemcached to create a Deployment object and attach labels app: memcached, memcached_cr: <memcached name> with labelsForMemcached function. These labels will be used to get Pods to update the status in the later section.

One more important part is the following piece of code. We add Owns(&appsv1.Deployment{}). to trigger the reconciliation loop when Deployment is changed.

Let’s check the controller at this point!

  1. make install: Install CRD if you haven’t installed the CRD.
  2. make run: Run the controller on your local.
  3. Modify config/samples/cache_v1alpha1_memcached.yaml by replacing foo: bar with size: 3 under spec.
  4. Open another terminal and run kubectl apply -f config/samples/cache_v1alpha1_memcached.yaml: Create the custom resource ( Memcached object) with name memcached-sample
  5. Check the log in the terminal where you’re running make run. You’ll see some logs from the controller.
  6. kubectl get deployment memcached-sample: Check if Deployment is created by the controller.
  7. kubectl delete -f config/samples/cache_v1alpha1_memcached.yaml
  8. kubectl get deployment memcached-sample: Check if Deployment is deleted. ← Deleted because of ctrl.ControllerReference(m, dep, r.Scheme)

ControllerReference is used for garbage collection in Kubernetes.

Kubernetes checks for and deletes objects that no longer have owner references

In our case, when the Memcached object is deleted, the corresponding Deployment lost its owner reference, and is also deleted.

The diagram shows the overview of what we’ve just done.

4.3 Implement controller (Keep the Deployment replicas and Memcached size same)

Although we enable the controller to create a new Deployment if not exist in the previous chapter, we still cannot update the replicas of the Deployment by changing size in Memcached object. In this section, we’ll implement the logic to keep the sizein Memcached object and replicas in the corresponding Deployment object.

The basic steps are as follows:

  1. Get the size in Memcached instance by memcached.Spec.Size.
  2. Compare the replicas of the Deployment and the obtained size.
  3. If the size and the replicas are different, setDeployment.Spec.Replicas field to the size.
  4. Update the Deployment object.

Now you can run the new controller with make run, and you’ll see the replicas of Deployment is changed by updating the size by kubectl patch memcached memcached-sample -p '{"spec": {"size": 5}}' --type merge .

4.4 Implement controller (Update Memcached status with Pod name)

As the last part of the controller implementation, we’ll store the Pod names in Memcached status.

Here are the steps:

  1. Get a list of Pods that have labels that we added in the earlier section where we created Deployment with labelsForMemcached function.
  2. Get the name of each Pod with getPodNames function.
  3. Set the name list to memcached.Status.Nodes field.

You can check the status by the following steps:

  1. make run: Run the controller.
  2. kubectl apply -f config/samples/cache_v1alpha1_memcached.yaml: Apply Memcached object.
  3. kubectl get memcached sample-memcached -o yaml: Get the Memcached object with yaml format which contains Status

5. Recap the controller function

In this section, let’s recap the controller.

  1. Delete Memcached object if it remains in your cluster.
kubectl delete -f config/samples/cache_v1alpha1_memcached.yaml

2. Install CRD if you haven’t installed it.

make install

3. Start running the controller.

make run

4. Create Memcached object.

kubectl apply -f config/samples/cache_v1alpha1_memcached.yaml

5. You can see the following from the logs:

  • Fetch Memcached instance
  • Create Deployment
  • Update Memcached Status
kubectl logs $(kubectl get po -n memcached-operator-system | grep memcached-operator-controller-manager | awk '{print $1}') -c manager -n memcached-operator-system -f

Logs:

2021-04-13T01:01:47.494Z        INFO    controllers.Memcached   1. Fetch the Memcached instance. Memchached resource found      {"memcached": "default/memcached-sample", "memcached.Name": "memcached-sample", "memcached.Namespace": "default"} 2021-04-13T01:01:47.495Z        INFO    controllers.Memcached   2. Check if the deployment already exists, if not create a new one. Creating a new Deployment    {"memcached": "default/memcached-sample", "Deployment.Namespace": "default", "Deployment.Name": "memcached-sample"} 2021-04-13T01:02:24.109Z        INFO    controllers.Memcached   1. Fetch the Memcached instance. Memchached resource found      {"memcached": "default/memcached-sample", "memcached.Name": "memcached-sample", "memcached.Namespace": "default"} 2021-04-13T01:02:24.109Z        INFO    controllers.Memcached   4. Update the Memcached status with the pod names. Pod list     {"memcached": "default/memcached-sample", "podNames": ["memcached-sample-6c765df685-2mx8x", "memcached-sample-6c765df685-9t2fl"]} 2021-04-13T01:02:24.125Z        INFO    controllers.Memcached   4. Update the Memcached status with the pod names. Update memcached.Status       {"memcached": "default/memcached-sample", "memcached.Status.Nodes": ["memcached-sample-6c765df685-2mx8x", "memcached-sample-6c765df685-9t2fl"]}

6. Check the Memcachedstatus

kubectl get memcached memcached-sample -o jsonpath='{.status}' {"nodes":["memcached-sample-6c765df685-gtstq","memcached-sample-6c765df685-lxj8z"]}%

7. Change Memcached size.

kubectl patch memcached memcached-sample -p '{"spec":{"size": 5}}' --type=merge

You can see the number of pods is changed:

kubectl get deployment memcached-sample

6. Clean up

1. Delete Memcached object.

kubectl delete -f config/samples/cache_v1alpha1_memcached.yaml

2. Stop the controller (ctrl-c in the terminal where you’re running make run).

3. Uninstall the CRD from the Kubernetes cluster.

make uninstall

Summary

How about your experience in creating your first Kubernetes operator? The controller should look overwhelming at the beginning, but if you break down into smaller pieces, it’s not that hard to start, isn’t it?

For me, operator-sdk’s tutorial opened the door to the Kubernetes operator world. Hopefully, this post will be helpful for newbies to start learning Kubernetes operators.

Links

--

--

Masato Naka

An SRE engineer, mainly working on Kubernetes. CKA (Feb 2021). His Interests include Cloud-Native application development, and machine learning.