Author: Marek Anderson


With the OpenShift Container Platform (OCP) release 4.1 in June 2019, Red Hat introduced Infrastructure MachineSets. These sets allow you to host only infrastructure components, such as:

  • The default router
  • The container image registry
  • The cluster metrics collection, or monitoring service
  • Cluster aggregated logging

An Infrastructure MachineSet consists of Machine resources (kind: Machine). These Machine resources spin up new virtual machines in your cloud.

Specific Kubernetes labels can be applied to these machines to move one or more of the above mentioned infrastructure components to run on only those machines.

The kicker: The infrastructure nodes do not count towards the number of subscriptions that are required to run the environment!

Unleashing Worker Nodes

Worker nodes in the OCP cluster must be covered by subscriptions and their primary purpose is to run your application workloads. To free resources from these worker nodes, which normally run the OCP infrastructure components, it is beneficial to move the infrastructure components to dedicated infrastructure nodes.

So let’s get started.

Creating an Infrastructure MachineSet for Production

For a production-ready deployment, it is recommended to deploy three MachineSets at minimum to run infrastructure components. The aggregated logging solution, i.e., ElasticSearch, requires three instances that run on different nodes. Since each MachineSet is assigned to one availability zone of the (public) cloud provider only, deploy three MachineSets at minimum.

For demonstration purposes, we will limit the scope to only one MachineSet in the next section.

Defining the MachineSet Custom Resource for the Google Cloud Platform

Once your OCP cluster is deployed to your Google Cloud Platform (GCP) project, you can create your first MachineSet to move infrastructure components. Sidenote: OCP 4.3 supports the installer provisioned infrastructure (IPI) installation method to pre-existing Virtual Private Clouds (VPC) and subnets. Choose the GCP region in which you deployed your OCP4 cluster. Then, select a GCP zone within that region to deploy the MachineSet.

Note: Double-check that the GCP zone actually exists. I tried to deploy to us-east1-a, which does not exist ;-) Unfortunately, no logs or events revealed this to me. Instead, a kind colleague showed me the light.

Please find the YAML-file machineset1.yaml defining the MachineSet below. Change the following values according to your environment:

  • Replace the string myclus-khb5h with your OCP cluster ID
  • Replace region with the region your OCP cluster is in
  • Replace zone with an (existing ;-) ) GCP zone
  • Replace projectID with your GCP project ID
  • Replace serviceAccounts with your service account
  • The name must be unique in your OCP cluster
apiVersion: machine.openshift.io/v1beta1
kind: MachineSet
metadata:
  labels:
    machine.openshift.io/cluster-api-cluster: myclus-khb5h
    machine.openshift.io/cluster-api-machine-role: infra
    machine.openshift.io/cluster-api-machine-type: infra
  name: myclus-khb5h-w-a
  namespace: openshift-machine-api
spec:
  replicas: 1
  selector:
    matchLabels:
      machine.openshift.io/cluster-api-cluster: myclus-khb5h
      machine.openshift.io/cluster-api-machineset: myclus-khb5h-w-a
  template:
    metadata:
      creationTimestamp: null
      labels:
        machine.openshift.io/cluster-api-cluster: myclus-khb5h
        machine.openshift.io/cluster-api-machine-role: infra
        machine.openshift.io/cluster-api-machine-type: infra
        machine.openshift.io/cluster-api-machineset: myclus-khb5h-w-a
    spec:
      metadata:
        labels:
          node-role.kubernetes.io/infra: ""
      providerSpec:
        value:
          apiVersion: gcpprovider.openshift.io/v1beta1
          canIPForward: false
          credentialsSecret:
            name: gcp-cloud-credentials
          deletionProtection: false
          disks:
          - autoDelete: true
            boot: true
            image: myclus-khb5h-rhcos-image
            labels: null
            sizeGb: 128
            type: pd-ssd
          kind: GCPMachineProviderSpec
          machineType: n1-standard-4
          metadata:
            creationTimestamp: null
          networkInterfaces:
          - network: myclus-khb5h-network
            subnetwork: myclus-khb5h-worker-subnet
          projectID: marek-ocp4-blog
          region: us-east1
          serviceAccounts:
          - email: [email protected]
            scopes:
            - https://www.googleapis.com/auth/cloud-platform
          tags:
          - myclus-khb5h-infra
          userDataSecret:
            name: worker-user-data
          zone: us-east1-b

Now that the YAML-file is prepared, apply it to your cluster.

oc create -f machineset1.yaml
machineset.machine.openshift.io/myclus-khb5h-infra-a created

You can check that the resource is starting to be created.

oc get machinesets -n openshift-machine-api
NAME                   DESIRED   CURRENT   READY   AVAILABLE   AGE
myclus-khb5h-infra-a   1         1                             7s
myclus-khb5h-w-b       1         1         1       1           19h
myclus-khb5h-w-c       1         1         1       1           19h
myclus-khb5h-w-d       1         1         1       1           19h

Further insight on the creation process can be gained with oc describe. It is important to note the output should include the Status: and Events: section; if not, then there is likely an error with the YAML-file.

oc describe machine myclus-khb5h-infra-a -n openshift-machine-api
<output omitted>
Status:
  Addresses:
    Address:     10.0.64.2
    Type:        InternalIP
    Address:     myclus-khb5h-infra-a-69j7n.us-east1-b.c.marek-ocp4-blog.internal
    Type:        InternalDNS
    Address:     myclus-khb5h-infra-a-69j7n.c.marek-ocp4-blog.internal
    Type:        InternalDNS
  Last Updated:  2020-04-28T14:46:43Z
  Phase:         Provisioned
  Provider Status:
    Conditions:
      Last Probe Time:       2020-04-28T14:46:23Z
      Last Transition Time:  2020-04-28T14:46:23Z
      Message:               machine successfully created
      Reason:                MachineCreationSucceeded
      Status:                True
      Type:                  MachineCreated
    Instance Id:             myclus-khb5h-infra-a-69j7n
    Instance State:          RUNNING
    Metadata:
      Creation Timestamp:  <nil>
Events:
  Type     Reason        Age                From           Message
  ----     ------        ----               ----           -------
  Warning  FailedCreate  14m                gcpcontroller  requeue in: 20s
  Warning  FailedUpdate  14m (x4 over 14m)  gcpcontroller  requeue in: 20s
  Normal   Update        14m (x3 over 14m)  gcpcontroller  Updated Machine myclus-khb5h-infra-a-69j7n

Now, the GCP console shows the new instance (see below).

New infrastructure node

Moving the Container Image Registry

To free resources from the worker node, let’s move the container image registry to the newly created infrastructure node.

Since the image registry resource already exists, we will edit the existing config/cluster object and add the infra nodeSelector to move the registry to our new infrastructure node.

oc edit config/cluster

# Add these two lines to the spec: section
  nodeSelector:
    node-role.kubernetes.io/infra: ""
# Save and exit the file
config.imageregistry.operator.openshift.io/cluster edited

Watch the resources being moved:

watch -n 1 'oc get pods -n openshift-image-registry -o wide'

# Following output is edited for brevity
NAME                                            READY   STATUS    AGE   IP            NODE
cluster-image-registry-operator-9754995-rg2n5   2/2     Running   21h   10.128.0.28   myclus-khb5h-m-2.c.marek-ocp4-blog.internal
image-registry-75b4bd664f-rvrn5                 0/1     Pending   28s   <none>        <none>
image-registry-dd874db66-29hzp                  1/1     Running   21h   10.128.2.4    myclus-khb5h-w-c-d94s4.c.marek-ocp4-blog.internal

After a few moments, the original image-registry pod will be removed.

Getting Started

As you can see, after Infrastructure MachineSets have been created, exisiting OCP infrastructure components can be moved easily to the dedicated infrastructure nodes.

Next, try to move the cluster monitoring service, cluster aggregated logging, or the default router to your new infrastructure nodes.

You are now ready to apply this technique in your new and existing OCP4 clusters.

Interested in learning more about the OpenShift journey? //take the first step

Tagged:



//comments


//blog search


//other topics