With the OpenShift Container Platform (OCP) release 4.1 in June 2019, Red Hat introduced Infrastructure MachineSets. These sets allow you to host only infrastructure components, such as:
- The default router
- The container image registry
- The cluster metrics collection, or monitoring service
- Cluster aggregated logging
An Infrastructure MachineSet consists of Machine resources (kind: Machine
).
These Machine resources spin up new virtual machines in your cloud.
Specific Kubernetes labels can be applied to these machines to move one or more of the above mentioned infrastructure components to run on only those machines.
The kicker: The infrastructure nodes do not count towards the number of subscriptions that are required to run the environment!
Unleashing Worker Nodes
Worker nodes in the OCP cluster must be covered by subscriptions and their primary purpose is to run your application workloads. To free resources from these worker nodes, which normally run the OCP infrastructure components, it is beneficial to move the infrastructure components to dedicated infrastructure nodes.
So let’s get started.
Creating an Infrastructure MachineSet for Production
For a production-ready deployment, it is recommended to deploy three MachineSets at minimum to run infrastructure components. The aggregated logging solution, i.e., ElasticSearch, requires three instances that run on different nodes. Since each MachineSet is assigned to one availability zone of the (public) cloud provider only, deploy three MachineSets at minimum.
For demonstration purposes, we will limit the scope to only one MachineSet in the next section.
Defining the MachineSet Custom Resource for the Google Cloud Platform
Once your OCP cluster is deployed to your Google Cloud Platform (GCP) project, you can create your first MachineSet to move infrastructure components. Sidenote: OCP 4.3 supports the installer provisioned infrastructure (IPI) installation method to pre-existing Virtual Private Clouds (VPC) and subnets. Choose the GCP region in which you deployed your OCP4 cluster. Then, select a GCP zone within that region to deploy the MachineSet.
Note: Double-check that the GCP zone actually exists. I tried to
deploy to us-east1-a
, which does not exist ;-) Unfortunately, no logs
or events revealed this to me. Instead, a kind colleague showed me the
light.
Please find the YAML-file machineset1.yaml
defining the MachineSet below.
Change the following values according to your environment:
- Replace the string
myclus-khb5h
with your OCP cluster ID - Replace
region
with the region your OCP cluster is in - Replace
zone
with an (existing ;-) ) GCP zone - Replace
projectID
with your GCP project ID - Replace
serviceAccounts
with your service account - The
name
must be unique in your OCP cluster
apiVersion: machine.openshift.io/v1beta1
kind: MachineSet
metadata:
labels:
machine.openshift.io/cluster-api-cluster: myclus-khb5h
machine.openshift.io/cluster-api-machine-role: infra
machine.openshift.io/cluster-api-machine-type: infra
name: myclus-khb5h-w-a
namespace: openshift-machine-api
spec:
replicas: 1
selector:
matchLabels:
machine.openshift.io/cluster-api-cluster: myclus-khb5h
machine.openshift.io/cluster-api-machineset: myclus-khb5h-w-a
template:
metadata:
creationTimestamp: null
labels:
machine.openshift.io/cluster-api-cluster: myclus-khb5h
machine.openshift.io/cluster-api-machine-role: infra
machine.openshift.io/cluster-api-machine-type: infra
machine.openshift.io/cluster-api-machineset: myclus-khb5h-w-a
spec:
metadata:
labels:
node-role.kubernetes.io/infra: ""
providerSpec:
value:
apiVersion: gcpprovider.openshift.io/v1beta1
canIPForward: false
credentialsSecret:
name: gcp-cloud-credentials
deletionProtection: false
disks:
- autoDelete: true
boot: true
image: myclus-khb5h-rhcos-image
labels: null
sizeGb: 128
type: pd-ssd
kind: GCPMachineProviderSpec
machineType: n1-standard-4
metadata:
creationTimestamp: null
networkInterfaces:
- network: myclus-khb5h-network
subnetwork: myclus-khb5h-worker-subnet
projectID: marek-ocp4-blog
region: us-east1
serviceAccounts:
- email: [email protected]
scopes:
- https://www.googleapis.com/auth/cloud-platform
tags:
- myclus-khb5h-infra
userDataSecret:
name: worker-user-data
zone: us-east1-b
Now that the YAML-file is prepared, apply it to your cluster.
oc create -f machineset1.yaml
machineset.machine.openshift.io/myclus-khb5h-infra-a created
You can check that the resource is starting to be created.
oc get machinesets -n openshift-machine-api
NAME DESIRED CURRENT READY AVAILABLE AGE
myclus-khb5h-infra-a 1 1 7s
myclus-khb5h-w-b 1 1 1 1 19h
myclus-khb5h-w-c 1 1 1 1 19h
myclus-khb5h-w-d 1 1 1 1 19h
Further insight on the creation process can be gained with oc describe
.
It is important to note the output should include the Status:
and Events:
section; if not, then there is likely an error with the YAML-file.
oc describe machine myclus-khb5h-infra-a -n openshift-machine-api
<output omitted>
Status:
Addresses:
Address: 10.0.64.2
Type: InternalIP
Address: myclus-khb5h-infra-a-69j7n.us-east1-b.c.marek-ocp4-blog.internal
Type: InternalDNS
Address: myclus-khb5h-infra-a-69j7n.c.marek-ocp4-blog.internal
Type: InternalDNS
Last Updated: 2020-04-28T14:46:43Z
Phase: Provisioned
Provider Status:
Conditions:
Last Probe Time: 2020-04-28T14:46:23Z
Last Transition Time: 2020-04-28T14:46:23Z
Message: machine successfully created
Reason: MachineCreationSucceeded
Status: True
Type: MachineCreated
Instance Id: myclus-khb5h-infra-a-69j7n
Instance State: RUNNING
Metadata:
Creation Timestamp: <nil>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedCreate 14m gcpcontroller requeue in: 20s
Warning FailedUpdate 14m (x4 over 14m) gcpcontroller requeue in: 20s
Normal Update 14m (x3 over 14m) gcpcontroller Updated Machine myclus-khb5h-infra-a-69j7n
Now, the GCP console shows the new instance (see below).
New infrastructure node
Moving the Container Image Registry
To free resources from the worker node, let’s move the container image registry to the newly created infrastructure node.
Since the image registry resource already exists, we will edit the
existing config/cluster
object and add the infra
nodeSelector
to move the registry to our new infrastructure node.
oc edit config/cluster
# Add these two lines to the spec: section
nodeSelector:
node-role.kubernetes.io/infra: ""
# Save and exit the file
config.imageregistry.operator.openshift.io/cluster edited
Watch the resources being moved:
watch -n 1 'oc get pods -n openshift-image-registry -o wide'
# Following output is edited for brevity
NAME READY STATUS AGE IP NODE
cluster-image-registry-operator-9754995-rg2n5 2/2 Running 21h 10.128.0.28 myclus-khb5h-m-2.c.marek-ocp4-blog.internal
image-registry-75b4bd664f-rvrn5 0/1 Pending 28s <none> <none>
image-registry-dd874db66-29hzp 1/1 Running 21h 10.128.2.4 myclus-khb5h-w-c-d94s4.c.marek-ocp4-blog.internal
After a few moments, the original image-registry
pod will be
removed.
Getting Started
As you can see, after Infrastructure MachineSets have been created, exisiting OCP infrastructure components can be moved easily to the dedicated infrastructure nodes.
Next, try to move the cluster monitoring service, cluster aggregated logging, or the default router to your new infrastructure nodes.
You are now ready to apply this technique in your new and existing OCP4 clusters.
Interested in learning more about the OpenShift journey? //take the first step