This is the start of a series of blog posts that will explain the different components of Kubernetes. Primarily because if I can explain it here, I’ll have learned it quite well myself.
Primer on Containers
I think most people are at least aware of the existence of containers. Fundamentally they’re a construct used to make an application component self contained & portable. It holds all the libraries and binaries required to run the component. Think of it as a virtual machine sans Operating System. If you don’t have an Operating System to run how much less resource is required to run it? How much faster will it start?
The answer to both of these is a lot.
It’s a natural evolution from physical servers to virtual machines to containers. It helps developers work in isolation without worrying about trying to run a code merge against the rest of the teams work. It enables fast feedback when the code is committed and the unit tests are run. Because all the libraries and binaries are held within the container, it helps the age old problem – “It worked on my laptop.”
Because of the faster start time, they’re easier to scale up when you need to and scale down when you don’t. Or even just start a container when you need a task performing and delete it when that task is finished.
Pods Vs Containers
Kubernetes doesn’t deal in containers exactly, it builds the containers within a pod. This makes a pod the lowest denominator, the atomic unit within Kubernetes.
A pod can hold multiple containers which is the reasoning behind why the pod was created instead of managing containers directly. It is a higher construct to enable multiple pods to be scheduled on the same machine and share the same network namespace.
Multiple pods are better than multiple containers in the same pod. When a container starts it’s main process will start as PID 1. If PID 1 dies then Kubernetes will kill the container. If it’s a managed container then another will restart somewhere else.
Just because you can run multiple containers in the same pod doesn’t mean you should. Scaling, scheduling, resources,
Creating a pod
Creating a pod is done by one of two methods, imperatively or declaratively. These both involve use of the CLI (usually) but one is created directly on the command line and the other is declared as a YAML or JSON file and then posted to the API server. The end result is broadly the same in that they directly or indirectly post JSON to the API server.
kubectl run test-pod --image=busybox --restart=Never
As you can see it’s not rocket science to perform this task and your pod will be up and running in no time. If you want to make any changes to that pod you edit the pod directly. Either through kubectl edit which brings up your text editor for you to edit the YAML directly before posting it back to the API server.
kubectl edit pod test-pod
Or by deleting and recreating the pod with your changes.
kubectl delete pod test-pod
kubectl run test-pod --image=busybox --restart=Never --command "sleep"
Once you’ve done this, where is the record of your configuration and the changes you’ve posted?
In my view using the CLI directly is great for quickly spinning up a new pod for testing or development but I’d avoid it for production use.
This involves creating a YAML (or JSON, but YAML seems to be the de-facto standard) file with all your configuration options in and posting the whole file to the API server.
apiVersion: v1 kind: Pod metadata: name: test-pod spec: containers: - image: busybox name: c1 command: - sleep
Once you’ve written and saved this, you post it to the API server using
kubectl apply -f test-pod.yaml
As this is stored in a file, this can be kept in the source control repository of your choice along with all the good stuff that brings. It also means if you need to edit the file, you simply edit the original and post it again.
Stop or Delete pods
Now we’ve talked about creating pods, it stands that you might need to delete pods sometimes. The easiest way to achieve this is by name:
kubectl delete pod test-pod
It can also be done by selecting by label (labels covered next):
kubectl delete pod -l app=test-app
Or by deleting the whole namespace (namespaces covered later):
kubectl delete namespace test-ns
A label is a key-value pair and is an important construct in Kubernetes. In summary, Kubernetes uses labels to select resources to be managed by any number of higher level controllers. This enables scaling, self-healing and load balancing to name but a few.
There are a number of ways to label a pod. The most common (and you could argue the better way) is to the include the label(s) in the pods YAML descriptor.
... metadata: name: test-pod labels: app: test-app env: test ...
This can also be done imperatively:
kubectl label pod app=test-app
Once these labels are created you list your pods with all labels:
kubectl get pod --show-labels
Or you can filter by a specific label:
kubectl get pod -l app=test-app
Or can list all pods that have the label set:
kubectl get pod -l app
Or not set:
kubectl get pod -l '!app'
You can also label nodes, which can come in pretty handy if you want to highlight certain features that certain pods may need to run. Like a GPU, or SSD storage. To do this label the node:
kubectl label node node1 gpu-installed=true
Then add this label under a node selection attribute with the pods YAML:
... spec: nodeSelector: gpuinstalled: "true" ...
In summary labels are an incredibly flexible way to organise your pods. As mentioned earlier they’re also incredibly important to the rest of the Kubernetes platform as well.
Following on from labels, annotations are another key-value pair that can form part of an objects metadata. The main difference between the two is that annotations can’t be used to select multiple objects, and annotations can hold much larger (and therefore more descriptive) pieces of information.
I find the best use case for annotations is to add a description so that all users of a cluster understand what each component is. Sometimes annotations are also updated automatically by applications, or are used by alpha/beta features within Kubernetes.
Namespaces are a means to groups resources into a meaningful way. Most commonly used to separate the same cluster into development, test and sandbox. Or possibly giving each branch of code it’s own namespace.
Namespaces allow you to re-use the same name for pods, configmaps, secrets etc. This means that all objects can be created with the same YAML files but will be translated to the environment they’re in by the settings in the configmap within that namespace.
A number of other items can be applied to individual namespaces, think permissions and resource quotas.