Finout's Complete Guide to Kubernetes Labels

Apr 24th, 2022

Whether you run your own clusters or use a managed Kubernetes service from Amazon, Google, or their competitors, you need to stay on top of managing resource usage and your container lifecycle.

Kubernetes is the industry's de-facto solution for container management. With lightweight deployments that are not tied to an operating system, K8s have a lot to offer. It enables engineers to optimize clusters for cost, performance, resilience, capacity, and scalability – by proactively creating, supporting, and dumping container instances as needed. And it does this while supporting world-class tools for developing, testing, and deploying your applications.

However, it is possible to miss out on a key control layer supported by K8s: Kubernetes labels. Kubernetes labels give users a big-picture perspective and control over this modular system, which means that all K8 users need a firm understanding of:

Namespaces
What K8s labels are
How to apply Labels
How to conduct searches and the benefits K8s labels offer

With labels, you can optimize how you leverage K8's simple API and third-party integrations. For example, labels allow the cluster to communicate with client tools and libraries such as kubectl and Helm. Not only that, they ensure that everyone on your team has instant access to the application metadata they need.

What is a Kubernetes Namespace?

A namespace is a high-level collection of Kubernetes resources. They are intended to simplify the management of resources in environments where many users are spread across multiple teams or projects. A resource’s name must be unique within a namespace (but not across namespaces). Because each Kubernetes resource can only belong to one namespace, they create natural divisions between teams and applications.

Your team may not need to apply namespace names. In fact, for clusters with user counts up to a factor of ten, you will probably not need to create them at all. The reason they are important to our discussion is that labels are also used to distinguish resources within the same namespace.

What are Kubernetes Labels?

K8 labels are key-value pairs that are part of an application's metadata.

With Kubernetes, the concept of an application is (deliberately) left very open and defined with metadata. This means that creating your own label is a very open-ended process. This leaves it entirely within your control what data you choose to hold.

That is not to say there are no conventions. In fact, the Kubernetes services use labels to schedule pods to nodes, manage replicas of deployments, and network routing of services.

Labels or Annotations?

Two properties may hold key-value pairs, labels and annotations. Labels should be used to attach identifying metadata, for example:

```YAML

"metadata": {

"labels": {

"tenant": "explo-6834"

"environment": "production"

"tier": "backend"

"app": "#5784762"

"version": "1.82"

}

}

```

Annotations are used to attach additional arbitrary data to objects. For example:

```YAML

"metadata": {

"annotations": {

"first_deployment": "1646126616"

"deployed_by": "daena@example.com"

}

```

Note, if you have set up an annotation that later becomes used to group objects, you probably need to consider that this has been elevated to a "label", and reassign it.

Kubernetes Standard Labels

Kubernetes services and replication controllers use labels to manage pods, target workloads to specific instance types, and control services across multiple cloud provider zones. Therefore, label use is hard-baked into the Kubernetes design.

Standard labels include:

key:	pair (the property's value)
app.kubernetes.io/name:	Name of the application
app.kubernetes.io/part-of:	The higher-level application this micro-service supports
app.kubernetes.io/managed-by:	The package management system

Many standard labels are auto-filled by K8s, so it is well worth applying them for your daily operations and client tools. For example:

`app.kubernetes.io/managed-by: "" `

will be populated with:

`app.kubernetes.io/managed-by: helm`

if Helm is the package manager.

Why Setup Custom Kubernetes Labels?

Custom labels are an excellent solution to several challenges that you will face when setting up a Kubernetes environment. They are very similar to the tagging concept in AWS: AWS tagging also relies on key-value pairs that identify AWS resources in EC2, S3, Redshift, and EFS.

Kubernetes labels allow DevOps to optimize searches, apply configurations, manage deployment administration, and enable FinOps by implementing a cost monitoring mechanism.

Say you want to monitor the status of pods according to their environment; you can set up key pairs that identify the environment, such as:

environment: development

environment: staging

environment: production

Such granular data allows you to make specific calls. For example, you want to list the status of all production pods:

```

kubectl get pods -l 'environment=production'

```

This is far superior to having to make an API call for all pods and then filtering through the output after.

Labels are also very useful for release management. Found a backend bug and want to release a patch? Simply deploy a new set of v:1.83 backend instances, replace tier:backend, version:v1.82 with tier:backend, v:1.83 in the service label selector. The pods running v1.82 were orphaned, and you have deployed a new set of instances.

Constraints on Labels

The following syntax constraints are applied to labels:

Key must be unique within a given object
Min 0-max 63 characters for the segment (required): 253 for prefix (optional)
Start and end with alphanumerics [a-z0-9A-Z] (unless length is 0)
dashes "-", underscore "_" and dot "." allowed (internally)
(Optional) prefix must be a series of DNS labels separated by dots and followed by a slash

The inclusion of the prefix allows users and automated system components, for example, kube-scheduler, or third-party integrations, to manage resources.

Let's unpack those two syntax constraints that could cause confusion a little further:

Enforcing the key as unique prevents us from making copy/pasta mistakes such as duplicating the environment property.

```YAML

"metadata": {

"labels": {

"environment": "production"

"environment": "development"

}

```

Consider a standard label such as: app.kubernetes.io/name:

{app.kubernetes.io} is the prefix providing the DNS label
{name} is the segment.

Searches

The Kubernetes API supports searches for:

equality, i.e., 1:1 matches
nequality, i.e., specify a "does not match"
sets

Equality uses = (or, if the fear of resetting a value leaves you feeling itchy, ==). Inequality is the standard !=, and a set or array of values is specified with a comma separator.

From our previous example of labeling our environment, therefore, we could use:

equality, to return the data on the pods in production:

```

kubectl get pods -l 'environment==production'

```

inequality to return the pods in production and development

```

kubectl get pods -l 'environment!=(staging)'

```

Or we can search for sets, i.e., an array. Set searches apply "in", "notin", and "exists":

```

kubectl get pods -l 'environment in (production)'

```

to return the data on the pods in production.

```

kubectl get pods -l 'environment notin (development, staging)'

```

where the separating "," comma acts as an AND (&&) operator.

Note that OR (||) is not supported.

Multiple Conditions

If you provide more than one equality condition, then the matching object/s must satisfy all of the constraints. For example:

environment=production

tier!=frontend

Similarly, set-based conditions return the sub-set of objects that match all the given conditions.

Thus, according to our example:

```YAML

"metadata": {

"labels": {

"tenant": "explo-6834"

"environment": "production"

"tier": "backend"

"app": "#5784762"

"version": "1.82"

}

```

Where environment may be: staging, development, or production, then:

```

kubectl get pods -l environment notin (development, staging),tier in (backend)

```

would return the same object as

```kubectl get pods -l environment=production,tier!=frontend```

Labels: Best Practices

There are three key considerations when implementing best practice for your K8s labels:

1. Your Organization Labeling Strategy

When a system is designed to be so open-ended, it is vital to apply your own strategies and conventions to ensure that your labels provide you with the functionality you need.

Once such conventions are established, you can add checks at the Pull Request (PR) level to verify that configuration files include all the required labels.

Setting an informative prefix can assist you to instantly identify which service or family of functions a label applies to. It is good practice to choose a prefix to represent your company and sub-prefixes to identify specific projects.

If you want to see the labels applied to an object, you can add this flag to your call:

```kubectl get pod my-example-pod --show-labels```

2. Templates

In K8s, the concept of a template has a very specific application, thanks to pod templates. However, it used to be that the term “template” meant "a shaped piece of rigid material used as a pattern for processes such as cutting out".

So, let’s apply the term beyond pod templates, because it is good practice to apply a rigid structure to shape all your configuration files with ready-to-use patterns. From `PodTemplate` (specifications for creating Pods) to metadata structures for applications, your team will all be on the same page with a ready-to-use label strategy in place.

What you tag will depend on your needs, but will probably include those provided in our examples, such as:

environment
tier
version
application uuid

And, in a multi-tenant environment, where a pod is dedicated to one tenant, never forget:

tenant

Because you will love the cloud cost management that Finout can hand you by including tenancy!

Once you have defined a labeling strategy that teams can apply, the next best practice step is to validate the process. Conduct static code analysis of all resource config YAML files to verify the presence of all required labels. A PR should only be merged if the configuration file provides all the required labels

3. Automate Labelling for CI/CD

Within your continuous integration/continuous delivery (CI/CD) pipeline, you can automate some labels for cross-cutting concerns. Attaching labels automatically with CD tooling ensures consistency and spares developer effort. Again, validate that those labels are in place: CI jobs should enforce proper labeling by making a build fail and notifying the responsible team if a label is missing.

You can define variables on jenkinsfiles or github actions workflows and parameterize the label part to automate labels in kubernetes manifests. At the same time, you can use Helm to easily deploy each version you want. And automated labels will help you here for each deployment strategy (canary, rolling update etc.).

Helm sample usage:

apiVersion: apps/v1

kind: Deployment

metadata:

name:

labels:

app.kubernetes.io/name:

app.kubernetes.io/instance:

annotations:

kubernetes.io/change-cause:

Reaping the Benefits in a Multi-Tenancy or Multi-Cloud Environment

Kubernetes can run on any public cloud provider or on-premises system — or apply a hybrid approach. Therefore, it is possible to have different bills from different service providers for your clusters. Even if you are not stretched across different providers, you may provide services to multiple tenants via your pods on servers and load balancers. That means you need a system capable of tracking the cost of these components.

Simplified Searching

One of the benefits of applying a labeling strategy is that although each component may be granular, it becomes part of a bigger picture. And, those labels let you zoom into or rebuild that picture. Let's say you have a multi-tenant environment. If you want to know all of the services a particular tenant uses, then you can collate that tenant's data just by filtering.

Say our example tenant:

"tenant": "explo-6834"

is supported on tier==backend and tier==frontend.

Should you receive a query with regards to a perceived service issue, a simple API call retrieves all the service-related data you need for that tenant. No need to look up any system diagrams to see what applications support that tenant's service.

You retain your modularity without losing any visibility.

Advanced Cost Observability

Meaningful cost observability requires that FinOps teams can accurately calculate the costs per pod, pod-label, deployment, namespace, and other resources in your cluster. The key to achieving this is implementing a well-executed labeling strategy.

Cost insights can be relatively simple to achieve if you are in the enviable position of being able to assign a Kubernetes namespace to each tenant. In reality, DevOps usually faces the challenge of measuring a tenant’s usage of shared, autoscaled resources – which makes cost allocation more complicated. Cost allocation often requires assigning a tenant’s pro-rated usage of the cluster’s resources, including CPU, GPU, memory, disk, and network. This is where labels assist FinOps to allocate cost per customer, tenant, dev team, or business application.

But everything is shared! Don’t worry if your autoscaled architecture means that many pods support a multi-tenant service. Finout can also provide an abstraction layer, the Unit of Economics, to let FinOps zoom in on single-tenant costs.

Conclusion

Whether a dev wants to debug an issue, DevOps wants to shut down non-essential infrastructure resources over a long weekend, or FinOps wants to understand the costs in a multi-tenant environment, in K8s, labels give you that power.

As you may have noticed, managing resource usage in a highly volatile environment means that tracking actual usage levels and performing cloud cost management to distribute overhead expenses is no small challenge. Whether you are deploying Kubernetes clusters directly or with a cloud service provider such as AWS EKS, a robust tagging strategy pays huge dividends.

Reach out to learn more