Whether you run your own clusters or use a managed Kubernetes service from Amazon, Google, or their competitors, you need to stay on top of managing resource usage and your container lifecycle.
Kubernetes is the industry's de-facto solution for container management. With lightweight deployments that are not tied to an operating system, K8s have a lot to offer. It enables engineers to optimize clusters for cost, performance, resilience, capacity, and scalability – by proactively creating, supporting, and dumping container instances as needed. And it does this while supporting world-class tools for developing, testing, and deploying your applications.
However, it is possible to miss out on a key control layer supported by K8s: Kubernetes labels. Kubernetes labels give users a big-picture perspective and control over this modular system, which means that all K8 users need a firm understanding of:
- Namespaces
- What K8s labels are
- How to apply Labels
- How to conduct searches and the benefits K8s labels offer
With labels, you can optimize how you leverage K8's simple API and third-party integrations. For example, labels allow the cluster to communicate with client tools and libraries such as kubectl and Helm. Not only that, they ensure that everyone on your team has instant access to the application metadata they need.
What is a Kubernetes Namespace?
A namespace is a high-level collection of Kubernetes resources. They are intended to simplify the management of resources in environments where many users are spread across multiple teams or projects. A resource’s name must be unique within a namespace (but not across namespaces). Because each Kubernetes resource can only belong to one namespace, they create natural divisions between teams and applications.
Your team may not need to apply namespace names. In fact, for clusters with user counts up to a factor of ten, you will probably not need to create them at all. The reason they are important to our discussion is that labels are also used to distinguish resources within the same namespace.
What are Kubernetes Labels?
K8 labels are key-value pairs that are part of an application's metadata.
With Kubernetes, the concept of an application is (deliberately) left very open and defined with metadata. This means that creating your own label is a very open-ended process. This leaves it entirely within your control what data you choose to hold.
That is not to say there are no conventions. In fact, the Kubernetes services use labels to schedule pods to nodes, manage replicas of deployments, and network routing of services.
Labels or Annotations?
"metadata": {
"labels": {
"tenant": "explo-6834"
"environment": "production"
"tier": "backend"
"app": "#5784762"
"version": "1.82"
}
}
Annotations are used to attach additional arbitrary data to objects. For example:
Kubernetes Standard Labels
Kubernetes services and replication controllers use labels to manage pods, target workloads to specific instance types, and control services across multiple cloud provider zones. Therefore, label use is hard-baked into the Kubernetes design.
Standard labels include:
key: |
pair (the property's value) |
app.kubernetes.io/name: |
Name of the application |
app.kubernetes.io/part-of: |
The higher-level application this micro-service supports |
app.kubernetes.io/managed-by: |
The package management system |
Many standard labels are auto-filled by K8s, so it is well worth applying them for your daily operations and client tools. For example:
`app.kubernetes.io/managed-by: "" `
will be populated with:
`app.kubernetes.io/managed-by: helm`
if Helm is the package manager.
Why Setup Custom Kubernetes Labels?
Custom labels are an excellent solution to several challenges that you will face when setting up a Kubernetes environment. They are very similar to the tagging concept in AWS: AWS tagging also relies on key-value pairs that identify AWS resources in EC2, S3, Redshift, and EFS.
Kubernetes labels allow DevOps to optimize searches, apply configurations, manage deployment administration, and enable FinOps by implementing a cost monitoring mechanism.
Say you want to monitor the status of pods according to their environment; you can set up key pairs that identify the environment, such as:
environment: development
environment: staging
environment: production
Such granular data allows you to make specific calls. For example, you want to list the status of all production pods:
```
```
This is far superior to having to make an API call for all pods and then filtering through the output after.
Labels are also very useful for release management. Found a backend bug and want to release a patch? Simply deploy a new set of v:1.83 backend instances, replace tier:backend, version:v1.82 with tier:backend, v:1.83 in the service label selector. The pods running v1.82 were orphaned, and you have deployed a new set of instances.
Constraints on Labels
The following syntax constraints are applied to labels:
- Key must be unique within a given object
- Min 0-max 63 characters for the segment (required): 253 for prefix (optional)
- Start and end with alphanumerics [a-z0-9A-Z] (unless length is 0)
- dashes "-", underscore "_" and dot "." allowed (internally)
- (Optional) prefix must be a series of DNS labels separated by dots and followed by a slash
The inclusion of the prefix allows users and automated system components, for example, kube-scheduler, or third-party integrations, to manage resources.
Let's unpack those two syntax constraints that could cause confusion a little further:
- Enforcing the key as unique prevents us from making copy/pasta mistakes such as duplicating the environment property.
- Consider a standard label such as: app.kubernetes.io/name:
- {app.kubernetes.io} is the prefix providing the DNS label
- {name} is the segment.
Searches
The Kubernetes API supports searches for:
- equality, i.e., 1:1 matches
- nequality, i.e., specify a "does not match"
- sets
Equality uses = (or, if the fear of resetting a value leaves you feeling itchy, ==). Inequality is the standard !=, and a set or array of values is specified with a comma separator.
From our previous example of labeling our environment, therefore, we could use:
- equality, to return the data on the pods in production:
```
```
- inequality to return the pods in production and development
```
```
Or we can search for sets, i.e., an array. Set searches apply "in", "notin", and "exists":
```
```
to return the data on the pods in production.
or
```
```
where the separating "," comma acts as an AND (&&) operator.
Note that OR (||) is not supported.
Multiple Conditions
If you provide more than one equality condition, then the matching object/s must satisfy all of the constraints. For example:
environment=production
tier!=frontend
Similarly, set-based conditions return the sub-set of objects that match all the given conditions.
Thus, according to our example:
Where environment may be: staging, development, or production, then:
```
kubectl get pods -l environment notin (development, staging),tier in (backend)
```
would return the same object as
```kubectl get pods -l environment=production,tier!=frontend```
Labels: Best Practices
There are three key considerations when implementing best practice for your K8s labels:
1. Your Organization Labeling Strategy
When a system is designed to be so open-ended, it is vital to apply your own strategies and conventions to ensure that your labels provide you with the functionality you need.
Once such conventions are established, you can add checks at the Pull Request (PR) level to verify that configuration files include all the required labels.
Setting an informative prefix can assist you to instantly identify which service or family of functions a label applies to. It is good practice to choose a prefix to represent your company and sub-prefixes to identify specific projects.
If you want to see the labels applied to an object, you can add this flag to your call:
2. Templates
In K8s, the concept of a template has a very specific application, thanks to pod templates. However, it used to be that the term “template” meant "a shaped piece of rigid material used as a pattern for processes such as cutting out".
So, let’s apply the term beyond pod templates, because it is good practice to apply a rigid structure to shape all your configuration files with ready-to-use patterns. From `PodTemplate` (specifications for creating Pods) to metadata structures for applications, your team will all be on the same page with a ready-to-use label strategy in place.
What you tag will depend on your needs, but will probably include those provided in our examples, such as:
- environment
- tier
- version
- application uuid
And, in a multi-tenant environment, where a pod is dedicated to one tenant, never forget:
- tenant
Because you will love the cloud cost management that Finout can hand you by including tenancy!
Once you have defined a labeling strategy that teams can apply, the next best practice step is to validate the process. Conduct static code analysis of all resource config YAML files to verify the presence of all required labels. A PR should only be merged if the configuration file provides all the required labels
3. Automate Labelling for CI/CD
Within your continuous integration/continuous delivery (CI/CD) pipeline, you can automate some labels for cross-cutting concerns. Attaching labels automatically with CD tooling ensures consistency and spares developer effort. Again, validate that those labels are in place: CI jobs should enforce proper labeling by making a build fail and notifying the responsible team if a label is missing.
You can define variables on jenkinsfiles or github actions workflows and parameterize the label part to automate labels in kubernetes manifests. At the same time, you can use Helm to easily deploy each version you want. And automated labels will help you here for each deployment strategy (canary, rolling update etc.).
Kubernetes can run on any public cloud provider or on-premises system — or apply a hybrid approach. Therefore, it is possible to have different bills from different service providers for your clusters. Even if you are not stretched across different providers, you may provide services to multiple tenants via your pods on servers and load balancers. That means you need a system capable of tracking the cost of these components.
Simplified Searching
One of the benefits of applying a labeling strategy is that although each component may be granular, it becomes part of a bigger picture. And, those labels let you zoom into or rebuild that picture. Let's say you have a multi-tenant environment. If you want to know all of the services a particular tenant uses, then you can collate that tenant's data just by filtering.
Say our example tenant:
Should you receive a query with regards to a perceived service issue, a simple API call retrieves all the service-related data you need for that tenant. No need to look up any system diagrams to see what applications support that tenant's service.
You retain your modularity without losing any visibility.
Advanced Cost Observability
Meaningful cost observability requires that FinOps teams can accurately calculate the costs per pod, pod-label, deployment, namespace, and other resources in your cluster. The key to achieving this is implementing a well-executed labeling strategy.
Cost insights can be relatively simple to achieve if you are in the enviable position of being able to assign a Kubernetes namespace to each tenant. In reality, DevOps usually faces the challenge of measuring a tenant’s usage of shared, autoscaled resources – which makes cost allocation more complicated. Cost allocation often requires assigning a tenant’s pro-rated usage of the cluster’s resources, including CPU, GPU, memory, disk, and network. This is where labels assist FinOps to allocate cost per customer, tenant, dev team, or business application.
But everything is shared! Don’t worry if your autoscaled architecture means that many pods support a multi-tenant service. Finout can also provide an abstraction layer, the Unit of Economics, to let FinOps zoom in on single-tenant costs.
Conclusion
Whether a dev wants to debug an issue, DevOps wants to shut down non-essential infrastructure resources over a long weekend, or FinOps wants to understand the costs in a multi-tenant environment, in K8s, labels give you that power.
As you may have noticed, managing resource usage in a highly volatile environment means that tracking actual usage levels and performing cloud cost management to distribute overhead expenses is no small challenge. Whether you are deploying Kubernetes clusters directly or with a cloud service provider such as AWS EKS, a robust tagging strategy pays huge dividends.
Reach out to learn more