Application Consistency in Kubernetes - Zerto

Application Consistency in Kubernetes

June 15, 2021
Est. Reading Time: 4 minutes

Gone are the days that a backup of a virtual machine is enough for backing up an entire application.

Applications, especially modern, cloud-native applications, span across virtual machines, containers and cloud services. Backing up any one of those individual components is not enough to capture the entire application’s state. Instead, we need to capture all application components at the same moment in time for the backup to be usable. In other words: if we don’t capture the entire application, we cannot meaningfully recover data from a backup.

What is Application Consistency

Application Consistency is the practice of capturing the entire application’s state as a whole, coordinating backup across its constituents across virtual machines, containers and cloud services.

Modern, cloud-native applications, by design, are made up of many smaller application services, called microservices. These services live across cloud availability zones (or span data centers) by design to increase resilience.

This makes application consistency for modern applications more complex than ever — the number of components and diversity of storage locations is higher than ever.

In the case of Kubernetes, application consistency is even more important – and challenging. There are two significant contributing factors to this challenge.

Containers are ephemeral

The first is that containers are ephemeral – you can’t rely on any one single container running. The sum of running containers make up that application service; individual container instances pop in and out of existence during normal operations, caused by scaling events (adding or removing replicas) and host events (host failures, cluster events).

This means that, for data protection purposes, we cannot use the containers themselves to capture state. We have to rely on the underlying Kubernetes platform, understand the taxonomy of the application, and use its knowledge of the application to capture container images, application configuration and state and persistent storage.

Luckily, application taxonomy is built into the core of Kubernetes (we won’t go into detail on any of the core Kubernetes objects, but you can go here for a great explanation). Using Deployments, ReplicaSets and Pods, you can tell Kubernetes what your application looks like, exactly. And in combination with Services, Ingress, ConfigMaps, Secrets and some more object types, Kubernetes unravels applications to its individual constituents, which we can use for application consistency.

For instance, this yaml is an example of a Deployment. It creates a ReplicaSet to bring up three nginx Pods. The Deployment, called nginx-deployment, creates three replicated Pods, each consisting of a single container, based on the nginx Docker Hub image at version 1.14.2.

Source: https://kubernetes.io/docs/concepts/workloads/controllers/deployment/

As you can see, Kubernetes is very prescriptive. The desired state approach of Kubernetes, where you define what you want your applications to look like, is helpful for application consistency: the entire taxonomy of an application is available, in code, ready to use. Kubernetes cleanly separates the application’s container base images, application configuration and state, secrets and persistent storage.

Stateless containers lead to fragmentation

Which brings us to the second significant contributing factor to the Kubernetes application consistency challenge: containers are stateless, and lead to fragmentation of persistent data.

Kubernetes cleanly separates application runtime (the container image), application configuration (ConfigMaps, persistent state on disk) and application data (persistent volumes). That means that if we want to capture an application’s state, we need to capture more than the container images, but also need to protect additional Kubernetes objects (like ConfigMaps, Secrets, Services and more), but also its persistent data across on-premises and cloud block storage, but also persistent data on file and object stores at the same point-in-time.

Which, inevitably, leads to data fragmentation across heterogenous storage services and locations, including across availability zones and on-prem data centers. Keeping tabs on what data is stored where (and how) is not easy, adding to the complexity of application consistency of modern applications.

This is why it’s important to keep state in one place: the Kubernetes yaml, so you can treat that yaml as your single source of truth (as well as its history of changes) to an application’s anatomy. With that visibility into your applications landscape’s state, you can consistently, correctly and completely protect its data.