Kubernetes – Volumes

Kubernetes is an open-source container orchestration tool developed by Google. It is primarily employed to automate the deployment, scaling, and management of software. In short, Kubernetes is termed as K8s. Kubernetes is currently maintained by the Cloud Native Computing Foundation. Although it now supports both containerd and CRI-O in addition to the Docker runtime Engine with which it was first intended to communicate. Automating the operations of container management is Kubernetes’ primary goal. It has built-in commands for application deployment and rolling out necessary modifications to the application. Companies like Google, Spotify, and Capital One are now using it.

What are known as Volumes ?

On-disk files present in a container are ephemeral storages, they present some or other problems for non-trivial applications while running in containers. One of the major problems is the loss of files when the container crashes. The container is restarted by kubelet, but it is restored in a clean state without any data. Another problem occurs when we are sharing files between containers that are running together in a Pod. All these problems are solved by Kubernetes volume abstraction.

Volumes in Kubernetes

Although Docker has the concept of Volumes it is loose and not well managed. Generally, a docker volume is a directory on a disk that is present inside another container. Though Volume drives are provided by docker functionality is very limited. Kubernetes supports multiple types of Volumes. Any number of volumes can be used by a Pod simultaneously. Persistent volumes are present beyond the lifetime of a pod even after they are destroyed but Ephemeral volume types have a lifetime of a pod which means they are destroyed once the lifetime of the Pod is finished. For any kind of Volume in a pod, data is preserved across container restarts.

At the lowest level, a volume is a directory mostly with some data, this data is accessible to containers that are in a pod. The content present in this is determined by the volume type used in it.

Types of Volumes

As discussed earlier there are multiple types of volume in Kubernetes:

1. AwsElasticBlockStore (deprecated)

An Amazon Web Services (AWS) EBS volume is mounted in our pod if awsElasticBlockStore is used. This volume type is persisted which means unlike emptyDir the data present inside a pod is not erased and the volume is unmounted. This means that EBS volume can be pre-populated with any data, this particular data can be shared amongst various Pods.
There are some restrictions with using awsElasticBlockStore volume like the nodes on which the pods are running must also be AWS EC2 instances, these instances should be of the same region and availability zone as the EBS volume.

2. AWS EBS CSI migration

When the CSI Migration feature for awsElasticBlockStore is enabled it redirects all plugin operations from the existing in-tree plugin to the ebs.csi.aws.com CSI)driver. If we want to use this feature then we must install the AWS EBS CSI driver on the cluster.

3. AWS EBS CSI migration complete

If we want to disable awsElasticBlockStore storage plugin from being loaded by the controller manager or the kubelet we need to set the InTreePluginAWSUnregister flag value to true.

4. AzureDisk CSI migration

When the CSIMigration feature for azureDisk is enabled it redirects all plugin operations from the existing in-tree plugin to the disk.csi.azure.com CSI. If we want to use this feature we need to install Azure Disk CSI Driver to our cluster.

5. AzureDisk CSI migration complete

If we want to disable the azureDisk storage plugin from being loaded by the controller manager and the kubelet we need to set the InTreePluginAzureDiskUnregister flag to true.

6. AzureFile CSI migration

When CSIMigration feature for azureFile is enabled then it redirects all plugin operations from the pre-existing in-tree plugin to file.csi.azure.com CSI. If we want to use this feature we need to install Azure File CSI Driver on a cluster and CSIMigrationAzureFile feature gates must be enabled. If we use the same volume with different fsgroups then it is not supported by Azure File CSI.

Kubernetes Storage Plugins

These Kubernetes Storage plugins lay the foundation for the easy management and internal operations of storage solutions in containers. It does the configuration settings on containers that are orchestrated by Kubernetes. These plugins increase the platform’s adaptability and guarantee a customized solution for a range of storage requirements. The following 2 popular kubernetes storage plugins

Network File System Plugin
Container Storage Interface ( CSI )

Network File System Plugin

Effortless Data Sharing: NFS plugin servers a vital, in establishing the connection between Kubernetes volumes and network file systems without any difficulties for transmission of data.
Cluster-wide Accessibility: The NFS plugin helps with enabling stable data transmission among the clusters improving the efficiency of resource usage.
Scalable Solutions: NFS acts as a great option for customizing storage needs facilitating effective data exchange and offering scalable solutions.

Container Storage Interface, or CSI

Standardized Integration: Container Storage Interface ( CSI ) comes with a standardized Interface with common language storage providers facilitating effortless integration with Kubernetes.
Flexibility And choice: The CSI plugin provides a method for the users to select the storage options that exactly meet to the needs of their applications.
Interoperability: CSI comes with adopting a standardized approach for enhancing the interoperability providing a smooth and efficient storage experience with containerized settings.

Creating a Pod Yaml File with PV and PVC

Whenever developer deploying an Application using Kubernetes to make the data in this application Persistent , he raises the request for the persistent Volume it has known as PVC. IT Administrators looks over the specifications of the PVC requests and provides Persistent Volumes (PV) as per required needs.

Step 1: Creating a PVC (Persistent Volume Claim) with a Pod Yaml

Here we are creating PVC with name as mypvc using Yaml syntax file with specification of storage resource of 5Gib, having access modes Read and Write only Once. Save the file code with name such as my_pvc.yml

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mypvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 2Gi

Run this yaml file with kubectl create command or kubectl apply command as follows

kubectl create -f my_pvc.yml

The following picture show the practical usage of the above commands

Step 1

Step 2: Check the Status of the PVC

After executing the above step 1 to check the status of PVC creation try on running the following command:

kubectl get pvc

To know more about the PVC try on seeing the description on running the following command

kubectl describe pvc/mypvc

The following screenshots show the implementation of above step2 commands

Step 2

Step 3: Creating a PV ( Persistent Volume ) with a Yaml file

Based on the PVC specification , the Persistent Volume is created and allocated to it , here we are going to the manual way instead of recreated storage classes for better understanding of workflow. Try on writing a yaml code for persistent volume as follows:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: mypv
spec:
  capacity:
    storage: 2Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  hostPath:
    path: "/mnt/data"

In this file, we're providing reclaim policy as Retain, to make the data permant even after the deletion of the pod application.

Save the above specified yaml file with “my_pv.yml” and then run it with the following command:

kubectl apply -f my_pv.yml

The following screenshots show the implementation of above step3 commands

Step 3

Step 4: Checking the Status of PV

After once completing the step 3 to check the status of the PV try on running the following command:

kubectl get pv

To know more about the details of the PV that you created try on running the following command with its specific name of PV

kubectl describe pv mypv

The following screenshots show the implementation of above step4 commands

Step 4

Step 5: Creating Pod Yaml File with PV and PVC

Now create a pod yaml file as referring the following yaml code for using the above created PV and PVC with Pod application with image nginx latest version.

apiVersion: v1
kind: Pod
metadata:
  name: my-pod
spec:
  containers:
  - name: my-container
    image: nginx:latest
    volumeMounts:
    - name: my-persistent-storage
      mountPath: "/usr/share/nginx/html"
  volumes:
  - name: my-persistent-storage
    persistentVolumeClaim:
      claimName: my-pvc

Save this pod yaml file with name my_pod.yml and then run it with the command kubectl create as follows:

kubectl create -f my_pod.yml

The following screenshots show the implementation of above step 5 commands

Step 5

Step 6: To Check The Status Of Created Pod

To check the creation of the pod check with the following command:

kubectl get pods

To know more about the detailing of that pod, run with the following command

kubectl describe pods/mypod

The following screenshots show the implementation of above step6 commands

Step 6.1

Step 6.2

Conclusion

In this evolving era of Container Orchestration, Kubernetes plays an essential role in the management of data with persistence. It offers flexible management of data within pods ranging from permanence to short-term duration of storing data with volume types such as EmptyDir Volumes for temporary storage easy to use and Hostpath volumes making data permanent in the host filesystem. Kubernetes comes up with many features for orchestrating the containers effectively in an automated way with services such as “ConfigMap” resource for streamlining the configuration management, and hostPath Volumes for giving storage flexibility with the hostPath file system improving the overall efficiency of container applications.

The effective way of utilizing volumes in Kubernetes not only contributes to the management of data within a cluster but also inherits scalability and agility in Kubernetes Orchestration. Nowadays, the need for Kubernetes volumes for data management is increasing because of its adoptness of microservices and distributed architectures.