In my previous blog post (http://bit.ly/4l5DIK8), I demonstrated a basic use case involving the deployment of a Cirros virtual machine on a Kubernetes cluster. The process was relatively straightforward, as it did not require the use of DataVolumes (DV) or the Containerized Data Importer (CDI). However, the complexity increases when introducing DV and CDI into the workflow Here is a chart to explain it.
OS Image | Data Volume | Explanation |
RHEL/ROCKY | ✅ Yes, DV is needed | RHEL/ROCKY comes as a .qcow2 or .iso image that must be imported and written to a persistent volume. CDI does this automatically. |
Cirros | ❌ No, DV not needed | Cirros is tiny and often bundled as a containerDisk (ephemeral). It can run directly without importing to PV. |
.
Feature | ContainerDisk | DataVolume |
Volatile | Yes (removed after VM shutdown) | No (persistent storage) |
Format Support | Only container images | Can import .qcow2, .iso, or clone PVCs |
Use Case | Quick test VMs (like Cirros) | Real OS installation (like RHEL, Windows) |
When working with Kubevirt technology, it is common to encounter new storage terms such as Data volume, CDI, SC, PV, and PVC. These terms can be confusing at first glance, but with the right analogy, they can be easily understood and applied in real-world scenarios. In this section of the blog post, we will delve deeper into these parameters and correlate them with an analogy from a car parking place.
Let’s imagine a scenario where you are looking for a parking spot in a crowded parking lot. The parking lot itself represents the storage space available on your cluster. Just like how you need to find a specific spot for your car based on its size and requirements, similarly when dealing with Kubevirt technology, you need to allocate storage resources based on the needs of your virtual machines (VMs).
Firstly, let’s discuss Data volumes. These are analogous to designated parking spots in the lot that have been marked out for specific cars or vehicles. Similarly, data volumes act as designated chunks of storage space that are allocated for use by VMs. They serve as persistent storage that remains even after the VM has been deleted.
Next up is CDI (Containerized Data Importer), which can be compared to valet services offered at some parking lots. With CDI, you can import existing disk images into your cluster and use them as data volumes for your VMs without having to create them manually.
Now let’s look at SC (Storage Class) – this parameter determines what type of parking spot or storage resource will be allocated for your VM. Just like how different types of cars require different sized spots in a parking lot, similarly different types of applications may require specific types of storage resources such as SSD or HDD drives.
Moving on to PV (Persistent Volume) and PVC (Persistent Volume Claim), we can compare them to reserved spaces in the parking lot that have already been assigned but not yet occupied. PVs are storage resources that have been provisioned by the administrator, while PVCs are requests made by applications for specific storage resources from the available PV pool.
Understanding these storage terms in Kubevirt can be compared to finding a parking spot that fits your car’s needs in a crowded lot. By using this analogy, it becomes easier to grasp their purpose and how they work together to provide efficient st
What does each object keep?
- StorageClass (SC)
- SC is like the parking office.
- Keeps information like:
- Type of provisioner (e.g., nfs.csi.k8s.io)
- Parameters like size, mount options
- Reclaim policy
- Default parking behavior (e.g., automatic or manual allocation)
- PersistentVolumeClaim (PVC)
- PVC is the application requesting a parking spot.
- Holds:
- Access mode (ReadWriteOnce, RWX)
- Requested size (20Gi, etc.)
- Name of StorageClass (e.g., nfs-sc)
- Think of it like: “I want a 20Gi, RWX spot from SC nfs-sc”
- DataVolume (DV)
- DV wraps a PVC and adds:
- Where to get the image from (HTTP URL, PVC clone, upload)
- Whether to import or clone the image
- Think of it as: “I want a spot and please unpack this RHEL.qcow2 onto it”
- DV wraps a PVC and adds:
Summary Flow
⮕ DV (says: “I need a volume and here’s an image to unpack”)
↓
⮕ PVC (created as part of DV, asks for space from SC)
↓
⮕ SC (provisions a PV based on its rules)
↓
⮕ PV (physical parking spot)
↓
⮕ CDI (downloads and installs image into PV)
↓
⮕ VM mounts the volume and boots.
📥 Does DV appear in PVC or SC?
- PVC is embedded inside DV (not vice versa)
- SC is referenced by name inside PVC
- DV does not modify SC
- DV → generates PVC → requests from SC → binds to PV
In my current lab environment, I’ve deployed a Kubernetes cluster comprising three virtual machines hosted on an ESXi server. The cluster architecture includes:
- One Master Node:
- IP Address:
192.168.0.40
- OS: RHEL 9.2
- Kubernetes Version: v1.29
- IP Address:
- Two Worker Nodes:
- Worker 1:
192.168.0.41
- Worker 2:
192.168.0.42
- OS: RHEL 9.2
- Worker 1:
Additionally, an auxiliary NFS and web server is provisioned on a separate VM with the IP address 192.168.0.30
. This node is responsible for serving persistent volumes and image artifacts as needed during the cluster and application deployments.
For the networking layer, I am leveraging Calico as the Container Network Interface (CNI) plugin to enable secure pod-to-pod communication with robust NetworkPolicy support.
I have uploaded all the VM Manifests and scripts on Github at https://github.com/ranjeetbadhe/Kubevirt-VM-On-Kubernetes.git . Feel free to download and use them.
01-nfs-sc.yaml
02-pv-rocky.yaml
03-pv-scratch.yaml
04-datavolume.yaml
05-vm.yaml
scratch-pvc.yaml
vm-health-check.sh
Lets have a look on each of the manifests and discuss it contents
01-nfs-sc.yaml
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
name: rocky-vm
spec:
running: true
template:
metadata:
labels:
kubevirt.io/domain: rocky-vm
spec:
domain:
cpu:
cores: 2
devices:
disks:
- name: rootdisk
disk:
bus: virtio
bootOrder: 1
resources:
requests:
memory: 2Gi
volumes:
- name: rootdisk
dataVolume:
name: rocky-datavolume
02-pv-rocky.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: rocky-dv-pv
spec:
capacity:
storage: 20Gi
accessModes:
- ReadWriteMany
nfs:
path: /nfs/rocky
server: 192.168.0.30
persistentVolumeReclaimP
storageClassName: nfs-sc
claimRef:
namespace: default
name: rocky-datavolume
n a simple use case of running a Cirros virtual ma
03-pv-scratch.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: rocky-datavolume-scratch-pv
spec:
capacity:
storage: 20Gi
accessModes:
- ReadWriteMany
nfs:
path: /nfs/rocky-scratch
server: 192.168.0.30
persistentVolumeReclaimPolicy: Retain
storageClassName: nfs-sc
claimRef:
name: rocky-datavolume-scratch
namespace: default
04-datavolume.yaml
apiVersion: cdi.kubevirt.io/v1beta1
kind: DataVolume
metadata:
name: rocky-datavolume
spec:
source:
http:
url: http://192.168.0.30/images/rocky.qcow2
pvc:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 20Gi
storageClassName: nfs-sc
05-vm.yaml
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
name: rocky-vm
spec:
running: true
template:
metadata:
labels:
kubevirt.io/domain: rocky-vm
spec:
domain:
cpu:
cores: 2
devices:
disks:
- name: rootdisk
disk:
bus: virtio
bootOrder: 1
resources:
requests:
memory: 2Gi
volumes:
- name: rootdisk
dataVolume:
name: rocky-datavolume
scratch-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: rocky-datavolume-scratch
namespace: default
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 20Gi
storageClassName: nfs-sc
volumeName: rocky-datavolume-scratch-pv
Verification commands and scripts . I have created one shell script to verify the installation and also we will using cli commands .
[root@kubemaster] # cat vm-health-check.sh
#!/bin/bash
echo "🔍 Checking PVCs..."
kubectl get pvc rocky-datavolume rocky-datavolume-scratch -o wide
echo -e "\n🔍 Checking PersistentVolumes..."
kubectl get pv | grep 'rocky'
echo -e "\n🔍 Checking DataVolume status..."
kubectl get dv rocky-datavolume -o custom-columns=NAME:.metadata.name,PHASE:.status.phase
echo -e "\n🔍 Checking importer pod status..."
kubectl get pods | grep importer
echo -e "\n🔍 Checking VM status..."
kubectl get vm rocky-vm -o custom-columns=NAME:.metadata.name,STATUS:.status.printableStatus
echo -e "\n🔍 Checking VMI (VirtualMachineInstance)..."
kubectl get vmi rocky-vm -o wide
echo -e "\n🔍 Events related to rocky-datavolume..."
kubectl get events --sort-by=.metadata.creationTimestamp | grep rocky
echo -e "\n📡 Testing console connection (virtctl)..."
if command -v virtctl &> /dev/null; then
echo "Running: virtctl console rocky-vm (Press Ctrl+] to exit)"
virtctl console rocky-vm
else
echo "virtctl is not installed or not in PATH."
fi
on running the script
[root@kubemaster]# sh vm-health-check.sh
🔍 Checking PVCs...
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE VOLUMEMODE
rocky-datavolume Bound rocky-dv-pv 20Gi RWX nfs-sc 3h6m Filesystem
rocky-datavolume-scratch Bound rocky-datavolume-scratch-pv 20Gi RWX nfs-sc 155m Filesystem
🔍 Checking PersistentVolumes...
rocky-datavolume-scratch-pv 20Gi RWX Retain Bound default/rocky-datavolume-scratch nfs-sc 171m
rocky-dv-pv 20Gi RWX Retain Bound default/rocky-datavolume nfs-sc 3h7m
🔍 Checking DataVolume status...
NAME PHASE
rocky-datavolume Succeeded
🔍 Checking importer pod status...
🔍 Checking VM status...
NAME STATUS
rocky-vm Running
🔍 Checking VMI (VirtualMachineInstance)...
NAME AGE PHASE IP NODENAME READY LIVE-MIGRATABLE PAUSED
rocky-vm 151m Running 10.244.127.81 kubeworker1.ranjeetbadhe.com True False
🔍 Events related to rocky-datavolume...
No resources found in default namespace.
📡 Testing console connection (virtctl)...
Running: virtctl console rocky-vm (Press Ctrl+] to exit)
Successfully connected to rocky-vm console. The escape sequence is ^]