The purpose of the upcoming posts is to discuss Day 2 operations. Assuming that you have a running cluster, we will explore additional capabilities that can be added to ensure smooth operation and management. In particular, this post will focus on CSI and Storage within vSphere (and Windows), although most of the process can be applied to other clouds.
The Kubernetes v1.13 release introduced the Container Storage Interface (CSI), which replaces the previous “in-tree” volume plugins included in k/k’s codebase. With the old architecture, vendors were required to maintain their plugins through the entire Kubernetes release lifecycle. CSI provides a standard interface pattern (along with CNI, CRI, etc.) that allows third-party plugins to exist outside the main Kubernetes repository. The goal of CSI is to provide a standard method for exposing arbitrary block and file storage to containers, allowing these vendors to provide solutions by themselves.
A CSI driver is required for the cluster to utilize underlying infrastructure resources. The vSphere CSI driver is a plugin that sits outside the Kubernetes codebase, allowing containerized workloads to access vSphere storage. This plugin offers support for different types of storage, including vSAN. The vSphere CSI driver communicates with the control plane on the vSphere server for all storage provisioning operations. On Kubernetes, the CSI driver is used with the vSphere CPI (Cloud Provider Interface). The CSI driver is shipped as a container image and must be deployed in the cluster.
In this post, it is important to understand the Cloud Native Storage Server Component on vSphere, also known as the CNS control plane inside the vSphere server, and focus on the CSI and Kubernetes layers. CNS control plane is an extension of the vCenter Server management that implements provisioning and life cycle operations for container volumes. When provisioning container volumes, it interacts with the vCenter Server to create storage objects that back the volumes. The Storage Policy-Based Management functionality guarantees the required level of service to the volumes and provides monitoring and backing storage objects.
From a vSphere VSAN DataStore we can set the following objects for a RWO (per Pod mount)
$ govc datastore.info
Name: vsanDatastore
Path: /dc0/datastore/vsanDatastore
Type: vsan
URL: ds:///vmfs/volumes/vsan:5253520081e7f5cb-482ce3096504d5fd/
Capacity: 1024.0 GB
Free: 1009.4 GB
A StorageClass sets the parameters for a particular type or class of storage that can be used to dynamically provision PersistentVolumes. StorageClasses are not associated with a specific namespace, and their name in etcd is determined by ObjectMeta.Name. In Kubernetes, there is no default StorageClass, so it must be created manually. In vSphere, the storagePolicyName parameter is crucial as it allows for the attachment of Storage Policy Based Management (SPBM) to the datastore, providing greater control over container volume granularity.
From kubectl get sc -o yaml vsan-sc
allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: vsan-sc
provisioner: csi.vsphere.vmware.com
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
parameters:
csi.storage.k8s.io/fstype: ext4
datastoreurl: ds:///vmfs/volumes/vsan:5253520081e7f5cb-482ce3096504d5fd/
A PersistentVolumeClaim (PVC) is a request made by a user for storage space to claim for a persistent volume in the cluster for a specific Pod. It works like a Pod, but instead of consuming node resources, it consumes PV resources. While Pods can request specific levels of resources like CPU and Memory, PVCs can request specific size and access modes. These access modes determine how the PVC can be mounted, such as:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: rwo-pvc
spec:
volumeMode: Block
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 15Gi
storageClassName: vsan-sc
Create the Pod with the volume pointing to the PVC, so the request can be achieved:
apiVersion: v1
kind: Pod
metadata:
name: pod
spec:
containers:
- name: test-container
image: gcr.io/google_containers/busybox:1.24
command: ["/bin/sh", "-c", "while true ; do sleep 2 ; done"]
volumeDevices:
- devicePath: /dev/xvda
name: data
restartPolicy: Never
volumes:
- name: data
persistentVolumeClaim:
claimName: rwo-pvc
Check the PVC status, it must be set to bound
:
$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
rwo-pvc Bound pvc-35e6c773-2b21-4408-a046-33f27b7238cb 15Gi RWO vsan-sc 110s
The CNS on vSphere is going to provide the current mounted volumes in the datastore:
A PersistentVolume (PV) is a storage unit in the cluster that is provided by an administrator or dynamically provisioned through StorageClasses. It’s a resource in the cluster similar to a node. PVs are volume plugins like Volumes but have a lifecycle that isn’t dependent on any individual Pod that uses the PV. This API object captures the storage implementation details, such as NFS, iSCSI, or a cloud-provider-specific storage system.
Persistent Volumes (PVs) can be provisioned in two ways: static and dynamic. In the static approach, the admin creates a fixed number of PVs manually, which are then available for consumption. On the other hand, in the dynamic approach, the cluster may try to provision a volume dynamically for a Persistent Volume Claim (PVC). However, this can only happen if the PVC requests a StorageClass, and the admin has previously created and configured that class for dynamic provisioning to occur. On vSphere persistent volumes map to VMDKs on the datastore, with 2 kinds: First Class Disk (FCD) and Improved Virtual Disks( IVD)
From the dynamic PV the object is created after the PVC is created:
$ kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-35e6c773-2b21-4408-a046-33f27b7238cb 15Gi RWO Delete Bound default/rwo-pvc vsan-sc 114s
The PV will be something like:
apiVersion: v1
kind: PersistentVolume
metadata:
annotations:
pv.kubernetes.io/provisioned-by: csi.vsphere.vmware.com
volume.kubernetes.io/provisioner-deletion-secret-name: ""
volume.kubernetes.io/provisioner-deletion-secret-namespace: ""
finalizers:
- kubernetes.io/pv-protection
- external-attacher/csi-vsphere-vmware-com
name: pvc-35e6c773-2b21-4408-a046-33f27b7238cb
spec:
accessModes:
- ReadWriteOnce
capacity:
storage: 15Gi
claimRef:
apiVersion: v1
kind: PersistentVolumeClaim
name: rwo-pvc
namespace: default
csi:
driver: csi.vsphere.vmware.com
volumeAttributes:
storage.kubernetes.io/csiProvisionerIdentity: 1706894229053-8081-csi.vsphere.vmware.com
type: vSphere CNS Block Volume
volumeHandle: c6dd8286-e961-4bc5-bccb-21af5d454306
persistentVolumeReclaimPolicy: Delete
storageClassName: vsan-sc
volumeMode: Block
status:
phase: Bound
In the controller pod specification, containers are responsible for managing the volume in the CNS.
ControllerPublish
and ControllerUnpublish
functions of CSI driversOn each node there’s a local host driver and driver register:
The vSphere Storage Controller provides an interface for the container orchestrator to manage the lifecycle of vSphere volumes. It allows the creation, expansion, and deletion of volumes and the attachment and detachment of volumes to Node VMs. The vSphere Container Storage Node allows for the formatting and mounting of volumes to the node, using bind mounts for the volumes inside the pod. Before detaching the volume, the node component helps to unmount the volume from the node. Finally, the Syncer pushes PV, VPC, and pod metadata to CVS. This data assists vSphere administrators in determining which Kubernetes clusters, apps, pods, etc. are using the volume. Full sync keeps the CNS up-to-date with Kubernetes volume metadata information.
Installing the drivers on your Kubernetes cluster instructions are provided on the official documentation page.
The Windows installation has a few subtle distinctions (as usual). After installing CSI on Linux and letting the core components run, the next step is to install the CSI-proxy on Windows. The binary is not available, it’s required to build first and push to the Windows node:
$ git clone https://github.com/kubernetes-csi/csi-proxy
$ sudo make build
$ ls bin/
csi-proxy.exe
To install the binary as a service install it with the command (the csi-proxy.exe
is hosted on c:\etc\kubernetes\node\bin\csi.proxy.exe
):
$flags = "-windows-service -log_file=C:\etc\kubernetes\logs\csi-proxy.log -logtostderr=false"
sc.exe create csiproxy start= "auto" binPath= "C:\etc\kubernetes\node\bin\csi-proxy.exe $flags"
sc.exe failure csiproxy reset= 0 actions= restart/10000
sc.exe start csiproxy
The CSI proxy is a binary responsible for exposing the gRPC APIs around local storage operations for Windows. The CSI Driver installed from the DaemonSet accesses the socket pipeline from CSI proxy and invokes the operations (a few APIs like Disk, Volume, SMB, FS are available)
After the CSI proxy installation, apply the vSphere CSI Node daemonset for the Windows nodes:
---
kind: DaemonSet
apiVersion: apps/v1
metadata:
name: vsphere-csi-node-windows
namespace: vmware-system-csi
spec:
selector:
matchLabels:
app: vsphere-csi-node-windows
updateStrategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
template:
metadata:
labels:
app: vsphere-csi-node-windows
role: vsphere-csi-windows
spec:
priorityClassName: system-node-critical
nodeSelector:
kubernetes.io/os: windows
serviceAccountName: vsphere-csi-node
containers:
- name: node-driver-registrar
image: registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.8.0
args:
- "--v=5"
- "--csi-address=$(ADDRESS)"
- "--kubelet-registration-path=$(DRIVER_REG_SOCK_PATH)"
env:
- name: ADDRESS
value: 'unix://C:\\csi\\csi.sock'
- name: DRIVER_REG_SOCK_PATH
value: 'C:\\var\\lib\\kubelet\\plugins\\csi.vsphere.vmware.com\\csi.sock'
volumeMounts:
- name: plugin-dir
mountPath: /csi
- name: registration-dir
mountPath: /registration
livenessProbe:
exec:
command:
- /csi-node-driver-registrar.exe
- --kubelet-registration-path=C:\\var\\lib\\kubelet\\plugins\\csi.vsphere.vmware.com\\csi.sock
- --mode=kubelet-registration-probe
initialDelaySeconds: 3
- name: vsphere-csi-node
image: gcr.io/cloud-provider-vsphere/csi/release/driver:v3.0.1
args:
- "--fss-name=internal-feature-states.csi.vsphere.vmware.com"
- "--fss-namespace=$(CSI_NAMESPACE)"
imagePullPolicy: "Always"
env:
- name: NODE_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: spec.nodeName
- name: CSI_ENDPOINT
value: 'unix://C:\\csi\\csi.sock'
- name: MAX_VOLUMES_PER_NODE
value: "59" # Maximum number of volumes that controller can publish to the node. If value is not set or zero Kubernetes decide how many volumes can be published by the controller to the node.
- name: X_CSI_MODE
value: node
- name: X_CSI_SPEC_REQ_VALIDATION
value: 'false'
- name: X_CSI_SPEC_DISABLE_LEN_CHECK
value: "true"
- name: LOGGER_LEVEL
value: "PRODUCTION" # Options: DEVELOPMENT, PRODUCTION
- name: X_CSI_LOG_LEVEL
value: DEBUG
- name: CSI_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: NODEGETINFO_WATCH_TIMEOUT_MINUTES
value: "1"
volumeMounts:
- name: plugin-dir
mountPath: 'C:\csi'
- name: pods-mount-dir
mountPath: 'C:\var\lib\kubelet'
- name: csi-proxy-volume-v1
mountPath: \\.\pipe\csi-proxy-volume-v1
- name: csi-proxy-filesystem-v1
mountPath: \\.\pipe\csi-proxy-filesystem-v1
- name: csi-proxy-disk-v1
mountPath: \\.\pipe\csi-proxy-disk-v1
- name: csi-proxy-system-v1alpha1
mountPath: \\.\pipe\csi-proxy-system-v1alpha1
ports:
- name: healthz
containerPort: 9808
protocol: TCP
livenessProbe:
httpGet:
path: /healthz
port: healthz
initialDelaySeconds: 10
timeoutSeconds: 5
periodSeconds: 5
failureThreshold: 3
- name: liveness-probe
image: registry.k8s.io/sig-storage/livenessprobe:v2.10.0
args:
- "--v=4"
- "--csi-address=/csi/csi.sock"
volumeMounts:
- name: plugin-dir
mountPath: /csi
volumes:
- name: registration-dir
hostPath:
path: 'C:\var\lib\kubelet\plugins_registry\'
type: Directory
- name: plugin-dir
hostPath:
path: 'C:\var\lib\kubelet\plugins\csi.vsphere.vmware.com\'
type: DirectoryOrCreate
- name: pods-mount-dir
hostPath:
path: \var\lib\kubelet
type: Directory
- name: csi-proxy-disk-v1
hostPath:
path: \\.\pipe\csi-proxy-disk-v1
type: ''
- name: csi-proxy-volume-v1
hostPath:
path: \\.\pipe\csi-proxy-volume-v1
type: ''
- name: csi-proxy-filesystem-v1
hostPath:
path: \\.\pipe\csi-proxy-filesystem-v1
type: ''
- name: csi-proxy-system-v1alpha1
hostPath:
path: \\.\pipe\csi-proxy-system-v1alpha1
type: ''
tolerations:
- effect: NoExecute
operator: Exists
- effect: NoSchedule
operator: Exists
If after installing the CSI driver, the gRPC socket is getting reset when installing the Kubelet plugin from the node-driver-registrar:
2024-02-04T05:40:54.2393408-08:00 stderr F {"level":"error","time":"2024-02-04T05:40:54.238813-08:00","caller":"osutils/windows_os_utils.go:509","msg":"csi plugin started on windows node without enabling feature switch","TraceId":"a719f958-
b149-4d92-b432-8dc8cfb4b4a4","stacktrace":"[sigs.k8s.io/vsphere-csi-driver/v3/pkg/csi/service/osutils.(*OsUtils](http://sigs.k8s.io/vsphere-csi-driver/v3/pkg/csi/service/osutils.(*OsUtils)).ShouldContinue\n\t/build/pkg/csi/service/osutils/windows_os_utils.go:509\[nsigs.k8s.io/vsphere-csi-driver/v3/pkg/csi/service.(*vs](http://nsigs.k8s.io/vsphere-csi-driver/v3/pkg/csi/service.(*vs)
phereCSIDriver).NodeGetInfo\n\t/build/pkg/csi/service/node.go:340\[ngithub.com/container-storage-interface/spec/lib/go/csi._Node_NodeGetInfo_Handler\n\t/go/pkg/mod/github.com/container-storage-interface/spec@v1.7.0/lib/go/csi/csi.pb.go:6231\](http://ngithub.com/container-storage-interface/spec/lib/go/csi._Node_NodeGetInfo_Handler%5Cn%5Ct/go/pkg/mod/github.com/container-storage-interface/spec@v1.7.0/lib/go/csi/csi.pb.go:6231%5C)[ngoogle.golang.org/grpc.(*Server](http://ngoogle.golang.org/grpc.(*Server)).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/grpc@v1.47.0/server.go:1283\[ngoogle.golang.org/grpc.(*Server](http://ngoogle.golang.org/grpc.(*Server)).handleStream\n\t/go/pkg/mod/google.golang.org/grpc@v1.47.0/server.go:1620\[ngoogle.golang.org/gr](http://ngoogle.golang.org/gr)
pc.(*Server).serveStreams.func1.2\n\t/go/pkg/mod/google.golang.org/grpc@v1.47.0/server.go:922"}
The Windows feature gate must be enabled in the CSI configuration configmap: internal-feature-states.csi.vsphere.vmware.com
:
$ kubectl edit configmap -n vmware-system-csi [internal-feature-states.csi.vsphere.vmware.com](http://internal-feature-states.csi.vsphere.vmware.com/)
data:
csi-windows-support: "true"
For SMB installation and usage check Jaime Gonzales documentation and nodes repository, great source of Windows content!
The first open source tool around to migrate PVCs (and not related to vSphere) is the pv-migrate, the usage is simple as:
pv-migrate migrate \
--source-namespace default \
--dest-namespace backup \
old-pvc new-pvc
The official solution is the usage of Volume Snapshot, install it using bash deploy-csi-snapshot-components.sh
.
After The CRD are installed it’s possible to backup the PVC (first create VolumeSnapshotClass (for example vs-class))
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
finalizers:
- snapshot.storage.kubernetes.io/volumesnapshot-as-source-protection
- snapshot.storage.kubernetes.io/volumesnapshot-bound-protection
name: pvc-snapshot
namespace: default
spec:
source:
persistentVolumeClaimName: windows-pvc
volumeSnapshotClassName: vs-class
The restore is simple, create a PVC and point the datasource to the VolumeSnapshot just created:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc-restore
spec:
storageClassName: vs-class
dataSource:
name: pvc-snapshot
kind: VolumeSnapshot
apiGroup: snapshot.storage.k8s.io
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
You will endup with another PVC with the same configuration (and content) in the new PVC created:
kubo@MJvjY0ETPFqph:~$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
windows-pvc Bound pvc-2916cdff-f5ca-4d0b-ae97-1abe9ae83f0f 5Gi RWO windows 85m
pvc-restore Bound pvc-560324d0-719a-4cb0-bc7c-5181d544bb8f 5Gi RWO windows 5s