vSphere CSI Driver - v2.1.0 release
New Feature
- CSI Migration for in-tree vSphere volumes on vSphere 7.0 Update 1. This feature requires upgrading Kubernetes to 1.19 release.
Deployment files
Kubernetes Release
- Minimum: 1.17
- Maximum: 1.19
Note: For vSphere CSI Migration feature the minimum Kubernetes version requirement is v1.19.0
Supported sidecar containers versions
- csi-provisioner - v2.0.0
- csi-attacher - v3.0.0
- csi-resizer - v1.0.0
- livenessprob - v2.1.0
- csi-node-driver-registrar - v2.0.1
Known Issues
vSphere CSI Driver issues
- When the static persistent volume with file-share is re-created with the same PV name, volume is not getting registered as a container volume with vSphere.
- Impact: attach/delete can not be performed on the such Persistent Volume.
- Workaround: wait for 1 hour before re-creating static persistent volume with file-share using the same name. Note: This issue is fixed in v2.2.0
- Metadata syncer container deletes the volume physically from the datastore when Persistent Volumes with
Bound
status and reclaim policyDelete
is deleted by the user whenStorageObjectInUseProtection
is disabled on Kubernetes Cluster.- Impact: Persistent Volumes Claim goes in the lost status. Volume can not be recovered.
- Workaround: Do not disable
StorageObjectInUseProtection
and attempt to delete Persistent Volume directly without deleting PVC. Note: This issue is fixed in v2.2.0
- Migrated in-tree vSphere volume deleted by in-tree vSphere plugin remains on the CNS UI.
- Impact: Migrated in-tree vSphere volumes deleted by in-tree vSphere plugin remains on the CNS UI.
- Workaround: Admin needs to manually reconcile discrepancies in the Managed Virtual Disk Catalog. Admin needs to follow this KB article - https://kb.vmware.com/s/article/2147750
- Volume expansion might fail when it is called with pod creation simultaneously.
- Impact: Users can resize the PVC and create a pod using that PVC simultaneously. In this case, pod creation might be completed first using the PVC with original size. Volume expansion will fail because online resize is not supported in vSphere 7.0 Update1.
- Workaround: Wait for the PVC to reach FileVolumeResizePending condition before attaching a pod to it.
- Deleting PV before deleting PVC, leaves orphan volume on the datastore.
- Impact: Orphan volumes remain on the datastore, and admin needs to delete those volumes manually using
govc
command. - Upstream issue is tracked at: https://github.com/kubernetes-csi/external-provisioner/issues/546
- Workaround:
- No workaround. User should not attempt to delete PV which is bound to PVC. User should only delete a PV if they know that the underlying volume in the storage system is gone.
- If user has accidentally left orphan volumes on the datastore by not following the guideline, and if user has captured the volume handles or First Class Disk IDs of deleted PVs, storage admin can help delete those volumes using
govc disk.rm <volume-handle/FCD ID>
command.
- Impact: Orphan volumes remain on the datastore, and admin needs to delete those volumes manually using
- When in-tree vSphere plugin is configured to use default datastore in this format default-datastore:
</datacenter-name>/datastore/<default-datastore-name>
, migration of the volume will fail as mentioned here: https://github.com/kubernetes-sigs/vsphere-csi-driver/issues/628. If default datastore is configured in this format<default-datastore-name>
then we do not see this issue- Impact: In-tree vSphere volumes will not get migrated successfully
- Workaround: This issue is fixed in v2.2.0. Consider upgrading driver to v2.2.0.
- When a Pod is rescheduled to a new node, there may be some lock contention which causes a delay in the volume getting detached from the old node and attached to the new node.
- Impact: Rescheduled Pods remain in
Pending
state for an varying amount of time. - Workaround: This issue is fixed in v2.1.1. Consider upgrading driver to v2.1.1.
- Impact: Rescheduled Pods remain in
- When pod using a PVC is rescheduled to other node when metadatasyncer is down, fullsync might fail with error
Duplicated entity for each entity type in one cluster is found
.- Impact: CNS may hold stale volume metadata.
- Workaround: This issue is fixed in v2.2.0. Consider upgrading driver to v2.2.0.
- If there are no datastores present in any one of the datacenters in your vSphere environment, CSI file volume provisioning fails with
failed to get all the datastores
error.- Impact: File volume provisioning keeps failing.
- Workaround: Either remove the
ReadOnly
privilege on this datacenter for the user listed in thevsphere-config-secret
secret or add a datastore to this datacenter. Refer tovsphere-roles-and-privileges
section in the prerequisites page to change the permissions on this datacenter. Note: This issue is fixed in v2.2.0
- vSAN file share volumes (ReadWriteMany/ ReadOnlyMany vSphere CSI Volumes) become inaccessible after vSphere CSI driver Node DaemonSet Pod restarts.
- Impact: Pod can not write/read data on vSAN file share volumes
- Workaround: Remount vSAN file services volumes used by the application pods at the same location from the Node VM Guest OS directly. Detailed steps are available here.
user
andca-file
change in the vsphere-config-secret is not honored until vSphere CSI Controller Pod restarts.- Impact: Volume lifecycle operations fail until the controller pod restarts.
- Workaround: Restart
vsphere-csi-controller
deployment Pod.
Kubernetes issues
- Filesystem resize is skipped if the original PVC is deleted when FilesystemResizePending condition is still on the PVC, but PV and its associated volume on the storage system are not deleted due to the Retain policy.
- Issue: https://github.com/kubernetes/kubernetes/issues/88683
- Impact: User may create a new PVC to statically bind to the undeleted PV. In this case, the volume on the storage system is resized but the filesystem is not resized accordingly. User may try to write to the volume whose filesystem is out of capacity.
- Workaround: User can log into the container to manually resize the filesystem.
- Volume associated with a Statefulset cannot be resized
- Issue: https://github.com/kubernetes/enhancements/pull/660
- Impact: User cannot resize volume in a StatefulSet.
- Workaround: If the statefulset is not managed by an operator, there is a slightly risky workaround which the user can use on their own discretion depending upon their use case. Please refer to https://serverfault.com/questions/955293/how-to-increase-disk-size-in-a-stateful-set for more details.
- Recover from volume expansion failure.
- Impact: If a user tries to expand a PVC to a size which may not be supported by the underlying storage system, volume expansion will keep failing and there is no way to recover.
- Issue: https://github.com/kubernetes/enhancements/pull/1516
- Workaround: None
vSphere issues
- CNS file volume has a limitation of 8K for metadata.
- Impact: It is quite possible that we will not be able to push all the metadata to CNS file share as we need support a max of 64 clients per file volume.
- Workaround: None, This is vSphere limitation.