vSphere CSI Driver - v1.0.2 release
New Feature
- Implement the GetVolumeStats CSI node capability (#108, @flofourcade)
Other Notable Changes
- Mitigate CVE-2019-11255 by bumping external-provisioner image version to v1.2.2. CVE announcement #98
- Remove vsphere credentials from csi node daemonset #96
- Support for running the CSI controller outside the controlled cluster. #102
- Hotfix compatibility issue on 7.0. Using the correct field name to govmomi. #127
- Invoke NodeRegister only when Provider Id change is observed. #132
- removing pending status check for the pod Delete #117
- Replace container images to match published manifests #105
- updating deployment yamls with v1.0.1 images #86
Deployment files
Kubernetes Release
- Minimum: 1.14
- Maximum: 1.16
Supported sidecar containers versions
- csi-provisioner - v1.2.2
- csi-attacher - v1.1.1
- livenessprob - v1.1.0
- csi-node-driver-registrar - v1.1.0
Known Issues
vSphere CSI Driver issues
v1.0.2
does not have capabilities to reuse long-running vCenter tasks. While creating volume if vCenter takes longer than 5 minutes, CSI Provisioner times out and sends another request to create a volume.- Impact: This results into Orphan volumes on the datastore.
- Workaround: use larger timeout value to reduce occurrence of orphan volumes if vCenter is taking long time to create volumes.
Volume never gets detached from the node if volume is being deleted before it is detached from Node VM.
- Reasons volume gets into this state are
- CSI provisioner is issuing delete call before volume is detached.
- vCenter API in
vSphere 67u3
is not marking volume back as Container Volume when Delete call fails while Volume is attached to the Node VM. This issue is fixed in the vSphere 7.0.- On vSphere 67u3, issue can be fixed by upgrading driver to v2.0.1 release.
- Impact: When Pod and PVC are deleted together upon deletion of namespace, we have a race to delete and detach volume. Due to this, Pod remains in the terminating state and PV remains in the released state.
- Upstream issue was tracked at: https://github.com/kubernetes/kubernetes/issues/84226
- Workaround:
- Delete the Pod with force:
kubectl delete pods <pod> --grace-period=0 --force
- Find VolumeAttachment for the volume that remained undeleted. Get Node from this VolumeAttachment.
- Manually detach the disk from the Node VM.
- Edit this VolumeAttachment and remove the finalizer. It will get deleted.
- Use
govc
to manually delete the FCD. - Edit PV and remove the finalizer. It will get deleted.
- Delete the Pod with force:
- Reasons volume gets into this state are
Metadata syncer container deletes the volume physically from the datastore when Persistent Volumes with
Bound
status and reclaim policyDelete
is deleted by the user whenStorageObjectInUseProtection
is disabled on Kubernetes Cluster.- Impact: Persistent Volumes Claim goes in the lost status. Volume can not be recovered.
- Workaround: Do not disable
StorageObjectInUseProtection
and attempt to delete Persistent Volume directly without deleting PVC.
- Deployment yaml uses
hostPath
volume in the CSI driver deployment for unix domain socket path.- Impact: when the controller Pod does not have access to the file system on the node VM, driver fails to create socket file and thus does not come up.
- Workaround: use
emptydir
volume instead ofhostPath
volume.
- Deleting PV before deleting PVC, leaves orphan volume on the datastore.
- Impact: Orphan volumes remain on the datastore, and admin needs to delete those volumes manually using
govc
command. - Upstream issue is tracked at: https://github.com/kubernetes-csi/external-provisioner/issues/546
- Workaround:
- No workaround. User should not attempt to delete PV which is bound to PVC. User should only delete a PV if they know that the underlying volume in the storage system is gone.
- If user has accidentally left orphan volumes on the datastore by not following the guideline, and if user has captured the volume handles or First Class Disk IDs of deleted PVs, storage admin can help delete those volumes using
govc disk.rm <volume-handle/FCD ID>
command.
- Impact: Orphan volumes remain on the datastore, and admin needs to delete those volumes manually using