Skip to content

Conversation

@phoenix-bjoern
Copy link

@phoenix-bjoern phoenix-bjoern commented Dec 7, 2023

What type of PR is this?
/kind bug

What this PR does / why we need it:
This PR replaces the Update calls for VolumeSnapshot and VolumeSnapshotContent with JSON patch calls. Addresses the "object has been modified" issue we see a lot in the snapshot-controller/snapshotter.

Which issue(s) this PR fixes:
Partial fix (only for Update() calls) #748

Special notes for your reviewer:
N/A

Does this PR introduce a user-facing change?:

Cherry-pick from PR 876: Update VolumeSnapshot and VolumeSnapshotContent using JSON patch 
fix unit tests operate on cloned copy revert to update call for removeSnapshotFinalizer fix content delete finalizer
@k8s-ci-robot k8s-ci-robot added the do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. label Dec 7, 2023
@k8s-ci-robot
Copy link
Contributor

Welcome @phoenix-bjoern!

It looks like this is your first PR to kubernetes-csi/external-snapshotter 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-csi/external-snapshotter has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Dec 7, 2023
@k8s-ci-robot
Copy link
Contributor

Hi @phoenix-bjoern. Thanks for your PR.

I'm waiting for a kubernetes-csi member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Dec 7, 2023
Copy link
Contributor

@shubham-pampattiwar shubham-pampattiwar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you !

@xing-yang
Copy link
Collaborator

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Dec 8, 2023
@xing-yang
Copy link
Collaborator

Please add a release note.

@phoenix-bjoern
Copy link
Author

phoenix-bjoern commented Dec 9, 2023

Compiled a release note draft from the original PR of @shubham-pampattiwar:

Replace update calls for VolumeSnapshot and VolumeSnapshotContent with JSON patches. Addresses the "object has been modified" issue during snapshots. Partial fix for #748.

@xing-yang is the draft okay? Shall I commit it somewhere?

@xing-yang
Copy link
Collaborator

@phoenix-bjoern The PR description section of the original PR has a release note: #876. You can add the same release note in your PR description and also mention that this is a cherry-pick of 876.

@shubham-pampattiwar
Copy link
Contributor

@phoenix-bjoern please update the PR description as suggested by @xing-yang

@phoenix-bjoern
Copy link
Author

@shubham-pampattiwar @xing-yang I've copied the issue description from the original PR as requested.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Dec 13, 2023
@xing-yang
Copy link
Collaborator

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 13, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: phoenix-bjoern, shubham-pampattiwar, xing-yang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 13, 2023
@k8s-ci-robot k8s-ci-robot merged commit 23208ac into kubernetes-csi:release-6.3 Dec 13, 2023
@phoenix-bjoern phoenix-bjoern deleted the backport-issue-748 branch December 13, 2023 04:40
@pbozeman
Copy link

FYI, I was having very frequent snapshot errors with velero and longhorn on k3s. I build a custom v6.3.3 of the external-snapshotter with this fix and my errors appear to have been resolved. I look forward to having this in an officially released image. Thank you folks :)

@xing-yang
Copy link
Collaborator

FYI, I was having very frequent snapshot errors with velero and longhorn on k3s. I build a custom v6.3.3 of the external-snapshotter with this fix and my errors appear to have been resolved. I look forward to having this in an officially released image. Thank you folks :)

We do plan to cut a patch release for 6.3.x. We are still waiting for another important fix before doing that.

@Hamza-Aziz
Copy link

@pbozeman I have the same stack as yours and having this issue, could you please describe how did you fix it ? As far as I know, the latest longhorn version 1.5.3 uses the CSI snapshotter v6.2.1, so I'm imaging I can't use the CSI snapshotter v6.3.3

@pbozeman
Copy link

pbozeman commented Jan 25, 2024

@Hamza-Aziz: I cloned the external-shaphotter repo and made sure I was on a branch with this commit, then tagged it with 6.3.3. I built the repo, pushed it to a private docker registry. That part may not be necessary anymore if the k8s-csi folks have pushed this release to their repository. From there, add the following to your longhorn values.yaml (or however you are configuring it during deployment):

image: csi: snapshotter: repository: <repo>/csi-snapshotter tag: v6.3.3 

I was initially worried about mixing and matching components/images like this from different versions, but it seems fine.

@Hamza-Aziz
Copy link

@phoenix-bjoern Thank you, yes it's released you can use now

image: csi: snapshotter: repository: registry.k8s.io/sig-storage/csi-snapshotter tag: v6.3.3 
@bernardgut
Copy link

Hello
I am still seeing this issue on v7.0.2 . These are the logs right after creating a volumesnapshot object on a running workload:

I0526 19:26:32.830256 1 snapshot_controller.go:660] createSnapshotContent: Creating content for snapshot www/zfspv-snap-test through the plugin ... E0526 19:26:32.842133 1 snapshot_controller.go:938] cannot add finalizer on claim [www/wordpress] for snapshot [www/zfspv-snap-test]: [Operation cannot be fulfilled on persistentvolumeclaims "wordpress": the object has been modified; please apply your changes to the latest version and try again] E0526 19:26:32.842170 1 snapshot_controller.go:666] createSnapshotContent failed to add finalizer for source of snapshot snapshot controller failed to update wordpress on API server: Operation cannot be fulfilled on persistentvolumeclaims "wordpress": the object has been modified; please apply your changes to the latest version and try again E0526 19:26:32.846645 1 snapshot_controller_base.go:470] could not sync snapshot "www/zfspv-snap-test": snapshot controller failed to update wordpress on API server: Operation cannot be fulfilled on persistentvolumeclaims "wordpress": the object has been modified; please apply your changes to the latest version and try again I0526 19:26:32.846759 1 snapshot_controller.go:660] createSnapshotContent: Creating content for snapshot www/zfspv-snap-test through the plugin ... I0526 19:26:32.846753 1 event.go:364] Event(v1.ObjectReference{Kind:"VolumeSnapshot", Namespace:"", Name:"", UID:"", APIVersion:"snapshot.storage.k8s.io/v1", ResourceVersion:"", FieldPath:""}): type: 'Warning' reason: 'SnapshotContentCreationFailed' Failed to create snapshot content with error snapshot controller failed to update wordpress on API server: Operation cannot be fulfilled on persistentvolumeclaims "wordpress": the object has been modified; please apply your changes to the latest version and try again I0526 19:26:32.846783 1 snapshot_controller.go:925] Protection finalizer already exists for persistent volume claim www/wordpress E0526 19:26:32.868796 1 snapshot_controller.go:755] failed to update content store error parsing ResourceVersion "" of content "snapcontent-6858adb1-347e-49a8-9fc1-b9641c27ab22": strconv.ParseInt: parsing "": invalid syntax I0526 19:26:32.868877 1 event.go:364] Event(v1.ObjectReference{Kind:"VolumeSnapshot", Namespace:"www", Name:"zfspv-snap-test", UID:"6858adb1-347e-49a8-9fc1-b9641c27ab22", APIVersion:"snapshot.storage.k8s.io/v1", ResourceVersion:"375980299", FieldPath:""}): type: 'Normal' reason: 'CreatingSnapshot' Waiting for a snapshot www/zfspv-snap-test to be created by the CSI driver. I0526 19:26:32.901338 1 event.go:364] Event(v1.ObjectReference{Kind:"VolumeSnapshot", Namespace:"www", Name:"zfspv-snap-test", UID:"6858adb1-347e-49a8-9fc1-b9641c27ab22", APIVersion:"snapshot.storage.k8s.io/v1", ResourceVersion:"375980312", FieldPath:""}): type: 'Normal' reason: 'SnapshotCreated' Snapshot www/zfspv-snap-test was successfully created by the CSI driver. I0526 19:26:34.625529 1 event.go:364] Event(v1.ObjectReference{Kind:"VolumeSnapshot", Namespace:"www", Name:"zfspv-snap-test", UID:"6858adb1-347e-49a8-9fc1-b9641c27ab22", APIVersion:"snapshot.storage.k8s.io/v1", ResourceVersion:"375980321", FieldPath:""}): type: 'Normal' reason: 'SnapshotReady' Snapshot www/zfspv-snap-test is ready to use. I0526 19:26:34.831545 1 snapshot_controller.go:1020] checkandRemovePVCFinalizer[zfspv-snap-test]: Remove Finalizer for PVC wordpress as it is not used by snapshots in creation 
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

7 participants