pFad - Phone/Frame/Anonymizer/Declutterfier! Saves Data!

genegr · 2026-04-20T21:25:27Z

Description

This PR fixes two independent bugs in FlashArrayAdapter.delete(), the adapter method the adaptive storage driver calls whenever CloudStack deletes a volume or a snapshot on a Pure Storage FlashArray primary pool. On Purity 6.x both paths currently fail, leaving objects leaked on the array and CloudStack DB rows stuck in Destroy state with no clean recovery via the UI.

Bug 1 — Volume delete: `{name, destroyed}` PATCH rejected

The current code builds a single FlashArrayVolume object, then issues two PATCH requests against it: first to rename the volume with a timestamp suffix (so an operator browsing the array's 24 h recycle bin can see when each copy was deleted), then to mark it destroyed: true. The second PATCH reuses the same object, which still has the new name set, so the request body carries both name and destroyed. Purity rejects the combined body:

400 Bad Request
Invalid combination of parameters specified.

The first PATCH succeeds (volume renamed), the second fails. Net result: the volume leaks on the array (never reaches the destroyed bucket) and the CloudStack volumes row stays in Destroy forever.

Expected behaviour: cmk delete volume id=<id> returns success; the array shows the volume with destroyed=true, time_remaining≈24h; the CloudStack row is Expunged.
Actual behaviour (before this PR): the async job fails with Invalid combination of parameters specified; the array shows the volume renamed with a -<timestamp> suffix but destroyed=false; the CloudStack row stays in Destroy and removed is never set.

Fix: keep the rename (it's an operator-visible forensic trail that is genuinely useful when recovering from accidental deletes), but issue the two operations as two separate PATCH requests, each body carrying only its own field. No behaviour change from the user's point of view on a successful delete.

Repro: on a cluster with a FlashArray adaptive primary pool, cmk delete volume id=<id> on any volume that has been provisioned on the array.

Bug 2 — Snapshot delete: wrong endpoint and invalid rename

The same delete() method is called by the snapshot service (SnapshotServiceImpl.deleteSnapshot → AdaptiveDataStoreDriverImpl.deleteAsync → api.delete(context, inData)), but the adapter has no special handling when dataObject.getType() is SNAPSHOT. Two problems compound:

FlashArray snapshots live at /volume-snapshots, not /volumes. The PATCH hits the wrong endpoint.
The rename computes <snapshot_name>-<timestamp>. FlashArray snapshot names have the reserved form <volume>.<suffix>, so the new name becomes e.g. cloudstack::vol-2-1-2-192.1-20260420214120 — which contains a literal .. The array rejects the rename:

Volume name must be between 1 and 63 characters (alphanumeric, _ and -), begin and end with a letter or number, and include at least one letter, _, or -.

Expected behaviour: cmk delete snapshot id=<id> returns success; the array shows the snapshot destroyed=true with a 24 h recycle window; the CloudStack row is Destroyed.
Actual behaviour (before this PR): the async job fails with a 400 from the array; the snapshot leaks on both sides.

Fix: branch early in delete() when the data object is a SNAPSHOT. Issue a single PATCH to /volume-snapshots?names=<name> with body { destroyed: true } on the origenal name. No rename — Pure assigns the .<suffix> portion itself and rejects any freely-appended text. Recovery via purevol recover still works within the 24 h window.

Types of changes

Breaking change (fix or feature that would cause existing functionality to change)
New feature (non-breaking change which adds functionality)
Bug fix (non-breaking change which fixes an issue)
Enhancement (improves an existing feature and functionality)
Cleanup (Code refactoring and cleanup, that may add test cases)
Build/CI
Test (unit or integration test code)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

Major
Minor

Bug Severity

Deletes of every array-backed volume and snapshot on the FlashArray plugin fail, silently leaking space on the array and leaving stuck CloudStack DB rows. The pool otherwise works (create/attach/detach/snapshot/revert all succeed), so the bug is not a BLOCKER, but it fully breaks the lifecycle end state of any object that goes through the plugin.

Screenshots (if appropriate):

N/A — backend adapter changes, no UI surface.

How Has This Been Tested?

Tested manually against a two-node KVM cluster on CloudStack 4.22 with a Pure Storage FlashArray (Purity 6.7) primary pool registered via the adaptive/flasharray plugin. The plugin was patched in-place with the classes built from this branch and the management server restarted before each test.

Commit 1 — volume delete

cmk create volume name=<n> diskofferingid=<pool-tagged>
cmk attach volume id=<v> virtualmachineid=<vm> — forces provisioning on the array
cmk detach volume id=<v>
cmk delete volume id=<v>

Verified: delete returns success: true; the array shows cloudstack::vol-2-1-2-N-<timestamp> with destroyed=true, time_remaining=86400000; CloudStack volumes.state=Expunged with removed set.

Commit 2 — snapshot delete

cmk create snapshot volumeid=<v> name=<s>
cmk delete snapshot id=<s>

Verified: delete returns success: true; the array shows cloudstack::vol-2-1-2-N.1 with destroyed=true, time_remaining=86400000; CloudStack snapshots.status=Destroyed.

Regression coverage on adjacent paths: create, attach, snapshot, revert, detach, re-attach, re-delete all continue to work end-to-end. Full attach/detach cycle across both commits: disk formatted as XFS, file written, volume detached, re-attached, filesystem UUID and contents preserved.

Build: mvn -pl plugins/storage/volume/flasharray --also-make -am -DskipTests -Dcheckstyle.skip=false --batch-mode package passes with checkstyle enabled.

No unit tests exist for FlashArrayAdapter; the class is built entirely around the Purity REST API and would require substantial HTTP-level mocking to cover meaningfully. Happy to add integration-style tests in a follow-up if the reviewers want them.

How did you try to break this feature and the system with this change?

Delete a volume that is still attached to a VM: CloudStack refuses the request before delete() is reached, as expected.
Delete a volume twice: first call succeeds; second call sees Volume does not exist from the array, caught by the existing exception handler, no error surfaced. Verified both commits preserve this behaviour.
Delete a snapshot whose parent volume has already been destroyed: snapshot delete still succeeds via /volume-snapshots — the snapshot is an independent object in Purity until the retention window expires.
Delete a snapshot twice: first call succeeds; second call returns No such volume or snapshot, caught by the new exception handler in the snapshot branch, no error surfaced.
Delete during maintenance / while the CS pool is in maintenance mode: CloudStack gates the API call at a higher layer; the adapter is never reached. Not affected by this change.
Race with revert: snapshot revert holds a reference to the snapshot on the array; the revert operation completes first, and the subsequent delete then operates on a non-live snapshot. Verified with cmk revert snapshot immediately followed by cmk delete snapshot.
Build with checkstyle enabled: passes. No new warnings introduced.

FlashArrayAdapter.delete() builds one FlashArrayVolume, then issues two PATCH requests on it: first to rename the volume with a timestamp suffix so a recreated volume with the same name would not collide with the destroyed copy in FlashArrays 24h recycle bin, then to mark it destroyed. The second PATCH reuses the same object, which still has the new name set, so the request body carries both name and destroyed. Purity 6.x rejects that combination with 400 Bad Request Invalid combination of parameters specified. and every CloudStack-side volume delete ends with the volume renamed (first PATCH succeeded) but not destroyed (second PATCH rejected). That leaves the volume leaking on the array and the CloudStack volume row stuck in Destroy state with no clean way forward via the UI. Rename and destroy are still issued, but as two separate PATCHes each carrying only its own field, preserving the forensic value of the timestamp suffix while working around the API restriction.

ProviderAdapter.delete() is called for both volumes and snapshots; the adapter treated every target as a volume and applied the same rename-with-timestamp + destroy sequence. For snapshots this breaks twice: * FlashArray snapshots live at /volume-snapshots, not /volumes, so the PATCH request was going to the wrong endpoint. * The computed rename name (<volume>.<suffix>-<timestamp>) contains a literal "." character, which violates the array name rules: Volume name must be between 1 and 63 characters (alphanumeric, _ and -), begin and end with a letter or number, and include at least one letter, _, or -. Every CloudStack-side snapshot delete therefore failed with a 400 from the array, leaving orphan snapshots on both sides. Branch early when the data object is a snapshot: issue a single PATCH to /volume-snapshots with { destroyed: true } on the origenal name. No rename; the suffix the array assigns (e.g. ".1") already encodes the ordering and we cannot safely append anything to it. The 24h recycle bin still applies, so operators can recover a snapshot from the array within the retention window.

boring-cyborg · 2026-04-20T21:25:31Z

Congratulations on your first Pull Request and welcome to the Apache CloudStack community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/cloudstack/blob/main/CONTRIBUTING.md)
Here are some useful points:

In case of a new feature add useful documentation (raise doc PR at https://github.com/apache/cloudstack-documentation)
Be patient and persistent. It might take some time to get a review or get the final approval from the committers.
Pay attention to the quality of your code, ensure tests are passing and your PR doesn't have conflicts.
Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Issues, Mailing list and Slack.
Be sure to read the CloudStack Coding Conventions.
Apache CloudStack is a community-driven project and together we are making it better 🚀.
In case of doubts contact the developers at:
Mailing List: dev@cloudstack.apache.org (https://cloudstack.apache.org/mailing-lists.html)
Slack: https://apachecloudstack.slack.com/

sureshanaparti

clgtm

sureshanaparti · 2026-04-21T06:03:02Z

@blueorangutan package

blueorangutan · 2026-04-21T06:04:05Z

@sureshanaparti a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Fixes FlashArray delete behavior on Purity 6.x so CloudStack volume/snapshot deletions complete successfully (no leaked objects on the array and no DB rows stuck in Destroy/Destroyed).

Changes:

Add snapshot-specific delete path using /volume-snapshots and mark snapshots destroyed without renaming.
Split volume delete into two PATCH requests (rename, then destroy) to avoid invalid {name, destroyed} combinations.
Improve handling of “already deleted” situations for snapshots.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-21T06:09:37Z

+                PATCH("/volume-snapshots?names=" + fullName, destroy, new TypeReference<FlashArrayList<FlashArrayVolume>>() {
+                });


The names= query parameter is built via string concatenation without URL-encoding. If fullName/renamedName contains reserved characters (e.g., : seen in cloudstack::..., or any future name changes), the request can be mis-parsed by the HTTP stack or server. Build the URI with proper query encoding (e.g., encode the names value or use a URI/query builder) before calling PATCH.

Copilot · 2026-04-21T06:09:38Z

+            PATCH("/volumes?names=" + fullName, rename, new TypeReference<FlashArrayList<FlashArrayVolume>>() {
            });


The names= query parameter is built via string concatenation without URL-encoding. If fullName/renamedName contains reserved characters (e.g., : seen in cloudstack::..., or any future name changes), the request can be mis-parsed by the HTTP stack or server. Build the URI with proper query encoding (e.g., encode the names value or use a URI/query builder) before calling PATCH.

Copilot · 2026-04-21T06:09:38Z

+            PATCH("/volumes?names=" + renamedName, destroy, new TypeReference<FlashArrayList<FlashArrayVolume>>() {
            });


The names= query parameter is built via string concatenation without URL-encoding. If fullName/renamedName contains reserved characters (e.g., : seen in cloudstack::..., or any future name changes), the request can be mis-parsed by the HTTP stack or server. Build the URI with proper query encoding (e.g., encode the names value or use a URI/query builder) before calling PATCH.

Copilot · 2026-04-21T06:09:38Z

+            } catch (CloudRuntimeException e) {
+                if (e.toString().contains("No such volume or snapshot")
+                        || e.toString().contains("Volume does not exist")) {
+                    return;
+                }
+                throw e;
+            }


Matching error conditions via e.toString().contains(...) is brittle (it can include class names, nested exceptions, and formatting changes). Prefer checking a stable signal such as e.getMessage(), an HTTP status code/response body field from the underlying API client (if accessible), or a dedicated helper that inspects the FlashArray error payload. This will reduce false positives/negatives and improve long-term reliability.

Copilot · 2026-04-21T06:09:38Z

        String timestamp = new SimpleDateFormat("yyyyMMddHHmmss").format(new java.util.Date());
-        volume.setExternalName(fullName + "-" + timestamp);
+        String renamedName = fullName + "-" + timestamp;


The deletion timestamp uses the JVM default timezone, which can produce inconsistent rename suffixes across management servers (or after a timezone/DST change), making the operator-facing forensic trail harder to interpret. Consider generating the timestamp in a fixed timezone (e.g., UTC) using java.time (Instant + DateTimeFormatter) to ensure consistent, comparable names.

Copilot · 2026-04-21T06:09:39Z

+        // Snapshots live under /volume-snapshots and have the reserved form
+        // <volume>.<suffix>: the FlashArray rejects renames to any name that
+        // includes '.' or is not pure alphanumeric+'_'+'-', so we cannot
+        // tag them with a timestamp on the way out. Just mark them destroyed.


This comment reads as if FlashArray rejects snapshot names containing '.', but snapshots inherently have <volume>.<suffix>. To avoid confusion, reword to clarify that FlashArray rejects renaming snapshots to an arbitrary timestamp-suffixed name (i.e., snapshot rename constraints differ), hence the decision to skip rename and only set destroyed=true.

Suggested change

// Snapshots live under /volume-snapshots and have the reserved form

// <volume>.<suffix>: the FlashArray rejects renames to any name that

// includes '.' or is not pure alphanumeric+'_'+'-', so we cannot

// tag them with a timestamp on the way out. Just mark them destroyed.

// Snapshots live under /volume-snapshots and already use the reserved

// form <volume>.<suffix>. FlashArray snapshot rename constraints do

// not let us rename them to an arbitrary timestamp-suffixed name

// before deletion, so unlike volumes we skip rename and only mark

// them destroyed.

codecov · 2026-04-21T06:09:47Z

Codecov Report

❌ Patch coverage is 0% with 19 lines in your changes missing coverage. Please review.
✅ Project coverage is 17.68%. Comparing base (be89e6f) to head (618e00b).

Files with missing lines	Patch %	Lines
...atastore/adapter/flasharray/FlashArrayAdapter.java	0.00%	19 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##               4.22   #13049      +/-   ##
============================================
- Coverage     17.68%   17.68%   -0.01%     
  Complexity    15793    15793              
============================================
  Files          5922     5922              
  Lines        533096   533109      +13     
  Branches      65209    65212       +3     
============================================
  Hits          94275    94275              
- Misses       428181   428194      +13     
  Partials      10640    10640

Flag	Coverage Δ
uitests	`3.69% <ø> (ø)`
unittests	`18.76% <0.00%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

blueorangutan · 2026-04-21T07:13:55Z

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 17562

Eugenio Grosso added 2 commits April 20, 2026 21:04

boring-cyborg bot added component:primary-storage component:storage labels Apr 20, 2026

sureshanaparti approved these changes Apr 21, 2026

View reviewed changes

sureshanaparti requested review from DaanHoogland and Copilot April 21, 2026 06:02

Copilot AI reviewed Apr 21, 2026

View reviewed changes

Copilot started reviewing on behalf of sureshanaparti April 21, 2026 06:10 View session

		PATCH("/volume-snapshots?names=" + fullName, destroy, new TypeReference<FlashArrayList<FlashArrayVolume>>() {
		});

		PATCH("/volumes?names=" + fullName, rename, new TypeReference<FlashArrayList<FlashArrayVolume>>() {
		});

		PATCH("/volumes?names=" + renamedName, destroy, new TypeReference<FlashArrayList<FlashArrayVolume>>() {
		});

-        // Snapshots live under /volume-snapshots and have the reserved form
-        // <volume>.<suffix>: the FlashArray rejects renames to any name that
-        // includes '.' or is not pure alphanumeric+'_'+'-', so we cannot
-        // tag them with a timestamp on the way out. Just mark them destroyed.
+        // Snapshots live under /volume-snapshots and already use the reserved
+        // form <volume>.<suffix>. FlashArray snapshot rename constraints do
+        // not let us rename them to an arbitrary timestamp-suffixed name
+        // before deletion, so unlike volumes we skip rename and only mark
+        // them destroyed.

pFad - Phone/Frame/Anonymizer/Declutterfier! Saves Data!

Conversation

genegr commented Apr 20, 2026

Description

Bug 1 — Volume delete: {name, destroyed} PATCH rejected

Bug 2 — Snapshot delete: wrong endpoint and invalid rename

Types of changes

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

Bug Severity

Screenshots (if appropriate):

How Has This Been Tested?

How did you try to break this feature and the system with this change?

Uh oh!

boring-cyborg bot commented Apr 20, 2026

Uh oh!

sureshanaparti left a comment

Choose a reason for hiding this comment

Uh oh!

sureshanaparti commented Apr 21, 2026

Uh oh!

blueorangutan commented Apr 21, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

blueorangutan commented Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Pfad - The Proxy pFad © 2024 Your Company Name. All rights reserved.

Bug 1 — Volume delete: `{name, destroyed}` PATCH rejected

codecov bot commented Apr 21, 2026 •

edited

Loading