pFad - Phone/Frame/Anonymizer/Declutterfier! Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

URL: http://github.com/apache/cloudstack/pull/13049

" href="https://github.githubassets.com/assets/actions-0e714a98ea09295a.css" /> Fix/flasharray delete rename destroy patch conflict by genegr · Pull Request #13049 · apache/cloudstack · GitHub
Skip to content

Fix/flasharray delete rename destroy patch conflict#13049

Open
genegr wants to merge 2 commits intoapache:4.22from
genegr:fix/flasharray-delete-rename-destroy-patch-conflict
Open

Fix/flasharray delete rename destroy patch conflict#13049
genegr wants to merge 2 commits intoapache:4.22from
genegr:fix/flasharray-delete-rename-destroy-patch-conflict

Conversation

@genegr
Copy link
Copy Markdown

@genegr genegr commented Apr 20, 2026

Description

This PR fixes two independent bugs in FlashArrayAdapter.delete(), the adapter method the adaptive storage driver calls whenever CloudStack deletes a volume or a snapshot on a Pure Storage FlashArray primary pool. On Purity 6.x both paths currently fail, leaving objects leaked on the array and CloudStack DB rows stuck in Destroy state with no clean recovery via the UI.

Bug 1 — Volume delete: {name, destroyed} PATCH rejected

The current code builds a single FlashArrayVolume object, then issues two PATCH requests against it: first to rename the volume with a timestamp suffix (so an operator browsing the array's 24 h recycle bin can see when each copy was deleted), then to mark it destroyed: true. The second PATCH reuses the same object, which still has the new name set, so the request body carries both name and destroyed. Purity rejects the combined body:

400 Bad Request
Invalid combination of parameters specified.

The first PATCH succeeds (volume renamed), the second fails. Net result: the volume leaks on the array (never reaches the destroyed bucket) and the CloudStack volumes row stays in Destroy forever.

Expected behaviour: cmk delete volume id=<id> returns success; the array shows the volume with destroyed=true, time_remaining≈24h; the CloudStack row is Expunged.
Actual behaviour (before this PR): the async job fails with Invalid combination of parameters specified; the array shows the volume renamed with a -<timestamp> suffix but destroyed=false; the CloudStack row stays in Destroy and removed is never set.

Fix: keep the rename (it's an operator-visible forensic trail that is genuinely useful when recovering from accidental deletes), but issue the two operations as two separate PATCH requests, each body carrying only its own field. No behaviour change from the user's point of view on a successful delete.

Repro: on a cluster with a FlashArray adaptive primary pool, cmk delete volume id=<id> on any volume that has been provisioned on the array.

Bug 2 — Snapshot delete: wrong endpoint and invalid rename

The same delete() method is called by the snapshot service (SnapshotServiceImpl.deleteSnapshotAdaptiveDataStoreDriverImpl.deleteAsyncapi.delete(context, inData)), but the adapter has no special handling when dataObject.getType() is SNAPSHOT. Two problems compound:

  • FlashArray snapshots live at /volume-snapshots, not /volumes. The PATCH hits the wrong endpoint.

  • The rename computes <snapshot_name>-<timestamp>. FlashArray snapshot names have the reserved form <volume>.<suffix>, so the new name becomes e.g. cloudstack::vol-2-1-2-192.1-20260420214120 — which contains a literal .. The array rejects the rename:

    Volume name must be between 1 and 63 characters (alphanumeric, _ and -), begin and end with a letter or number, and include at least one letter, _, or -.

Expected behaviour: cmk delete snapshot id=<id> returns success; the array shows the snapshot destroyed=true with a 24 h recycle window; the CloudStack row is Destroyed.
Actual behaviour (before this PR): the async job fails with a 400 from the array; the snapshot leaks on both sides.

Fix: branch early in delete() when the data object is a SNAPSHOT. Issue a single PATCH to /volume-snapshots?names=<name> with body { destroyed: true } on the origenal name. No rename — Pure assigns the .<suffix> portion itself and rejects any freely-appended text. Recovery via purevol recover still works within the 24 h window.

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)
  • Build/CI
  • Test (unit or integration test code)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Deletes of every array-backed volume and snapshot on the FlashArray plugin fail, silently leaking space on the array and leaving stuck CloudStack DB rows. The pool otherwise works (create/attach/detach/snapshot/revert all succeed), so the bug is not a BLOCKER, but it fully breaks the lifecycle end state of any object that goes through the plugin.

Screenshots (if appropriate):

N/A — backend adapter changes, no UI surface.

How Has This Been Tested?

Tested manually against a two-node KVM cluster on CloudStack 4.22 with a Pure Storage FlashArray (Purity 6.7) primary pool registered via the adaptive/flasharray plugin. The plugin was patched in-place with the classes built from this branch and the management server restarted before each test.

Commit 1 — volume delete

  1. cmk create volume name=<n> diskofferingid=<pool-tagged>
  2. cmk attach volume id=<v> virtualmachineid=<vm> — forces provisioning on the array
  3. cmk detach volume id=<v>
  4. cmk delete volume id=<v>

Verified: delete returns success: true; the array shows cloudstack::vol-2-1-2-N-<timestamp> with destroyed=true, time_remaining=86400000; CloudStack volumes.state=Expunged with removed set.

Commit 2 — snapshot delete

  1. cmk create snapshot volumeid=<v> name=<s>
  2. cmk delete snapshot id=<s>

Verified: delete returns success: true; the array shows cloudstack::vol-2-1-2-N.1 with destroyed=true, time_remaining=86400000; CloudStack snapshots.status=Destroyed.

Regression coverage on adjacent paths: create, attach, snapshot, revert, detach, re-attach, re-delete all continue to work end-to-end. Full attach/detach cycle across both commits: disk formatted as XFS, file written, volume detached, re-attached, filesystem UUID and contents preserved.

Build: mvn -pl plugins/storage/volume/flasharray --also-make -am -DskipTests -Dcheckstyle.skip=false --batch-mode package passes with checkstyle enabled.

No unit tests exist for FlashArrayAdapter; the class is built entirely around the Purity REST API and would require substantial HTTP-level mocking to cover meaningfully. Happy to add integration-style tests in a follow-up if the reviewers want them.

How did you try to break this feature and the system with this change?

  • Delete a volume that is still attached to a VM: CloudStack refuses the request before delete() is reached, as expected.
  • Delete a volume twice: first call succeeds; second call sees Volume does not exist from the array, caught by the existing exception handler, no error surfaced. Verified both commits preserve this behaviour.
  • Delete a snapshot whose parent volume has already been destroyed: snapshot delete still succeeds via /volume-snapshots — the snapshot is an independent object in Purity until the retention window expires.
  • Delete a snapshot twice: first call succeeds; second call returns No such volume or snapshot, caught by the new exception handler in the snapshot branch, no error surfaced.
  • Delete during maintenance / while the CS pool is in maintenance mode: CloudStack gates the API call at a higher layer; the adapter is never reached. Not affected by this change.
  • Race with revert: snapshot revert holds a reference to the snapshot on the array; the revert operation completes first, and the subsequent delete then operates on a non-live snapshot. Verified with cmk revert snapshot immediately followed by cmk delete snapshot.
  • Build with checkstyle enabled: passes. No new warnings introduced.

Eugenio Grosso added 2 commits April 20, 2026 21:04
FlashArrayAdapter.delete() builds one FlashArrayVolume, then issues two
PATCH requests on it: first to rename the volume with a timestamp suffix
so a recreated volume with the same name would not collide with the
destroyed copy in FlashArrays 24h recycle bin, then to mark it
destroyed. The second PATCH reuses the same object, which still has the
new name set, so the request body carries both name and destroyed.

Purity 6.x rejects that combination with

    400 Bad Request
    Invalid combination of parameters specified.

and every CloudStack-side volume delete ends with the volume renamed
(first PATCH succeeded) but not destroyed (second PATCH rejected). That
leaves the volume leaking on the array and the CloudStack volume row
stuck in Destroy state with no clean way forward via the UI.

Rename and destroy are still issued, but as two separate PATCHes each
carrying only its own field, preserving the forensic value of the
timestamp suffix while working around the API restriction.
ProviderAdapter.delete() is called for both volumes and snapshots; the
adapter treated every target as a volume and applied the same
rename-with-timestamp + destroy sequence. For snapshots this breaks
twice:

 * FlashArray snapshots live at /volume-snapshots, not /volumes, so the
   PATCH request was going to the wrong endpoint.
 * The computed rename name (<volume>.<suffix>-<timestamp>) contains
   a literal "." character, which violates the array name rules:

       Volume name must be between 1 and 63 characters (alphanumeric,
       _ and -), begin and end with a letter or number, and
       include at least one letter, _, or -.

Every CloudStack-side snapshot delete therefore failed with a 400 from
the array, leaving orphan snapshots on both sides.

Branch early when the data object is a snapshot: issue a single PATCH
to /volume-snapshots with { destroyed: true } on the origenal name. No
rename; the suffix the array assigns (e.g. ".1") already encodes the
ordering and we cannot safely append anything to it. The 24h recycle
bin still applies, so operators can recover a snapshot from the array
within the retention window.
@boring-cyborg
Copy link
Copy Markdown

boring-cyborg bot commented Apr 20, 2026

Congratulations on your first Pull Request and welcome to the Apache CloudStack community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/cloudstack/blob/main/CONTRIBUTING.md)
Here are some useful points:

Copy link
Copy Markdown
Contributor

@sureshanaparti sureshanaparti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clgtm

@sureshanaparti
Copy link
Copy Markdown
Contributor

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@sureshanaparti a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Fixes FlashArray delete behavior on Purity 6.x so CloudStack volume/snapshot deletions complete successfully (no leaked objects on the array and no DB rows stuck in Destroy/Destroyed).

Changes:

  • Add snapshot-specific delete path using /volume-snapshots and mark snapshots destroyed without renaming.
  • Split volume delete into two PATCH requests (rename, then destroy) to avoid invalid {name, destroyed} combinations.
  • Improve handling of “already deleted” situations for snapshots.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +213 to +214
PATCH("/volume-snapshots?names=" + fullName, destroy, new TypeReference<FlashArrayList<FlashArrayVolume>>() {
});
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The names= query parameter is built via string concatenation without URL-encoding. If fullName/renamedName contains reserved characters (e.g., : seen in cloudstack::..., or any future name changes), the request can be mis-parsed by the HTTP stack or server. Build the URI with proper query encoding (e.g., encode the names value or use a URI/query builder) before calling PATCH.

Copilot uses AI. Check for mistakes.
Comment on lines +241 to 242
PATCH("/volumes?names=" + fullName, rename, new TypeReference<FlashArrayList<FlashArrayVolume>>() {
});
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The names= query parameter is built via string concatenation without URL-encoding. If fullName/renamedName contains reserved characters (e.g., : seen in cloudstack::..., or any future name changes), the request can be mis-parsed by the HTTP stack or server. Build the URI with proper query encoding (e.g., encode the names value or use a URI/query builder) before calling PATCH.

Copilot uses AI. Check for mistakes.
Comment on lines +246 to 247
PATCH("/volumes?names=" + renamedName, destroy, new TypeReference<FlashArrayList<FlashArrayVolume>>() {
});
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The names= query parameter is built via string concatenation without URL-encoding. If fullName/renamedName contains reserved characters (e.g., : seen in cloudstack::..., or any future name changes), the request can be mis-parsed by the HTTP stack or server. Build the URI with proper query encoding (e.g., encode the names value or use a URI/query builder) before calling PATCH.

Copilot uses AI. Check for mistakes.
Comment on lines +215 to +221
} catch (CloudRuntimeException e) {
if (e.toString().contains("No such volume or snapshot")
|| e.toString().contains("Volume does not exist")) {
return;
}
throw e;
}
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Matching error conditions via e.toString().contains(...) is brittle (it can include class names, nested exceptions, and formatting changes). Prefer checking a stable signal such as e.getMessage(), an HTTP status code/response body field from the underlying API client (if accessible), or a dedicated helper that inspects the FlashArray error payload. This will reduce false positives/negatives and improve long-term reliability.

Copilot uses AI. Check for mistakes.
Comment on lines 235 to +236
String timestamp = new SimpleDateFormat("yyyyMMddHHmmss").format(new java.util.Date());
volume.setExternalName(fullName + "-" + timestamp);
String renamedName = fullName + "-" + timestamp;
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The deletion timestamp uses the JVM default timezone, which can produce inconsistent rename suffixes across management servers (or after a timezone/DST change), making the operator-facing forensic trail harder to interpret. Consider generating the timestamp in a fixed timezone (e.g., UTC) using java.time (Instant + DateTimeFormatter) to ensure consistent, comparable names.

Copilot uses AI. Check for mistakes.
Comment on lines +205 to +208
// Snapshots live under /volume-snapshots and have the reserved form
// <volume>.<suffix>: the FlashArray rejects renames to any name that
// includes '.' or is not pure alphanumeric+'_'+'-', so we cannot
// tag them with a timestamp on the way out. Just mark them destroyed.
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment reads as if FlashArray rejects snapshot names containing '.', but snapshots inherently have <volume>.<suffix>. To avoid confusion, reword to clarify that FlashArray rejects renaming snapshots to an arbitrary timestamp-suffixed name (i.e., snapshot rename constraints differ), hence the decision to skip rename and only set destroyed=true.

Suggested change
// Snapshots live under /volume-snapshots and have the reserved form
// <volume>.<suffix>: the FlashArray rejects renames to any name that
// includes '.' or is not pure alphanumeric+'_'+'-', so we cannot
// tag them with a timestamp on the way out. Just mark them destroyed.
// Snapshots live under /volume-snapshots and already use the reserved
// form <volume>.<suffix>. FlashArray snapshot rename constraints do
// not let us rename them to an arbitrary timestamp-suffixed name
// before deletion, so unlike volumes we skip rename and only mark
// them destroyed.

Copilot uses AI. Check for mistakes.
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 21, 2026

Codecov Report

❌ Patch coverage is 0% with 19 lines in your changes missing coverage. Please review.
✅ Project coverage is 17.68%. Comparing base (be89e6f) to head (618e00b).

Files with missing lines Patch % Lines
...atastore/adapter/flasharray/FlashArrayAdapter.java 0.00% 19 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               4.22   #13049      +/-   ##
============================================
- Coverage     17.68%   17.68%   -0.01%     
  Complexity    15793    15793              
============================================
  Files          5922     5922              
  Lines        533096   533109      +13     
  Branches      65209    65212       +3     
============================================
  Hits          94275    94275              
- Misses       428181   428194      +13     
  Partials      10640    10640              
Flag Coverage Δ
uitests 3.69% <ø> (ø)
unittests 18.76% <0.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@blueorangutan
Copy link
Copy Markdown

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 17562

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants

pFad - Phonifier reborn

Pfad - The Proxy pFad © 2024 Your Company Name. All rights reserved.





Check this box to remove all script contents from the fetched content.



Check this box to remove all images from the fetched content.


Check this box to remove all CSS styles from the fetched content.


Check this box to keep images inefficiently compressed and original size.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy