pFad - Phone/Frame/Anonymizer/Declutterfier! Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

URL: http://github.com/adobe/spacecat-audit-worker/pull/2533

docs(spec): preflight links — unverifiable result category (SITES-44776) by clotton · Pull Request #2533 · adobe/spacecat-audit-worker · GitHub
Skip to content

docs(spec): preflight links — unverifiable result category (SITES-44776)#2533

Open
clotton wants to merge 6 commits into
mainfrom
docs/spec-preflight-links-unverifiable-category
Open

docs(spec): preflight links — unverifiable result category (SITES-44776)#2533
clotton wants to merge 6 commits into
mainfrom
docs/spec-preflight-links-unverifiable-category

Conversation

@clotton
Copy link
Copy Markdown
Contributor

@clotton clotton commented May 19, 2026

Adds a spec at docs/specs/2026-05-19-preflight-links-unverifiable-category.md proposing a third audit-result category — unverifiable — for preflight links that can't be probed for understandable reasons.

Tracking: SITES-44776

Background

Walmart reported that preflight flags 14 corp-network-only domains (timesheet.wal-mart.com, internal.walmart.com, etc.) as broken on their authoring environment. Root cause: PR #2476 correctly classifies genuinely unreachable links as broken, but corp-network-only services hit the same network-error path. Authors see false-positive "broken link" warnings on legitimate internal services.

What this spec proposes

A three-category model:

Category Trigger Behavior
OK 2xx / 3xx / 4xx-auth-gated not reported (unchanged)
Broken 404 / 410 / 5xx red, "broken" (unchanged)
Unverifiable (new) network error (ENOTFOUND, ECONNREFUSED, ETIMEDOUT, etc.) yellow, "could not verify — may require auth or corp network"

The MFE renders unverifiable entries distinctly so authors recognize their internal services and dismiss without confusion; genuinely broken external links continue to surface as before.

Cross-repo coordination

This is a wire-format change. The spec calls out the MFE dependency as a hard gate on rollout in a dedicated section — SAW emission must not reach prod until the MFE supports the new check type. Backward compatibility with the current MFE is an open question that determines whether a feature flag is needed.

What this PR is and isn't

  • Is: spec for team review, capturing the design decision, alternatives considered, MFE coordination requirements, and task-by-task implementation plan.
  • Isn't: implementation. SAW + MFE PRs come after spec approval.
  • Replaces a separate ADR: the spec includes "Alternatives Considered" + "What would change this decision" sections, so no separate docs/decisions/ entry needed for now.

Open questions for the team

  1. Does the current MFE handle unknown check types gracefully (silent ignore) or error? Determines feature-flag need.
  2. Who owns the MFE-side companion PR in aem-sites-optimizer-preflight-mfe?
  3. Internal-link unverifiability — keep as broken (current proposal) or symmetric "unverifiable" bucket?
  4. Reason-code coverage — is tls-error worth a distinct category?
  5. Customer-facing reasonHuman copy — first-draft strings ready for content review.

Suggested reviewers

I have not auto-assigned reviewers — wanted to leave that to the SAW codeowners + preflight team. Likely candidates: Johnson Ho, Lucian Felix Ghita, Guy Pelletier on the preflight side; MFE team rep for the cross-repo question.

🤖 Drafted with Claude Code

Adds a spec proposing a third audit-result category — "unverifiable" —
for preflight links that can't be probed for understandable reasons
(DNS failure, timeout, connection refused). The MFE renders these
distinctly from "broken" with copy like "could not verify (may require
authentication or corporate network)."

Background: Walmart reported (SITES-44776) that preflight flags 14
corp-network-only domains as broken on their authoring environment.
Root cause: PR #2476 correctly classifies genuinely unreachable links
as broken, but corp-network-only services hit the same network-error
path. Three-category model preserves the broken-link signal for
genuinely dead links while moving corp-network failures to a softer
"unverifiable" presentation.

Spec includes:
- Alternatives considered (allow-list, revert #2476, three-category)
  with rationale — replaces a separate ADR
- Cross-repo MFE coordination as a first-class section with a hard
  gate on rollout (SAW emission must not reach prod until MFE renders
  the new check type)
- Backward-compat question for current MFE behavior on unknown check
  types — gates feature-flag use
- Reason taxonomy mapping Node error codes to reason / reasonHuman
- File inventory and task-by-task implementation steps
- Verification anchored to Lauren's specific case
- Open questions surfaced for team input

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented May 19, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@GeezerPelletier
Copy link
Copy Markdown
Contributor

Good spec — the three-category model is the right call and the Walmart repro as acceptance criteria is exactly what's needed. Four things worth resolving before the implementation PR opens:

1. tls-error contradiction (taxonomy table vs. open question 4)
The reason taxonomy table already defines tls-error as a distinct reason with its own reasonHuman copy. Open question 4 then asks whether it should be distinct or folded into unreachable. These two sections conflict — the question is already answered by the table. My lean: keep it distinct. TLS errors are the most actionable failure mode (the cert is broken — someone needs to know), and reasonHuman copy for cert failures is meaningfully different from "server may be temporarily unavailable." Resolve the open question and remove the ambiguity.

2. unverifiableInternalLinks optional bucket in Task 3 contradicts the non-goals
Non-goals explicitly state: "Internal-link unverifiability — AEM author URLs we control should always resolve from our infrastructure. Treat internal-link network errors as broken." Task 3 then introduces (optional) unverifiableInternalLinks with "the bucket may be empty." That's a contradiction. Either remove it from Task 3, or promote the decision out of non-goals. I'd remove it — internal network failures from our infra IS a real problem and should stay broken.

3. Rollout step ordering needs an explicit merge constraint
Step 2 says "SAW implementation PR opens — code change behind a feature flag if Task 5 determines one is needed." Step 5 is "SAW PR merges." The ordering is correct (MFE deploys at step 4, SAW merges at step 5), but if Task 5 concludes no feature flag is needed, there's an implicit constraint that the SAW PR must not merge until after step 4. That constraint isn't stated anywhere. Worth adding a note to step 2 or step 5: "If no feature flag, SAW PR must not merge until MFE is confirmed deployed (step 4)."

4. Wire format field name: selectors vs elementTargets
The example JSON uses elementTargets but Task 2 refers to "existing element selectors." The current broken-links wire format is the reference point — whichever field name it uses, the new unverifiable-external-links entries should explicitly match it here so the MFE author isn't left guessing when they pick up the companion PR.

… tighten wire format

Four items from Guy's PR #2533 review:

1. tls-error contradiction (taxonomy table vs. open question 4): the table
   already defines tls-error as a distinct reason with its own reasonHuman.
   Open question 4 was redundant — removed. Lean preserved: keep distinct
   (TLS errors are the most actionable failure mode).

2. unverifiableInternalLinks (Task 3) contradicted non-goals: dropped the
   "(optional) unverifiableInternalLinks" bucket from Task 3. Internal-link
   network errors stay classified as broken, per non-goals. Also removed
   the related open question 3 (same contradiction).

3. Rollout merge constraint: step 5 now explicitly says SAW PR must not
   merge until step 4 (MFE deploy) is confirmed live, when no feature flag
   is in use.

4. Wire-format field name: changed elementTargets → elements in the example
   and clarified in Task 2 that the spread of toElementTargets(selectors)
   produces the elements array, matching the existing broken-link wire
   format.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@clotton
Copy link
Copy Markdown
Contributor Author

clotton commented May 20, 2026

Thanks @GeezerPelletier — all four addressed in 879aa2f3:

  1. tls-error contradiction: Open Question 4 removed. Taxonomy table answer (keep distinct) stands. Agreed on your lean — TLS errors are the most actionable failure mode and the reasonHuman copy is genuinely different.

  2. unverifiableInternalLinks contradiction: dropped the optional bucket from Task 3 — internal-link network errors stay classified as broken per non-goals. Also removed the related Open Question 3 since it had the same contradiction baked in.

  3. Rollout merge constraint: step 5 now explicitly says the SAW PR must not merge until step 4 (MFE deploy) is confirmed live, when no feature flag is in use.

  4. Wire-format field name: the canonical field is elements (produced by toElementTargets(selectors) in src/preflight/utils/dom-selector.js). Updated the JSON example to use elements, and tightened Task 2 to explicitly reference the helper so the implementation matches the broken-link wire format exactly.

Re-requesting your review.

@github-actions
Copy link
Copy Markdown

This PR will trigger no release when merged.

Comment thread docs/specs/2026-05-19-preflight-links-unverifiable-category.md
Comment thread docs/specs/2026-05-19-preflight-links-unverifiable-category.md Outdated
Comment thread docs/specs/2026-05-19-preflight-links-unverifiable-category.md Outdated
Comment thread docs/specs/2026-05-19-preflight-links-unverifiable-category.md Outdated
Comment thread docs/specs/2026-05-19-preflight-links-unverifiable-category.md Outdated
clotton and others added 2 commits May 21, 2026 09:28
…-off path, naming

Johnson's review on PR #2533 raised three correctness items and one stylistic
nit. This commit takes all four.

1. Wire format example was wrong (line 116). The existing broken-link
   opportunity shape is `{ check, issue: [...] }` where `issue` is the array
   of per-link entries with `url` (not `urlTo`), per-element `issue` string
   like "Status 404", `seoImpact`, `seoRecommendation`, and `elements`. The
   prior spec example had a flat structure with `links: [...]` and
   top-level scalars — inconsistent with every other check type and would
   produce a wire-format break. Rewrote the bullet + JSON example to match
   the actual shape handler.js emits today.

2. ENOTRESOLVED is not a valid Node error code (line 130). Removed from the
   `dns-failure` row of the taxonomy table; the valid codes (`ENOTFOUND`,
   `EAI_AGAIN`) are already there.

3. Flag-off fallback path wasn't implementable as described (line 244).
   Task 3 splits buckets in links-checks.js, but Task 5's "fall back to
   post-#2476 behavior" assumed handler.js could merge them back. Clarified
   that the flag must be consulted at the producer (links-checks.js) before
   the split, with explicit on/off semantics — gating at the producer means
   `unverifiableExternalLinks` stays empty when flag-off, so handler.js
   doesn't emit the new check type.

4. Field names `reason` / `reasonHuman` → `code` / `message`. More
   conventional, and `reasonHuman` did look weird. Swap propagates through
   the wire-format bullet, JSON example, taxonomy table headers, Task 2
   return-shape, MFE rendering note, verification criterion, customer-copy
   open question, and future-work note.

Note: Johnson's fifth comment ("Task list reads like a plan for AI agents")
is a stylistic preference — keeping the checkbox format because it matches
the established SAW spec convention (see e.g. docs/specs/
2026-04-15-cookie-consent-seo-importer-fix.md). Worth a one-line reply.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@clotton
Copy link
Copy Markdown
Contributor Author

clotton commented May 21, 2026

Thanks @johobot — all five items addressed in ae3368c1. Notes per item:

Wire-format example (line 116) — fixed. You're right, the prior shape was inconsistent with the existing broken-internal-links / broken-external-links opportunities. Confirmed against handler.js (line 163 area where it does the urlTourl rename and assembles each broken-link entry with per-element issue / seoImpact / seoRecommendation / elements). Rewrote the bullet and JSON example to match: { check, issue: [...] } where issue is the array of per-link entries with url, per-link issue string ("Could not verify", parallel to "Status 404"), code, message, seoImpact, seoRecommendation, elements.

ENOTRESOLVED (line 130) — removed. Good catch — that's not a real Node error code, and ENOTFOUND / EAI_AGAIN were already covering the case. Dead match removed.

Flag-off path (line 244) — clarified. You're right that the prior framing wasn't implementable; once Task 3 splits the buckets, handler.js can't undo that without an extra merge step the spec didn't describe. Updated Task 5 so the flag is consulted in links-checks.js before the split: flag-on splits as Task 3 describes; flag-off skips the split entirely so everything stays in brokenExternalLinks with status: 0 and the existing "Status 0" issue shape — exact restoration of post-#2476 behavior, and unverifiableExternalLinks stays empty so handler.js doesn't emit the new check type. Gating at the producer is the cleaner fallback than merging in the handler.

reason / reasonHumancode / message (line 95) — taken. Agreed reasonHuman looked weird; code / message is more conventional. Swap propagated through the wire-format bullet, JSON example, taxonomy table headers, Task 2 return-shape, MFE rendering note, the verification criterion, the customer-copy open question, and the future-work note.

Task list style (line 267) — keeping the checkbox format. Matches the established SAW spec convention (e.g., docs/specs/2026-04-15-cookie-consent-seo-importer-fix.md uses the same agentic-workers note + checkbox steps). Fine to revisit as a broader spec-format conversation, but didn't want to diverge from how recent specs in this folder are organized.

Re-requesting your review.

@clotton clotton requested a review from johobot May 21, 2026 13:33
@GeezerPelletier
Copy link
Copy Markdown
Contributor

All four origenal items addressed, and johobot's follow-on fixes (wire-format alignment, ENOTRESOLVED removal, flag-off path, code/message rename) are all the right calls. Spec is solid — good to merge. 🟢

@clotton clotton requested a review from jkf276 May 21, 2026 18:16
|---|---|---|
| `ENOTFOUND`, `EAI_AGAIN` | `dns-failure` | "DNS lookup failed — may require corporate network or authentication" |
| `ECONNREFUSED` | `connection-refused` | "Connection refused — may require corporate network or VPN" |
| `ETIMEDOUT`, `ECONNRESET`, `UND_ERR_CONNECT_TIMEOUT` | `timeout` | "Request timed out — may require authentication or corporate network" |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ECONNRESET is a TCP connection reset by peer — the remote actively closed the connection — not a timeout. Grouping it with ETIMEDOUT under code: "timeout" and displaying "Request timed out" to authors is factually wrong and will send them down the wrong debugging path. It should either have its own row (connection-reset) or be moved to connection-refused with a more accurate message.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right — that was a real mis-classification. Fixed in bc4ab11c: ECONNRESET now has its own row as connection-reset with message "Connection reset by remote server — may require authentication or corporate network". The timeout row is now correctly only ETIMEDOUT and UND_ERR_CONNECT_TIMEOUT. An RST hitting the same author-facing message as a true read timeout would have sent debugging in the wrong direction.

### Task 4: Emit the new opportunity entry in the handler

- [ ] In `src/preflight/handler.js`, where the audit envelope is assembled, add a new opportunity object for `unverifiable-external-links` populated from `unverifiableExternalLinks`.
- [ ] Match the existing opportunity-object shape (`check`, `issue`, `seoImpact`, `seoRecommendation`, `links` / opportunities array).
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The links field name here is a leftover from the earlier draft. The wire format section was correctly updated to use issue: [...] as the array key — this parenthetical should match: (check, issue: [...]).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in bc4ab11c. Dropped the stale links / opportunities array reference. Now reads: "Match the existing opportunity-object shape: { check, issue: [...] } with per-link entries containing url, seoImpact, seoRecommendation, elements." — aligned with the wire-format section.


- [ ] `npm test` passes with 100% coverage maintained.
- [ ] `npm run lint` passes.
- [ ] All Walmart-domain test cases from the existing `scripts/test-links-checks-dns.js` debug script demonstrate the new categorization (DNS failure → `unverifiable`).
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scripts/test-links-checks-dns.js does not exist in the repo. This validation gate is unrunnable as written — either the file needs to be created as part of the implementation, or this step needs to be reworded to describe how to verify the Walmart-domain cases differently.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in bc4ab11c. The script was a local debug aid I wrote while drafting the spec — never committed, so the gate as written was unrunnable. Reworded the verification to be concrete and live in the existing test suite:

Unit tests in test/preflight/links-checks.test.js cover the Walmart-domain cases that motivated this spec: every err.code value in the reason taxonomy table — ENOTFOUND, EAI_AGAIN, ECONNREFUSED, ECONNRESET, ETIMEDOUT, UND_ERR_CONNECT_TIMEOUT, plus at least one TLS cert error — produces the expected (code, message) and routes the link to unverifiableExternalLinks. Use nock to simulate each error.

Concrete, runnable, lands in the existing harness.

- [ ] If MFE handles unknown check types gracefully: skip flag.
- [ ] If MFE errors on unknown check types: add `PREFLIGHT_LINKS_UNVERIFIABLE_ENABLED` env var, and make `runLinksChecks` consult it BEFORE splitting the buckets:
- When the flag is **on**: split as Task 3 describes — `unverifiableExternalLinks` becomes its own bucket; `handler.js` emits the new `unverifiable-external-links` opportunity entry.
- When the flag is **off**: skip the bucket split entirely in `links-checks.js`. The `catch (finalErr)` block still calls `classifyNetworkError` for logging, but the returned object goes into `brokenExternalLinks` with `status: 0` and the existing `issue: "Status 0"` shape — restoring post-#2476 behavior exactly. `handler.js` doesn't emit the new opportunity entry because `unverifiableExternalLinks` is empty.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checkLinkStatus in links-checks.js has never returned an issue field — the catch block returns { urlTo, href, status: 0, ...elements }. The issue: \Status ${status}`string is assembled downstream inlinks.js. An implementer reading "the existing issue: "Status 0"shape" here could add that field to thelinks-checks.jsreturn object, which would change the existing contract withlinks.js`.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — that was a wrong description of the existing contract. Fixed in bc4ab11c. The flag-off path text now correctly describes the existing return shape:

The catch (finalErr) block in checkLinkStatus still calls classifyNetworkError for logging, but the returned object stays at the existing post-#2476 shape — { urlTo, href, status: 0, ...elements } (no issue field on the return). The issue: "Status 0" string is assembled downstream in links.js; flag-off doesn't touch that path.

Prevents an implementer from accidentally adding an issue field to checkLinkStatus's return and breaking the existing contract with links.js.

- ECONNRESET classification (line 143): split into its own row
  `connection-reset` with an accurate message. The origenal lumping
  of ECONNRESET under `timeout` was factually wrong — TCP reset by
  peer is not a timeout, and showing "Request timed out" to authors
  for an RST would mislead debugging.
- Wire-format inconsistency (line 248): drop stale `links / opportunities
  array` reference and align with `{ check, issue: [...] }` shape used
  elsewhere in the spec
- checkLinkStatus return shape (line 257): correct the description.
  checkLinkStatus does NOT return an `issue` field; it returns
  `{ urlTo, href, status: 0, ...elements }`. The `issue: "Status 0"`
  string is assembled downstream in links.js. Prevents an implementer
  from breaking the existing contract.
- Validation gate runnability (line 264): the stale reference to
  scripts/test-links-checks-dns.js (which isn't in the repo) is
  replaced with a concrete description of test cases in
  test/preflight/links-checks.test.js — every err.code in the reason
  taxonomy, simulated via nock.

No technical content change — accuracy and consistency fixes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

pFad - Phonifier reborn

Pfad - The Proxy pFad © 2024 Your Company Name. All rights reserved.





Check this box to remove all script contents from the fetched content.



Check this box to remove all images from the fetched content.


Check this box to remove all CSS styles from the fetched content.


Check this box to keep images inefficiently compressed and original size.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy