Skip pnpm install in UI pre-commit hook when deps are unchanged#64124
Skip pnpm install in UI pre-commit hook when deps are unchanged#64124dstandish wants to merge 1 commit into
Conversation
Hashing package.json + pnpm-lock.yaml and caching the result in .build/ui/pnpm-install-hash.txt allows the hook to skip the expensive pnpm install step when neither file has changed and node_modules already exists. This mirrors the hash-based caching pattern already used in compile_ui_assets.py.
|
@potiuk not a huge time savings but 1s is something |
bugraoz93
left a comment
There was a problem hiding this comment.
It looks good! Only added a nit on utilising the common processes to leverage the same file rather than creating additional one although it is not blocking
Oh absolutely - anythind we can shave-off from a thing that is executed 1000s of times but 100s of people is good. |
There was a problem hiding this comment.
Pull request overview
This PR speeds up the ts-compile-lint-ui prek hook by avoiding repeated pnpm install runs when UI dependencies have not changed, using a stored SHA256 hash of package.json + pnpm-lock.yaml as a cache key.
Changes:
- Compute a SHA256 hash of
airflow-core/src/airflow/ui/package.jsonandpnpm-lock.yaml. - Skip
pnpm install --frozen-lockfilewhen the stored hash matches andnode_modules/exists. - Persist the computed hash in
.build/ui/pnpm-install-hash.txtafter a successful install.
| hash_file = AIRFLOW_ROOT_PATH / ".build" / "ui" / "pnpm-install-hash.txt" | ||
| node_modules = dir / "node_modules" | ||
| current_hash = hashlib.sha256( | ||
| (dir / "package.json").read_bytes() + (dir / "pnpm-lock.yaml").read_bytes() | ||
| ).hexdigest() | ||
| stored_hash = hash_file.read_text().strip() if hash_file.exists() else "" | ||
|
|
||
| if node_modules.exists() and current_hash == stored_hash: | ||
| print("pnpm deps unchanged — skipping install.") | ||
| else: | ||
| run_command(["pnpm", "config", "set", "store-dir", ".pnpm-store"], cwd=dir) | ||
| run_command(["pnpm", "install", "--frozen-lockfile", "--config.confirmModulesPurge=false"], cwd=dir) |
There was a problem hiding this comment.
The cache-hit condition can become a false positive if node_modules/ is modified outside this hook (e.g., running pnpm install manually or via another prek hook) without updating .build/ui/pnpm-install-hash.txt. In that case current_hash == stored_hash can still be true while node_modules no longer corresponds to the current lockfile, causing the hook to skip the needed install and then run eslint/tsc against mismatched deps. Consider adding an extra validity check tied to the actual node_modules state (e.g., compare timestamps, or read pnpm’s node_modules/.modules.yaml lockfile checksum) before skipping.
| current_hash = hashlib.sha256( | ||
| (dir / "package.json").read_bytes() + (dir / "pnpm-lock.yaml").read_bytes() | ||
| ).hexdigest() |
There was a problem hiding this comment.
Hashing by simple byte concatenation can (in rare but real cases) produce the same combined byte string for different (package.json, pnpm-lock.yaml) pairs (e.g., if bytes are shifted between files), causing a false cache hit and skipping a required install. To make the hash unambiguous, include a delimiter and/or the length of each file, or hash each file separately and combine the digests.
| current_hash = hashlib.sha256( | |
| (dir / "package.json").read_bytes() + (dir / "pnpm-lock.yaml").read_bytes() | |
| ).hexdigest() | |
| package_json_bytes = (dir / "package.json").read_bytes() | |
| pnpm_lock_bytes = (dir / "pnpm-lock.yaml").read_bytes() | |
| current_hash_hasher = hashlib.sha256() | |
| current_hash_hasher.update(f"package.json:{len(package_json_bytes)}:".encode("utf-8")) | |
| current_hash_hasher.update(package_json_bytes) | |
| current_hash_hasher.update(f"pnpm-lock.yaml:{len(pnpm_lock_bytes)}:".encode("utf-8")) | |
| current_hash_hasher.update(pnpm_lock_bytes) | |
| current_hash = current_hash_hasher.hexdigest() |
| hash_file.parent.mkdir(parents=True, exist_ok=True) | ||
| hash_file.write_text(current_hash) |
There was a problem hiding this comment.
Writing the hash file non-atomically can leave a truncated/partial hash if the process is interrupted mid-write (or if multiple runs overlap), leading to unnecessary reinstalls or inconsistent behavior. Prefer an atomic write pattern (write to a temp file in the same directory, then replace/rename) to make the cache robust.
| ).hexdigest() | ||
| stored_hash = hash_file.read_text().strip() if hash_file.exists() else "" | ||
|
|
||
| if node_modules.exists() and current_hash == stored_hash: |
There was a problem hiding this comment.
node_modules.exists() will also be true if there’s a file (or unexpected path type) named node_modules. Using node_modules.is_dir() (or checking exists() + is_dir()) makes the guard more precise and avoids incorrectly skipping installs in edge cases.
| if node_modules.exists() and current_hash == stored_hash: | |
| if node_modules.is_dir() and current_hash == stored_hash: |
The
Compile / format / lint UIpre-commit hook runspnpm install --frozen-lockfileon every invocation, even whenpackage.jsonandpnpm-lock.yamlhaven't changed. This is the main source of slowness in the hook.This PR adds a hash-based guard: SHA256 of
package.json+pnpm-lock.yamlis stored in.build/ui/pnpm-install-hash.txt(consistent with the caching pattern already used incompile_ui_assets.py). When the hash matches andnode_modules/exists,pnpm installis skipped entirely.Benchmarks
Measured on macOS Darwin 25.3.0, Apple Silicon, pnpm 9.9.0:
pnpm install(cold pnpm store)pnpm install(warm pnpm store)Typical dev savings: ~1.0–1.4 s per commit (~14–20x faster on the hot path) for contributors who frequently touch UI files without changing deps — the common case.
Cold-store savings (~22 s) apply on a fresh checkout. Cache miss adds only ~0.08 s of overhead before falling through to the normal install path.
The hash covers both
package.jsonandpnpm-lock.yaml, so any direct or transitive dep change correctly invalidates the cache.Was generative AI tooling used to co-author this PR?
Generated-by: Claude Sonnet 4.6 following the guidelines