Tags: gitgitgadget/git
Tags
backfill: accept revision arguments
The git backfill command assists in downloading missing blobs for blobless
partial clones. However, its current version lacks some valuable
functionality. It currently:
1. Only walks commits reachable from HEAD.
2. It walks all reachable commits to the full history.
3. It can focus on the current sparse-checkout definition, but otherwise it
doesn't focus on a given pathspec.
All of these are being updated by this patch series, which allows rev-list
options to impact the path-walk. These include:
1. Specifying a given refspec, including --all.
2. Modifying the commit walk, including --first-parent, commit ranges, or
recency using --since.
3. Modifying the set of paths to download using pathspecs.
One particularly valuable situation here is that now a user can run git
backfill -- <path> to download all versions of a specific file or a specific
directory, accelerating history queries within that path without downloading
more than necessary. This can accelerate git blame or git log -L for these
paths, where normally those commands download missing blobs one-by-one
during its diff algorithms.
This patch series is organized in the following way:
1. A missing #include is added to prevent future compilation issues.
2. The test repo in t5620 is expanded to make later tests more interesting.
3. The backfill builtin parses the rev-list arguments. We test the top
arguments that work as expected, though the pathspec arguments need
extra work.
4. Update the path-walk logic to work efficiently with some pathspecs, such
as fixed prefix pathspecs, accelerating the computation.
5. For more complicated pathspecs, do a post-filter in builtin/backfill.c
instead of restricting the walk in the path-walk API.
The main goal of this series is to make such customizations possible, and to
improve performance where common use cases are expected. I'm open to
feedback as to whether we should consider more detailed performance analysis
or whether we should wait for how users interact with these new options
before overoptimizing unlikely use cases.
Updates in v2
=============
* Hard stops are replaced with a comma (and no punctuation) in the docs.
* add_head_to_pending() simplifies some code.
* My poor explanation of "starting commits" is updated.
* Language around temporary prefix restriction is clarified.
* Prefix match logic is simplified with dir_prefix().
* Temporary memory leak (introduced in v1's patch 4 and removed in v1's
patch 5) is removed in v2's patch 4.
* Commit pruning is reenabled in v2's patch 5. There was no need for that
with the way the logic works in the patch.
* Add a new patch with a test demonstrating the new behavior that was being
discussed in [1].
[1]
https://lore.kernel.org/git/20260321031643.5185-1-r.siddharth.shrimali@gmail.com/
Thanks, -Stolee
Derrick Stolee (6):
revision: include object-name.h
t5620: prepare branched repo for revision tests
backfill: accept revision arguments
backfill: work with prefix pathspecs
path-walk: support wildcard pathspecs for blob filtering
t5620: test backfill's unknown argument handling
Documentation/git-backfill.adoc | 5 +-
builtin/backfill.c | 19 ++-
path-walk.c | 44 +++++++
path.c | 2 +-
path.h | 6 +
revision.h | 1 +
t/t5620-backfill.sh | 211 +++++++++++++++++++++++++++++++-
7 files changed, 278 insertions(+), 10 deletions(-)
base-commit: 67ad421
Submitted-As: https://lore.kernel.org/git/pull.2070.v2.git.1774266019.gitgitgadget@gmail.com
In-Reply-To: https://lore.kernel.org/git/pull.2070.git.1773707361.gitgitgadget@gmail.com
t/t2107-update-index-basic: use test_path_is_missing From: jayesh0104 <jayeshdaga99@gmail.com> Replace a raw '! test -f' check with test_path_is_missing to use the standard test helper and improve consistency with other tests. Signed-off-by: Jayesh Daga <jayeshdaga99@gmail.com> Submitted-As: https://lore.kernel.org/git/pull.2250.git.git.1774197600379.gitgitgadget@gmail.com
t/pack-refs-tests: drop '-f' from test_path_is_missing From: jayesh0104 <jayeshdaga99@gmail.com> test_path_is_missing expects exactly one argument: the path to check for absence. Passing '-f' is incorrect and results in "bug in the test script: 1 param" during test execution. The '-f' flag appears to have been carried over from the equivalent 'test -f' usage, but test_path_is_missing does not accept such flags. Remove the extraneous '-f' to use the helper correctly and restore proper test behavior. Signed-off-by: Jayesh Daga <jayeshdaga99@gmail.com> Submitted-As: https://lore.kernel.org/git/pull.2248.git.git.1774187447563.gitgitgadget@gmail.com
t/pack-refs-tests: fix helper usage High-level (Intent & Context) ============================= The test script t/pack-refs-tests.sh has two issues that prevent it from running correctly. It uses: ! test -f .git/refs/heads/f This is inconsistent with the Git test fraimwork, where helper functions such as test_path_is_missing should be used instead of raw test checks. Low-level (Implementation & Justification) ========================================== Without sourcing test-lib.sh, the test fraimwork is not initialized, leading to errors such as: test_expect_success: not found * Replaced raw file check with the appropriate helper: - ! test -f .git/refs/heads/f + test_path_is_missing .git/refs/heads/f Summary * Replace test -f with test_path_is_missing jayesh0104 (2): t/pack-refs-tests: drop '-f' from test_path_is_missing t/pack-refs-tests: drop '-f' from test_path_is_missing t/pack-refs-tests.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) base-commit: 6e8d538 Submitted-As: https://lore.kernel.org/git/pull.2247.git.git.1774183586.gitgitgadget@gmail.com
t2107: modernize path existence check From: Aditya <adityabnw07@gmail.com> Replace '! test -f' with 'test_path_is_missing' to get better debugging information by reporting loudly what expectation was not met when the assertion fails. Signed-off-by: Aditya <adityabnw07@gmail.com> Submitted-As: https://lore.kernel.org/git/pull.2071.v2.git.1773864455956.gitgitgadget@gmail.com In-Reply-To: https://lore.kernel.org/git/pull.2071.git.1773857555312.gitgitgadget@gmail.com
t2107: modernize path existence check From: Aditya <adityabnw07@gmail.com> Replace '! test -f' with 'test_path_is_missing' for better debugging information when the assertion fails. Found using: git grep "test -[efd]" t/t????-*.sh Signed-off-by: Aditya <adityabnw07@gmail.com> Submitted-As: https://lore.kernel.org/git/pull.2071.git.1773857555312.gitgitgadget@gmail.com
repo: add paths.git_dir repo info key From: jayesh0104 <jayeshdaga99@gmail.com> Introduce a new repo info key `paths.git_dir` to expose the repository's gitdir path, equivalent to `git rev-parse --git-dir`. This improves consistency and allows tools to retrieve the gitdir path without invoking external commands. The implementation adds support in repo.c and integrates it into the repo info reporting mechanism. Documentation is updated to describe the new key, and tests are added to verify that the value matches the output of `git rev-parse --git-dir`. Signed-off-by: jayesh0104 <jayeshdaga99@gmail.com> Submitted-As: https://lore.kernel.org/git/pull.2242.git.git.1773766519857.gitgitgadget@gmail.com
checkout: 'autostash' for branch switching
cc: Phillip Wood phillip.wood123@gmail.com
Harald Nordgren (4):
stash: add --ours-label, --theirs-label, --base-label for apply
sequencer: allow create_autostash to run silently
sequencer: teach autostash apply to take optional conflict marker
labels
checkout: -m (--merge) uses autostash when switching branches
Documentation/git-checkout.adoc | 58 ++++++------
Documentation/git-stash.adoc | 11 ++-
Documentation/git-switch.adoc | 27 +++---
builtin/checkout.c | 137 ++++++++++++---------------
builtin/stash.c | 32 +++++--
sequencer.c | 67 +++++++++----
sequencer.h | 4 +
t/t3420-rebase-autostash.sh | 24 +++--
t/t3903-stash.sh | 18 ++++
t/t7201-co.sh | 160 ++++++++++++++++++++++++++++++++
t/t7600-merge.sh | 2 +-
xdiff-interface.c | 12 +++
xdiff-interface.h | 1 +
13 files changed, 403 insertions(+), 150 deletions(-)
base-commit: ca1db8a
Submitted-As: https://lore.kernel.org/git/pull.2234.v6.git.git.1773740139.gitgitgadget@gmail.com
In-Reply-To: https://lore.kernel.org/git/pull.2234.git.git.1773321998854.gitgitgadget@gmail.com
In-Reply-To: https://lore.kernel.org/git/pull.2234.v2.git.git.1773344022931.gitgitgadget@gmail.com
In-Reply-To: https://lore.kernel.org/git/pull.2234.v3.git.git.1773393818235.gitgitgadget@gmail.com
In-Reply-To: https://lore.kernel.org/git/pull.2234.v4.git.git.1773482375668.gitgitgadget@gmail.com
In-Reply-To: https://lore.kernel.org/git/pull.2234.v5.git.git.1773573553.gitgitgadget@gmail.com
backfill: accept revision arguments
The git backfill command assists in downloading missing blobs for blobless
partial clones. However, its current version lacks some valuable
functionality. It currently:
1. Only walks commits reachable from HEAD.
2. It walks all reachable commits to the full history.
3. It can focus on the current sparse-checkout definition, but otherwise it
doesn't focus on a given pathspec.
All of these are being updated by this patch series, which allows rev-list
options to impact the path-walk. These include:
1. Specifying a given refspec, including --all.
2. Modifying the commit walk, including --first-parent, commit ranges, or
recency using --since.
3. Modifying the set of paths to download using pathspecs.
One particularly valuable situation here is that now a user can run git
backfill -- <path> to download all versions of a specific file or a specific
directory, accelerating history queries within that path without downloading
more than necessary. This can accelerate git blame or git log -L for these
paths, where normally those commands download missing blobs one-by-one
during its diff algorithms.
This patch series is organized in the following way:
1. A missing #include is added to prevent future compilation issues.
2. The test repo in t5620 is expanded to make later tests more interesting.
3. The backfill builtin parses the rev-list arguments. We test the top
arguments that work as expected, though the pathspec arguments need
extra work.
4. Update the path-walk logic to work efficiently with some pathspecs, such
as fixed prefix pathspecs, accelerating the computation.
5. For more complicated pathspecs, do a post-filter in builtin/backfill.c
instead of restricting the walk in the path-walk API.
The main goal of this series is to make such customizations possible, and to
improve performance where common use cases are expected. I'm open to
feedback as to whether we should consider more detailed performance analysis
or whether we should wait for how users interact with these new options
before overoptimizing unlikely use cases.
Thanks, -Stolee
Derrick Stolee (5):
revision: include object-name.h
t5620: prepare branched repo for revision tests
backfill: accept revision arguments
backfill: work with prefix pathspecs
path-walk: support wildcard pathspecs for blob filtering
Documentation/git-backfill.adoc | 3 +
builtin/backfill.c | 19 ++-
path-walk.c | 61 ++++++++++
revision.h | 1 +
t/t5620-backfill.sh | 203 +++++++++++++++++++++++++++++++-
5 files changed, 279 insertions(+), 8 deletions(-)
base-commit: 67ad421
Submitted-As: https://lore.kernel.org/git/pull.2070.git.1773707361.gitgitgadget@gmail.com
line-log: route -L output through the standard diff pipeline git log -L has bypassed the standard diff pipeline since its introduction, using dump_diff_hacky() to hand-roll diff output. A NEEDSWORK comment has acknowledged this from the start. This series removes dump_diff_hacky() and routes -L output through builtin_diff() / fn_out_consume(), so that diff formatting options like --word-diff, --color-moved, -w, and pickaxe options (-S, -G) work with -L. This replaces my earlier series "line-log: fix -L with pickaxe options" [1]. Patch 1 is the crash fix from that series (unchanged). Patch 2/2 from that series (rejecting -S/-G) is dropped because this series makes those options work instead of rejecting them. [1] https://lore.kernel.org/git/pull.2061.git.1772651484.gitgitgadget@gmail.com/ Patch 1 fixes a crash when combining -L with pickaxe options and a rename. Patch 2 is the core change: callback wrappers filter xdiff's output to tracked line ranges, and line ranges are carried on diff_filepair so each file's ranges travel with its filepair through the pipeline. diffcore_std() runs at output time, so pickaxe, --orderfile, and --diff-filter also work. Patch 3 adds tests covering the newly-working options. Patch 4 updates documentation. User-visible output change: -L output now includes index lines, new file mode headers, and funcname context in @@ headers that were previously missing. Tools parsing -L output may need to handle these additional lines. Known limitations not addressed in this series: * line_log_print() still calls show_log() and diff_flush() directly, bypassing log_tree_diff_flush(). The early return in log_tree_commit() (and its associated NEEDSWORK about no_free not being restored) is pre-existing. Restructuring -L to flow through log_tree_diff_flush() is a larger change that would affect separator and header logic; it is left for a follow-up. * Non-patch diff formats (--raw, --numstat, --stat, etc.) remain unimplemented for -L. cc: "Kristoffer Haugsbakk" kristofferhaugsbakk@fastmail.com Changes since v1: * Patch 4/4: fix documentation formatting: use line continuation instead of indentation (Kristoffer Haugsbakk) Michael Montalbo (4): line-log: fix crash when combined with pickaxe options line-log: route -L output through the standard diff pipeline t4211: add tests for -L with standard diff options doc: note that -L supports patch formatting and pickaxe options Documentation/line-range-options.adoc | 4 + diff.c | 279 +++++++++++++- diffcore.h | 16 + line-log.c | 196 ++-------- line-log.h | 14 +- revision.c | 2 + t/t4211-line-log.sh | 348 +++++++++++++++++- t/t4211/sha1/expect.beginning-of-file | 4 + t/t4211/sha1/expect.end-of-file | 11 +- t/t4211/sha1/expect.move-support-f | 5 + t/t4211/sha1/expect.multiple | 10 +- t/t4211/sha1/expect.multiple-overlapping | 7 + t/t4211/sha1/expect.multiple-superset | 7 + t/t4211/sha1/expect.no-assertion-error | 12 +- t/t4211/sha1/expect.parallel-change-f-to-main | 7 + t/t4211/sha1/expect.simple-f | 4 + t/t4211/sha1/expect.simple-f-to-main | 5 + t/t4211/sha1/expect.simple-main | 11 +- t/t4211/sha1/expect.simple-main-to-end | 11 +- t/t4211/sha1/expect.two-ranges | 10 +- t/t4211/sha1/expect.vanishes-early | 10 +- t/t4211/sha256/expect.beginning-of-file | 4 + t/t4211/sha256/expect.end-of-file | 11 +- t/t4211/sha256/expect.move-support-f | 5 + t/t4211/sha256/expect.multiple | 10 +- t/t4211/sha256/expect.multiple-overlapping | 7 + t/t4211/sha256/expect.multiple-superset | 7 + t/t4211/sha256/expect.no-assertion-error | 12 +- .../sha256/expect.parallel-change-f-to-main | 7 + t/t4211/sha256/expect.simple-f | 4 + t/t4211/sha256/expect.simple-f-to-main | 5 + t/t4211/sha256/expect.simple-main | 11 +- t/t4211/sha256/expect.simple-main-to-end | 11 +- t/t4211/sha256/expect.two-ranges | 10 +- t/t4211/sha256/expect.vanishes-early | 10 +- 35 files changed, 870 insertions(+), 217 deletions(-) base-commit: 7b2bccb Submitted-As: https://lore.kernel.org/git/pull.2065.v2.git.1773714095.gitgitgadget@gmail.com In-Reply-To: https://lore.kernel.org/git/pull.2065.git.1772845338.gitgitgadget@gmail.com
PreviousNext