From e417277ae99687b576e48cb477a7a50241ea0096 Mon Sep 17 00:00:00 2001 From: Yee Cheng Chin Date: Mon, 2 Mar 2026 14:54:25 +0000 Subject: [PATCH 01/33] xdiff: re-diff shifted change groups when using histogram algorithm After a diff algorithm has been run, the compaction phase (xdl_change_compact()) shifts and merges change groups to produce a cleaner output. However, this shifting could create a new matched group where both sides now have matching lines. This results in a wrong-looking diff output which contains redundant lines that are the same on both files. Fix this by detecting this situation, and re-diff the texts on each side to find similar lines, using the fall-back Myer's diff. Only do this for histogram diff as it's the only algorithm where this is relevant. Below contains an example, and more details. For an example, consider two files below: file1: A A A A A A A file2: A A x A A A A When using Myer's diff, the algorithm finds that only the "x" has been changed, and produces a final diff result (these are line diffs, but using word-diff syntax for ease of presentation): A A[-A-]{+x+}A AAA When using histogram diff, the algorithm first discovers the LCS "A AAA", which it uses as anchor, then produces an intermediate diff: {+A Ax+}A AAA[- AAA-]. This is a longer diff than Myer's, but it's still self-consistent. However, the compaction phase attempts to shift the first file's diff group upwards (note that this shift crosses the anchor that histogram had used), leading to the final results for histogram diff: [-A AA-]{+A Ax+}A AAA This is a technically correct patch but looks clearly redundant to a human as the first 3 lines should not be in the diff. The fix would detect that a shift has caused matching to a new group, and re-diff the "A AA" and "A Ax" parts, which results in "A A" correctly re-marked as unchanged. This creates the now correct histogram diff: A A[-A-]{+x+}A AAA This issue is not applicable to Myer's diff algorithm as it already generates a minimal diff, which means a shift cannot result in a smaller diff output (the default Myer's diff in xdiff is not guaranteed to be minimal for performance reasons, but it typically does a good enough job). It's also not applicable to patience diff, because it uses only unique lines as anchor for its splits, and falls back to Myer's diff within each split. Shifting requires both ends having the same lines, and therefore cannot cross the unique line boundaries established by the patience algorithm. In contrast histogram diff uses non-unique lines as anchors, and therefore shifting can cross over them. This issue is rare in a normal repository. Below is a table of repositories (`git log --no-merges -p --histogram -1000`), showing how many times a re-diff was done and how many times it resulted in finding matching lines (therefore addressing this issue) with the fix. In general it is fewer than 1% of diff's that exhibit this offending behavior: | Repo (1k commits) | Re-diff | Found matching lines | |--------------------|---------|----------------------| | llvm-project | 45 | 11 | | vim | 110 | 9 | | git | 18 | 2 | | WebKit | 168 | 1 | | ripgrep | 22 | 1 | | cpython | 32 | 0 | | vscode | 13 | 0 | Signed-off-by: Yee Cheng Chin Signed-off-by: Junio C Hamano --- t/meson.build | 1 + t/t4074-diff-shifted-matched-group.sh | 164 ++++++++++++++++++++++++++ xdiff/xdiffi.c | 47 +++++++- 3 files changed, 210 insertions(+), 2 deletions(-) create mode 100755 t/t4074-diff-shifted-matched-group.sh diff --git a/t/meson.build b/t/meson.build index a5531df415ffe2..d6c61b5ab55ea5 100644 --- a/t/meson.build +++ b/t/meson.build @@ -496,6 +496,7 @@ integration_tests = [ 't4070-diff-pairs.sh', 't4071-diff-minimal.sh', 't4072-diff-max-depth.sh', + 't4074-diff-shifted-matched-group.sh', 't4100-apply-stat.sh', 't4101-apply-nonl.sh', 't4102-apply-rename.sh', diff --git a/t/t4074-diff-shifted-matched-group.sh b/t/t4074-diff-shifted-matched-group.sh new file mode 100755 index 00000000000000..d77fa3b79d2b53 --- /dev/null +++ b/t/t4074-diff-shifted-matched-group.sh @@ -0,0 +1,164 @@ +#!/bin/sh + +test_description='shifted diff groups re-diffing during histogram diff' + +. ./test-lib.sh + +test_expect_success 'shifted/merged diff group should re-diff to minimize patch' ' + test_write_lines A x A A A x A A A >file1 && + test_write_lines A x A Z A x A A A >file2 && + + file1_h=$(git rev-parse --short $(git hash-object file1)) && + file2_h=$(git rev-parse --short $(git hash-object file2)) && + + cat >expect <<-EOF && + diff --git a/file1 b/file2 + index $file1_h..$file2_h 100644 + --- a/file1 + +++ b/file2 + @@ -1,7 +1,7 @@ + A + x + A + -A + +Z + A + x + A + EOF + + test_expect_code 1 git diff --no-index --histogram file1 file2 >output && + test_cmp expect output +' + +test_expect_success 'merged diff group with no shift' ' + test_write_lines A Z B x >file1 && + test_write_lines C D x Z E x >file2 && + + file1_h=$(git rev-parse --short $(git hash-object file1)) && + file2_h=$(git rev-parse --short $(git hash-object file2)) && + + cat >expect <<-EOF && + diff --git a/file1 b/file2 + index $file1_h..$file2_h 100644 + --- a/file1 + +++ b/file2 + @@ -1,4 +1,6 @@ + -A + +C + +D + +x + Z + -B + +E + x + EOF + + test_expect_code 1 git diff --no-index --histogram file1 file2 >output && + test_cmp expect output +' + +test_expect_success 're-diff should preserve diff flags' ' + test_write_lines a b c a b c >file1 && + test_write_lines x " b" z a b c >file2 && + + file1_h=$(git rev-parse --short $(git hash-object file1)) && + file2_h=$(git rev-parse --short $(git hash-object file2)) && + + cat >expect <<-EOF && + diff --git a/file1 b/file2 + index $file1_h..$file2_h 100644 + --- a/file1 + +++ b/file2 + @@ -1,6 +1,6 @@ + -a + -b + -c + +x + + b + +z + a + b + c + EOF + + test_expect_code 1 git diff --no-index --histogram file1 file2 >output && + test_cmp expect output && + + cat >expect_iwhite <<-EOF && + diff --git a/file1 b/file2 + index $file1_h..$file2_h 100644 + --- a/file1 + +++ b/file2 + @@ -1,6 +1,6 @@ + -a + +x + b + -c + +z + a + b + c + EOF + + test_expect_code 1 git diff --no-index --histogram --ignore-all-space file1 file2 >output_iwhite && + test_cmp expect_iwhite output_iwhite +' + +test_expect_success 'shifting on either side should trigger re-diff properly' ' + test_write_lines a b c a b c a b c >file1 && + test_write_lines a b c a1 a2 a3 b c1 a b c >file2 && + + file1_h=$(git rev-parse --short $(git hash-object file1)) && + file2_h=$(git rev-parse --short $(git hash-object file2)) && + + cat >expect1 <<-EOF && + diff --git a/file1 b/file2 + index $file1_h..$file2_h 100644 + --- a/file1 + +++ b/file2 + @@ -1,9 +1,11 @@ + a + b + c + -a + +a1 + +a2 + +a3 + b + -c + +c1 + a + b + c + EOF + + test_expect_code 1 git diff --no-index --histogram file1 file2 >output1 && + test_cmp expect1 output1 && + + cat >expect2 <<-EOF && + diff --git a/file2 b/file1 + index $file2_h..$file1_h 100644 + --- a/file2 + +++ b/file1 + @@ -1,11 +1,9 @@ + a + b + c + -a1 + -a2 + -a3 + +a + b + -c1 + +c + a + b + c + EOF + + test_expect_code 1 git diff --no-index --histogram file2 file1 >output2 && + test_cmp expect2 output2 +' + +test_done diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c index 6f3998ee54c01e..23cc64efd8112f 100644 --- a/xdiff/xdiffi.c +++ b/xdiff/xdiffi.c @@ -793,6 +793,7 @@ static int group_slide_up(xdfile_t *xdf, struct xdlgroup *g) */ int xdl_change_compact(xdfile_t *xdf, xdfile_t *xdfo, long flags) { struct xdlgroup g, go; + struct xdlgroup g_orig; long earliest_end, end_matching_other; long groupsize; @@ -806,10 +807,12 @@ int xdl_change_compact(xdfile_t *xdf, xdfile_t *xdfo, long flags) { if (g.end == g.start) goto next; + g_orig = g; + /* * Now shift the change up and then down as far as possible in * each direction. If it bumps into any other changes, merge - * them. + * them and restart the process. */ do { groupsize = g.end - g.start; @@ -862,7 +865,8 @@ int xdl_change_compact(xdfile_t *xdf, xdfile_t *xdfo, long flags) { /* * Move the possibly merged group of changes back to * line up with the last group of changes from the - * other file that it can align with. + * other file that it can align with. This avoids breaking + * a single change into a separate addition/deletion. */ while (go.end == go.start) { if (group_slide_up(xdf, &g)) @@ -915,6 +919,45 @@ int xdl_change_compact(xdfile_t *xdf, xdfile_t *xdfo, long flags) { } } + /* + * If we merged change groups during shifting, the new + * combined group could now have matching lines in both files, + * even if the original separate groups did not. Re-diff the + * new group to find these matching lines to mark them as + * unchanged. + * + * Only do this if the corresponding group in the other file is + * non-empty, as it's trivial otherwise. + * + * Only do this for histogram diff as its LCS algorithm allows + * for this scenario. In contrast, patience diff finds LCS + * of unique lines that groups cannot be shifted across. + * Myer's diff (standalone or used as fall-back in patience + * diff) already finds minimal edits so it is not possible for + * shifted groups to result in a smaller diff. (Without + * XDF_NEED_MINIMAL, Myer's isn't technically guaranteed to be + * minimal, but it should be so most of the time) + */ + if (go.end != go.start && + XDF_DIFF_ALG(flags) == XDF_HISTOGRAM_DIFF && + (g.start != g_orig.start || + g.end != g_orig.end)) { + xpparam_t xpp; + xdfenv_t xe; + + memset(&xpp, 0, sizeof(xpp)); + xpp.flags = flags & ~XDF_DIFF_ALGORITHM_MASK; + + xe.xdf1 = *xdf; + xe.xdf2 = *xdfo; + + if (xdl_fall_back_diff(&xe, &xpp, + g.start + 1, g.end - g.start, + go.start + 1, go.end - go.start)) { + return -1; + } + } + next: /* Move past the just-processed group: */ if (group_next(xdf, &g)) From 6e4d923267ca80dd1392bf7e0673c74711e8cb68 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Mon, 2 Mar 2026 13:13:05 +0100 Subject: [PATCH 02/33] add-patch: split out header from "add-interactive.h" While we have a "add-patch.c" code file, its declarations are part of "add-interactive.h". This makes it somewhat harder than necessary to find relevant code and to identify clear boundaries between the two subsystems. Split up concerns and move declarations that relate to "add-patch.c" into a new "add-patch.h" header. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- add-interactive.h | 24 +++--------------------- add-patch.c | 1 + add-patch.h | 27 +++++++++++++++++++++++++++ 3 files changed, 31 insertions(+), 21 deletions(-) create mode 100644 add-patch.h diff --git a/add-interactive.h b/add-interactive.h index 784339777509f7..6c62489bfef287 100644 --- a/add-interactive.h +++ b/add-interactive.h @@ -1,15 +1,11 @@ #ifndef ADD_INTERACTIVE_H #define ADD_INTERACTIVE_H +#include "add-patch.h" #include "color.h" -struct add_p_opt { - int context; - int interhunkcontext; - int auto_advance; -}; - -#define ADD_P_OPT_INIT { .context = -1, .interhunkcontext = -1, .auto_advance = 1 } +struct pathspec; +struct repository; struct add_i_state { struct repository *r; @@ -37,21 +33,7 @@ void init_add_i_state(struct add_i_state *s, struct repository *r, struct add_p_opt *add_p_opt); void clear_add_i_state(struct add_i_state *s); -struct repository; -struct pathspec; int run_add_i(struct repository *r, const struct pathspec *ps, struct add_p_opt *add_p_opt); -enum add_p_mode { - ADD_P_ADD, - ADD_P_STASH, - ADD_P_RESET, - ADD_P_CHECKOUT, - ADD_P_WORKTREE, -}; - -int run_add_p(struct repository *r, enum add_p_mode mode, - struct add_p_opt *o, const char *revision, - const struct pathspec *ps); - #endif diff --git a/add-patch.c b/add-patch.c index 8c03f710d380c1..8ce2fc02f6d2e5 100644 --- a/add-patch.c +++ b/add-patch.c @@ -3,6 +3,7 @@ #include "git-compat-util.h" #include "add-interactive.h" +#include "add-patch.h" #include "advice.h" #include "editor.h" #include "environment.h" diff --git a/add-patch.h b/add-patch.h new file mode 100644 index 00000000000000..88b00ca788722a --- /dev/null +++ b/add-patch.h @@ -0,0 +1,27 @@ +#ifndef ADD_PATCH_H +#define ADD_PATCH_H + +struct pathspec; +struct repository; + +struct add_p_opt { + int context; + int interhunkcontext; + int auto_advance; +}; + +#define ADD_P_OPT_INIT { .context = -1, .interhunkcontext = -1, .auto_advance = 1 } + +enum add_p_mode { + ADD_P_ADD, + ADD_P_STASH, + ADD_P_RESET, + ADD_P_CHECKOUT, + ADD_P_WORKTREE, +}; + +int run_add_p(struct repository *r, enum add_p_mode mode, + struct add_p_opt *o, const char *revision, + const struct pathspec *ps); + +#endif From e3d4d7787cc3b2f0281e808042ceaa08e05c281b Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Mon, 2 Mar 2026 13:13:06 +0100 Subject: [PATCH 03/33] add-patch: split out `struct interactive_options` The `struct add_p_opt` is reused both by our infra for "git add -p" and "git add -i". Users of `run_add_i()` for example are expected to pass `struct add_p_opt`. This is somewhat confusing and raises the question of which options apply to what part of the stack. But things are even more confusing than that: while callers are expected to pass in `struct add_p_opt`, these options ultimately get used to initialize a `struct add_i_state` that is used by both subsystems. So we are basically going full circle here. Refactor the code and split out a new `struct interactive_options` that hosts common options used by both. These options are then applied to a `struct interactive_config` that hosts common configuration. This refactoring doesn't yet fully detangle the two subsystems from one another, as we still end up calling `init_add_i_state()` in the "git add -p" subsystem. This will be fixed in a subsequent commit. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- add-interactive.c | 177 +++++++++--------------------------------- add-interactive.h | 24 +----- add-patch.c | 187 ++++++++++++++++++++++++++++++++++++--------- add-patch.h | 38 ++++++++- builtin/add.c | 26 +++---- builtin/checkout.c | 4 +- builtin/commit.c | 16 ++-- builtin/reset.c | 20 ++--- builtin/stash.c | 54 ++++++------- commit.h | 2 +- 10 files changed, 290 insertions(+), 258 deletions(-) diff --git a/add-interactive.c b/add-interactive.c index 158063968266ad..152e2a02979e45 100644 --- a/add-interactive.c +++ b/add-interactive.c @@ -3,7 +3,6 @@ #include "git-compat-util.h" #include "add-interactive.h" #include "color.h" -#include "config.h" #include "diffcore.h" #include "gettext.h" #include "hash.h" @@ -20,120 +19,18 @@ #include "prompt.h" #include "tree.h" -static void init_color(struct repository *r, enum git_colorbool use_color, - const char *section_and_slot, char *dst, - const char *default_color) -{ - char *key = xstrfmt("color.%s", section_and_slot); - const char *value; - - if (!want_color(use_color)) - dst[0] = '\0'; - else if (repo_config_get_value(r, key, &value) || - color_parse(value, dst)) - strlcpy(dst, default_color, COLOR_MAXLEN); - - free(key); -} - -static enum git_colorbool check_color_config(struct repository *r, const char *var) -{ - const char *value; - enum git_colorbool ret; - - if (repo_config_get_value(r, var, &value)) - ret = GIT_COLOR_UNKNOWN; - else - ret = git_config_colorbool(var, value); - - /* - * Do not rely on want_color() to fall back to color.ui for us. It uses - * the value parsed by git_color_config(), which may not have been - * called by the main command. - */ - if (ret == GIT_COLOR_UNKNOWN && - !repo_config_get_value(r, "color.ui", &value)) - ret = git_config_colorbool("color.ui", value); - - return ret; -} - void init_add_i_state(struct add_i_state *s, struct repository *r, - struct add_p_opt *add_p_opt) + struct interactive_options *opts) { s->r = r; - s->context = -1; - s->interhunkcontext = -1; - s->auto_advance = add_p_opt->auto_advance; - - s->use_color_interactive = check_color_config(r, "color.interactive"); - - init_color(r, s->use_color_interactive, "interactive.header", - s->header_color, GIT_COLOR_BOLD); - init_color(r, s->use_color_interactive, "interactive.help", - s->help_color, GIT_COLOR_BOLD_RED); - init_color(r, s->use_color_interactive, "interactive.prompt", - s->prompt_color, GIT_COLOR_BOLD_BLUE); - init_color(r, s->use_color_interactive, "interactive.error", - s->error_color, GIT_COLOR_BOLD_RED); - strlcpy(s->reset_color_interactive, - want_color(s->use_color_interactive) ? GIT_COLOR_RESET : "", COLOR_MAXLEN); - - s->use_color_diff = check_color_config(r, "color.diff"); - - init_color(r, s->use_color_diff, "diff.frag", s->fraginfo_color, - diff_get_color(s->use_color_diff, DIFF_FRAGINFO)); - init_color(r, s->use_color_diff, "diff.context", s->context_color, - "fall back"); - if (!strcmp(s->context_color, "fall back")) - init_color(r, s->use_color_diff, "diff.plain", - s->context_color, - diff_get_color(s->use_color_diff, DIFF_CONTEXT)); - init_color(r, s->use_color_diff, "diff.old", s->file_old_color, - diff_get_color(s->use_color_diff, DIFF_FILE_OLD)); - init_color(r, s->use_color_diff, "diff.new", s->file_new_color, - diff_get_color(s->use_color_diff, DIFF_FILE_NEW)); - strlcpy(s->reset_color_diff, - want_color(s->use_color_diff) ? GIT_COLOR_RESET : "", COLOR_MAXLEN); - - FREE_AND_NULL(s->interactive_diff_filter); - repo_config_get_string(r, "interactive.difffilter", - &s->interactive_diff_filter); - - FREE_AND_NULL(s->interactive_diff_algorithm); - repo_config_get_string(r, "diff.algorithm", - &s->interactive_diff_algorithm); - - if (!repo_config_get_int(r, "diff.context", &s->context)) - if (s->context < 0) - die(_("%s cannot be negative"), "diff.context"); - if (!repo_config_get_int(r, "diff.interHunkContext", &s->interhunkcontext)) - if (s->interhunkcontext < 0) - die(_("%s cannot be negative"), "diff.interHunkContext"); - - repo_config_get_bool(r, "interactive.singlekey", &s->use_single_key); - if (s->use_single_key) - setbuf(stdin, NULL); - - if (add_p_opt->context != -1) { - if (add_p_opt->context < 0) - die(_("%s cannot be negative"), "--unified"); - s->context = add_p_opt->context; - } - if (add_p_opt->interhunkcontext != -1) { - if (add_p_opt->interhunkcontext < 0) - die(_("%s cannot be negative"), "--inter-hunk-context"); - s->interhunkcontext = add_p_opt->interhunkcontext; - } + interactive_config_init(&s->cfg, r, opts); } void clear_add_i_state(struct add_i_state *s) { - FREE_AND_NULL(s->interactive_diff_filter); - FREE_AND_NULL(s->interactive_diff_algorithm); + interactive_config_clear(&s->cfg); memset(s, 0, sizeof(*s)); - s->use_color_interactive = GIT_COLOR_UNKNOWN; - s->use_color_diff = GIT_COLOR_UNKNOWN; + interactive_config_clear(&s->cfg); } /* @@ -287,7 +184,7 @@ static void list(struct add_i_state *s, struct string_list *list, int *selected, return; if (opts->header) - color_fprintf_ln(stdout, s->header_color, + color_fprintf_ln(stdout, s->cfg.header_color, "%s", opts->header); for (i = 0; i < list->nr; i++) { @@ -355,7 +252,7 @@ static ssize_t list_and_choose(struct add_i_state *s, list(s, &items->items, items->selected, &opts->list_opts); - color_fprintf(stdout, s->prompt_color, "%s", opts->prompt); + color_fprintf(stdout, s->cfg.prompt_color, "%s", opts->prompt); fputs(singleton ? "> " : ">> ", stdout); fflush(stdout); @@ -433,7 +330,7 @@ static ssize_t list_and_choose(struct add_i_state *s, if (from < 0 || from >= items->items.nr || (singleton && from + 1 != to)) { - color_fprintf_ln(stderr, s->error_color, + color_fprintf_ln(stderr, s->cfg.error_color, _("Huh (%s)?"), p); break; } else if (singleton) { @@ -993,7 +890,7 @@ static int run_patch(struct add_i_state *s, const struct pathspec *ps, free(files->items.items[i].string); } else if (item->index.unmerged || item->worktree.unmerged) { - color_fprintf_ln(stderr, s->error_color, + color_fprintf_ln(stderr, s->cfg.error_color, _("ignoring unmerged: %s"), files->items.items[i].string); free(item); @@ -1015,10 +912,10 @@ static int run_patch(struct add_i_state *s, const struct pathspec *ps, opts->prompt = N_("Patch update"); count = list_and_choose(s, files, opts); if (count > 0) { - struct add_p_opt add_p_opt = { - .context = s->context, - .interhunkcontext = s->interhunkcontext, - .auto_advance = s->auto_advance + struct interactive_options opts = { + .context = s->cfg.context, + .interhunkcontext = s->cfg.interhunkcontext, + .auto_advance = s->cfg.auto_advance, }; struct strvec args = STRVEC_INIT; struct pathspec ps_selected = { 0 }; @@ -1030,7 +927,7 @@ static int run_patch(struct add_i_state *s, const struct pathspec *ps, parse_pathspec(&ps_selected, PATHSPEC_ALL_MAGIC & ~PATHSPEC_LITERAL, PATHSPEC_LITERAL_PATH, "", args.v); - res = run_add_p(s->r, ADD_P_ADD, &add_p_opt, NULL, &ps_selected); + res = run_add_p(s->r, ADD_P_ADD, &opts, NULL, &ps_selected); strvec_clear(&args); clear_pathspec(&ps_selected); } @@ -1066,10 +963,10 @@ static int run_diff(struct add_i_state *s, const struct pathspec *ps, struct child_process cmd = CHILD_PROCESS_INIT; strvec_pushl(&cmd.args, "git", "diff", "-p", "--cached", NULL); - if (s->context != -1) - strvec_pushf(&cmd.args, "--unified=%i", s->context); - if (s->interhunkcontext != -1) - strvec_pushf(&cmd.args, "--inter-hunk-context=%i", s->interhunkcontext); + if (s->cfg.context != -1) + strvec_pushf(&cmd.args, "--unified=%i", s->cfg.context); + if (s->cfg.interhunkcontext != -1) + strvec_pushf(&cmd.args, "--inter-hunk-context=%i", s->cfg.interhunkcontext); strvec_pushl(&cmd.args, oid_to_hex(!is_initial ? &oid : s->r->hash_algo->empty_tree), "--", NULL); for (i = 0; i < files->items.nr; i++) @@ -1087,17 +984,17 @@ static int run_help(struct add_i_state *s, const struct pathspec *ps UNUSED, struct prefix_item_list *files UNUSED, struct list_and_choose_options *opts UNUSED) { - color_fprintf_ln(stdout, s->help_color, "status - %s", + color_fprintf_ln(stdout, s->cfg.help_color, "status - %s", _("show paths with changes")); - color_fprintf_ln(stdout, s->help_color, "update - %s", + color_fprintf_ln(stdout, s->cfg.help_color, "update - %s", _("add working tree state to the staged set of changes")); - color_fprintf_ln(stdout, s->help_color, "revert - %s", + color_fprintf_ln(stdout, s->cfg.help_color, "revert - %s", _("revert staged set of changes back to the HEAD version")); - color_fprintf_ln(stdout, s->help_color, "patch - %s", + color_fprintf_ln(stdout, s->cfg.help_color, "patch - %s", _("pick hunks and update selectively")); - color_fprintf_ln(stdout, s->help_color, "diff - %s", + color_fprintf_ln(stdout, s->cfg.help_color, "diff - %s", _("view diff between HEAD and index")); - color_fprintf_ln(stdout, s->help_color, "add untracked - %s", + color_fprintf_ln(stdout, s->cfg.help_color, "add untracked - %s", _("add contents of untracked files to the staged set of changes")); return 0; @@ -1105,21 +1002,21 @@ static int run_help(struct add_i_state *s, const struct pathspec *ps UNUSED, static void choose_prompt_help(struct add_i_state *s) { - color_fprintf_ln(stdout, s->help_color, "%s", + color_fprintf_ln(stdout, s->cfg.help_color, "%s", _("Prompt help:")); - color_fprintf_ln(stdout, s->help_color, "1 - %s", + color_fprintf_ln(stdout, s->cfg.help_color, "1 - %s", _("select a single item")); - color_fprintf_ln(stdout, s->help_color, "3-5 - %s", + color_fprintf_ln(stdout, s->cfg.help_color, "3-5 - %s", _("select a range of items")); - color_fprintf_ln(stdout, s->help_color, "2-3,6-9 - %s", + color_fprintf_ln(stdout, s->cfg.help_color, "2-3,6-9 - %s", _("select multiple ranges")); - color_fprintf_ln(stdout, s->help_color, "foo - %s", + color_fprintf_ln(stdout, s->cfg.help_color, "foo - %s", _("select item based on unique prefix")); - color_fprintf_ln(stdout, s->help_color, "-... - %s", + color_fprintf_ln(stdout, s->cfg.help_color, "-... - %s", _("unselect specified items")); - color_fprintf_ln(stdout, s->help_color, "* - %s", + color_fprintf_ln(stdout, s->cfg.help_color, "* - %s", _("choose all items")); - color_fprintf_ln(stdout, s->help_color, " - %s", + color_fprintf_ln(stdout, s->cfg.help_color, " - %s", _("(empty) finish selecting")); } @@ -1154,7 +1051,7 @@ static void print_command_item(int i, int selected UNUSED, static void command_prompt_help(struct add_i_state *s) { - const char *help_color = s->help_color; + const char *help_color = s->cfg.help_color; color_fprintf_ln(stdout, help_color, "%s", _("Prompt help:")); color_fprintf_ln(stdout, help_color, "1 - %s", _("select a numbered item")); @@ -1165,7 +1062,7 @@ static void command_prompt_help(struct add_i_state *s) } int run_add_i(struct repository *r, const struct pathspec *ps, - struct add_p_opt *add_p_opt) + struct interactive_options *interactive_opts) { struct add_i_state s = { NULL }; struct print_command_item_data data = { "[", "]" }; @@ -1208,15 +1105,15 @@ int run_add_i(struct repository *r, const struct pathspec *ps, ->util = util; } - init_add_i_state(&s, r, add_p_opt); + init_add_i_state(&s, r, interactive_opts); /* * When color was asked for, use the prompt color for * highlighting, otherwise use square brackets. */ - if (want_color(s.use_color_interactive)) { - data.color = s.prompt_color; - data.reset = s.reset_color_interactive; + if (want_color(s.cfg.use_color_interactive)) { + data.color = s.cfg.prompt_color; + data.reset = s.cfg.reset_color_interactive; } print_file_item_data.color = data.color; print_file_item_data.reset = data.reset; diff --git a/add-interactive.h b/add-interactive.h index 6c62489bfef287..eefa2edc7c124b 100644 --- a/add-interactive.h +++ b/add-interactive.h @@ -2,38 +2,20 @@ #define ADD_INTERACTIVE_H #include "add-patch.h" -#include "color.h" struct pathspec; struct repository; struct add_i_state { struct repository *r; - enum git_colorbool use_color_interactive; - enum git_colorbool use_color_diff; - char header_color[COLOR_MAXLEN]; - char help_color[COLOR_MAXLEN]; - char prompt_color[COLOR_MAXLEN]; - char error_color[COLOR_MAXLEN]; - char reset_color_interactive[COLOR_MAXLEN]; - - char fraginfo_color[COLOR_MAXLEN]; - char context_color[COLOR_MAXLEN]; - char file_old_color[COLOR_MAXLEN]; - char file_new_color[COLOR_MAXLEN]; - char reset_color_diff[COLOR_MAXLEN]; - - int use_single_key; - char *interactive_diff_filter, *interactive_diff_algorithm; - int context, interhunkcontext; - int auto_advance; + struct interactive_config cfg; }; void init_add_i_state(struct add_i_state *s, struct repository *r, - struct add_p_opt *add_p_opt); + struct interactive_options *opts); void clear_add_i_state(struct add_i_state *s); int run_add_i(struct repository *r, const struct pathspec *ps, - struct add_p_opt *add_p_opt); + struct interactive_options *opts); #endif diff --git a/add-patch.c b/add-patch.c index 8ce2fc02f6d2e5..756143eb846dba 100644 --- a/add-patch.c +++ b/add-patch.c @@ -5,6 +5,8 @@ #include "add-interactive.h" #include "add-patch.h" #include "advice.h" +#include "config.h" +#include "diff.h" #include "editor.h" #include "environment.h" #include "gettext.h" @@ -279,6 +281,123 @@ struct add_p_state { const char *revision; }; +static void init_color(struct repository *r, + enum git_colorbool use_color, + const char *section_and_slot, char *dst, + const char *default_color) +{ + char *key = xstrfmt("color.%s", section_and_slot); + const char *value; + + if (!want_color(use_color)) + dst[0] = '\0'; + else if (repo_config_get_value(r, key, &value) || + color_parse(value, dst)) + strlcpy(dst, default_color, COLOR_MAXLEN); + + free(key); +} + +static enum git_colorbool check_color_config(struct repository *r, const char *var) +{ + const char *value; + enum git_colorbool ret; + + if (repo_config_get_value(r, var, &value)) + ret = GIT_COLOR_UNKNOWN; + else + ret = git_config_colorbool(var, value); + + /* + * Do not rely on want_color() to fall back to color.ui for us. It uses + * the value parsed by git_color_config(), which may not have been + * called by the main command. + */ + if (ret == GIT_COLOR_UNKNOWN && + !repo_config_get_value(r, "color.ui", &value)) + ret = git_config_colorbool("color.ui", value); + + return ret; +} + +void interactive_config_init(struct interactive_config *cfg, + struct repository *r, + struct interactive_options *opts) +{ + cfg->context = -1; + cfg->interhunkcontext = -1; + cfg->auto_advance = opts->auto_advance; + + cfg->use_color_interactive = check_color_config(r, "color.interactive"); + + init_color(r, cfg->use_color_interactive, "interactive.header", + cfg->header_color, GIT_COLOR_BOLD); + init_color(r, cfg->use_color_interactive, "interactive.help", + cfg->help_color, GIT_COLOR_BOLD_RED); + init_color(r, cfg->use_color_interactive, "interactive.prompt", + cfg->prompt_color, GIT_COLOR_BOLD_BLUE); + init_color(r, cfg->use_color_interactive, "interactive.error", + cfg->error_color, GIT_COLOR_BOLD_RED); + strlcpy(cfg->reset_color_interactive, + want_color(cfg->use_color_interactive) ? GIT_COLOR_RESET : "", COLOR_MAXLEN); + + cfg->use_color_diff = check_color_config(r, "color.diff"); + + init_color(r, cfg->use_color_diff, "diff.frag", cfg->fraginfo_color, + diff_get_color(cfg->use_color_diff, DIFF_FRAGINFO)); + init_color(r, cfg->use_color_diff, "diff.context", cfg->context_color, + "fall back"); + if (!strcmp(cfg->context_color, "fall back")) + init_color(r, cfg->use_color_diff, "diff.plain", + cfg->context_color, + diff_get_color(cfg->use_color_diff, DIFF_CONTEXT)); + init_color(r, cfg->use_color_diff, "diff.old", cfg->file_old_color, + diff_get_color(cfg->use_color_diff, DIFF_FILE_OLD)); + init_color(r, cfg->use_color_diff, "diff.new", cfg->file_new_color, + diff_get_color(cfg->use_color_diff, DIFF_FILE_NEW)); + strlcpy(cfg->reset_color_diff, + want_color(cfg->use_color_diff) ? GIT_COLOR_RESET : "", COLOR_MAXLEN); + + FREE_AND_NULL(cfg->interactive_diff_filter); + repo_config_get_string(r, "interactive.difffilter", + &cfg->interactive_diff_filter); + + FREE_AND_NULL(cfg->interactive_diff_algorithm); + repo_config_get_string(r, "diff.algorithm", + &cfg->interactive_diff_algorithm); + + if (!repo_config_get_int(r, "diff.context", &cfg->context)) + if (cfg->context < 0) + die(_("%s cannot be negative"), "diff.context"); + if (!repo_config_get_int(r, "diff.interHunkContext", &cfg->interhunkcontext)) + if (cfg->interhunkcontext < 0) + die(_("%s cannot be negative"), "diff.interHunkContext"); + + repo_config_get_bool(r, "interactive.singlekey", &cfg->use_single_key); + if (cfg->use_single_key) + setbuf(stdin, NULL); + + if (opts->context != -1) { + if (opts->context < 0) + die(_("%s cannot be negative"), "--unified"); + cfg->context = opts->context; + } + if (opts->interhunkcontext != -1) { + if (opts->interhunkcontext < 0) + die(_("%s cannot be negative"), "--inter-hunk-context"); + cfg->interhunkcontext = opts->interhunkcontext; + } +} + +void interactive_config_clear(struct interactive_config *cfg) +{ + FREE_AND_NULL(cfg->interactive_diff_filter); + FREE_AND_NULL(cfg->interactive_diff_algorithm); + memset(cfg, 0, sizeof(*cfg)); + cfg->use_color_interactive = GIT_COLOR_UNKNOWN; + cfg->use_color_diff = GIT_COLOR_UNKNOWN; +} + static void add_p_state_clear(struct add_p_state *s) { size_t i; @@ -299,9 +418,9 @@ static void err(struct add_p_state *s, const char *fmt, ...) va_list args; va_start(args, fmt); - fputs(s->s.error_color, stdout); + fputs(s->s.cfg.error_color, stdout); vprintf(fmt, args); - puts(s->s.reset_color_interactive); + puts(s->s.cfg.reset_color_interactive); va_end(args); } @@ -424,12 +543,12 @@ static int parse_diff(struct add_p_state *s, const struct pathspec *ps) int res; strvec_pushv(&args, s->mode->diff_cmd); - if (s->s.context != -1) - strvec_pushf(&args, "--unified=%i", s->s.context); - if (s->s.interhunkcontext != -1) - strvec_pushf(&args, "--inter-hunk-context=%i", s->s.interhunkcontext); - if (s->s.interactive_diff_algorithm) - strvec_pushf(&args, "--diff-algorithm=%s", s->s.interactive_diff_algorithm); + if (s->s.cfg.context != -1) + strvec_pushf(&args, "--unified=%i", s->s.cfg.context); + if (s->s.cfg.interhunkcontext != -1) + strvec_pushf(&args, "--inter-hunk-context=%i", s->s.cfg.interhunkcontext); + if (s->s.cfg.interactive_diff_algorithm) + strvec_pushf(&args, "--diff-algorithm=%s", s->s.cfg.interactive_diff_algorithm); if (s->revision) { struct object_id oid; strvec_push(&args, @@ -458,9 +577,9 @@ static int parse_diff(struct add_p_state *s, const struct pathspec *ps) } strbuf_complete_line(plain); - if (want_color_fd(1, s->s.use_color_diff)) { + if (want_color_fd(1, s->s.cfg.use_color_diff)) { struct child_process colored_cp = CHILD_PROCESS_INIT; - const char *diff_filter = s->s.interactive_diff_filter; + const char *diff_filter = s->s.cfg.interactive_diff_filter; setup_child_process(s, &colored_cp, NULL); xsnprintf((char *)args.v[color_arg_index], 8, "--color"); @@ -693,7 +812,7 @@ static void render_hunk(struct add_p_state *s, struct hunk *hunk, hunk->colored_end - hunk->colored_start); return; } else { - strbuf_addstr(out, s->s.fraginfo_color); + strbuf_addstr(out, s->s.cfg.fraginfo_color); p = s->colored.buf + header->colored_extra_start; len = header->colored_extra_end - header->colored_extra_start; @@ -715,7 +834,7 @@ static void render_hunk(struct add_p_state *s, struct hunk *hunk, if (len) strbuf_add(out, p, len); else if (colored) - strbuf_addf(out, "%s\n", s->s.reset_color_diff); + strbuf_addf(out, "%s\n", s->s.cfg.reset_color_diff); else strbuf_addch(out, '\n'); } @@ -1104,12 +1223,12 @@ static void recolor_hunk(struct add_p_state *s, struct hunk *hunk) strbuf_addstr(&s->colored, plain[current] == '-' ? - s->s.file_old_color : + s->s.cfg.file_old_color : plain[current] == '+' ? - s->s.file_new_color : - s->s.context_color); + s->s.cfg.file_new_color : + s->s.cfg.context_color); strbuf_add(&s->colored, plain + current, eol - current); - strbuf_addstr(&s->colored, s->s.reset_color_diff); + strbuf_addstr(&s->colored, s->s.cfg.reset_color_diff); if (next > eol) strbuf_add(&s->colored, plain + eol, next - eol); current = next; @@ -1238,7 +1357,7 @@ static int run_apply_check(struct add_p_state *s, static int read_single_character(struct add_p_state *s) { - if (s->s.use_single_key) { + if (s->s.cfg.use_single_key) { int res = read_key_without_echo(&s->answer); printf("%s\n", res == EOF ? "" : s->answer.buf); return res; @@ -1252,7 +1371,7 @@ static int read_single_character(struct add_p_state *s) static int prompt_yesno(struct add_p_state *s, const char *prompt) { for (;;) { - color_fprintf(stdout, s->s.prompt_color, "%s", _(prompt)); + color_fprintf(stdout, s->s.cfg.prompt_color, "%s", _(prompt)); fflush(stdout); if (read_single_character(s) == EOF) return -1; @@ -1541,7 +1660,7 @@ static size_t patch_update_file(struct add_p_state *s, size_t idx) /* Everything decided? */ if (undecided_previous < 0 && undecided_next < 0 && hunk->use != UNDECIDED_HUNK) { - if (!s->s.auto_advance) + if (!s->s.cfg.auto_advance) all_decided = 1; else { patch_update_resp++; @@ -1595,11 +1714,11 @@ static size_t patch_update_file(struct add_p_state *s, size_t idx) permitted |= ALLOW_EDIT; strbuf_addstr(&s->buf, ",e"); } - if (!s->s.auto_advance && s->file_diff_nr > 1) { + if (!s->s.cfg.auto_advance && s->file_diff_nr > 1) { permitted |= ALLOW_GOTO_NEXT_FILE; strbuf_addstr(&s->buf, ",>"); } - if (!s->s.auto_advance && s->file_diff_nr > 1) { + if (!s->s.cfg.auto_advance && s->file_diff_nr > 1) { permitted |= ALLOW_GOTO_PREVIOUS_FILE; strbuf_addstr(&s->buf, ",<"); } @@ -1614,7 +1733,7 @@ static size_t patch_update_file(struct add_p_state *s, size_t idx) else prompt_mode_type = PROMPT_HUNK; - printf("%s(%"PRIuMAX"/%"PRIuMAX") ", s->s.prompt_color, + printf("%s(%"PRIuMAX"/%"PRIuMAX") ", s->s.cfg.prompt_color, (uintmax_t)hunk_index + 1, (uintmax_t)(file_diff->hunk_nr ? file_diff->hunk_nr @@ -1627,8 +1746,8 @@ static size_t patch_update_file(struct add_p_state *s, size_t idx) } printf(_(s->mode->prompt_mode[prompt_mode_type]), hunk_use_decision, s->buf.buf); - if (*s->s.reset_color_interactive) - fputs(s->s.reset_color_interactive, stdout); + if (*s->s.cfg.reset_color_interactive) + fputs(s->s.cfg.reset_color_interactive, stdout); fflush(stdout); if (read_single_character(s) == EOF) { patch_update_resp = s->file_diff_nr; @@ -1679,7 +1798,7 @@ static size_t patch_update_file(struct add_p_state *s, size_t idx) } else if (ch == 'q') { patch_update_resp = s->file_diff_nr; break; - } else if (!s->s.auto_advance && s->answer.buf[0] == '>') { + } else if (!s->s.cfg.auto_advance && s->answer.buf[0] == '>') { if (permitted & ALLOW_GOTO_NEXT_FILE) { if (patch_update_resp == s->file_diff_nr - 1) patch_update_resp = 0; @@ -1690,7 +1809,7 @@ static size_t patch_update_file(struct add_p_state *s, size_t idx) err(s, _("No next file")); continue; } - } else if (!s->s.auto_advance && s->answer.buf[0] == '<') { + } else if (!s->s.cfg.auto_advance && s->answer.buf[0] == '<') { if (permitted & ALLOW_GOTO_PREVIOUS_FILE) { if (patch_update_resp == 0) patch_update_resp = s->file_diff_nr - 1; @@ -1813,7 +1932,7 @@ static size_t patch_update_file(struct add_p_state *s, size_t idx) err(s, _("Sorry, cannot split this hunk")); } else if (!split_hunk(s, file_diff, hunk - file_diff->hunk)) { - color_fprintf_ln(stdout, s->s.header_color, + color_fprintf_ln(stdout, s->s.cfg.header_color, _("Split into %d hunks."), (int)splittable_into); rendered_hunk_index = -1; @@ -1831,7 +1950,7 @@ static size_t patch_update_file(struct add_p_state *s, size_t idx) } else if (s->answer.buf[0] == '?') { const char *p = _(help_patch_remainder), *eol = p; - color_fprintf(stdout, s->s.help_color, "%s", + color_fprintf(stdout, s->s.cfg.help_color, "%s", _(s->mode->help_patch_text)); /* @@ -1855,13 +1974,13 @@ static size_t patch_update_file(struct add_p_state *s, size_t idx) if (file_diff->hunk[i].use == SKIP_HUNK) skipped += 1; } - color_fprintf_ln(stdout, s->s.help_color, _(p), + color_fprintf_ln(stdout, s->s.cfg.help_color, _(p), total, used, skipped); } if (*p != '?' && !strchr(s->buf.buf, *p)) continue; - color_fprintf_ln(stdout, s->s.help_color, + color_fprintf_ln(stdout, s->s.cfg.help_color, "%.*s", (int)(eol - p), p); } } else { @@ -1870,7 +1989,7 @@ static size_t patch_update_file(struct add_p_state *s, size_t idx) } } - if (s->s.auto_advance) + if (s->s.cfg.auto_advance) apply_patch(s, file_diff); putchar('\n'); @@ -1878,7 +1997,7 @@ static size_t patch_update_file(struct add_p_state *s, size_t idx) } int run_add_p(struct repository *r, enum add_p_mode mode, - struct add_p_opt *o, const char *revision, + struct interactive_options *opts, const char *revision, const struct pathspec *ps) { struct add_p_state s = { @@ -1886,7 +2005,7 @@ int run_add_p(struct repository *r, enum add_p_mode mode, }; size_t i, binary_count = 0; - init_add_i_state(&s.s, r, o); + init_add_i_state(&s.s, r, opts); if (mode == ADD_P_STASH) s.mode = &patch_mode_stash; @@ -1932,7 +2051,7 @@ int run_add_p(struct repository *r, enum add_p_mode mode, if ((i = patch_update_file(&s, i)) == s.file_diff_nr) break; } - if (!s.s.auto_advance) + if (!s.s.cfg.auto_advance) for (i = 0; i < s.file_diff_nr; i++) apply_patch(&s, s.file_diff + i); diff --git a/add-patch.h b/add-patch.h index 88b00ca788722a..e6868c60a2f343 100644 --- a/add-patch.h +++ b/add-patch.h @@ -1,16 +1,48 @@ #ifndef ADD_PATCH_H #define ADD_PATCH_H +#include "color.h" + struct pathspec; struct repository; -struct add_p_opt { +struct interactive_options { int context; int interhunkcontext; int auto_advance; }; -#define ADD_P_OPT_INIT { .context = -1, .interhunkcontext = -1, .auto_advance = 1 } +#define INTERACTIVE_OPTIONS_INIT { \ + .context = -1, \ + .interhunkcontext = -1, \ + .auto_advance = 1, \ +} + +struct interactive_config { + enum git_colorbool use_color_interactive; + enum git_colorbool use_color_diff; + char header_color[COLOR_MAXLEN]; + char help_color[COLOR_MAXLEN]; + char prompt_color[COLOR_MAXLEN]; + char error_color[COLOR_MAXLEN]; + char reset_color_interactive[COLOR_MAXLEN]; + + char fraginfo_color[COLOR_MAXLEN]; + char context_color[COLOR_MAXLEN]; + char file_old_color[COLOR_MAXLEN]; + char file_new_color[COLOR_MAXLEN]; + char reset_color_diff[COLOR_MAXLEN]; + + int use_single_key; + char *interactive_diff_filter, *interactive_diff_algorithm; + int context, interhunkcontext; + int auto_advance; +}; + +void interactive_config_init(struct interactive_config *cfg, + struct repository *r, + struct interactive_options *opts); +void interactive_config_clear(struct interactive_config *cfg); enum add_p_mode { ADD_P_ADD, @@ -21,7 +53,7 @@ enum add_p_mode { }; int run_add_p(struct repository *r, enum add_p_mode mode, - struct add_p_opt *o, const char *revision, + struct interactive_options *opts, const char *revision, const struct pathspec *ps); #endif diff --git a/builtin/add.c b/builtin/add.c index 4357f87b7f807b..84f9bcb789fc7b 100644 --- a/builtin/add.c +++ b/builtin/add.c @@ -31,7 +31,7 @@ static const char * const builtin_add_usage[] = { NULL }; static int patch_interactive, add_interactive, edit_interactive; -static struct add_p_opt add_p_opt = ADD_P_OPT_INIT; +static struct interactive_options interactive_opts = INTERACTIVE_OPTIONS_INIT; static int take_worktree_changes; static int add_renormalize; static int pathspec_file_nul; @@ -160,7 +160,7 @@ static int refresh(struct repository *repo, int verbose, const struct pathspec * int interactive_add(struct repository *repo, const char **argv, const char *prefix, - int patch, struct add_p_opt *add_p_opt) + int patch, struct interactive_options *interactive_opts) { struct pathspec pathspec; int ret; @@ -172,9 +172,9 @@ int interactive_add(struct repository *repo, prefix, argv); if (patch) - ret = !!run_add_p(repo, ADD_P_ADD, add_p_opt, NULL, &pathspec); + ret = !!run_add_p(repo, ADD_P_ADD, interactive_opts, NULL, &pathspec); else - ret = !!run_add_i(repo, &pathspec, add_p_opt); + ret = !!run_add_i(repo, &pathspec, interactive_opts); clear_pathspec(&pathspec); return ret; @@ -256,10 +256,10 @@ static struct option builtin_add_options[] = { OPT_GROUP(""), OPT_BOOL('i', "interactive", &add_interactive, N_("interactive picking")), OPT_BOOL('p', "patch", &patch_interactive, N_("select hunks interactively")), - OPT_BOOL(0, "auto-advance", &add_p_opt.auto_advance, + OPT_BOOL(0, "auto-advance", &interactive_opts.auto_advance, N_("auto advance to the next file when selecting hunks interactively")), - OPT_DIFF_UNIFIED(&add_p_opt.context), - OPT_DIFF_INTERHUNK_CONTEXT(&add_p_opt.interhunkcontext), + OPT_DIFF_UNIFIED(&interactive_opts.context), + OPT_DIFF_INTERHUNK_CONTEXT(&interactive_opts.interhunkcontext), OPT_BOOL('e', "edit", &edit_interactive, N_("edit current diff and apply")), OPT__FORCE(&ignored_too, N_("allow adding otherwise ignored files"), 0), OPT_BOOL('u', "update", &take_worktree_changes, N_("update tracked files")), @@ -402,9 +402,9 @@ int cmd_add(int argc, prepare_repo_settings(repo); repo->settings.command_requires_full_index = 0; - if (add_p_opt.context < -1) + if (interactive_opts.context < -1) die(_("'%s' cannot be negative"), "--unified"); - if (add_p_opt.interhunkcontext < -1) + if (interactive_opts.interhunkcontext < -1) die(_("'%s' cannot be negative"), "--inter-hunk-context"); if (patch_interactive) @@ -414,13 +414,13 @@ int cmd_add(int argc, die(_("options '%s' and '%s' cannot be used together"), "--dry-run", "--interactive/--patch"); if (pathspec_from_file) die(_("options '%s' and '%s' cannot be used together"), "--pathspec-from-file", "--interactive/--patch"); - exit(interactive_add(repo, argv + 1, prefix, patch_interactive, &add_p_opt)); + exit(interactive_add(repo, argv + 1, prefix, patch_interactive, &interactive_opts)); } else { - if (add_p_opt.context != -1) + if (interactive_opts.context != -1) die(_("the option '%s' requires '%s'"), "--unified", "--interactive/--patch"); - if (add_p_opt.interhunkcontext != -1) + if (interactive_opts.interhunkcontext != -1) die(_("the option '%s' requires '%s'"), "--inter-hunk-context", "--interactive/--patch"); - if (!add_p_opt.auto_advance) + if (!interactive_opts.auto_advance) die(_("the option '%s' requires '%s'"), "--no-auto-advance", "--interactive/--patch"); } diff --git a/builtin/checkout.c b/builtin/checkout.c index eefefd5d0fe6d5..bebe18c1d9029a 100644 --- a/builtin/checkout.c +++ b/builtin/checkout.c @@ -532,7 +532,7 @@ static int checkout_paths(const struct checkout_opts *opts, if (opts->patch_mode) { enum add_p_mode patch_mode; - struct add_p_opt add_p_opt = { + struct interactive_options interactive_opts = { .context = opts->patch_context, .interhunkcontext = opts->patch_interhunk_context, .auto_advance = opts->auto_advance @@ -562,7 +562,7 @@ static int checkout_paths(const struct checkout_opts *opts, else BUG("either flag must have been set, worktree=%d, index=%d", opts->checkout_worktree, opts->checkout_index); - return !!run_add_p(the_repository, patch_mode, &add_p_opt, + return !!run_add_p(the_repository, patch_mode, &interactive_opts, rev, &opts->pathspec); } diff --git a/builtin/commit.c b/builtin/commit.c index 9e3a09d532bfce..a1d64dd699b000 100644 --- a/builtin/commit.c +++ b/builtin/commit.c @@ -123,7 +123,7 @@ static const char *edit_message, *use_message; static char *fixup_message, *fixup_commit, *squash_message; static const char *fixup_prefix; static int all, also, interactive, patch_interactive, only, amend, signoff; -static struct add_p_opt add_p_opt = ADD_P_OPT_INIT; +static struct interactive_options interactive_opts = INTERACTIVE_OPTIONS_INIT; static int edit_flag = -1; /* unspecified */ static int quiet, verbose, no_verify, allow_empty, dry_run, renew_authorship; static int config_commit_verbose = -1; /* unspecified */ @@ -357,9 +357,9 @@ static const char *prepare_index(const char **argv, const char *prefix, const char *ret; char *path = NULL; - if (add_p_opt.context < -1) + if (interactive_opts.context < -1) die(_("'%s' cannot be negative"), "--unified"); - if (add_p_opt.interhunkcontext < -1) + if (interactive_opts.interhunkcontext < -1) die(_("'%s' cannot be negative"), "--inter-hunk-context"); if (is_status) @@ -408,7 +408,7 @@ static const char *prepare_index(const char **argv, const char *prefix, old_index_env = xstrdup_or_null(getenv(INDEX_ENVIRONMENT)); setenv(INDEX_ENVIRONMENT, the_repository->index_file, 1); - if (interactive_add(the_repository, argv, prefix, patch_interactive, &add_p_opt) != 0) + if (interactive_add(the_repository, argv, prefix, patch_interactive, &interactive_opts) != 0) die(_("interactive add failed")); the_repository->index_file = old_repo_index_file; @@ -433,9 +433,9 @@ static const char *prepare_index(const char **argv, const char *prefix, ret = get_lock_file_path(&index_lock); goto out; } else { - if (add_p_opt.context != -1) + if (interactive_opts.context != -1) die(_("the option '%s' requires '%s'"), "--unified", "--interactive/--patch"); - if (add_p_opt.interhunkcontext != -1) + if (interactive_opts.interhunkcontext != -1) die(_("the option '%s' requires '%s'"), "--inter-hunk-context", "--interactive/--patch"); } @@ -1743,8 +1743,8 @@ int cmd_commit(int argc, OPT_BOOL('i', "include", &also, N_("add specified files to index for commit")), OPT_BOOL(0, "interactive", &interactive, N_("interactively add files")), OPT_BOOL('p', "patch", &patch_interactive, N_("interactively add changes")), - OPT_DIFF_UNIFIED(&add_p_opt.context), - OPT_DIFF_INTERHUNK_CONTEXT(&add_p_opt.interhunkcontext), + OPT_DIFF_UNIFIED(&interactive_opts.context), + OPT_DIFF_INTERHUNK_CONTEXT(&interactive_opts.interhunkcontext), OPT_BOOL('o', "only", &only, N_("commit only specified files")), OPT_BOOL('n', "no-verify", &no_verify, N_("bypass pre-commit and commit-msg hooks")), OPT_BOOL(0, "dry-run", &dry_run, N_("show what would be committed")), diff --git a/builtin/reset.c b/builtin/reset.c index 88f95f9fc7aa9e..4a74a82c0a3ae8 100644 --- a/builtin/reset.c +++ b/builtin/reset.c @@ -346,7 +346,7 @@ int cmd_reset(int argc, struct object_id oid; struct pathspec pathspec; int intent_to_add = 0; - struct add_p_opt add_p_opt = ADD_P_OPT_INIT; + struct interactive_options interactive_opts = INTERACTIVE_OPTIONS_INIT; const struct option options[] = { OPT__QUIET(&quiet, N_("be quiet, only report errors")), OPT_BOOL(0, "no-refresh", &no_refresh, @@ -371,10 +371,10 @@ int cmd_reset(int argc, PARSE_OPT_OPTARG, option_parse_recurse_submodules_worktree_updater), OPT_BOOL('p', "patch", &patch_mode, N_("select hunks interactively")), - OPT_BOOL(0, "auto-advance", &add_p_opt.auto_advance, + OPT_BOOL(0, "auto-advance", &interactive_opts.auto_advance, N_("auto advance to the next file when selecting hunks interactively")), - OPT_DIFF_UNIFIED(&add_p_opt.context), - OPT_DIFF_INTERHUNK_CONTEXT(&add_p_opt.interhunkcontext), + OPT_DIFF_UNIFIED(&interactive_opts.context), + OPT_DIFF_INTERHUNK_CONTEXT(&interactive_opts.interhunkcontext), OPT_BOOL('N', "intent-to-add", &intent_to_add, N_("record only the fact that removed paths will be added later")), OPT_PATHSPEC_FROM_FILE(&pathspec_from_file), @@ -425,9 +425,9 @@ int cmd_reset(int argc, oidcpy(&oid, &tree->object.oid); } - if (add_p_opt.context < -1) + if (interactive_opts.context < -1) die(_("'%s' cannot be negative"), "--unified"); - if (add_p_opt.interhunkcontext < -1) + if (interactive_opts.interhunkcontext < -1) die(_("'%s' cannot be negative"), "--inter-hunk-context"); prepare_repo_settings(the_repository); @@ -438,14 +438,14 @@ int cmd_reset(int argc, die(_("options '%s' and '%s' cannot be used together"), "--patch", "--{hard,mixed,soft}"); trace2_cmd_mode("patch-interactive"); update_ref_status = !!run_add_p(the_repository, ADD_P_RESET, - &add_p_opt, rev, &pathspec); + &interactive_opts, rev, &pathspec); goto cleanup; } else { - if (add_p_opt.context != -1) + if (interactive_opts.context != -1) die(_("the option '%s' requires '%s'"), "--unified", "--patch"); - if (add_p_opt.interhunkcontext != -1) + if (interactive_opts.interhunkcontext != -1) die(_("the option '%s' requires '%s'"), "--inter-hunk-context", "--patch"); - if (!add_p_opt.auto_advance) + if (!interactive_opts.auto_advance) die(_("the option '%s' requires '%s'"), "--no-auto-advance", "--patch"); } diff --git a/builtin/stash.c b/builtin/stash.c index e79d612e572e7c..c467c02c7fb2aa 100644 --- a/builtin/stash.c +++ b/builtin/stash.c @@ -1306,7 +1306,7 @@ static int stash_staged(struct stash_info *info, struct strbuf *out_patch, static int stash_patch(struct stash_info *info, const struct pathspec *ps, struct strbuf *out_patch, int quiet, - struct add_p_opt *add_p_opt) + struct interactive_options *interactive_opts) { int ret = 0; struct child_process cp_read_tree = CHILD_PROCESS_INIT; @@ -1331,7 +1331,7 @@ static int stash_patch(struct stash_info *info, const struct pathspec *ps, old_index_env = xstrdup_or_null(getenv(INDEX_ENVIRONMENT)); setenv(INDEX_ENVIRONMENT, the_repository->index_file, 1); - ret = !!run_add_p(the_repository, ADD_P_STASH, add_p_opt, NULL, ps); + ret = !!run_add_p(the_repository, ADD_P_STASH, interactive_opts, NULL, ps); the_repository->index_file = old_repo_index_file; if (old_index_env && *old_index_env) @@ -1427,7 +1427,8 @@ static int stash_working_tree(struct stash_info *info, const struct pathspec *ps } static int do_create_stash(const struct pathspec *ps, struct strbuf *stash_msg_buf, - int include_untracked, int patch_mode, struct add_p_opt *add_p_opt, + int include_untracked, int patch_mode, + struct interactive_options *interactive_opts, int only_staged, struct stash_info *info, struct strbuf *patch, int quiet) { @@ -1509,7 +1510,7 @@ static int do_create_stash(const struct pathspec *ps, struct strbuf *stash_msg_b untracked_commit_option = 1; } if (patch_mode) { - ret = stash_patch(info, ps, patch, quiet, add_p_opt); + ret = stash_patch(info, ps, patch, quiet, interactive_opts); if (ret < 0) { if (!quiet) fprintf_ln(stderr, _("Cannot save the current " @@ -1595,7 +1596,8 @@ static int create_stash(int argc, const char **argv, const char *prefix UNUSED, } static int do_push_stash(const struct pathspec *ps, const char *stash_msg, int quiet, - int keep_index, int patch_mode, struct add_p_opt *add_p_opt, + int keep_index, int patch_mode, + struct interactive_options *interactive_opts, int include_untracked, int only_staged) { int ret = 0; @@ -1667,7 +1669,7 @@ static int do_push_stash(const struct pathspec *ps, const char *stash_msg, int q if (stash_msg) strbuf_addstr(&stash_msg_buf, stash_msg); if (do_create_stash(ps, &stash_msg_buf, include_untracked, patch_mode, - add_p_opt, only_staged, &info, &patch, quiet)) { + interactive_opts, only_staged, &info, &patch, quiet)) { ret = -1; goto done; } @@ -1841,7 +1843,7 @@ static int push_stash(int argc, const char **argv, const char *prefix, const char *stash_msg = NULL; char *pathspec_from_file = NULL; struct pathspec ps; - struct add_p_opt add_p_opt = ADD_P_OPT_INIT; + struct interactive_options interactive_opts = INTERACTIVE_OPTIONS_INIT; struct option options[] = { OPT_BOOL('k', "keep-index", &keep_index, N_("keep index")), @@ -1849,10 +1851,10 @@ static int push_stash(int argc, const char **argv, const char *prefix, N_("stash staged changes only")), OPT_BOOL('p', "patch", &patch_mode, N_("stash in patch mode")), - OPT_BOOL(0, "auto-advance", &add_p_opt.auto_advance, + OPT_BOOL(0, "auto-advance", &interactive_opts.auto_advance, N_("auto advance to the next file when selecting hunks interactively")), - OPT_DIFF_UNIFIED(&add_p_opt.context), - OPT_DIFF_INTERHUNK_CONTEXT(&add_p_opt.interhunkcontext), + OPT_DIFF_UNIFIED(&interactive_opts.context), + OPT_DIFF_INTERHUNK_CONTEXT(&interactive_opts.interhunkcontext), OPT__QUIET(&quiet, N_("quiet mode")), OPT_BOOL('u', "include-untracked", &include_untracked, N_("include untracked files in stash")), @@ -1909,21 +1911,21 @@ static int push_stash(int argc, const char **argv, const char *prefix, } if (!patch_mode) { - if (add_p_opt.context != -1) + if (interactive_opts.context != -1) die(_("the option '%s' requires '%s'"), "--unified", "--patch"); - if (add_p_opt.interhunkcontext != -1) + if (interactive_opts.interhunkcontext != -1) die(_("the option '%s' requires '%s'"), "--inter-hunk-context", "--patch"); - if (!add_p_opt.auto_advance) + if (!interactive_opts.auto_advance) die(_("the option '%s' requires '%s'"), "--no-auto-advance", "--patch"); } - if (add_p_opt.context < -1) + if (interactive_opts.context < -1) die(_("'%s' cannot be negative"), "--unified"); - if (add_p_opt.interhunkcontext < -1) + if (interactive_opts.interhunkcontext < -1) die(_("'%s' cannot be negative"), "--inter-hunk-context"); ret = do_push_stash(&ps, stash_msg, quiet, keep_index, patch_mode, - &add_p_opt, include_untracked, only_staged); + &interactive_opts, include_untracked, only_staged); clear_pathspec(&ps); free(pathspec_from_file); @@ -1948,7 +1950,7 @@ static int save_stash(int argc, const char **argv, const char *prefix, const char *stash_msg = NULL; struct pathspec ps; struct strbuf stash_msg_buf = STRBUF_INIT; - struct add_p_opt add_p_opt = ADD_P_OPT_INIT; + struct interactive_options interactive_opts = INTERACTIVE_OPTIONS_INIT; struct option options[] = { OPT_BOOL('k', "keep-index", &keep_index, N_("keep index")), @@ -1956,10 +1958,10 @@ static int save_stash(int argc, const char **argv, const char *prefix, N_("stash staged changes only")), OPT_BOOL('p', "patch", &patch_mode, N_("stash in patch mode")), - OPT_BOOL(0, "auto-advance", &add_p_opt.auto_advance, + OPT_BOOL(0, "auto-advance", &interactive_opts.auto_advance, N_("auto advance to the next file when selecting hunks interactively")), - OPT_DIFF_UNIFIED(&add_p_opt.context), - OPT_DIFF_INTERHUNK_CONTEXT(&add_p_opt.interhunkcontext), + OPT_DIFF_UNIFIED(&interactive_opts.context), + OPT_DIFF_INTERHUNK_CONTEXT(&interactive_opts.interhunkcontext), OPT__QUIET(&quiet, N_("quiet mode")), OPT_BOOL('u', "include-untracked", &include_untracked, N_("include untracked files in stash")), @@ -1979,22 +1981,22 @@ static int save_stash(int argc, const char **argv, const char *prefix, memset(&ps, 0, sizeof(ps)); - if (add_p_opt.context < -1) + if (interactive_opts.context < -1) die(_("'%s' cannot be negative"), "--unified"); - if (add_p_opt.interhunkcontext < -1) + if (interactive_opts.interhunkcontext < -1) die(_("'%s' cannot be negative"), "--inter-hunk-context"); if (!patch_mode) { - if (add_p_opt.context != -1) + if (interactive_opts.context != -1) die(_("the option '%s' requires '%s'"), "--unified", "--patch"); - if (add_p_opt.interhunkcontext != -1) + if (interactive_opts.interhunkcontext != -1) die(_("the option '%s' requires '%s'"), "--inter-hunk-context", "--patch"); - if (!add_p_opt.auto_advance) + if (!interactive_opts.auto_advance) die(_("the option '%s' requires '%s'"), "--no-auto-advance", "--patch"); } ret = do_push_stash(&ps, stash_msg, quiet, keep_index, - patch_mode, &add_p_opt, include_untracked, + patch_mode, &interactive_opts, include_untracked, only_staged); strbuf_release(&stash_msg_buf); diff --git a/commit.h b/commit.h index f2f39e1a89ef62..94181de9fda903 100644 --- a/commit.h +++ b/commit.h @@ -287,7 +287,7 @@ int for_each_commit_graft(each_commit_graft_fn, void *); int interactive_add(struct repository *repo, const char **argv, const char *prefix, - int patch, struct add_p_opt *add_p_opt); + int patch, struct interactive_options *opts); struct commit_extra_header { struct commit_extra_header *next; From d51b61f5dab9c8e715fa792f31d572bc96fb5687 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Mon, 2 Mar 2026 13:13:07 +0100 Subject: [PATCH 04/33] add-patch: remove dependency on "add-interactive" subsystem With the preceding commit we have split out interactive configuration that is used by both "git add -p" and "git add -i". But we still initialize that configuration in the "add -p" subsystem by calling `init_add_i_state()`, even though we only do so to initialize the interactive configuration as well as a repository pointer. Stop doing so and instead store and initialize the interactive configuration in `struct add_p_state` directly. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- add-patch.c | 88 ++++++++++++++++++++++++++++------------------------- 1 file changed, 46 insertions(+), 42 deletions(-) diff --git a/add-patch.c b/add-patch.c index 756143eb846dba..4f089c82d0385d 100644 --- a/add-patch.c +++ b/add-patch.c @@ -2,7 +2,6 @@ #define DISABLE_SIGN_COMPARE_WARNINGS #include "git-compat-util.h" -#include "add-interactive.h" #include "add-patch.h" #include "advice.h" #include "config.h" @@ -263,7 +262,8 @@ struct hunk { }; struct add_p_state { - struct add_i_state s; + struct repository *r; + struct interactive_config cfg; struct strbuf answer, buf; /* parsed diff */ @@ -409,7 +409,7 @@ static void add_p_state_clear(struct add_p_state *s) for (i = 0; i < s->file_diff_nr; i++) free(s->file_diff[i].hunk); free(s->file_diff); - clear_add_i_state(&s->s); + interactive_config_clear(&s->cfg); } __attribute__((format (printf, 2, 3))) @@ -418,9 +418,9 @@ static void err(struct add_p_state *s, const char *fmt, ...) va_list args; va_start(args, fmt); - fputs(s->s.cfg.error_color, stdout); + fputs(s->cfg.error_color, stdout); vprintf(fmt, args); - puts(s->s.cfg.reset_color_interactive); + puts(s->cfg.reset_color_interactive); va_end(args); } @@ -438,7 +438,7 @@ static void setup_child_process(struct add_p_state *s, cp->git_cmd = 1; strvec_pushf(&cp->env, - INDEX_ENVIRONMENT "=%s", s->s.r->index_file); + INDEX_ENVIRONMENT "=%s", s->r->index_file); } static int parse_range(const char **p, @@ -543,12 +543,12 @@ static int parse_diff(struct add_p_state *s, const struct pathspec *ps) int res; strvec_pushv(&args, s->mode->diff_cmd); - if (s->s.cfg.context != -1) - strvec_pushf(&args, "--unified=%i", s->s.cfg.context); - if (s->s.cfg.interhunkcontext != -1) - strvec_pushf(&args, "--inter-hunk-context=%i", s->s.cfg.interhunkcontext); - if (s->s.cfg.interactive_diff_algorithm) - strvec_pushf(&args, "--diff-algorithm=%s", s->s.cfg.interactive_diff_algorithm); + if (s->cfg.context != -1) + strvec_pushf(&args, "--unified=%i", s->cfg.context); + if (s->cfg.interhunkcontext != -1) + strvec_pushf(&args, "--inter-hunk-context=%i", s->cfg.interhunkcontext); + if (s->cfg.interactive_diff_algorithm) + strvec_pushf(&args, "--diff-algorithm=%s", s->cfg.interactive_diff_algorithm); if (s->revision) { struct object_id oid; strvec_push(&args, @@ -577,9 +577,9 @@ static int parse_diff(struct add_p_state *s, const struct pathspec *ps) } strbuf_complete_line(plain); - if (want_color_fd(1, s->s.cfg.use_color_diff)) { + if (want_color_fd(1, s->cfg.use_color_diff)) { struct child_process colored_cp = CHILD_PROCESS_INIT; - const char *diff_filter = s->s.cfg.interactive_diff_filter; + const char *diff_filter = s->cfg.interactive_diff_filter; setup_child_process(s, &colored_cp, NULL); xsnprintf((char *)args.v[color_arg_index], 8, "--color"); @@ -812,7 +812,7 @@ static void render_hunk(struct add_p_state *s, struct hunk *hunk, hunk->colored_end - hunk->colored_start); return; } else { - strbuf_addstr(out, s->s.cfg.fraginfo_color); + strbuf_addstr(out, s->cfg.fraginfo_color); p = s->colored.buf + header->colored_extra_start; len = header->colored_extra_end - header->colored_extra_start; @@ -834,7 +834,7 @@ static void render_hunk(struct add_p_state *s, struct hunk *hunk, if (len) strbuf_add(out, p, len); else if (colored) - strbuf_addf(out, "%s\n", s->s.cfg.reset_color_diff); + strbuf_addf(out, "%s\n", s->cfg.reset_color_diff); else strbuf_addch(out, '\n'); } @@ -1223,12 +1223,12 @@ static void recolor_hunk(struct add_p_state *s, struct hunk *hunk) strbuf_addstr(&s->colored, plain[current] == '-' ? - s->s.cfg.file_old_color : + s->cfg.file_old_color : plain[current] == '+' ? - s->s.cfg.file_new_color : - s->s.cfg.context_color); + s->cfg.file_new_color : + s->cfg.context_color); strbuf_add(&s->colored, plain + current, eol - current); - strbuf_addstr(&s->colored, s->s.cfg.reset_color_diff); + strbuf_addstr(&s->colored, s->cfg.reset_color_diff); if (next > eol) strbuf_add(&s->colored, plain + eol, next - eol); current = next; @@ -1357,7 +1357,7 @@ static int run_apply_check(struct add_p_state *s, static int read_single_character(struct add_p_state *s) { - if (s->s.cfg.use_single_key) { + if (s->cfg.use_single_key) { int res = read_key_without_echo(&s->answer); printf("%s\n", res == EOF ? "" : s->answer.buf); return res; @@ -1371,7 +1371,7 @@ static int read_single_character(struct add_p_state *s) static int prompt_yesno(struct add_p_state *s, const char *prompt) { for (;;) { - color_fprintf(stdout, s->s.cfg.prompt_color, "%s", _(prompt)); + color_fprintf(stdout, s->cfg.prompt_color, "%s", _(prompt)); fflush(stdout); if (read_single_character(s) == EOF) return -1; @@ -1559,7 +1559,7 @@ static void apply_patch(struct add_p_state *s, struct file_diff *file_diff) strbuf_reset(&s->buf); reassemble_patch(s, file_diff, 0, &s->buf); - discard_index(s->s.r->index); + discard_index(s->r->index); if (s->mode->apply_for_checkout) apply_for_checkout(s, &s->buf, s->mode->is_reverse); @@ -1570,9 +1570,9 @@ static void apply_patch(struct add_p_state *s, struct file_diff *file_diff) NULL, 0, NULL, 0)) error(_("'git apply' failed")); } - if (repo_read_index(s->s.r) >= 0) - repo_refresh_and_write_index(s->s.r, REFRESH_QUIET, 0, - 1, NULL, NULL, NULL); + if (repo_read_index(s->r) >= 0) + repo_refresh_and_write_index(s->r, REFRESH_QUIET, 0, + 1, NULL, NULL, NULL); } } @@ -1660,7 +1660,7 @@ static size_t patch_update_file(struct add_p_state *s, size_t idx) /* Everything decided? */ if (undecided_previous < 0 && undecided_next < 0 && hunk->use != UNDECIDED_HUNK) { - if (!s->s.cfg.auto_advance) + if (!s->cfg.auto_advance) all_decided = 1; else { patch_update_resp++; @@ -1714,11 +1714,11 @@ static size_t patch_update_file(struct add_p_state *s, size_t idx) permitted |= ALLOW_EDIT; strbuf_addstr(&s->buf, ",e"); } - if (!s->s.cfg.auto_advance && s->file_diff_nr > 1) { + if (!s->cfg.auto_advance && s->file_diff_nr > 1) { permitted |= ALLOW_GOTO_NEXT_FILE; strbuf_addstr(&s->buf, ",>"); } - if (!s->s.cfg.auto_advance && s->file_diff_nr > 1) { + if (!s->cfg.auto_advance && s->file_diff_nr > 1) { permitted |= ALLOW_GOTO_PREVIOUS_FILE; strbuf_addstr(&s->buf, ",<"); } @@ -1733,7 +1733,7 @@ static size_t patch_update_file(struct add_p_state *s, size_t idx) else prompt_mode_type = PROMPT_HUNK; - printf("%s(%"PRIuMAX"/%"PRIuMAX") ", s->s.cfg.prompt_color, + printf("%s(%"PRIuMAX"/%"PRIuMAX") ", s->cfg.prompt_color, (uintmax_t)hunk_index + 1, (uintmax_t)(file_diff->hunk_nr ? file_diff->hunk_nr @@ -1746,8 +1746,8 @@ static size_t patch_update_file(struct add_p_state *s, size_t idx) } printf(_(s->mode->prompt_mode[prompt_mode_type]), hunk_use_decision, s->buf.buf); - if (*s->s.cfg.reset_color_interactive) - fputs(s->s.cfg.reset_color_interactive, stdout); + if (*s->cfg.reset_color_interactive) + fputs(s->cfg.reset_color_interactive, stdout); fflush(stdout); if (read_single_character(s) == EOF) { patch_update_resp = s->file_diff_nr; @@ -1798,7 +1798,7 @@ static size_t patch_update_file(struct add_p_state *s, size_t idx) } else if (ch == 'q') { patch_update_resp = s->file_diff_nr; break; - } else if (!s->s.cfg.auto_advance && s->answer.buf[0] == '>') { + } else if (!s->cfg.auto_advance && s->answer.buf[0] == '>') { if (permitted & ALLOW_GOTO_NEXT_FILE) { if (patch_update_resp == s->file_diff_nr - 1) patch_update_resp = 0; @@ -1809,7 +1809,7 @@ static size_t patch_update_file(struct add_p_state *s, size_t idx) err(s, _("No next file")); continue; } - } else if (!s->s.cfg.auto_advance && s->answer.buf[0] == '<') { + } else if (!s->cfg.auto_advance && s->answer.buf[0] == '<') { if (permitted & ALLOW_GOTO_PREVIOUS_FILE) { if (patch_update_resp == 0) patch_update_resp = s->file_diff_nr - 1; @@ -1932,7 +1932,7 @@ static size_t patch_update_file(struct add_p_state *s, size_t idx) err(s, _("Sorry, cannot split this hunk")); } else if (!split_hunk(s, file_diff, hunk - file_diff->hunk)) { - color_fprintf_ln(stdout, s->s.cfg.header_color, + color_fprintf_ln(stdout, s->cfg.header_color, _("Split into %d hunks."), (int)splittable_into); rendered_hunk_index = -1; @@ -1950,7 +1950,7 @@ static size_t patch_update_file(struct add_p_state *s, size_t idx) } else if (s->answer.buf[0] == '?') { const char *p = _(help_patch_remainder), *eol = p; - color_fprintf(stdout, s->s.cfg.help_color, "%s", + color_fprintf(stdout, s->cfg.help_color, "%s", _(s->mode->help_patch_text)); /* @@ -1974,13 +1974,13 @@ static size_t patch_update_file(struct add_p_state *s, size_t idx) if (file_diff->hunk[i].use == SKIP_HUNK) skipped += 1; } - color_fprintf_ln(stdout, s->s.cfg.help_color, _(p), + color_fprintf_ln(stdout, s->cfg.help_color, _(p), total, used, skipped); } if (*p != '?' && !strchr(s->buf.buf, *p)) continue; - color_fprintf_ln(stdout, s->s.cfg.help_color, + color_fprintf_ln(stdout, s->cfg.help_color, "%.*s", (int)(eol - p), p); } } else { @@ -1989,7 +1989,7 @@ static size_t patch_update_file(struct add_p_state *s, size_t idx) } } - if (s->s.cfg.auto_advance) + if (s->cfg.auto_advance) apply_patch(s, file_diff); putchar('\n'); @@ -2001,11 +2001,15 @@ int run_add_p(struct repository *r, enum add_p_mode mode, const struct pathspec *ps) { struct add_p_state s = { - { r }, STRBUF_INIT, STRBUF_INIT, STRBUF_INIT, STRBUF_INIT + .r = r, + .answer = STRBUF_INIT, + .buf = STRBUF_INIT, + .plain = STRBUF_INIT, + .colored = STRBUF_INIT, }; size_t i, binary_count = 0; - init_add_i_state(&s.s, r, opts); + interactive_config_init(&s.cfg, r, opts); if (mode == ADD_P_STASH) s.mode = &patch_mode_stash; @@ -2051,7 +2055,7 @@ int run_add_p(struct repository *r, enum add_p_mode mode, if ((i = patch_update_file(&s, i)) == s.file_diff_nr) break; } - if (!s.s.cfg.auto_advance) + if (!s.cfg.auto_advance) for (i = 0; i < s.file_diff_nr; i++) apply_patch(&s, s.file_diff + i); From 0c5583a57d618db191afceeff54e8cafbee89f41 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Mon, 2 Mar 2026 13:13:08 +0100 Subject: [PATCH 05/33] add-patch: add support for in-memory index patching With `run_add_p()` callers have the ability to apply changes from a specific revision to a repository's index. This infra supports several different modes, like for example applying changes to the index, working tree or both. One feature that is missing though is the ability to apply changes to an in-memory index different from the repository's index. Add a new function `run_add_p_index()` to plug this gap. This new function will be used in a subsequent commit. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- add-patch.c | 149 +++++++++++++++++++++++++++++++++++++++++++--------- add-patch.h | 8 +++ 2 files changed, 132 insertions(+), 25 deletions(-) diff --git a/add-patch.c b/add-patch.c index 4f089c82d0385d..b4dc7d2293930c 100644 --- a/add-patch.c +++ b/add-patch.c @@ -4,11 +4,13 @@ #include "git-compat-util.h" #include "add-patch.h" #include "advice.h" +#include "commit.h" #include "config.h" #include "diff.h" #include "editor.h" #include "environment.h" #include "gettext.h" +#include "hex.h" #include "object-name.h" #include "pager.h" #include "read-cache-ll.h" @@ -263,6 +265,8 @@ struct hunk { struct add_p_state { struct repository *r; + struct index_state *index; + const char *index_file; struct interactive_config cfg; struct strbuf answer, buf; @@ -438,7 +442,7 @@ static void setup_child_process(struct add_p_state *s, cp->git_cmd = 1; strvec_pushf(&cp->env, - INDEX_ENVIRONMENT "=%s", s->r->index_file); + INDEX_ENVIRONMENT "=%s", s->index_file); } static int parse_range(const char **p, @@ -1559,7 +1563,7 @@ static void apply_patch(struct add_p_state *s, struct file_diff *file_diff) strbuf_reset(&s->buf); reassemble_patch(s, file_diff, 0, &s->buf); - discard_index(s->r->index); + discard_index(s->index); if (s->mode->apply_for_checkout) apply_for_checkout(s, &s->buf, s->mode->is_reverse); @@ -1570,9 +1574,11 @@ static void apply_patch(struct add_p_state *s, struct file_diff *file_diff) NULL, 0, NULL, 0)) error(_("'git apply' failed")); } - if (repo_read_index(s->r) >= 0) + if (read_index_from(s->index, s->index_file, s->r->gitdir) >= 0 && + s->index == s->r->index) { repo_refresh_and_write_index(s->r, REFRESH_QUIET, 0, 1, NULL, NULL, NULL); + } } } @@ -1996,18 +2002,51 @@ static size_t patch_update_file(struct add_p_state *s, size_t idx) return patch_update_resp; } +static int run_add_p_common(struct add_p_state *state, + const struct pathspec *ps) +{ + size_t binary_count = 0; + size_t i; + + if (parse_diff(state, ps) < 0) + return -1; + + for (i = 0; i < state->file_diff_nr;) { + if (state->file_diff[i].binary && !state->file_diff[i].hunk_nr) { + binary_count++; + i++; + continue; + } + if ((i = patch_update_file(state, i)) == state->file_diff_nr) + break; + } + + if (!state->cfg.auto_advance) + for (i = 0; i < state->file_diff_nr; i++) + apply_patch(state, state->file_diff + i); + + if (state->file_diff_nr == 0) + err(state, _("No changes.")); + else if (binary_count == state->file_diff_nr) + err(state, _("Only binary files changed.")); + + return 0; +} + int run_add_p(struct repository *r, enum add_p_mode mode, struct interactive_options *opts, const char *revision, const struct pathspec *ps) { struct add_p_state s = { .r = r, + .index = r->index, + .index_file = r->index_file, .answer = STRBUF_INIT, .buf = STRBUF_INIT, .plain = STRBUF_INIT, .colored = STRBUF_INIT, }; - size_t i, binary_count = 0; + int ret; interactive_config_init(&s.cfg, r, opts); @@ -2040,30 +2079,90 @@ int run_add_p(struct repository *r, enum add_p_mode mode, if (repo_read_index(r) < 0 || (!s.mode->index_only && repo_refresh_and_write_index(r, REFRESH_QUIET, 0, 1, - NULL, NULL, NULL) < 0) || - parse_diff(&s, ps) < 0) { - add_p_state_clear(&s); - return -1; + NULL, NULL, NULL) < 0)) { + ret = -1; + goto out; } - for (i = 0; i < s.file_diff_nr;) { - if (s.file_diff[i].binary && !s.file_diff[i].hunk_nr) { - binary_count++; - i++; - continue; - } - if ((i = patch_update_file(&s, i)) == s.file_diff_nr) - break; - } - if (!s.cfg.auto_advance) - for (i = 0; i < s.file_diff_nr; i++) - apply_patch(&s, s.file_diff + i); + ret = run_add_p_common(&s, ps); + if (ret < 0) + goto out; - if (s.file_diff_nr == 0) - err(&s, _("No changes.")); - else if (binary_count == s.file_diff_nr) - err(&s, _("Only binary files changed.")); + ret = 0; +out: add_p_state_clear(&s); - return 0; + return ret; +} + +int run_add_p_index(struct repository *r, + struct index_state *index, + const char *index_file, + struct interactive_options *opts, + const char *revision, + const struct pathspec *ps) +{ + struct patch_mode mode = { + .apply_args = { "--cached", NULL }, + .apply_check_args = { "--cached", NULL }, + .prompt_mode = { + N_("Stage mode change [y,n,q,a,d%s,?]? "), + N_("Stage deletion [y,n,q,a,d%s,?]? "), + N_("Stage addition [y,n,q,a,d%s,?]? "), + N_("Stage this hunk [y,n,q,a,d%s,?]? ") + }, + .edit_hunk_hint = N_("If the patch applies cleanly, the edited hunk " + "will immediately be marked for staging."), + .help_patch_text = + N_("y - stage this hunk\n" + "n - do not stage this hunk\n" + "q - quit; do not stage this hunk or any of the remaining " + "ones\n" + "a - stage this hunk and all later hunks in the file\n" + "d - do not stage this hunk or any of the later hunks in " + "the file\n"), + .index_only = 1, + }; + struct add_p_state s = { + .r = r, + .index = index, + .index_file = index_file, + .answer = STRBUF_INIT, + .buf = STRBUF_INIT, + .plain = STRBUF_INIT, + .colored = STRBUF_INIT, + .mode = &mode, + .revision = revision, + }; + char parent_tree_oid[GIT_MAX_HEXSZ + 1]; + struct commit *commit; + int ret; + + interactive_config_init(&s.cfg, r, opts); + + commit = lookup_commit_reference_by_name(revision); + if (!commit) { + err(&s, _("Revision does not refer to a commit")); + ret = -1; + goto out; + } + + if (commit->parents) + oid_to_hex_r(parent_tree_oid, get_commit_tree_oid(commit->parents->item)); + else + oid_to_hex_r(parent_tree_oid, r->hash_algo->empty_tree); + + mode.diff_cmd[0] = "diff-tree"; + mode.diff_cmd[1] = "-r"; + mode.diff_cmd[2] = parent_tree_oid; + + ret = run_add_p_common(&s, ps); + if (ret < 0) + goto out; + + ret = 0; + +out: + add_p_state_clear(&s); + return ret; } diff --git a/add-patch.h b/add-patch.h index e6868c60a2f343..cf2a31a40fb918 100644 --- a/add-patch.h +++ b/add-patch.h @@ -3,6 +3,7 @@ #include "color.h" +struct index_state; struct pathspec; struct repository; @@ -56,4 +57,11 @@ int run_add_p(struct repository *r, enum add_p_mode mode, struct interactive_options *opts, const char *revision, const struct pathspec *ps); +int run_add_p_index(struct repository *r, + struct index_state *index, + const char *index_file, + struct interactive_options *opts, + const char *revision, + const struct pathspec *ps); + #endif From 48f6d9232834be661f0d1dc4f187b324124ccbe0 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Mon, 2 Mar 2026 13:13:09 +0100 Subject: [PATCH 06/33] add-patch: allow disabling editing of hunks The "add-patch" mode allows the user to edit hunks to apply custom changes. This is incompatible with a new `git history split` command that we're about to introduce in a subsequent commit, so we need a way to disable this mode. Add a new flag to disable editing hunks. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- add-interactive.c | 2 +- add-patch.c | 22 ++++++++++++++-------- add-patch.h | 11 +++++++++-- builtin/add.c | 2 +- builtin/checkout.c | 2 +- builtin/reset.c | 2 +- builtin/stash.c | 2 +- 7 files changed, 28 insertions(+), 15 deletions(-) diff --git a/add-interactive.c b/add-interactive.c index 152e2a02979e45..3cf8a1dbf85e3f 100644 --- a/add-interactive.c +++ b/add-interactive.c @@ -927,7 +927,7 @@ static int run_patch(struct add_i_state *s, const struct pathspec *ps, parse_pathspec(&ps_selected, PATHSPEC_ALL_MAGIC & ~PATHSPEC_LITERAL, PATHSPEC_LITERAL_PATH, "", args.v); - res = run_add_p(s->r, ADD_P_ADD, &opts, NULL, &ps_selected); + res = run_add_p(s->r, ADD_P_ADD, &opts, NULL, &ps_selected, 0); strvec_clear(&args); clear_pathspec(&ps_selected); } diff --git a/add-patch.c b/add-patch.c index b4dc7d2293930c..4e28e5c18786e2 100644 --- a/add-patch.c +++ b/add-patch.c @@ -1604,7 +1604,9 @@ static bool get_first_undecided(const struct file_diff *file_diff, size_t *idx) return false; } -static size_t patch_update_file(struct add_p_state *s, size_t idx) +static size_t patch_update_file(struct add_p_state *s, + size_t idx, + unsigned flags) { size_t hunk_index = 0; ssize_t i, undecided_previous, undecided_next, rendered_hunk_index = -1; @@ -1715,7 +1717,8 @@ static size_t patch_update_file(struct add_p_state *s, size_t idx) permitted |= ALLOW_SPLIT; strbuf_addstr(&s->buf, ",s"); } - if (hunk_index + 1 > file_diff->mode_change && + if (!(flags & ADD_P_DISALLOW_EDIT) && + hunk_index + 1 > file_diff->mode_change && !file_diff->deleted) { permitted |= ALLOW_EDIT; strbuf_addstr(&s->buf, ",e"); @@ -2003,7 +2006,8 @@ static size_t patch_update_file(struct add_p_state *s, size_t idx) } static int run_add_p_common(struct add_p_state *state, - const struct pathspec *ps) + const struct pathspec *ps, + unsigned flags) { size_t binary_count = 0; size_t i; @@ -2017,7 +2021,7 @@ static int run_add_p_common(struct add_p_state *state, i++; continue; } - if ((i = patch_update_file(state, i)) == state->file_diff_nr) + if ((i = patch_update_file(state, i, flags)) == state->file_diff_nr) break; } @@ -2035,7 +2039,8 @@ static int run_add_p_common(struct add_p_state *state, int run_add_p(struct repository *r, enum add_p_mode mode, struct interactive_options *opts, const char *revision, - const struct pathspec *ps) + const struct pathspec *ps, + unsigned flags) { struct add_p_state s = { .r = r, @@ -2084,7 +2089,7 @@ int run_add_p(struct repository *r, enum add_p_mode mode, goto out; } - ret = run_add_p_common(&s, ps); + ret = run_add_p_common(&s, ps, flags); if (ret < 0) goto out; @@ -2100,7 +2105,8 @@ int run_add_p_index(struct repository *r, const char *index_file, struct interactive_options *opts, const char *revision, - const struct pathspec *ps) + const struct pathspec *ps, + unsigned flags) { struct patch_mode mode = { .apply_args = { "--cached", NULL }, @@ -2156,7 +2162,7 @@ int run_add_p_index(struct repository *r, mode.diff_cmd[1] = "-r"; mode.diff_cmd[2] = parent_tree_oid; - ret = run_add_p_common(&s, ps); + ret = run_add_p_common(&s, ps, flags); if (ret < 0) goto out; diff --git a/add-patch.h b/add-patch.h index cf2a31a40fb918..fb6d975b68cc4d 100644 --- a/add-patch.h +++ b/add-patch.h @@ -53,15 +53,22 @@ enum add_p_mode { ADD_P_WORKTREE, }; +enum add_p_flags { + /* Disallow "editing" hunks. */ + ADD_P_DISALLOW_EDIT = (1 << 0), +}; + int run_add_p(struct repository *r, enum add_p_mode mode, struct interactive_options *opts, const char *revision, - const struct pathspec *ps); + const struct pathspec *ps, + unsigned flags); int run_add_p_index(struct repository *r, struct index_state *index, const char *index_file, struct interactive_options *opts, const char *revision, - const struct pathspec *ps); + const struct pathspec *ps, + unsigned flags); #endif diff --git a/builtin/add.c b/builtin/add.c index 84f9bcb789fc7b..eeab779328e42f 100644 --- a/builtin/add.c +++ b/builtin/add.c @@ -172,7 +172,7 @@ int interactive_add(struct repository *repo, prefix, argv); if (patch) - ret = !!run_add_p(repo, ADD_P_ADD, interactive_opts, NULL, &pathspec); + ret = !!run_add_p(repo, ADD_P_ADD, interactive_opts, NULL, &pathspec, 0); else ret = !!run_add_i(repo, &pathspec, interactive_opts); diff --git a/builtin/checkout.c b/builtin/checkout.c index bebe18c1d9029a..a8863277f225c4 100644 --- a/builtin/checkout.c +++ b/builtin/checkout.c @@ -563,7 +563,7 @@ static int checkout_paths(const struct checkout_opts *opts, BUG("either flag must have been set, worktree=%d, index=%d", opts->checkout_worktree, opts->checkout_index); return !!run_add_p(the_repository, patch_mode, &interactive_opts, - rev, &opts->pathspec); + rev, &opts->pathspec, 0); } repo_hold_locked_index(the_repository, &lock_file, LOCK_DIE_ON_ERROR); diff --git a/builtin/reset.c b/builtin/reset.c index 4a74a82c0a3ae8..3590be57a5f03c 100644 --- a/builtin/reset.c +++ b/builtin/reset.c @@ -438,7 +438,7 @@ int cmd_reset(int argc, die(_("options '%s' and '%s' cannot be used together"), "--patch", "--{hard,mixed,soft}"); trace2_cmd_mode("patch-interactive"); update_ref_status = !!run_add_p(the_repository, ADD_P_RESET, - &interactive_opts, rev, &pathspec); + &interactive_opts, rev, &pathspec, 0); goto cleanup; } else { if (interactive_opts.context != -1) diff --git a/builtin/stash.c b/builtin/stash.c index c467c02c7fb2aa..7c68a1d7f9a85b 100644 --- a/builtin/stash.c +++ b/builtin/stash.c @@ -1331,7 +1331,7 @@ static int stash_patch(struct stash_info *info, const struct pathspec *ps, old_index_env = xstrdup_or_null(getenv(INDEX_ENVIRONMENT)); setenv(INDEX_ENVIRONMENT, the_repository->index_file, 1); - ret = !!run_add_p(the_repository, ADD_P_STASH, interactive_opts, NULL, ps); + ret = !!run_add_p(the_repository, ADD_P_STASH, interactive_opts, NULL, ps, 0); the_repository->index_file = old_repo_index_file; if (old_index_env && *old_index_env) From a021e4f92cbacdb028b3efa49f619b076e72c9a6 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Mon, 2 Mar 2026 13:13:10 +0100 Subject: [PATCH 07/33] cache-tree: allow writing in-memory index as tree The function `write_in_core_index_as_tree()` takes a repository and writes its index into a tree object. What this function cannot do though is to take an _arbitrary_ in-memory index. Introduce a new `struct index_state` parameter so that the caller can pass a different index than the one belonging to the repository. This will be used in a subsequent commit. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- builtin/checkout.c | 3 ++- cache-tree.c | 4 ++-- cache-tree.h | 3 ++- 3 files changed, 6 insertions(+), 4 deletions(-) diff --git a/builtin/checkout.c b/builtin/checkout.c index a8863277f225c4..f8b3a7b08ce120 100644 --- a/builtin/checkout.c +++ b/builtin/checkout.c @@ -891,7 +891,8 @@ static int merge_working_tree(const struct checkout_opts *opts, 0); init_ui_merge_options(&o, the_repository); o.verbosity = 0; - work = write_in_core_index_as_tree(the_repository); + work = write_in_core_index_as_tree(the_repository, + the_repository->index); ret = reset_tree(new_tree, opts, 1, diff --git a/cache-tree.c b/cache-tree.c index 16c3a36b48267b..60bcc07c3b8357 100644 --- a/cache-tree.c +++ b/cache-tree.c @@ -723,11 +723,11 @@ static int write_index_as_tree_internal(struct object_id *oid, return 0; } -struct tree* write_in_core_index_as_tree(struct repository *repo) { +struct tree *write_in_core_index_as_tree(struct repository *repo, + struct index_state *index_state) { struct object_id o; int was_valid, ret; - struct index_state *index_state = repo->index; was_valid = index_state->cache_tree && cache_tree_fully_valid(index_state->cache_tree); diff --git a/cache-tree.h b/cache-tree.h index b82c4963e7c805..f8bddae5235f54 100644 --- a/cache-tree.h +++ b/cache-tree.h @@ -47,7 +47,8 @@ int cache_tree_verify(struct repository *, struct index_state *); #define WRITE_TREE_UNMERGED_INDEX (-2) #define WRITE_TREE_PREFIX_ERROR (-3) -struct tree* write_in_core_index_as_tree(struct repository *repo); +struct tree *write_in_core_index_as_tree(struct repository *repo, + struct index_state *index_state); int write_index_as_tree(struct object_id *oid, struct index_state *index_state, const char *index_path, int flags, const char *prefix); void prime_cache_tree(struct repository *, struct index_state *, struct tree *); From 98f839425db94eb8d4c1f773e923ba1cff622d66 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Mon, 2 Mar 2026 13:13:11 +0100 Subject: [PATCH 08/33] builtin/history: split out extended function to create commits In the next commit we're about to introduce a new command that splits up a commit into two. Most of the logic will be shared with rewording commits, except that we also need to have control over the parents and the old/new trees. Extract a new function `commit_tree_with_edited_message_ext()` to prepare for this commit. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- builtin/history.c | 67 ++++++++++++++++++++++++++++------------------- 1 file changed, 40 insertions(+), 27 deletions(-) diff --git a/builtin/history.c b/builtin/history.c index 1cf6c668cfd814..80726ce14b5779 100644 --- a/builtin/history.c +++ b/builtin/history.c @@ -83,10 +83,13 @@ static int fill_commit_message(struct repository *repo, return 0; } -static int commit_tree_with_edited_message(struct repository *repo, - const char *action, - struct commit *original, - struct commit **out) +static int commit_tree_with_edited_message_ext(struct repository *repo, + const char *action, + struct commit *commit_with_message, + const struct commit_list *parents, + const struct object_id *old_tree, + const struct object_id *new_tree, + struct commit **out) { const char *exclude_gpgsig[] = { /* We reencode the message, so the encoding needs to be stripped. */ @@ -100,44 +103,27 @@ static int commit_tree_with_edited_message(struct repository *repo, struct commit_extra_header *original_extra_headers = NULL; struct strbuf commit_message = STRBUF_INIT; struct object_id rewritten_commit_oid; - struct object_id original_tree_oid; - struct object_id parent_tree_oid; char *original_author = NULL; - struct commit *parent; size_t len; int ret; - original_tree_oid = repo_get_commit_tree(repo, original)->object.oid; - - parent = original->parents ? original->parents->item : NULL; - if (parent) { - if (repo_parse_commit(repo, parent)) { - ret = error(_("unable to parse parent commit %s"), - oid_to_hex(&parent->object.oid)); - goto out; - } - - parent_tree_oid = repo_get_commit_tree(repo, parent)->object.oid; - } else { - oidcpy(&parent_tree_oid, repo->hash_algo->empty_tree); - } - /* We retain authorship of the original commit. */ - original_message = repo_logmsg_reencode(repo, original, NULL, NULL); + original_message = repo_logmsg_reencode(repo, commit_with_message, NULL, NULL); ptr = find_commit_header(original_message, "author", &len); if (ptr) original_author = xmemdupz(ptr, len); find_commit_subject(original_message, &original_body); - ret = fill_commit_message(repo, &parent_tree_oid, &original_tree_oid, + ret = fill_commit_message(repo, old_tree, new_tree, original_body, action, &commit_message); if (ret < 0) goto out; - original_extra_headers = read_commit_extra_headers(original, exclude_gpgsig); + original_extra_headers = read_commit_extra_headers(commit_with_message, + exclude_gpgsig); - ret = commit_tree_extended(commit_message.buf, commit_message.len, &original_tree_oid, - original->parents, &rewritten_commit_oid, original_author, + ret = commit_tree_extended(commit_message.buf, commit_message.len, new_tree, + parents, &rewritten_commit_oid, original_author, NULL, NULL, original_extra_headers); if (ret < 0) goto out; @@ -151,6 +137,33 @@ static int commit_tree_with_edited_message(struct repository *repo, return ret; } +static int commit_tree_with_edited_message(struct repository *repo, + const char *action, + struct commit *original, + struct commit **out) +{ + struct object_id parent_tree_oid; + const struct object_id *tree_oid; + struct commit *parent; + + tree_oid = &repo_get_commit_tree(repo, original)->object.oid; + + parent = original->parents ? original->parents->item : NULL; + if (parent) { + if (repo_parse_commit(repo, parent)) { + return error(_("unable to parse parent commit %s"), + oid_to_hex(&parent->object.oid)); + } + + parent_tree_oid = repo_get_commit_tree(repo, parent)->object.oid; + } else { + oidcpy(&parent_tree_oid, repo->hash_algo->empty_tree); + } + + return commit_tree_with_edited_message_ext(repo, action, original, original->parents, + &parent_tree_oid, tree_oid, out); +} + enum ref_action { REF_ACTION_DEFAULT, REF_ACTION_BRANCHES, From d563ecec2845467880f5742e178a9723afef495a Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Mon, 2 Mar 2026 13:13:12 +0100 Subject: [PATCH 09/33] builtin/history: implement "split" subcommand It is quite a common use case that one wants to split up one commit into multiple commits by moving parts of the changes of the original commit out into a separate commit. This is quite an involved operation though: 1. Identify the commit in question that is to be dropped. 2. Perform an interactive rebase on top of that commit's parent. 3. Modify the instruction sheet to "edit" the commit that is to be split up. 4. Drop the commit via "git reset HEAD~". 5. Stage changes that should go into the first commit and commit it. 6. Stage changes that should go into the second commit and commit it. 7. Finalize the rebase. This is quite complex, and overall I would claim that most people who are not experts in Git would struggle with this flow. Introduce a new "split" subcommand for git-history(1) to make this way easier. All the user needs to do is to say `git history split $COMMIT`. From hereon, Git asks the user which parts of the commit shall be moved out into a separate commit and, once done, asks the user for the commit message. Git then creates that split-out commit and applies the original commit on top of it. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- Documentation/git-history.adoc | 62 +++ builtin/history.c | 250 +++++++++++ t/meson.build | 1 + t/t3452-history-split.sh | 757 +++++++++++++++++++++++++++++++++ 4 files changed, 1070 insertions(+) create mode 100755 t/t3452-history-split.sh diff --git a/Documentation/git-history.adoc b/Documentation/git-history.adoc index cc019de69764d2..24dc907033b469 100644 --- a/Documentation/git-history.adoc +++ b/Documentation/git-history.adoc @@ -9,6 +9,7 @@ SYNOPSIS -------- [synopsis] git history reword [--dry-run] [--update-refs=(branches|head)] +git history split [--dry-run] [--update-refs=(branches|head)] [--] [...] DESCRIPTION ----------- @@ -57,6 +58,26 @@ The following commands are available to rewrite history in different ways: details of this commit remain unchanged. This command will spawn an editor with the current message of that commit. +`split [--] [...]`:: + Interactively split up into two commits by choosing + hunks introduced by it that will be moved into the new split-out + commit. These hunks will then be written into a new commit that + becomes the parent of the previous commit. The original commit + stays intact, except that its parent will be the newly split-out + commit. ++ +The commit messages of the split-up commits will be asked for by launching +the configured editor. Authorship of the commit will be the same as for the +original commit. ++ +If passed, __ can be used to limit which changes shall be split out +of the original commit. Files not matching any of the pathspecs will remain +part of the original commit. For more details, see the 'pathspec' entry in +linkgit:gitglossary[7]. ++ +It is invalid to select either all or no hunks, as that would lead to +one of the commits becoming empty. + OPTIONS ------- @@ -72,6 +93,47 @@ OPTIONS descendants of the original commit will be rewritten. With `head`, only the current `HEAD` reference will be rewritten. Defaults to `branches`. +EXAMPLES +-------- + +Split a commit +~~~~~~~~~~~~~~ + +---------- +$ git log --stat --oneline +3f81232 (HEAD -> main) original + bar | 1 + + foo | 1 + + 2 files changed, 2 insertions(+) + +$ git history split HEAD +diff --git a/bar b/bar +new file mode 100644 +index 0000000..5716ca5 +--- /dev/null ++++ b/bar +@@ -0,0 +1 @@ ++bar +(1/1) Stage addition [y,n,q,a,d,p,?]? y + +diff --git a/foo b/foo +new file mode 100644 +index 0000000..257cc56 +--- /dev/null ++++ b/foo +@@ -0,0 +1 @@ ++foo +(1/1) Stage addition [y,n,q,a,d,p,?]? n + +$ git log --stat --oneline +7cebe64 (HEAD -> main) original + foo | 1 + + 1 file changed, 1 insertion(+) +d1582f3 split-out commit + bar | 1 + + 1 file changed, 1 insertion(+) +---------- + GIT --- Part of the linkgit:git[1] suite diff --git a/builtin/history.c b/builtin/history.c index 80726ce14b5779..dfb9d3f1802733 100644 --- a/builtin/history.c +++ b/builtin/history.c @@ -1,6 +1,7 @@ #define USE_THE_REPOSITORY_VARIABLE #include "builtin.h" +#include "cache-tree.h" #include "commit.h" #include "commit-reach.h" #include "config.h" @@ -8,17 +9,24 @@ #include "environment.h" #include "gettext.h" #include "hex.h" +#include "lockfile.h" +#include "oidmap.h" #include "parse-options.h" +#include "path.h" +#include "read-cache.h" #include "refs.h" #include "replay.h" #include "revision.h" #include "sequencer.h" #include "strvec.h" #include "tree.h" +#include "unpack-trees.h" #include "wt-status.h" #define GIT_HISTORY_REWORD_USAGE \ N_("git history reword [--dry-run] [--update-refs=(branches|head)]") +#define GIT_HISTORY_SPLIT_USAGE \ + N_("git history split [--dry-run] [--update-refs=(branches|head)] [--] [...]") static void change_data_free(void *util, const char *str UNUSED) { @@ -484,6 +492,246 @@ static int cmd_history_reword(int argc, return ret; } +static int write_ondisk_index(struct repository *repo, + struct object_id *oid, + const char *path) +{ + struct unpack_trees_options opts = { 0 }; + struct lock_file lock = LOCK_INIT; + struct tree_desc tree_desc; + struct index_state index; + struct tree *tree; + int ret; + + index_state_init(&index, repo); + + opts.head_idx = -1; + opts.src_index = &index; + opts.dst_index = &index; + + tree = repo_parse_tree_indirect(repo, oid); + init_tree_desc(&tree_desc, &tree->object.oid, tree->buffer, tree->size); + + if (unpack_trees(1, &tree_desc, &opts)) { + ret = error(_("unable to populate index with tree")); + goto out; + } + + prime_cache_tree(repo, &index, tree); + + if (hold_lock_file_for_update(&lock, path, 0) < 0) { + ret = error_errno(_("unable to acquire index lock")); + goto out; + } + + if (write_locked_index(&index, &lock, COMMIT_LOCK)) { + ret = error(_("unable to write new index file")); + goto out; + } + + ret = 0; + +out: + rollback_lock_file(&lock); + release_index(&index); + return ret; +} + +static int split_commit(struct repository *repo, + struct commit *original, + struct pathspec *pathspec, + struct commit **out) +{ + struct interactive_options interactive_opts = INTERACTIVE_OPTIONS_INIT; + struct strbuf index_file = STRBUF_INIT; + struct index_state index = INDEX_STATE_INIT(repo); + const struct object_id *original_commit_tree_oid; + const struct object_id *old_tree_oid, *new_tree_oid; + struct object_id parent_tree_oid; + char original_commit_oid[GIT_MAX_HEXSZ + 1]; + struct commit *first_commit, *second_commit; + struct commit_list *parents = NULL; + struct tree *split_tree; + int ret; + + if (original->parents) { + if (repo_parse_commit(repo, original->parents->item)) { + ret = error(_("unable to parse parent commit %s"), + oid_to_hex(&original->parents->item->object.oid)); + goto out; + } + + parent_tree_oid = *get_commit_tree_oid(original->parents->item); + } else { + oidcpy(&parent_tree_oid, repo->hash_algo->empty_tree); + } + original_commit_tree_oid = get_commit_tree_oid(original); + + /* + * Construct the first commit. This is done by taking the original + * commit parent's tree and selectively patching changes from the diff + * between that parent and its child. + */ + repo_git_path_replace(repo, &index_file, "%s", "history-split.index"); + + ret = write_ondisk_index(repo, &parent_tree_oid, index_file.buf); + if (ret < 0) + goto out; + + ret = read_index_from(&index, index_file.buf, repo->gitdir); + if (ret < 0) { + ret = error(_("failed reading temporary index")); + goto out; + } + + oid_to_hex_r(original_commit_oid, &original->object.oid); + ret = run_add_p_index(repo, &index, index_file.buf, &interactive_opts, + original_commit_oid, pathspec, ADD_P_DISALLOW_EDIT); + if (ret < 0) + goto out; + + split_tree = write_in_core_index_as_tree(repo, &index); + if (!split_tree) { + ret = error(_("failed split tree")); + goto out; + } + + unlink(index_file.buf); + strbuf_release(&index_file); + + /* + * We disallow the cases where either the split-out commit or the + * original commit would become empty. Consequently, if we see that the + * new tree ID matches either of those trees we abort. + */ + if (oideq(&split_tree->object.oid, &parent_tree_oid)) { + ret = error(_("split commit is empty")); + goto out; + } else if (oideq(&split_tree->object.oid, original_commit_tree_oid)) { + ret = error(_("split commit tree matches original commit")); + goto out; + } + + /* + * The first commit is constructed from the split-out tree. The base + * that shall be diffed against is the parent of the original commit. + */ + ret = commit_tree_with_edited_message_ext(repo, "split-out", original, + original->parents, &parent_tree_oid, + &split_tree->object.oid, &first_commit); + if (ret < 0) { + ret = error(_("failed writing first commit")); + goto out; + } + + /* + * The second commit is constructed from the original tree. The base to + * diff against and the parent in this case is the first split-out + * commit. + */ + commit_list_append(first_commit, &parents); + + old_tree_oid = &repo_get_commit_tree(repo, first_commit)->object.oid; + new_tree_oid = &repo_get_commit_tree(repo, original)->object.oid; + + ret = commit_tree_with_edited_message_ext(repo, "split-out", original, + parents, old_tree_oid, + new_tree_oid, &second_commit); + if (ret < 0) { + ret = error(_("failed writing second commit")); + goto out; + } + + *out = second_commit; + ret = 0; + +out: + if (index_file.len) + unlink(index_file.buf); + strbuf_release(&index_file); + free_commit_list(parents); + release_index(&index); + return ret; +} + +static int cmd_history_split(int argc, + const char **argv, + const char *prefix, + struct repository *repo) +{ + const char * const usage[] = { + GIT_HISTORY_SPLIT_USAGE, + NULL, + }; + enum ref_action action = REF_ACTION_DEFAULT; + int dry_run = 0; + struct option options[] = { + OPT_CALLBACK_F(0, "update-refs", &action, N_(""), + N_("control ref update behavior (branches|head|print)"), + PARSE_OPT_NONEG, parse_ref_action), + OPT_BOOL('n', "dry-run", &dry_run, + N_("perform a dry-run without updating any refs")), + OPT_END(), + }; + struct commit *original, *rewritten = NULL; + struct strbuf reflog_msg = STRBUF_INIT; + struct pathspec pathspec = { 0 }; + struct rev_info revs = { 0 }; + int ret; + + argc = parse_options(argc, argv, prefix, options, usage, 0); + if (argc < 1) { + ret = error(_("command expects a committish")); + goto out; + } + repo_config(repo, git_default_config, NULL); + + if (action == REF_ACTION_DEFAULT) + action = REF_ACTION_BRANCHES; + + parse_pathspec(&pathspec, 0, + PATHSPEC_PREFER_FULL | + PATHSPEC_SYMLINK_LEADING_PATH | + PATHSPEC_PREFIX_ORIGIN, + prefix, argv + 1); + + original = lookup_commit_reference_by_name(argv[0]); + if (!original) { + ret = error(_("commit cannot be found: %s"), argv[0]); + goto out; + } + + ret = setup_revwalk(repo, action, original, &revs); + if (ret < 0) + goto out; + + if (original->parents && original->parents->next) { + ret = error(_("cannot split up merge commit")); + goto out; + } + + ret = split_commit(repo, original, &pathspec, &rewritten); + if (ret < 0) + goto out; + + strbuf_addf(&reflog_msg, "split: updating %s", argv[0]); + + ret = handle_reference_updates(&revs, action, original, rewritten, + reflog_msg.buf, dry_run); + if (ret < 0) { + ret = error(_("failed replaying descendants")); + goto out; + } + + ret = 0; + +out: + strbuf_release(&reflog_msg); + clear_pathspec(&pathspec); + release_revisions(&revs); + return ret; +} + int cmd_history(int argc, const char **argv, const char *prefix, @@ -491,11 +739,13 @@ int cmd_history(int argc, { const char * const usage[] = { GIT_HISTORY_REWORD_USAGE, + GIT_HISTORY_SPLIT_USAGE, NULL, }; parse_opt_subcommand_fn *fn = NULL; struct option options[] = { OPT_SUBCOMMAND("reword", &fn, cmd_history_reword), + OPT_SUBCOMMAND("split", &fn, cmd_history_split), OPT_END(), }; diff --git a/t/meson.build b/t/meson.build index 6d91470ebc1cf7..2d578ef58b5169 100644 --- a/t/meson.build +++ b/t/meson.build @@ -392,6 +392,7 @@ integration_tests = [ 't3438-rebase-broken-files.sh', 't3450-history.sh', 't3451-history-reword.sh', + 't3452-history-split.sh', 't3500-cherry.sh', 't3501-revert-cherry-pick.sh', 't3502-cherry-pick-merge.sh', diff --git a/t/t3452-history-split.sh b/t/t3452-history-split.sh new file mode 100755 index 00000000000000..8ed0cebb500296 --- /dev/null +++ b/t/t3452-history-split.sh @@ -0,0 +1,757 @@ +#!/bin/sh + +test_description='tests for git-history split subcommand' + +. ./test-lib.sh +. "$TEST_DIRECTORY/lib-log-graph.sh" + +# The fake editor takes multiple arguments, each of which represents a commit +# message. Subsequent invocations of the editor will then yield those messages +# in order. +# +set_fake_editor () { + printf "%s\n" "$@" >fake-input && + write_script fake-editor.sh <<-\EOF && + head -n1 fake-input >"$1" + sed 1d fake-input >fake-input.trimmed && + mv fake-input.trimmed fake-input + EOF + test_set_editor "$(pwd)"/fake-editor.sh +} + +expect_graph () { + cat >expect && + lib_test_cmp_graph --graph --format=%s "$@" +} + +expect_log () { + git log --format="%s" >actual && + cat >expect && + test_cmp expect actual +} + +expect_tree_entries () { + git ls-tree --name-only "$1" >actual && + cat >expect && + test_cmp expect actual +} + +test_expect_success 'refuses to work with merge commits' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + test_commit base && + git branch branch && + test_commit ours && + git switch branch && + test_commit theirs && + git switch - && + git merge theirs && + test_must_fail git history split HEAD 2>err && + test_grep "cannot split up merge commit" err && + test_must_fail git history split HEAD~ 2>err && + test_grep "replaying merge commits is not supported yet" err + ) +' + +test_expect_success 'errors on missing commit argument' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + test_commit initial && + test_must_fail git history split 2>err && + test_grep "command expects a committish" err + ) +' + +test_expect_success 'errors on unknown revision' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + test_commit initial && + test_must_fail git history split does-not-exist 2>err && + test_grep "commit cannot be found" err + ) +' + +test_expect_success '--dry-run does not modify any refs' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + test_commit base && + touch bar foo && + git add . && + git commit -m split-me && + + git refs list --include-root-refs >before && + + set_fake_editor "first" "second" && + git history split --dry-run HEAD <<-EOF && + y + n + EOF + + git refs list --include-root-refs >after && + test_cmp before after + ) +' + +test_expect_success 'can split up tip commit' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + test_commit initial && + touch bar foo && + git add . && + git commit -m split-me && + + git symbolic-ref HEAD >expect && + set_fake_editor "first" "second" && + git history split HEAD <<-EOF && + y + n + EOF + git symbolic-ref HEAD >actual && + test_cmp expect actual && + + expect_log <<-EOF && + second + first + initial + EOF + + expect_tree_entries HEAD~ <<-EOF && + bar + initial.t + EOF + + expect_tree_entries HEAD <<-EOF && + bar + foo + initial.t + EOF + + git reflog >reflog && + test_grep "split: updating HEAD" reflog + ) +' + +test_expect_success 'can split up root commit' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + touch bar foo && + git add . && + git commit -m root && + test_commit tip && + + set_fake_editor "first" "second" && + git history split HEAD~ <<-EOF && + y + n + EOF + + expect_log <<-EOF && + tip + second + first + EOF + + expect_tree_entries HEAD~2 <<-EOF && + bar + EOF + + expect_tree_entries HEAD~ <<-EOF && + bar + foo + EOF + + expect_tree_entries HEAD <<-EOF + bar + foo + tip.t + EOF + ) +' + +test_expect_success 'can split up in-between commit' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + test_commit initial && + touch bar foo && + git add . && + git commit -m split-me && + test_commit tip && + + set_fake_editor "first" "second" && + git history split HEAD~ <<-EOF && + y + n + EOF + + expect_log <<-EOF && + tip + second + first + initial + EOF + + expect_tree_entries HEAD~2 <<-EOF && + bar + initial.t + EOF + + expect_tree_entries HEAD~ <<-EOF && + bar + foo + initial.t + EOF + + expect_tree_entries HEAD <<-EOF + bar + foo + initial.t + tip.t + EOF + ) +' + +test_expect_success 'can split HEAD only' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + test_commit base && + touch a b && + git add . && + git commit -m split-me && + git branch unrelated && + + set_fake_editor "ours-a" "ours-b" && + git history split --update-refs=head HEAD <<-EOF && + y + n + EOF + expect_graph --branches <<-EOF + * ours-b + * ours-a + | * split-me + |/ + * base + EOF + ) +' + +test_expect_success 'can split detached HEAD' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + test_commit initial && + touch bar foo && + git add . && + git commit -m split-me && + git checkout --detach HEAD && + + set_fake_editor "first" "second" && + git history split --update-refs=head HEAD <<-EOF && + y + n + EOF + + # HEAD should be detached and updated. + test_must_fail git symbolic-ref HEAD && + + expect_log <<-EOF + second + first + initial + EOF + ) +' + +test_expect_success 'can split commit in unrelated branch' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + test_commit base && + git branch ours && + git switch --create theirs && + touch theirs-a theirs-b && + git add . && + git commit -m theirs && + git switch ours && + test_commit ours && + + # With --update-refs=head it is not possible to split up a + # commit that is unrelated to HEAD. + test_must_fail git history split --update-refs=head theirs 2>err && + test_grep "rewritten commit must be an ancestor of HEAD" err && + + set_fake_editor "theirs-rewritten-a" "theirs-rewritten-b" && + git history split theirs <<-EOF && + y + n + EOF + expect_graph --branches <<-EOF && + * ours + | * theirs-rewritten-b + | * theirs-rewritten-a + |/ + * base + EOF + + expect_tree_entries theirs~ <<-EOF && + base.t + theirs-a + EOF + + expect_tree_entries theirs <<-EOF + base.t + theirs-a + theirs-b + EOF + ) +' + +test_expect_success 'updates multiple descendant branches' ' + test_when_finished "rm -rf repo" && + git init repo --initial-branch=main && + ( + cd repo && + test_commit base && + touch file-a file-b && + git add . && + git commit -m split-me && + git branch branch && + test_commit on-main && + git switch branch && + test_commit on-branch && + git switch main && + + set_fake_editor "split-a" "split-b" && + git history split HEAD~ <<-EOF && + y + n + EOF + + # Both branches should now descend from the split commits. + expect_graph --branches <<-EOF + * on-branch + | * on-main + |/ + * split-b + * split-a + * base + EOF + ) +' + +test_expect_success 'can pick multiple hunks' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + touch bar baz foo qux && + git add . && + git commit -m split-me && + + set_fake_editor "first" "second" && + git history split HEAD <<-EOF && + y + n + y + n + EOF + + expect_tree_entries HEAD~ <<-EOF && + bar + foo + EOF + + expect_tree_entries HEAD <<-EOF + bar + baz + foo + qux + EOF + ) +' + +test_expect_success 'can use only last hunk' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + touch bar foo && + git add . && + git commit -m split-me && + + set_fake_editor "first" "second" && + git history split HEAD <<-EOF && + n + y + EOF + + expect_log <<-EOF && + second + first + EOF + + expect_tree_entries HEAD~ <<-EOF && + foo + EOF + + expect_tree_entries HEAD <<-EOF + bar + foo + EOF + ) +' + +test_expect_success 'can split commit with file deletions' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + echo a >a && + echo b >b && + echo c >c && + git add . && + git commit -m base && + git rm a b && + git commit -m delete-both && + + set_fake_editor "delete-a" "delete-b" && + git history split HEAD <<-EOF && + y + n + EOF + + expect_log <<-EOF && + delete-b + delete-a + base + EOF + + expect_tree_entries HEAD~ <<-EOF && + b + c + EOF + + expect_tree_entries HEAD <<-EOF + c + EOF + ) +' + +test_expect_success 'preserves original authorship' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + test_commit initial && + touch bar foo && + git add . && + GIT_AUTHOR_NAME="Other Author" \ + GIT_AUTHOR_EMAIL="other@example.com" \ + git commit -m split-me && + + set_fake_editor "first" "second" && + git history split HEAD <<-EOF && + y + n + EOF + + git log -1 --format="%an <%ae>" HEAD~ >actual && + echo "Other Author " >expect && + test_cmp expect actual && + + git log -1 --format="%an <%ae>" HEAD >actual && + test_cmp expect actual + ) +' + +test_expect_success 'aborts with empty commit message' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + touch bar foo && + git add . && + git commit -m split-me && + + set_fake_editor "" && + test_must_fail git history split HEAD <<-EOF 2>err && + y + n + EOF + test_grep "Aborting commit due to empty commit message." err + ) +' + +test_expect_success 'commit message editor sees split-out changes' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + touch bar foo && + git add . && + git commit -m split-me && + + write_script fake-editor.sh <<-\EOF && + cat "$1" >>MESSAGES && + echo "some commit message" >"$1" + EOF + test_set_editor "$(pwd)"/fake-editor.sh && + + git history split HEAD <<-EOF && + y + n + EOF + + # Note that we expect to see the messages twice, once for each + # of the commits. The committed files are different though. + cat >expect <<-EOF && + split-me + + # Please enter the commit message for the split-out changes. Lines starting + # with ${SQ}#${SQ} will be ignored, and an empty message aborts the commit. + # Changes to be committed: + # new file: bar + # + split-me + + # Please enter the commit message for the split-out changes. Lines starting + # with ${SQ}#${SQ} will be ignored, and an empty message aborts the commit. + # Changes to be committed: + # new file: foo + # + EOF + test_cmp expect MESSAGES && + + expect_log <<-EOF + some commit message + some commit message + EOF + ) +' + +test_expect_success 'can use pathspec to limit what gets split' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + touch bar foo && + git add . && + git commit -m split-me && + + set_fake_editor "first" "second" && + git history split HEAD -- foo <<-EOF && + y + EOF + + expect_tree_entries HEAD~ <<-EOF && + foo + EOF + + expect_tree_entries HEAD <<-EOF + bar + foo + EOF + ) +' + +test_expect_success 'pathspec matching no files produces empty split error' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + test_commit initial && + touch bar foo && + git add . && + git commit -m split-me && + + set_fake_editor "first" "second" && + test_must_fail git history split HEAD -- nonexistent 2>err && + test_grep "split commit is empty" err + ) +' + +test_expect_success 'split with multiple pathspecs' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + test_commit initial && + touch a b c d && + git add . && + git commit -m split-me && + + # Only a and c should be offered for splitting. + set_fake_editor "split-ac" "remainder" && + git history split HEAD -- a c <<-EOF && + y + y + EOF + + expect_tree_entries HEAD~ <<-EOF && + a + c + initial.t + EOF + + expect_tree_entries HEAD <<-EOF + a + b + c + d + initial.t + EOF + ) +' + +test_expect_success 'split with file mode change' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + echo content >script && + git add . && + git commit -m base && + test_chmod +x script && + echo change >script && + git commit -a -m "mode and content change" && + + set_fake_editor "mode-change" "content-change" && + git history split HEAD <<-EOF && + y + n + EOF + + expect_log <<-EOF + content-change + mode-change + base + EOF + ) +' + +test_expect_success 'refuses to create empty split-out commit' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + test_commit base && + touch bar foo && + git add . && + git commit -m split-me && + + test_must_fail git history split HEAD 2>err <<-EOF && + n + n + EOF + test_grep "split commit is empty" err + ) +' + +test_expect_success 'hooks are not executed for rewritten commits' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + touch bar foo && + git add . && + git commit -m split-me && + old_head=$(git rev-parse HEAD) && + + ORIG_PATH="$(pwd)" && + export ORIG_PATH && + for hook in prepare-commit-msg pre-commit post-commit post-rewrite commit-msg + do + write_script .git/hooks/$hook <<-\EOF || exit 1 + touch "$ORIG_PATH"/hooks.log + EOF + done && + + set_fake_editor "first" "second" && + git history split HEAD <<-EOF && + y + n + EOF + + expect_log <<-EOF && + second + first + EOF + + test_path_is_missing hooks.log + ) +' + +test_expect_success 'refuses to create empty original commit' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + touch bar foo && + git add . && + git commit -m split-me && + + test_must_fail git history split HEAD 2>err <<-EOF && + y + y + EOF + test_grep "split commit tree matches original commit" err + ) +' + +test_expect_success 'retains changes in the worktree and index' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + echo a >a && + echo b >b && + git add . && + git commit -m "initial commit" && + echo a-modified >a && + echo b-modified >b && + git add b && + set_fake_editor "a-only" "remainder" && + git history split HEAD <<-EOF && + y + n + EOF + + expect_tree_entries HEAD~ <<-EOF && + a + EOF + expect_tree_entries HEAD <<-EOF && + a + b + EOF + + cat >expect <<-\EOF && + M a + M b + ?? actual + ?? expect + ?? fake-editor.sh + ?? fake-input + EOF + git status --porcelain >actual && + test_cmp expect actual + ) +' + +test_done From 4f6a803aba9e97650af304b5c266cfbc25b11fcc Mon Sep 17 00:00:00 2001 From: Tian Yuchen Date: Tue, 10 Mar 2026 17:50:17 +0800 Subject: [PATCH 10/33] diff: document -U without as using default context The documentation for '-U' implies that the numeric value '' is mandatory. However, the command line parser has historically accepted '-U' without a number. Strictly requiring a number for '-U' would break existing tests (e.g., in 't4013') and likely disrupt user scripts relying on this undocumented behavior. Hence we retain this fallback behavior for backward compatibility, but document it as such. Signed-off-by: Tian Yuchen Signed-off-by: Junio C Hamano --- Documentation/diff-context-options.adoc | 6 ++++-- Documentation/diff-options.adoc | 6 ++++-- 2 files changed, 8 insertions(+), 4 deletions(-) diff --git a/Documentation/diff-context-options.adoc b/Documentation/diff-context-options.adoc index e161260358fff5..b9ace2aa4b3092 100644 --- a/Documentation/diff-context-options.adoc +++ b/Documentation/diff-context-options.adoc @@ -1,7 +1,9 @@ `-U`:: `--unified=`:: - Generate diffs with __ lines of context. Defaults to `diff.context` - or 3 if the config option is unset. + Generate diffs with __ lines of context. The number of context + lines defaults to `diff.context` or 3 if the configuration variable + is unset. (`-U` without `` is silently accepted as a synonym for + `-p` due to a historical accident). `--inter-hunk-context=`:: Show the context between diff hunks, up to the specified __ diff --git a/Documentation/diff-options.adoc b/Documentation/diff-options.adoc index 9cdad6f72a0c7d..867bef631280e3 100644 --- a/Documentation/diff-options.adoc +++ b/Documentation/diff-options.adoc @@ -127,8 +127,10 @@ endif::git-log[] `-U`:: `--unified=`:: - Generate diffs with __ lines of context instead of - the usual three. + Generate diffs with __ lines of context. The number of context + lines defaults to `diff.context` or 3 if the configuration variable + is unset. (`-U` without `` is silently accepted as a synonym for + `-p` due to a historical accident). ifndef::git-format-patch[] Implies `--patch`. endif::git-format-patch[] From 233545cc60315863fea3fa17e2df86de23a6a4f7 Mon Sep 17 00:00:00 2001 From: Justin Tobler Date: Thu, 12 Mar 2026 20:39:36 -0500 Subject: [PATCH 11/33] commit: remove unused forward declaration In 6206089cbd (commit: write commits for both hashes, 2023-10-01), `sign_with_header()` was removed, but its forward declaration in "commit.h" was left. Remove the unused declaration. Signed-off-by: Justin Tobler Signed-off-by: Junio C Hamano --- commit.h | 2 -- 1 file changed, 2 deletions(-) diff --git a/commit.h b/commit.h index f2f39e1a89ef62..e9dee8d26bc9ee 100644 --- a/commit.h +++ b/commit.h @@ -400,8 +400,6 @@ LAST_ARG_MUST_BE_NULL int run_commit_hook(int editor_is_used, const char *index_file, int *invoked_hook, const char *name, ...); -/* Sign a commit or tag buffer, storing the result in a header. */ -int sign_with_header(struct strbuf *buf, const char *keyid); /* Parse the signature out of a header. */ int parse_buffer_signed_by_header(const char *buffer, unsigned long size, From 86ebf870b909a7f4707aa2601d290bc992d21a53 Mon Sep 17 00:00:00 2001 From: Justin Tobler Date: Thu, 12 Mar 2026 20:39:37 -0500 Subject: [PATCH 12/33] gpg-interface: allow sign_buffer() to use default signing key The `sign_commit_to_strbuf()` helper in "commit.c" provides fallback logic to get the default configured signing key when a key is not provided and handles generating the commit signature accordingly. This signing operation is not really specific to commits as any arbitrary buffer can be signed. Also, in a subsequent commit, this same logic is reused by git-fast-import(1) when signing commits with invalid signatures. Remove the `sign_commit_to_strbuf()` helper from "commit.c" and extend `sign_buffer()` in "gpg-interface.c" to support using the default key as a fallback when the `SIGN_BUFFER_USE_DEFAULT_KEY` flag is provided. Call sites are updated accordingly. Signed-off-by: Justin Tobler Signed-off-by: Junio C Hamano --- builtin/tag.c | 4 ++-- commit.c | 19 +++++-------------- gpg-interface.c | 13 +++++++++++-- gpg-interface.h | 12 ++++++++++-- send-pack.c | 2 +- 5 files changed, 29 insertions(+), 21 deletions(-) diff --git a/builtin/tag.c b/builtin/tag.c index aeb04c487fe95a..540d783c67ded2 100644 --- a/builtin/tag.c +++ b/builtin/tag.c @@ -167,7 +167,7 @@ static int do_sign(struct strbuf *buffer, struct object_id **compat_oid, char *keyid = get_signing_key(); int ret = -1; - if (sign_buffer(buffer, &sig, keyid)) + if (sign_buffer(buffer, &sig, keyid, 0)) goto out; if (compat) { @@ -176,7 +176,7 @@ static int do_sign(struct strbuf *buffer, struct object_id **compat_oid, if (convert_object_file(the_repository ,&compat_buf, algo, compat, buffer->buf, buffer->len, OBJ_TAG, 1)) goto out; - if (sign_buffer(&compat_buf, &compat_sig, keyid)) + if (sign_buffer(&compat_buf, &compat_sig, keyid, 0)) goto out; add_header_signature(&compat_buf, &sig, algo); strbuf_addbuf(&compat_buf, &compat_sig); diff --git a/commit.c b/commit.c index 0ffdd6679eec80..80d8d078757dbc 100644 --- a/commit.c +++ b/commit.c @@ -1170,18 +1170,6 @@ int add_header_signature(struct strbuf *buf, struct strbuf *sig, const struct gi return 0; } -static int sign_commit_to_strbuf(struct strbuf *sig, struct strbuf *buf, const char *keyid) -{ - char *keyid_to_free = NULL; - int ret = 0; - if (!keyid || !*keyid) - keyid = keyid_to_free = get_signing_key(); - if (sign_buffer(buf, sig, keyid)) - ret = -1; - free(keyid_to_free); - return ret; -} - int parse_signed_commit(const struct commit *commit, struct strbuf *payload, struct strbuf *signature, const struct git_hash_algo *algop) @@ -1759,7 +1747,8 @@ int commit_tree_extended(const char *msg, size_t msg_len, oidcpy(&parent_buf[i++], &p->item->object.oid); write_commit_tree(&buffer, msg, msg_len, tree, parent_buf, nparents, author, committer, extra); - if (sign_commit && sign_commit_to_strbuf(&sig, &buffer, sign_commit)) { + if (sign_commit && sign_buffer(&buffer, &sig, sign_commit, + SIGN_BUFFER_USE_DEFAULT_KEY)) { result = -1; goto out; } @@ -1791,7 +1780,9 @@ int commit_tree_extended(const char *msg, size_t msg_len, free_commit_extra_headers(compat_extra); free(mapped_parents); - if (sign_commit && sign_commit_to_strbuf(&compat_sig, &compat_buffer, sign_commit)) { + if (sign_commit && sign_buffer(&compat_buffer, &compat_sig, + sign_commit, + SIGN_BUFFER_USE_DEFAULT_KEY)) { result = -1; goto out; } diff --git a/gpg-interface.c b/gpg-interface.c index 7e6a1520bd1535..27eacbb632628c 100644 --- a/gpg-interface.c +++ b/gpg-interface.c @@ -974,11 +974,20 @@ const char *gpg_trust_level_to_str(enum signature_trust_level level) return sigcheck_gpg_trust_level[level].display_key; } -int sign_buffer(struct strbuf *buffer, struct strbuf *signature, const char *signing_key) +int sign_buffer(struct strbuf *buffer, struct strbuf *signature, + const char *signing_key, enum sign_buffer_flags flags) { + char *keyid_to_free = NULL; + int ret = 0; + gpg_interface_lazy_init(); - return use_format->sign_buffer(buffer, signature, signing_key); + if ((flags & SIGN_BUFFER_USE_DEFAULT_KEY) && (!signing_key || !*signing_key)) + signing_key = keyid_to_free = get_signing_key(); + + ret = use_format->sign_buffer(buffer, signature, signing_key); + free(keyid_to_free); + return ret; } /* diff --git a/gpg-interface.h b/gpg-interface.h index 789d1ffac47850..37f3ac42dbb99a 100644 --- a/gpg-interface.h +++ b/gpg-interface.h @@ -74,6 +74,15 @@ int parse_signature(const char *buf, size_t size, struct strbuf *payload, struct */ size_t parse_signed_buffer(const char *buf, size_t size); +/* Flags for sign_buffer(). */ +enum sign_buffer_flags { + /* + * Use the default configured signing key as returned by `get_signing_key()` + * when the provided "signing_key" is NULL or empty. + */ + SIGN_BUFFER_USE_DEFAULT_KEY = (1 << 0), +}; + /* * Create a detached signature for the contents of "buffer" and append * it after "signature"; "buffer" and "signature" can be the same @@ -81,8 +90,7 @@ size_t parse_signed_buffer(const char *buf, size_t size); * at the end. Returns 0 on success, non-zero on failure. */ int sign_buffer(struct strbuf *buffer, struct strbuf *signature, - const char *signing_key); - + const char *signing_key, enum sign_buffer_flags flags); /* * Returns corresponding string in lowercase for a given member of diff --git a/send-pack.c b/send-pack.c index 67d6987b1ccd7e..07ecfae4de92a2 100644 --- a/send-pack.c +++ b/send-pack.c @@ -391,7 +391,7 @@ static int generate_push_cert(struct strbuf *req_buf, if (!update_seen) goto free_return; - if (sign_buffer(&cert, &cert, signing_key)) + if (sign_buffer(&cert, &cert, signing_key, 0)) die(_("failed to sign the push certificate")); packet_buf_write(req_buf, "push-cert%c%s", 0, cap_string); From ee66c793f84ef1c84ec3fe732bb26394ebefd257 Mon Sep 17 00:00:00 2001 From: Justin Tobler Date: Thu, 12 Mar 2026 20:39:38 -0500 Subject: [PATCH 13/33] fast-import: add mode to sign commits with invalid signatures With git-fast-import(1), handling of signed commits is controlled via the `--signed-commits=` option. When an invalid signature is encountered, a user may want the option to sign the commit again as opposed to just stripping the signature. To facilitate this, introduce a "sign-if-invalid" mode for the `--signed-commits` option. Optionally, a key ID may be explicitly provided in the form `sign-if-invalid[=]` to specify which signing key should be used when signing invalid commit signatures. Note that to properly support interoperability mode when signing commit signatures, the commit buffer must be created in both the repository and compatability object formats to generate the appropriate signatures accordingly. As currently implemented, the commit buffer for the compatability object format is not reconstructed and thus signing commits in interoperability mode is not yet supported. Support may be added in the future. Signed-off-by: Justin Tobler Signed-off-by: Junio C Hamano --- Documentation/git-fast-import.adoc | 4 + builtin/fast-export.c | 8 +- builtin/fast-import.c | 101 +++++++++++++++----- gpg-interface.c | 23 +++-- gpg-interface.h | 7 +- t/t9305-fast-import-signatures.sh | 144 ++++++++++++++++++----------- 6 files changed, 202 insertions(+), 85 deletions(-) diff --git a/Documentation/git-fast-import.adoc b/Documentation/git-fast-import.adoc index 479c4081da8f27..b3f42d46372a40 100644 --- a/Documentation/git-fast-import.adoc +++ b/Documentation/git-fast-import.adoc @@ -86,6 +86,10 @@ already trusted to run their own code. * `strip-if-invalid` will check signatures and, if they are invalid, will strip them and display a warning. The validation is performed in the same way as linkgit:git-verify-commit[1] does it. +* `sign-if-invalid[=]`, similar to `strip-if-invalid`, verifies + commit signatures and replaces invalid signatures with newly created ones. + Valid signatures are left unchanged. If `` is provided, that key is + used for signing; otherwise the configured default signing key is used. Options for Frontends ~~~~~~~~~~~~~~~~~~~~~ diff --git a/builtin/fast-export.c b/builtin/fast-export.c index 0c5d2386d81b92..13621b0d6a1552 100644 --- a/builtin/fast-export.c +++ b/builtin/fast-export.c @@ -64,7 +64,7 @@ static int parse_opt_sign_mode(const struct option *opt, if (unset) return 0; - if (parse_sign_mode(arg, val)) + if (parse_sign_mode(arg, val, NULL)) return error(_("unknown %s mode: %s"), opt->long_name, arg); return 0; @@ -825,6 +825,9 @@ static void handle_commit(struct commit *commit, struct rev_info *rev, case SIGN_STRIP_IF_INVALID: die(_("'strip-if-invalid' is not a valid mode for " "git fast-export with --signed-commits=")); + case SIGN_SIGN_IF_INVALID: + die(_("'sign-if-invalid' is not a valid mode for " + "git fast-export with --signed-commits=")); default: BUG("invalid signed_commit_mode value %d", signed_commit_mode); } @@ -970,6 +973,9 @@ static void handle_tag(const char *name, struct tag *tag) case SIGN_STRIP_IF_INVALID: die(_("'strip-if-invalid' is not a valid mode for " "git fast-export with --signed-tags=")); + case SIGN_SIGN_IF_INVALID: + die(_("'sign-if-invalid' is not a valid mode for " + "git fast-export with --signed-tags=")); default: BUG("invalid signed_commit_mode value %d", signed_commit_mode); } diff --git a/builtin/fast-import.c b/builtin/fast-import.c index b8a7757cfda943..935e688e33ac8a 100644 --- a/builtin/fast-import.c +++ b/builtin/fast-import.c @@ -190,6 +190,7 @@ static const char *global_prefix; static enum sign_mode signed_tag_mode = SIGN_VERBATIM; static enum sign_mode signed_commit_mode = SIGN_VERBATIM; +static const char *signed_commit_keyid; /* Memory pools */ static struct mem_pool fi_mem_pool = { @@ -2836,26 +2837,15 @@ static void finalize_commit_buffer(struct strbuf *new_data, strbuf_addbuf(new_data, msg); } -static void handle_strip_if_invalid(struct strbuf *new_data, - struct signature_data *sig_sha1, - struct signature_data *sig_sha256, - struct strbuf *msg) +static void warn_invalid_signature(struct signature_check *check, + const char *msg, enum sign_mode mode) { - struct strbuf tmp_buf = STRBUF_INIT; - struct signature_check signature_check = { 0 }; - int ret; - - /* Check signature in a temporary commit buffer */ - strbuf_addbuf(&tmp_buf, new_data); - finalize_commit_buffer(&tmp_buf, sig_sha1, sig_sha256, msg); - ret = verify_commit_buffer(tmp_buf.buf, tmp_buf.len, &signature_check); - - if (ret) { - const char *signer = signature_check.signer ? - signature_check.signer : _("unknown"); - const char *subject; - int subject_len = find_commit_subject(msg->buf, &subject); + const char *signer = check->signer ? check->signer : _("unknown"); + const char *subject; + int subject_len = find_commit_subject(msg, &subject); + switch (mode) { + case SIGN_STRIP_IF_INVALID: if (subject_len > 100) warning(_("stripping invalid signature for commit '%.100s...'\n" " allegedly by %s"), subject, signer); @@ -2865,6 +2855,67 @@ static void handle_strip_if_invalid(struct strbuf *new_data, else warning(_("stripping invalid signature for commit\n" " allegedly by %s"), signer); + break; + case SIGN_SIGN_IF_INVALID: + if (subject_len > 100) + warning(_("replacing invalid signature for commit '%.100s...'\n" + " allegedly by %s"), subject, signer); + else if (subject_len > 0) + warning(_("replacing invalid signature for commit '%.*s'\n" + " allegedly by %s"), subject_len, subject, signer); + else + warning(_("replacing invalid signature for commit\n" + " allegedly by %s"), signer); + break; + default: + BUG("unsupported signing mode"); + } +} + +static void handle_signature_if_invalid(struct strbuf *new_data, + struct signature_data *sig_sha1, + struct signature_data *sig_sha256, + struct strbuf *msg, + enum sign_mode mode) +{ + struct strbuf tmp_buf = STRBUF_INIT; + struct signature_check signature_check = { 0 }; + int ret; + + /* Check signature in a temporary commit buffer */ + strbuf_addbuf(&tmp_buf, new_data); + finalize_commit_buffer(&tmp_buf, sig_sha1, sig_sha256, msg); + ret = verify_commit_buffer(tmp_buf.buf, tmp_buf.len, &signature_check); + + if (ret) { + warn_invalid_signature(&signature_check, msg->buf, mode); + + if (mode == SIGN_SIGN_IF_INVALID) { + struct strbuf signature = STRBUF_INIT; + struct strbuf payload = STRBUF_INIT; + + /* + * NEEDSWORK: To properly support interoperability mode + * when signing commit signatures, the commit buffer + * must be provided in both the repository and + * compatibility object formats. As currently + * implemented, only the repository object format is + * considered meaning compatibility signatures cannot be + * generated. Thus, attempting to sign commit signatures + * in interoperability mode is currently unsupported. + */ + if (the_repository->compat_hash_algo) + die(_("signing commits in interoperability mode is unsupported")); + + strbuf_addstr(&payload, signature_check.payload); + if (sign_buffer(&payload, &signature, signed_commit_keyid, + SIGN_BUFFER_USE_DEFAULT_KEY)) + die(_("failed to sign commit object")); + add_header_signature(new_data, &signature, the_hash_algo); + + strbuf_release(&signature); + strbuf_release(&payload); + } finalize_commit_buffer(new_data, NULL, NULL, msg); } else { @@ -2927,6 +2978,7 @@ static void parse_new_commit(const char *arg) /* fallthru */ case SIGN_VERBATIM: case SIGN_STRIP_IF_INVALID: + case SIGN_SIGN_IF_INVALID: import_one_signature(&sig_sha1, &sig_sha256, v); break; @@ -3011,9 +3063,11 @@ static void parse_new_commit(const char *arg) "encoding %s\n", encoding); - if (signed_commit_mode == SIGN_STRIP_IF_INVALID && + if ((signed_commit_mode == SIGN_STRIP_IF_INVALID || + signed_commit_mode == SIGN_SIGN_IF_INVALID) && (sig_sha1.hash_algo || sig_sha256.hash_algo)) - handle_strip_if_invalid(&new_data, &sig_sha1, &sig_sha256, &msg); + handle_signature_if_invalid(&new_data, &sig_sha1, &sig_sha256, + &msg, signed_commit_mode); else finalize_commit_buffer(&new_data, &sig_sha1, &sig_sha256, &msg); @@ -3060,6 +3114,9 @@ static void handle_tag_signature(struct strbuf *msg, const char *name) case SIGN_STRIP_IF_INVALID: die(_("'strip-if-invalid' is not a valid mode for " "git fast-import with --signed-tags=")); + case SIGN_SIGN_IF_INVALID: + die(_("'sign-if-invalid' is not a valid mode for " + "git fast-import with --signed-tags=")); default: BUG("invalid signed_tag_mode value %d from tag '%s'", signed_tag_mode, name); @@ -3649,10 +3706,10 @@ static int parse_one_option(const char *option) } else if (skip_prefix(option, "export-pack-edges=", &option)) { option_export_pack_edges(option); } else if (skip_prefix(option, "signed-commits=", &option)) { - if (parse_sign_mode(option, &signed_commit_mode)) + if (parse_sign_mode(option, &signed_commit_mode, &signed_commit_keyid)) usagef(_("unknown --signed-commits mode '%s'"), option); } else if (skip_prefix(option, "signed-tags=", &option)) { - if (parse_sign_mode(option, &signed_tag_mode)) + if (parse_sign_mode(option, &signed_tag_mode, NULL)) usagef(_("unknown --signed-tags mode '%s'"), option); } else if (!strcmp(option, "quiet")) { show_stats = 0; diff --git a/gpg-interface.c b/gpg-interface.c index 27eacbb632628c..d517425034ee6c 100644 --- a/gpg-interface.c +++ b/gpg-interface.c @@ -1152,21 +1152,28 @@ static int sign_buffer_ssh(struct strbuf *buffer, struct strbuf *signature, return ret; } -int parse_sign_mode(const char *arg, enum sign_mode *mode) +int parse_sign_mode(const char *arg, enum sign_mode *mode, const char **keyid) { - if (!strcmp(arg, "abort")) + if (!strcmp(arg, "abort")) { *mode = SIGN_ABORT; - else if (!strcmp(arg, "verbatim") || !strcmp(arg, "ignore")) + } else if (!strcmp(arg, "verbatim") || !strcmp(arg, "ignore")) { *mode = SIGN_VERBATIM; - else if (!strcmp(arg, "warn-verbatim") || !strcmp(arg, "warn")) + } else if (!strcmp(arg, "warn-verbatim") || !strcmp(arg, "warn")) { *mode = SIGN_WARN_VERBATIM; - else if (!strcmp(arg, "warn-strip")) + } else if (!strcmp(arg, "warn-strip")) { *mode = SIGN_WARN_STRIP; - else if (!strcmp(arg, "strip")) + } else if (!strcmp(arg, "strip")) { *mode = SIGN_STRIP; - else if (!strcmp(arg, "strip-if-invalid")) + } else if (!strcmp(arg, "strip-if-invalid")) { *mode = SIGN_STRIP_IF_INVALID; - else + } else if (!strcmp(arg, "sign-if-invalid")) { + *mode = SIGN_SIGN_IF_INVALID; + } else if (skip_prefix(arg, "sign-if-invalid=", &arg)) { + *mode = SIGN_SIGN_IF_INVALID; + if (keyid) + *keyid = arg; + } else { return -1; + } return 0; } diff --git a/gpg-interface.h b/gpg-interface.h index 37f3ac42dbb99a..a365586ce1e755 100644 --- a/gpg-interface.h +++ b/gpg-interface.h @@ -120,12 +120,15 @@ enum sign_mode { SIGN_WARN_STRIP, SIGN_STRIP, SIGN_STRIP_IF_INVALID, + SIGN_SIGN_IF_INVALID, }; /* * Return 0 if `arg` can be parsed into an `enum sign_mode`. Return -1 - * otherwise. + * otherwise. If the parsed mode is SIGN_SIGN_IF_INVALID and GPG key provided in + * the arguments in the form `sign-if-invalid=`, the key-ID is parsed + * into `char **keyid`. */ -int parse_sign_mode(const char *arg, enum sign_mode *mode); +int parse_sign_mode(const char *arg, enum sign_mode *mode, const char **keyid); #endif diff --git a/t/t9305-fast-import-signatures.sh b/t/t9305-fast-import-signatures.sh index 022dae02e48177..ac4228127a9651 100755 --- a/t/t9305-fast-import-signatures.sh +++ b/t/t9305-fast-import-signatures.sh @@ -103,71 +103,111 @@ test_expect_success GPG 'strip both OpenPGP signatures with --signed-commits=war test_line_count = 2 out ' -test_expect_success GPG 'import commit with no signature with --signed-commits=strip-if-invalid' ' - git fast-export main >output && - git -C new fast-import --quiet --signed-commits=strip-if-invalid log 2>&1 && - test_must_be_empty log -' - -test_expect_success GPG 'keep valid OpenPGP signature with --signed-commits=strip-if-invalid' ' - rm -rf new && - git init new && - - git fast-export --signed-commits=verbatim openpgp-signing >output && - git -C new fast-import --quiet --signed-commits=strip-if-invalid log 2>&1 && - IMPORTED=$(git -C new rev-parse --verify refs/heads/openpgp-signing) && - test $OPENPGP_SIGNING = $IMPORTED && - git -C new cat-file commit "$IMPORTED" >actual && - test_grep -E "^gpgsig(-sha256)? " actual && - test_must_be_empty log -' - -test_expect_success GPG 'strip signature invalidated by message change with --signed-commits=strip-if-invalid' ' +for mode in strip-if-invalid sign-if-invalid +do + test_expect_success GPG "import commit with no signature with --signed-commits=$mode" ' + git fast-export main >output && + git -C new fast-import --quiet --signed-commits=$mode log 2>&1 && + test_must_be_empty log + ' + + test_expect_success GPG "keep valid OpenPGP signature with --signed-commits=$mode" ' + rm -rf new && + git init new && + + git fast-export --signed-commits=verbatim openpgp-signing >output && + git -C new fast-import --quiet --signed-commits=$mode log 2>&1 && + IMPORTED=$(git -C new rev-parse --verify refs/heads/openpgp-signing) && + test $OPENPGP_SIGNING = $IMPORTED && + git -C new cat-file commit "$IMPORTED" >actual && + test_grep -E "^gpgsig(-sha256)? " actual && + test_must_be_empty log + ' + + test_expect_success GPG "handle signature invalidated by message change with --signed-commits=$mode" ' + rm -rf new && + git init new && + + git fast-export --signed-commits=verbatim openpgp-signing >output && + + # Change the commit message, which invalidates the signature. + # The commit message length should not change though, otherwise the + # corresponding `data ` command would have to be changed too. + sed "s/OpenPGP signed commit/OpenPGP forged commit/" output >modified && + + git -C new fast-import --quiet --signed-commits=$mode log 2>&1 && + + IMPORTED=$(git -C new rev-parse --verify refs/heads/openpgp-signing) && + test $OPENPGP_SIGNING != $IMPORTED && + git -C new cat-file commit "$IMPORTED" >actual && + + if test "$mode" = strip-if-invalid + then + test_grep "stripping invalid signature" log && + test_grep ! -E "^gpgsig" actual + else + test_grep "replacing invalid signature" log && + test_grep -E "^gpgsig(-sha256)? " actual && + git -C new verify-commit "$IMPORTED" + fi + ' + + test_expect_success GPGSM "keep valid X.509 signature with --signed-commits=$mode" ' + rm -rf new && + git init new && + + git fast-export --signed-commits=verbatim x509-signing >output && + git -C new fast-import --quiet --signed-commits=$mode log 2>&1 && + IMPORTED=$(git -C new rev-parse --verify refs/heads/x509-signing) && + test $X509_SIGNING = $IMPORTED && + git -C new cat-file commit "$IMPORTED" >actual && + test_grep -E "^gpgsig(-sha256)? " actual && + test_must_be_empty log + ' + + test_expect_success GPGSSH "keep valid SSH signature with --signed-commits=$mode" ' + rm -rf new && + git init new && + + test_config -C new gpg.ssh.allowedSignersFile "${GPGSSH_ALLOWED_SIGNERS}" && + + git fast-export --signed-commits=verbatim ssh-signing >output && + git -C new fast-import --quiet --signed-commits=$mode log 2>&1 && + IMPORTED=$(git -C new rev-parse --verify refs/heads/ssh-signing) && + test $SSH_SIGNING = $IMPORTED && + git -C new cat-file commit "$IMPORTED" >actual && + test_grep -E "^gpgsig(-sha256)? " actual && + test_must_be_empty log + ' +done + +test_expect_success GPGSSH "sign invalid commit with explicit keyid" ' rm -rf new && git init new && - git fast-export --signed-commits=verbatim openpgp-signing >output && + git fast-export --signed-commits=verbatim ssh-signing >output && # Change the commit message, which invalidates the signature. # The commit message length should not change though, otherwise the # corresponding `data ` command would have to be changed too. - sed "s/OpenPGP signed commit/OpenPGP forged commit/" output >modified && - - git -C new fast-import --quiet --signed-commits=strip-if-invalid log 2>&1 && - - IMPORTED=$(git -C new rev-parse --verify refs/heads/openpgp-signing) && - test $OPENPGP_SIGNING != $IMPORTED && - git -C new cat-file commit "$IMPORTED" >actual && - test_grep ! -E "^gpgsig" actual && - test_grep "stripping invalid signature" log -' - -test_expect_success GPGSM 'keep valid X.509 signature with --signed-commits=strip-if-invalid' ' - rm -rf new && - git init new && - - git fast-export --signed-commits=verbatim x509-signing >output && - git -C new fast-import --quiet --signed-commits=strip-if-invalid log 2>&1 && - IMPORTED=$(git -C new rev-parse --verify refs/heads/x509-signing) && - test $X509_SIGNING = $IMPORTED && - git -C new cat-file commit "$IMPORTED" >actual && - test_grep -E "^gpgsig(-sha256)? " actual && - test_must_be_empty log -' - -test_expect_success GPGSSH 'keep valid SSH signature with --signed-commits=strip-if-invalid' ' - rm -rf new && - git init new && + sed "s/SSH signed commit/SSH forged commit/" output >modified && + # Configure the target repository with an invalid default signing key. + test_config -C new user.signingkey "not-a-real-key-id" && + test_config -C new gpg.format ssh && test_config -C new gpg.ssh.allowedSignersFile "${GPGSSH_ALLOWED_SIGNERS}" && + test_must_fail git -C new fast-import --quiet \ + --signed-commits=sign-if-invalid /dev/null 2>&1 && + + # Import using explicitly provided signing key. + git -C new fast-import --quiet \ + --signed-commits=sign-if-invalid="${GPGSSH_KEY_PRIMARY}" output && - git -C new fast-import --quiet --signed-commits=strip-if-invalid log 2>&1 && IMPORTED=$(git -C new rev-parse --verify refs/heads/ssh-signing) && - test $SSH_SIGNING = $IMPORTED && + test $SSH_SIGNING != $IMPORTED && git -C new cat-file commit "$IMPORTED" >actual && test_grep -E "^gpgsig(-sha256)? " actual && - test_must_be_empty log + git -C new verify-commit "$IMPORTED" ' test_done From d39cef3a1af5da30fc8db2f31b0cd23cfb45d05b Mon Sep 17 00:00:00 2001 From: Siddharth Shrimali Date: Fri, 13 Mar 2026 11:01:59 +0530 Subject: [PATCH 14/33] t0410: modernize delete_object helper The delete_object helper currently relies on a manual sed command to calculate object paths. This works, but it's a bit brittle and forces us to maintain shell logic that Git's own test suite can already handle more elegantly. Switch to 'test_oid_to_path' to let Git handle the path logic. This makes the helper hash independent, which is much cleaner than manual string manipulation. While at it, use 'local' to declare helper-specific variables and quote them to follow Git's coding style. This prevents them from leaking into global shell scope and avoids potential naming conflicts with other parts of the test suite. Helped-by: Pushkar Singh Suggested-by: Jeff King Signed-off-by: Siddharth Shrimali Signed-off-by: Junio C Hamano --- t/t0410-partial-clone.sh | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh index 2a5bdbeeb87f6e..52e19728a3fca0 100755 --- a/t/t0410-partial-clone.sh +++ b/t/t0410-partial-clone.sh @@ -11,7 +11,10 @@ test_description='partial clone' GIT_TEST_COMMIT_GRAPH=0 delete_object () { - rm $1/.git/objects/$(echo $2 | sed -e 's|^..|&/|') + local repo="$1" + local obj="$2" + local path="$repo/.git/objects/$(test_oid_to_path "$obj")" && + rm "$path" } pack_as_from_promisor () { From 9f4cdf590177eacf8d73f2295c07ad4d4e27ab19 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Fri, 13 Mar 2026 07:45:12 +0100 Subject: [PATCH 15/33] upload-pack: fix debug statement when flushing packfile data When git-upload-pack(1) writes packfile data to the client we have some logic in place that buffers some partial lines. When that buffer still contains data after git-pack-objects(1) has finished we flush the buffer so that all remaining bytes are sent out. Curiously, when we do so we also print the string "flushed." to stderr. This statement has been introduced in b1c71b7281 (upload-pack: avoid sending an incomplete pack upon failure, 2006-06-20), so quite a while ago. What's interesting though is that stderr is typically spliced through to the client-side, and consequently the client would see this message. Munging the way how we do the caching indeed confirms this: $ git clone file:///home/pks/Development/linux/ Cloning into bare repository 'linux.git'... remote: Enumerating objects: 12980346, done. remote: Counting objects: 100% (131820/131820), done. remote: Compressing objects: 100% (50290/50290), done. remote: Total 12980346 (delta 96319), reused 104500 (delta 81217), pack-reused 12848526 (from 1) Receiving objects: 100% (12980346/12980346), 3.23 GiB | 57.44 MiB/s, done. flushed. Resolving deltas: 100% (10676718/10676718), done. It's quite clear that this string shouldn't ever be visible to the client, so it rather feels like this is a left-over debug statement. The menitoned commit doesn't mention this line, either. Remove the debug output to prepare for a change in how we do the buffering in the next commit. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- upload-pack.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/upload-pack.c b/upload-pack.c index 2d2b70cbf2dd0b..c2643c0295bad8 100644 --- a/upload-pack.c +++ b/upload-pack.c @@ -457,11 +457,9 @@ static void create_pack_file(struct upload_pack_data *pack_data, } /* flush the data */ - if (output_state->used > 0) { + if (output_state->used > 0) send_client_data(1, output_state->buffer, output_state->used, pack_data->use_sideband); - fprintf(stderr, "flushed.\n"); - } free(output_state); if (pack_data->use_sideband) packet_flush(1); From 5d4b7ddce15950ed71ec60453aadfe13d9e633d5 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Fri, 13 Mar 2026 07:45:13 +0100 Subject: [PATCH 16/33] upload-pack: adapt keepalives based on buffering The function `create_pack_file()` is responsible for sending the packfile data to the client of git-upload-pack(1). As generating the bytes may take significant computing resources we also have a mechanism in place that optionally sends keepalive pktlines in case we haven't sent out any data. The keepalive logic is purely based poll(3p): we pass a timeout to that syscall, and if the call times out we send out the keepalive pktline. While reasonable, this logic isn't entirely sufficient: even if the call to poll(3p) ends because we have received data on any of the file descriptors we may not necessarily send data to the client. The most important edge case here happens in `relay_pack_data()`. When we haven't seen the initial "PACK" signature from git-pack-objects(1) yet we buffer incoming data. So in the worst case, if each of the bytes of that signature arrive shortly before the configured keepalive timeout, then we may not send out any data for a time period that is (almost) four times as long as the configured timeout. This edge case is rather unlikely to matter in practice. But in a subsequent commit we're going to adapt our buffering mechanism to become more aggressive, which makes it more likely that we don't send any data for an extended amount of time. Adapt the logic so that instead of using a fixed timeout on every call to poll(3p), we instead figure out how much time has passed since the last-sent data. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- upload-pack.c | 49 ++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 40 insertions(+), 9 deletions(-) diff --git a/upload-pack.c b/upload-pack.c index c2643c0295bad8..04521e57c967f3 100644 --- a/upload-pack.c +++ b/upload-pack.c @@ -29,6 +29,7 @@ #include "commit-graph.h" #include "commit-reach.h" #include "shallow.h" +#include "trace.h" #include "write-or-die.h" #include "json-writer.h" #include "strmap.h" @@ -218,7 +219,8 @@ struct output_state { }; static int relay_pack_data(int pack_objects_out, struct output_state *os, - int use_sideband, int write_packfile_line) + int use_sideband, int write_packfile_line, + bool *did_send_data) { /* * We keep the last byte to ourselves @@ -232,6 +234,8 @@ static int relay_pack_data(int pack_objects_out, struct output_state *os, */ ssize_t readsz; + *did_send_data = false; + readsz = xread(pack_objects_out, os->buffer + os->used, sizeof(os->buffer) - os->used); if (readsz < 0) { @@ -247,6 +251,7 @@ static int relay_pack_data(int pack_objects_out, struct output_state *os, if (os->packfile_uris_started) packet_delim(1); packet_write_fmt(1, "\1packfile\n"); + *did_send_data = true; } break; } @@ -259,6 +264,7 @@ static int relay_pack_data(int pack_objects_out, struct output_state *os, } *p = '\0'; packet_write_fmt(1, "\1%s\n", os->buffer); + *did_send_data = true; os->used -= p - os->buffer + 1; memmove(os->buffer, p + 1, os->used); @@ -279,6 +285,7 @@ static int relay_pack_data(int pack_objects_out, struct output_state *os, os->used = 0; } + *did_send_data = true; return readsz; } @@ -290,6 +297,7 @@ static void create_pack_file(struct upload_pack_data *pack_data, char progress[128]; char abort_msg[] = "aborting due to possible repository " "corruption on the remote side."; + uint64_t last_sent_ms = 0; ssize_t sz; int i; FILE *pipe_fd; @@ -365,10 +373,14 @@ static void create_pack_file(struct upload_pack_data *pack_data, */ while (1) { + uint64_t now_ms = getnanotime() / 1000000; struct pollfd pfd[2]; - int pe, pu, pollsize, polltimeout; + int pe, pu, pollsize, polltimeout_ms; int ret; + if (!last_sent_ms) + last_sent_ms = now_ms; + reset_timeout(pack_data->timeout); pollsize = 0; @@ -390,11 +402,21 @@ static void create_pack_file(struct upload_pack_data *pack_data, if (!pollsize) break; - polltimeout = pack_data->keepalive < 0 - ? -1 - : 1000 * pack_data->keepalive; + if (pack_data->keepalive < 0) { + polltimeout_ms = -1; + } else { + /* + * The polling timeout needs to be adjusted based on + * the time we have sent our last package. The longer + * it's been in the past, the shorter the timeout + * becomes until we eventually don't block at all. + */ + polltimeout_ms = 1000 * pack_data->keepalive - (now_ms - last_sent_ms); + if (polltimeout_ms < 0) + polltimeout_ms = 0; + } - ret = poll(pfd, pollsize, polltimeout); + ret = poll(pfd, pollsize, polltimeout_ms); if (ret < 0) { if (errno != EINTR) { @@ -403,16 +425,18 @@ static void create_pack_file(struct upload_pack_data *pack_data, } continue; } + if (0 <= pe && (pfd[pe].revents & (POLLIN|POLLHUP))) { /* Status ready; we ship that in the side-band * or dump to the standard error. */ sz = xread(pack_objects.err, progress, sizeof(progress)); - if (0 < sz) + if (0 < sz) { send_client_data(2, progress, sz, pack_data->use_sideband); - else if (sz == 0) { + last_sent_ms = now_ms; + } else if (sz == 0) { close(pack_objects.err); pack_objects.err = -1; } @@ -421,11 +445,14 @@ static void create_pack_file(struct upload_pack_data *pack_data, /* give priority to status messages */ continue; } + if (0 <= pu && (pfd[pu].revents & (POLLIN|POLLHUP))) { + bool did_send_data; int result = relay_pack_data(pack_objects.out, output_state, pack_data->use_sideband, - !!uri_protocols); + !!uri_protocols, + &did_send_data); if (result == 0) { close(pack_objects.out); @@ -433,6 +460,9 @@ static void create_pack_file(struct upload_pack_data *pack_data, } else if (result < 0) { goto fail; } + + if (did_send_data) + last_sent_ms = now_ms; } /* @@ -448,6 +478,7 @@ static void create_pack_file(struct upload_pack_data *pack_data, if (!ret && pack_data->use_sideband) { static const char buf[] = "0005\1"; write_or_die(1, buf, 5); + last_sent_ms = now_ms; } } From c422ec886c92d27b702a2b8cf657cede54658d94 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Fri, 13 Mar 2026 07:45:14 +0100 Subject: [PATCH 17/33] upload-pack: prefer flushing data over sending keepalive When using the sideband in git-upload-pack(1) we know to send out keepalive packets in case generating the pack takes too long. These keepalives take the form of a simple empty pktline. In the preceding commit we have adapted git-upload-pack(1) to buffer data more aggressively before sending it to the client. This creates an obvious optimization opportunity: when we hit the keepalive timeout while we still hold on to some buffered data, then it makes more sense to flush out the data instead of sending the empty keepalive packet. This is overall not going to be a significant win. Most keepalives will come before the pack data starts, and once pack-objects starts producing data, it tends to do so pretty consistently. And of course we can't send data before we see the PACK header, because the whole point is to buffer the early bit waiting for packfile URIs. But the optimization is easy enough to realize. Do so and flush out data instead of sending an empty pktline. Suggested-by: Jeff King Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- upload-pack.c | 21 +++++++++++++++------ 1 file changed, 15 insertions(+), 6 deletions(-) diff --git a/upload-pack.c b/upload-pack.c index 04521e57c967f3..ef8f8189c1714f 100644 --- a/upload-pack.c +++ b/upload-pack.c @@ -466,18 +466,27 @@ static void create_pack_file(struct upload_pack_data *pack_data, } /* - * We hit the keepalive timeout without saying anything; send - * an empty message on the data sideband just to let the other - * side know we're still working on it, but don't have any data - * yet. + * We hit the keepalive timeout without saying anything. If we + * have pending data we flush it out to the caller now. + * Otherwise, we send an empty message on the data sideband + * just to let the other side know we're still working on it, + * but don't have any data yet. * * If we don't have a sideband channel, there's no room in the * protocol to say anything, so those clients are just out of * luck. */ if (!ret && pack_data->use_sideband) { - static const char buf[] = "0005\1"; - write_or_die(1, buf, 5); + if (output_state->packfile_started && output_state->used > 1) { + send_client_data(1, output_state->buffer, output_state->used - 1, + pack_data->use_sideband); + output_state->buffer[0] = output_state->buffer[output_state->used - 1]; + output_state->used = 1; + } else { + static const char buf[] = "0005\1"; + write_or_die(1, buf, 5); + } + last_sent_ms = now_ms; } } From a5816e4596891f1c7552b049053e94929c699b78 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Fri, 13 Mar 2026 07:45:15 +0100 Subject: [PATCH 18/33] upload-pack: reduce lock contention when writing packfile data MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit In our production systems we have recently observed write contention in git-upload-pack(1). The system in question was consistently streaming packfiles at a rate of dozens of gigabits per second, but curiously the system was neither bottlenecked on CPU, memory or IOPS. We eventually discovered that Git was spending 80% of its time in `pipe_write()`, out of which almost all of the time was spent in the `ep_poll_callback` function in the kernel. Quoting the reporter: This infrastructure is part of an event notification queue designed to allow for multiple producers to emit events, but that concurrency safety is guarded by 3 layers of locking. The layer we're hitting contention in uses a simple reader/writer lock mode (a.k.a. shared versus exclusive mode), where producers need shared-mode (read mode), and various other actions use exclusive (write) mode. The system in question generates workloads where we have hundreds of git-upload-pack(1) processes active at the same point in time. These processes end up contending around those locks, and the consequence is that the Git processes stall. Now git-upload-pack(1) already has the infrastructure in place to buffer some of the data it reads from git-pack-objects(1) before actually sending it out. We only use this infrastructure in very limited ways though, so we generally end up matching one read(3p) call with one write(3p) call. Even worse, when the sideband is enabled we end up matching one read with _two_ writes: one for the pkt-line length, and one for the packfile data. Extend our use of the buffering infrastructure so that we soak up bytes until the buffer is filled up at least 2/3rds of its capacity. The change is relatively simple to implement as we already know to flush the buffer in `create_pack_file()` after git-pack-objects(1) has finished. This significantly reduces the number of write(3p) syscalls we need to do. Before this change, cloning the Linux repository resulted in around 400,000 write(3p) syscalls. With the buffering in place we only do around 130,000 syscalls. Now we could of course go even further and make sure that we always fill up the whole buffer. But this might cause an increase in read(3p) syscalls, and some tests show that this only reduces the number of write(3p) syscalls from 130,000 to 100,000. So overall this doesn't seem worth it. Note that the issue could also be fixed by adapting the write buffer that we use in the downstream git-pack-objects(1) command, and such a change would have roughly the same result. But the command that generates the packfile data may not always be git-pack-objects(1) as it can be changed via "uploadpack.packObjectsHook", so such a fix would only help in _some_ cases. Regardless of that, we'll also adapt the write buffer size of git-pack-objects(1) in a subsequent commit. Helped-by: Matt Smiley Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- upload-pack.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/upload-pack.c b/upload-pack.c index ef8f8189c1714f..92e1ff3ba2ec6e 100644 --- a/upload-pack.c +++ b/upload-pack.c @@ -276,6 +276,13 @@ static int relay_pack_data(int pack_objects_out, struct output_state *os, } } + /* + * Make sure that we buffer some data before sending it to the client. + * This significantly reduces the number of write(3p) syscalls. + */ + if (readsz && os->used < (sizeof(os->buffer) * 2 / 3)) + return readsz; + if (os->used > 1) { send_client_data(1, os->buffer, os->used - 1, use_sideband); os->buffer[0] = os->buffer[os->used - 1]; From 3b9b2c2a29a1d529ca9884fa0a6529f6e2496abe Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Fri, 13 Mar 2026 07:45:16 +0100 Subject: [PATCH 19/33] compat/posix: introduce writev(3p) wrapper In a subsequent commit we're going to add the first caller to writev(3p). Introduce a compatibility wrapper for this syscall that we can use on systems that don't have this syscall. The syscall exists on modern Unixes like Linux and macOS, and seemingly even for NonStop according to [1]. It doesn't seem to exist on Windows though. [1]: http://nonstoptools.com/manuals/OSS-SystemCalls.pdf [2]: https://www.gnu.org/software/gnulib/manual/html_node/writev.html Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- Makefile | 4 ++++ compat/posix.h | 14 ++++++++++++++ compat/writev.c | 44 ++++++++++++++++++++++++++++++++++++++++++++ config.mak.uname | 2 ++ meson.build | 1 + 5 files changed, 65 insertions(+) create mode 100644 compat/writev.c diff --git a/Makefile b/Makefile index 8aa489f3b6812f..61c7dff942a0fe 100644 --- a/Makefile +++ b/Makefile @@ -2015,6 +2015,10 @@ ifdef NO_PREAD COMPAT_CFLAGS += -DNO_PREAD COMPAT_OBJS += compat/pread.o endif +ifdef NO_WRITEV + COMPAT_CFLAGS += -DNO_WRITEV + COMPAT_OBJS += compat/writev.o +endif ifdef NO_FAST_WORKING_DIRECTORY BASIC_CFLAGS += -DNO_FAST_WORKING_DIRECTORY endif diff --git a/compat/posix.h b/compat/posix.h index 245386fa4a9f4e..3c611d2736c47a 100644 --- a/compat/posix.h +++ b/compat/posix.h @@ -137,6 +137,9 @@ #include #include #include +#ifndef NO_WRITEV +#include +#endif #include #ifndef NO_SYS_SELECT_H #include @@ -323,6 +326,17 @@ int git_lstat(const char *, struct stat *); ssize_t git_pread(int fd, void *buf, size_t count, off_t offset); #endif +#ifdef NO_WRITEV +#define writev git_writev +#define iovec git_iovec +struct git_iovec { + void *iov_base; + size_t iov_len; +}; + +ssize_t git_writev(int fd, const struct iovec *iov, int iovcnt); +#endif + #ifdef NO_SETENV #define setenv gitsetenv int gitsetenv(const char *, const char *, int); diff --git a/compat/writev.c b/compat/writev.c new file mode 100644 index 00000000000000..3a94870a2f5855 --- /dev/null +++ b/compat/writev.c @@ -0,0 +1,44 @@ +#include "../git-compat-util.h" +#include "../wrapper.h" + +ssize_t git_writev(int fd, const struct iovec *iov, int iovcnt) +{ + size_t total_written = 0; + size_t sum = 0; + + /* + * According to writev(3p), the syscall shall error with EINVAL in case + * the sum of `iov_len` overflows `ssize_t`. + */ + for (int i = 0; i < iovcnt; i++) { + if (iov[i].iov_len > maximum_signed_value_of_type(ssize_t) || + iov[i].iov_len + sum > maximum_signed_value_of_type(ssize_t)) { + errno = EINVAL; + return -1; + } + + sum += iov[i].iov_len; + } + + for (int i = 0; i < iovcnt; i++) { + const char *bytes = iov[i].iov_base; + size_t iovec_written = 0; + + while (iovec_written < iov[i].iov_len) { + ssize_t bytes_written = xwrite(fd, bytes + iovec_written, + iov[i].iov_len - iovec_written); + if (bytes_written < 0) { + if (total_written) + goto out; + return bytes_written; + } + if (!bytes_written) + goto out; + iovec_written += bytes_written; + total_written += bytes_written; + } + } + +out: + return (ssize_t) total_written; +} diff --git a/config.mak.uname b/config.mak.uname index 3c35ae33a3c0c0..8e66a1cc0a04c2 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -457,6 +457,7 @@ ifeq ($(uname_S),Windows) SANE_TOOL_PATH ?= $(msvc_bin_dir_msys) HAVE_ALLOCA_H = YesPlease NO_PREAD = YesPlease + NO_WRITEV = YesPlease NEEDS_CRYPTO_WITH_SSL = YesPlease NO_LIBGEN_H = YesPlease NO_POLL = YesPlease @@ -672,6 +673,7 @@ ifeq ($(uname_S),MINGW) pathsep = ; HAVE_ALLOCA_H = YesPlease NO_PREAD = YesPlease + NO_WRITEV = YesPlease NEEDS_CRYPTO_WITH_SSL = YesPlease NO_LIBGEN_H = YesPlease NO_POLL = YesPlease diff --git a/meson.build b/meson.build index dd52efd1c87574..f1cd89efc7d38e 100644 --- a/meson.build +++ b/meson.build @@ -1405,6 +1405,7 @@ checkfuncs = { 'initgroups' : [], 'strtoumax' : ['strtoumax.c', 'strtoimax.c'], 'pread' : ['pread.c'], + 'writev' : ['writev.c'], } if host_machine.system() == 'windows' From 1970fcef93adcc5a35f6468d00a5a634d5af2b3c Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Fri, 13 Mar 2026 07:45:17 +0100 Subject: [PATCH 20/33] wrapper: introduce writev(3p) wrappers In the preceding commit we have added a compatibility wrapper for the writev(3p) syscall. Introduce some generic wrappers for this function that we nowadays take for granted in the Git codebase. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- wrapper.c | 41 +++++++++++++++++++++++++++++++++++++++++ wrapper.h | 9 +++++++++ write-or-die.c | 8 ++++++++ write-or-die.h | 1 + 4 files changed, 59 insertions(+) diff --git a/wrapper.c b/wrapper.c index b794fb20e71879..fa40f399b3452d 100644 --- a/wrapper.c +++ b/wrapper.c @@ -323,6 +323,47 @@ ssize_t write_in_full(int fd, const void *buf, size_t count) return total; } +ssize_t writev_in_full(int fd, struct iovec *iov, int iovcnt) +{ + ssize_t total_written = 0; + + while (iovcnt) { + ssize_t bytes_written = writev(fd, iov, iovcnt); + if (bytes_written < 0) { + if (errno == EINTR || errno == EAGAIN) + continue; + return -1; + } + if (!bytes_written) { + errno = ENOSPC; + return -1; + } + + total_written += bytes_written; + + /* + * We first need to discard any iovec entities that have been + * fully written. + */ + while (iovcnt && (size_t)bytes_written >= iov->iov_len) { + bytes_written -= iov->iov_len; + iov++; + iovcnt--; + } + + /* + * Finally, we need to adjust the last iovec in case we have + * performed a partial write. + */ + if (iovcnt && bytes_written) { + iov->iov_base = (char *) iov->iov_base + bytes_written; + iov->iov_len -= bytes_written; + } + } + + return total_written; +} + ssize_t pread_in_full(int fd, void *buf, size_t count, off_t offset) { char *p = buf; diff --git a/wrapper.h b/wrapper.h index 15ac3bab6e9748..27519b32d1782d 100644 --- a/wrapper.h +++ b/wrapper.h @@ -47,6 +47,15 @@ ssize_t read_in_full(int fd, void *buf, size_t count); ssize_t write_in_full(int fd, const void *buf, size_t count); ssize_t pread_in_full(int fd, void *buf, size_t count, off_t offset); +/* + * Try to write all iovecs. Returns -1 in case an error occurred with a proper + * errno set, the number of bytes written otherwise. + * + * Note that the iovec will be modified as a result of this call to adjust for + * partial writes! + */ +ssize_t writev_in_full(int fd, struct iovec *iov, int iovcnt); + static inline ssize_t write_str_in_full(int fd, const char *str) { return write_in_full(fd, str, strlen(str)); diff --git a/write-or-die.c b/write-or-die.c index 01a9a51fa2fcd7..5f522fb7287382 100644 --- a/write-or-die.c +++ b/write-or-die.c @@ -96,6 +96,14 @@ void write_or_die(int fd, const void *buf, size_t count) } } +void writev_or_die(int fd, struct iovec *iov, int iovlen) +{ + if (writev_in_full(fd, iov, iovlen) < 0) { + check_pipe(errno); + die_errno("writev error"); + } +} + void fwrite_or_die(FILE *f, const void *buf, size_t count) { if (fwrite(buf, 1, count, f) != count) diff --git a/write-or-die.h b/write-or-die.h index 65a5c42a47ac86..ae3d7d88b87699 100644 --- a/write-or-die.h +++ b/write-or-die.h @@ -7,6 +7,7 @@ void fprintf_or_die(FILE *, const char *fmt, ...); void fwrite_or_die(FILE *f, const void *buf, size_t count); void fflush_or_die(FILE *f); void write_or_die(int fd, const void *buf, size_t count); +void writev_or_die(int fd, struct iovec *iov, int iovlen); /* * These values are used to help identify parts of a repository to fsync. From 26986f4cbaf38d84a82b0b35da211389ce49552c Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Fri, 13 Mar 2026 07:45:18 +0100 Subject: [PATCH 21/33] sideband: use writev(3p) to send pktlines Every pktline that we send out via `send_sideband()` currently requires two syscalls: one to write the pktline's length, and one to send its data. This typically isn't all that much of a problem, but under extreme load the syscalls may cause contention in the kernel. Refactor the code to instead use the newly introduced writev(3p) infra so that we can send out the data with a single syscall. This reduces the number of syscalls from around 133,000 calls to write(3p) to around 67,000 calls to writev(3p). Suggested-by: Jeff King Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- sideband.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/sideband.c b/sideband.c index ea7c25211ef7e1..1ed6614eaf1baf 100644 --- a/sideband.c +++ b/sideband.c @@ -264,6 +264,7 @@ void send_sideband(int fd, int band, const char *data, ssize_t sz, int packet_ma const char *p = data; while (sz) { + struct iovec iov[2]; unsigned n; char hdr[5]; @@ -273,12 +274,19 @@ void send_sideband(int fd, int band, const char *data, ssize_t sz, int packet_ma if (0 <= band) { xsnprintf(hdr, sizeof(hdr), "%04x", n + 5); hdr[4] = band; - write_or_die(fd, hdr, 5); + iov[0].iov_base = hdr; + iov[0].iov_len = 5; } else { xsnprintf(hdr, sizeof(hdr), "%04x", n + 4); - write_or_die(fd, hdr, 4); + iov[0].iov_base = hdr; + iov[0].iov_len = 4; } - write_or_die(fd, p, n); + + iov[1].iov_base = (void *) p; + iov[1].iov_len = n; + + writev_or_die(fd, iov, ARRAY_SIZE(iov)); + p += n; sz -= n; } From a1118c0a44606e0b71e515b05112ff38fef989c0 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Fri, 13 Mar 2026 07:45:19 +0100 Subject: [PATCH 22/33] csum-file: introduce `hashfd_ext()` Introduce a new `hashfd_ext()` function that takes an options structure. This function will replace `hashd_throughput()` in the next commit. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- csum-file.c | 22 +++++++++++++--------- csum-file.h | 14 ++++++++++++++ 2 files changed, 27 insertions(+), 9 deletions(-) diff --git a/csum-file.c b/csum-file.c index 6e21e3cac8a636..a50416247edf25 100644 --- a/csum-file.c +++ b/csum-file.c @@ -161,17 +161,16 @@ struct hashfile *hashfd_check(const struct git_hash_algo *algop, return f; } -static struct hashfile *hashfd_internal(const struct git_hash_algo *algop, - int fd, const char *name, - struct progress *tp, - size_t buffer_len) +struct hashfile *hashfd_ext(const struct git_hash_algo *algop, + int fd, const char *name, + const struct hashfd_options *opts) { struct hashfile *f = xmalloc(sizeof(*f)); f->fd = fd; f->check_fd = -1; f->offset = 0; f->total = 0; - f->tp = tp; + f->tp = opts->progress; f->name = name; f->do_crc = 0; f->skip_hash = 0; @@ -179,8 +178,8 @@ static struct hashfile *hashfd_internal(const struct git_hash_algo *algop, f->algop = unsafe_hash_algo(algop); f->algop->init_fn(&f->ctx); - f->buffer_len = buffer_len; - f->buffer = xmalloc(buffer_len); + f->buffer_len = opts->buffer_len ? opts->buffer_len : 128 * 1024; + f->buffer = xmalloc(f->buffer_len); f->check_buffer = NULL; return f; @@ -194,7 +193,8 @@ struct hashfile *hashfd(const struct git_hash_algo *algop, * measure the rate of data passing through this hashfile, * use a larger buffer size to reduce fsync() calls. */ - return hashfd_internal(algop, fd, name, NULL, 128 * 1024); + struct hashfd_options opts = { 0 }; + return hashfd_ext(algop, fd, name, &opts); } struct hashfile *hashfd_throughput(const struct git_hash_algo *algop, @@ -206,7 +206,11 @@ struct hashfile *hashfd_throughput(const struct git_hash_algo *algop, * size so the progress indicators arrive at a more * frequent rate. */ - return hashfd_internal(algop, fd, name, tp, 8 * 1024); + struct hashfd_options opts = { + .progress = tp, + .buffer_len = 8 * 1024, + }; + return hashfd_ext(algop, fd, name, &opts); } void hashfile_checkpoint_init(struct hashfile *f, diff --git a/csum-file.h b/csum-file.h index 07ae11024afc34..a03b60120d85f1 100644 --- a/csum-file.h +++ b/csum-file.h @@ -45,6 +45,20 @@ int hashfile_truncate(struct hashfile *, struct hashfile_checkpoint *); #define CSUM_FSYNC 2 #define CSUM_HASH_IN_STREAM 4 +struct hashfd_options { + /* + * Throughput progress that counts the number of bytes that have been + * hashed. + */ + struct progress *progress; + + /* The length of the buffer that shall be used read read data. */ + size_t buffer_len; +}; + +struct hashfile *hashfd_ext(const struct git_hash_algo *algop, + int fd, const char *name, + const struct hashfd_options *opts); struct hashfile *hashfd(const struct git_hash_algo *algop, int fd, const char *name); struct hashfile *hashfd_check(const struct git_hash_algo *algop, From 2bf8f36ddb308b084912f8265ad6fd60df34a036 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Fri, 13 Mar 2026 07:45:20 +0100 Subject: [PATCH 23/33] csum-file: drop `hashfd_throughput()` The `hashfd_throughput()` function is used by a single callsite in git-pack-objects(1). In contrast to `hashfd()`, this function uses a progress meter to measure throughput and a smaller buffer length so that the progress meter can provide more granular metrics. We're going to change that caller in the next commit to be a bit more specific to packing objects. As such, `hashfd_throughput()` will be a somewhat unfitting mechanism for any potential new callers. Drop the function and replace it with a call to `hashfd_ext()`. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- builtin/pack-objects.c | 19 +++++++++++++++---- csum-file.c | 16 ---------------- csum-file.h | 2 -- 3 files changed, 15 insertions(+), 22 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 5846b6a293a27b..ba150a80ada2df 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1331,11 +1331,22 @@ static void write_pack_file(void) unsigned char hash[GIT_MAX_RAWSZ]; char *pack_tmp_name = NULL; - if (pack_to_stdout) - f = hashfd_throughput(the_repository->hash_algo, 1, - "", progress_state); - else + if (pack_to_stdout) { + /* + * Since we are expecting to report progress of the + * write into this hashfile, use a smaller buffer + * size so the progress indicators arrive at a more + * frequent rate. + */ + struct hashfd_options opts = { + .progress = progress_state, + .buffer_len = 8 * 1024, + }; + f = hashfd_ext(the_repository->hash_algo, 1, + "", &opts); + } else { f = create_tmp_packfile(the_repository, &pack_tmp_name); + } offset = write_pack_header(f, nr_remaining); diff --git a/csum-file.c b/csum-file.c index a50416247edf25..5dfaca55435d8f 100644 --- a/csum-file.c +++ b/csum-file.c @@ -197,22 +197,6 @@ struct hashfile *hashfd(const struct git_hash_algo *algop, return hashfd_ext(algop, fd, name, &opts); } -struct hashfile *hashfd_throughput(const struct git_hash_algo *algop, - int fd, const char *name, struct progress *tp) -{ - /* - * Since we are expecting to report progress of the - * write into this hashfile, use a smaller buffer - * size so the progress indicators arrive at a more - * frequent rate. - */ - struct hashfd_options opts = { - .progress = tp, - .buffer_len = 8 * 1024, - }; - return hashfd_ext(algop, fd, name, &opts); -} - void hashfile_checkpoint_init(struct hashfile *f, struct hashfile_checkpoint *checkpoint) { diff --git a/csum-file.h b/csum-file.h index a03b60120d85f1..01472555c81e3b 100644 --- a/csum-file.h +++ b/csum-file.h @@ -63,8 +63,6 @@ struct hashfile *hashfd(const struct git_hash_algo *algop, int fd, const char *name); struct hashfile *hashfd_check(const struct git_hash_algo *algop, const char *name); -struct hashfile *hashfd_throughput(const struct git_hash_algo *algop, - int fd, const char *name, struct progress *tp); /* * Free the hashfile without flushing its contents to disk. This only From 835e0aaf6f0e07e9f9a393ed0e456db7c1be33ef Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Fri, 13 Mar 2026 07:45:21 +0100 Subject: [PATCH 24/33] builtin/pack-objects: reduce lock contention when writing packfile data When running `git pack-objects --stdout` we feed the data through `hashfd_ext()` with a progress meter and a smaller-than-usual buffer length of 8kB so that we can track throughput more granularly. But as packfiles tend to be on the larger side, this small buffer size may cause a ton of write(3p) syscalls. Originally, the buffer we used in `hashfd()` was 8kB for all use cases. This was changed though in 2ca245f8be (csum-file.h: increase hashfile buffer size, 2021-05-18) because we noticed that the number of writes can have an impact on performance. So the buffer size was increased to 128kB, which improved performance a bit for some use cases. But the commit didn't touch the buffer size for `hashd_throughput()`. The reasoning here was that callers expect the progress indicator to update frequently, and a larger buffer size would of course reduce the update frequency especially on slow networks. While that is of course true, there was (and still is, even though it's now a call to `hashfd_ext()`) only a single caller of this function in git-pack-objects(1). This command is responsible for writing packfiles, and those packfiles are often on the bigger side. So arguably: - The user won't care about increments of 8kB when packfiles tend to be megabytes or even gigabytes in size. - Reducing the number of syscalls would be even more valuable here than it would be for multi-pack indices, which was the benchmark done in the mentioned commit, as MIDXs are typically significantly smaller than packfiles. - Nowadays, many internet connections should be able to transfer data at a rate significantly higher than 8kB per second. Update the buffer to instead have a size of `LARGE_PACKET_DATA_MAX - 1`, which translates to ~64kB. This limit was chosen because `git pack-objects --stdout` is most often used when sending packfiles via git-upload-pack(1), where packfile data is chunked into pktlines when using the sideband. Furthermore, most internet connections should have a bandwidth signifcantly higher than 64kB/s, so we'd still be able to observe progress updates at a rate of at least once per second. This change significantly reduces the number of write(3p) syscalls from 355,000 to 44,000 when packing the Linux repository. While this results in a small performance improvement on an otherwise-unused system, this improvement is mostly negligible. More importantly though, it will reduce lock contention in the kernel on an extremely busy system where we have many processes writing data at once. Suggested-by: Jeff King Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- builtin/pack-objects.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index ba150a80ada2df..7301ed8c681dc5 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -41,6 +41,7 @@ #include "promisor-remote.h" #include "pack-mtimes.h" #include "parse-options.h" +#include "pkt-line.h" #include "blob.h" #include "tree.h" #include "path-walk.h" @@ -1333,14 +1334,17 @@ static void write_pack_file(void) if (pack_to_stdout) { /* - * Since we are expecting to report progress of the - * write into this hashfile, use a smaller buffer - * size so the progress indicators arrive at a more - * frequent rate. + * This command is most often invoked via + * git-upload-pack(1), which will typically chunk data + * into pktlines. As such, we use the maximum data + * length of them as buffer length. + * + * Note that we need to subtract one though to + * accomodate for the sideband byte. */ struct hashfd_options opts = { .progress = progress_state, - .buffer_len = 8 * 1024, + .buffer_len = LARGE_PACKET_DATA_MAX - 1, }; f = hashfd_ext(the_repository->hash_algo, 1, "", &opts); From 2f0503971716ac21f37a0822a39740324b89c054 Mon Sep 17 00:00:00 2001 From: Ritesh Singh Jadoun Date: Sun, 15 Mar 2026 13:40:32 +0530 Subject: [PATCH 25/33] t/pack-refs-tests: use test_path_is_missing The pack-refs tests previously used raw 'test -f' and 'test -e' checks with negation. Update them to use Git's standard helper function test_path_is_missing for consistency and clearer failure reporting. As suggested in review, replaced the negated 'test_path_exists' with test_path_is_missing to better reflect the expected absence of paths. Signed-off-by: Ritesh Singh Jadoun Signed-off-by: Junio C Hamano --- t/pack-refs-tests.sh | 28 ++++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/t/pack-refs-tests.sh b/t/pack-refs-tests.sh index 81086c369089b9..0b196ea511863f 100644 --- a/t/pack-refs-tests.sh +++ b/t/pack-refs-tests.sh @@ -61,13 +61,13 @@ test_expect_success 'see if a branch still exists after git ${pack_refs} --prune test_expect_success 'see if git ${pack_refs} --prune remove ref files' ' git branch f && git ${pack_refs} --all --prune && - ! test -f .git/refs/heads/f + test_path_is_missing .git/refs/heads/f ' test_expect_success 'see if git ${pack_refs} --prune removes empty dirs' ' git branch r/s/t && git ${pack_refs} --all --prune && - ! test -e .git/refs/heads/r + test_path_is_missing .git/refs/heads/r ' test_expect_success 'git branch g should work when git branch g/h has been deleted' ' @@ -111,43 +111,43 @@ test_expect_success 'test excluded refs are not packed' ' git branch dont_pack2 && git branch pack_this && git ${pack_refs} --all --exclude "refs/heads/dont_pack*" && - test -f .git/refs/heads/dont_pack1 && - test -f .git/refs/heads/dont_pack2 && - ! test -f .git/refs/heads/pack_this' + test_path_is_file .git/refs/heads/dont_pack1 && + test_path_is_file .git/refs/heads/dont_pack2 && + test_path_is_missing .git/refs/heads/pack_this' test_expect_success 'test --no-exclude refs clears excluded refs' ' git branch dont_pack3 && git branch dont_pack4 && git ${pack_refs} --all --exclude "refs/heads/dont_pack*" --no-exclude && - ! test -f .git/refs/heads/dont_pack3 && - ! test -f .git/refs/heads/dont_pack4' + test_path_is_missing .git/refs/heads/dont_pack3 && + test_path_is_missing .git/refs/heads/dont_pack4' test_expect_success 'test only included refs are packed' ' git branch pack_this1 && git branch pack_this2 && git tag dont_pack5 && git ${pack_refs} --include "refs/heads/pack_this*" && - test -f .git/refs/tags/dont_pack5 && - ! test -f .git/refs/heads/pack_this1 && - ! test -f .git/refs/heads/pack_this2' + test_path_is_file .git/refs/tags/dont_pack5 && + test_path_is_missing .git/refs/heads/pack_this1 && + test_path_is_missing .git/refs/heads/pack_this2' test_expect_success 'test --no-include refs clears included refs' ' git branch pack1 && git branch pack2 && git ${pack_refs} --include "refs/heads/pack*" --no-include && - test -f .git/refs/heads/pack1 && - test -f .git/refs/heads/pack2' + test_path_is_file .git/refs/heads/pack1 && + test_path_is_file .git/refs/heads/pack2' test_expect_success 'test --exclude takes precedence over --include' ' git branch dont_pack5 && git ${pack_refs} --include "refs/heads/pack*" --exclude "refs/heads/pack*" && - test -f .git/refs/heads/dont_pack5' + test_path_is_file .git/refs/heads/dont_pack5' test_expect_success 'see if up-to-date packed refs are preserved' ' git branch q && git ${pack_refs} --all --prune && git update-ref refs/heads/q refs/heads/q && - ! test -f .git/refs/heads/q + test_path_is_missing .git/refs/heads/q ' test_expect_success 'pack, prune and repack' ' From 65fec23b577d09122865d239ce454d7946691c2a Mon Sep 17 00:00:00 2001 From: Deveshi Dwivedi Date: Sun, 15 Mar 2026 09:44:43 +0000 Subject: [PATCH 26/33] coccinelle: detect struct strbuf passed by value Passing a struct strbuf by value to a function copies the struct but shares the underlying character array between caller and callee. If the callee causes a reallocation, the caller's copy becomes a dangling pointer, leading to a double-free when strbuf_release() is called. There is no coccinelle rule to catch this pattern. Jeff King suggested adding one during review of the write_worktree_linking_files() fix [1], and noted that a reporting rule using coccinelle's Python scripting extensions could emit a descriptive warning, but we do not currently require Python support in coccinelle. Add a transformation rule that rewrites a by-value strbuf parameter to a pointer. The detection is identical to what a Python-based reporting rule would catch; only the presentation differs. The resulting diff will not produce compilable code on its own (callers and the function body still need updating), but the spatch output alerts the developer that the signature needs attention. This is consistent with the other rules in strbuf.cocci, which also rewrite to the preferred form. [1] https://lore.kernel.org/git/20260309192600.GC309867@coredump.intra.peff.net/ Signed-off-by: Deveshi Dwivedi Signed-off-by: Junio C Hamano --- contrib/coccinelle/strbuf.cocci | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/contrib/coccinelle/strbuf.cocci b/contrib/coccinelle/strbuf.cocci index 5f06105df6db7b..83bd93be5fce8d 100644 --- a/contrib/coccinelle/strbuf.cocci +++ b/contrib/coccinelle/strbuf.cocci @@ -60,3 +60,14 @@ expression E1, E2; @@ - strbuf_addstr(E1, real_path(E2)); + strbuf_add_real_path(E1, E2); + +@@ +identifier fn, param; +@@ + fn(..., +- struct strbuf param ++ struct strbuf *param + ,...) + { + ... + } From 8f8e1b080701d4b02ca3732eed2706b1a5328c5d Mon Sep 17 00:00:00 2001 From: Deveshi Dwivedi Date: Sun, 15 Mar 2026 09:44:44 +0000 Subject: [PATCH 27/33] stash: do not pass strbuf by value save_untracked_files() takes its 'files' parameter as struct strbuf by value. Passing a strbuf by value copies the struct but shares the underlying buffer between caller and callee, risking a dangling pointer and double-free if the callee reallocates. The function needs both the buffer and its length for pipe_command(), so a plain const char * is not sufficient here. Switch the parameter to struct strbuf * and update the caller to pass a pointer. Signed-off-by: Deveshi Dwivedi Signed-off-by: Junio C Hamano --- builtin/stash.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/builtin/stash.c b/builtin/stash.c index e79d612e572e7c..472eebd6ed656d 100644 --- a/builtin/stash.c +++ b/builtin/stash.c @@ -1232,7 +1232,7 @@ static int check_changes(const struct pathspec *ps, int include_untracked, } static int save_untracked_files(struct stash_info *info, struct strbuf *msg, - struct strbuf files) + struct strbuf *files) { int ret = 0; struct strbuf untracked_msg = STRBUF_INIT; @@ -1246,7 +1246,7 @@ static int save_untracked_files(struct stash_info *info, struct strbuf *msg, stash_index_path.buf); strbuf_addf(&untracked_msg, "untracked files on %s\n", msg->buf); - if (pipe_command(&cp_upd_index, files.buf, files.len, NULL, 0, + if (pipe_command(&cp_upd_index, files->buf, files->len, NULL, 0, NULL, 0)) { ret = -1; goto done; @@ -1499,7 +1499,7 @@ static int do_create_stash(const struct pathspec *ps, struct strbuf *stash_msg_b parents = NULL; if (include_untracked) { - if (save_untracked_files(info, &msg, untracked_files)) { + if (save_untracked_files(info, &msg, &untracked_files)) { if (!quiet) fprintf_ln(stderr, _("Cannot save " "the untracked files")); From 3cf4024117f5c15361e68c5d9b0c3e9dd01ec105 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Mon, 16 Mar 2026 08:50:43 +0100 Subject: [PATCH 28/33] clar: update to fix compilation on platforms without PATH_MAX Update clar to e4172e3 (Merge pull request #134 from clar-test/ethomson/const, 2026-01-10). Besides some changes to "generate.py" which don't have any impact on us, this commit also fixes compilation on platforms that don't have PATH_MAX, like for example GNU/Hurd. Reported-by: Samuel Thibault Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- t/unit-tests/clar/clar.h | 4 +- t/unit-tests/clar/generate.py | 79 ++++++++++++++++++++++++++++------- 2 files changed, 68 insertions(+), 15 deletions(-) diff --git a/t/unit-tests/clar/clar.h b/t/unit-tests/clar/clar.h index f7e43630226434..9ea91d3d0e88c2 100644 --- a/t/unit-tests/clar/clar.h +++ b/t/unit-tests/clar/clar.h @@ -15,8 +15,10 @@ # define CLAR_MAX_PATH 4096 #elif defined(_WIN32) # define CLAR_MAX_PATH MAX_PATH -#else +#elif defined(PATH_MAX) # define CLAR_MAX_PATH PATH_MAX +#else +# define CLAR_MAX_PATH 4096 #endif #ifndef CLAR_SELFTEST diff --git a/t/unit-tests/clar/generate.py b/t/unit-tests/clar/generate.py index fd2f0ee83b55c5..2357b2d6d299dc 100755 --- a/t/unit-tests/clar/generate.py +++ b/t/unit-tests/clar/generate.py @@ -8,7 +8,7 @@ from __future__ import with_statement from string import Template -import re, fnmatch, os, sys, codecs, pickle +import re, fnmatch, os, sys, codecs, pickle, io class Module(object): class Template(object): @@ -147,7 +147,7 @@ def __init__(self, path, output): self.path = path self.output = output - def should_generate(self, path): + def maybe_generate(self, path): if not os.path.isfile(path): return True @@ -223,34 +223,85 @@ def callback_count(self): return sum(len(module.callbacks) for module in self.modules.values()) def write(self): - output = os.path.join(self.output, 'clar.suite') - os.makedirs(self.output, exist_ok=True) + if not os.path.exists(self.output): + os.makedirs(self.output) - if not self.should_generate(output): + wrote_suite = self.write_suite() + wrote_header = self.write_header() + + if wrote_suite or wrote_header: + self.save_cache() + return True + + return False + + def write_output(self, fn, data): + if not self.maybe_generate(fn): + return False + + current = None + + try: + with open(fn, 'r') as input: + current = input.read() + except OSError: + pass + except IOError: + pass + + if current == data: return False - with open(output, 'w') as data: + with open(fn, 'w') as output: + output.write(data) + + return True + + def write_suite(self): + suite_fn = os.path.join(self.output, 'clar.suite') + + with io.StringIO() as suite_file: modules = sorted(self.modules.values(), key=lambda module: module.name) for module in modules: t = Module.DeclarationTemplate(module) - data.write(t.render()) + suite_file.write(t.render()) for module in modules: t = Module.CallbacksTemplate(module) - data.write(t.render()) + suite_file.write(t.render()) suites = "static struct clar_suite _clar_suites[] = {" + ','.join( Module.InfoTemplate(module).render() for module in modules ) + "\n};\n" - data.write(suites) + suite_file.write(suites) - data.write("static const size_t _clar_suite_count = %d;\n" % self.suite_count()) - data.write("static const size_t _clar_callback_count = %d;\n" % self.callback_count()) + suite_file.write(u"static const size_t _clar_suite_count = %d;\n" % self.suite_count()) + suite_file.write(u"static const size_t _clar_callback_count = %d;\n" % self.callback_count()) - self.save_cache() - return True + return self.write_output(suite_fn, suite_file.getvalue()) + + return False + + def write_header(self): + header_fn = os.path.join(self.output, 'clar_suite.h') + + with io.StringIO() as header_file: + header_file.write(u"#ifndef _____clar_suite_h_____\n") + header_file.write(u"#define _____clar_suite_h_____\n") + + modules = sorted(self.modules.values(), key=lambda module: module.name) + + for module in modules: + t = Module.DeclarationTemplate(module) + header_file.write(t.render()) + + header_file.write(u"#endif\n") + + return self.write_output(header_fn, header_file.getvalue()) + + return False if __name__ == '__main__': from optparse import OptionParser @@ -275,4 +326,4 @@ def write(self): suite.load(options.force) suite.disable(options.excluded) if suite.write(): - print("Written `clar.suite` (%d tests in %d suites)" % (suite.callback_count(), suite.suite_count())) + print("Written `clar.suite`, `clar_suite.h` (%d tests in %d suites)" % (suite.callback_count(), suite.suite_count())) From 5514f146171349afd1965c13f92c786ed90f8dbe Mon Sep 17 00:00:00 2001 From: Guillaume Jacob Date: Mon, 16 Mar 2026 14:15:36 +0000 Subject: [PATCH 29/33] doc: fix git grep args order in Quick Reference The example provided has its arguments in the wrong order. The revision should follow the pattern, and not the other way around. Signed-off-by: Guillaume Jacob Signed-off-by: Junio C Hamano --- Documentation/user-manual.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/user-manual.adoc b/Documentation/user-manual.adoc index d2b478ad23221a..c6996a2141c222 100644 --- a/Documentation/user-manual.adoc +++ b/Documentation/user-manual.adoc @@ -4466,7 +4466,7 @@ $ git show # most recent commit $ git diff v2.6.15..v2.6.16 # diff between two tagged versions $ git diff v2.6.15..HEAD # diff with current head $ git grep "foo()" # search working directory for "foo()" -$ git grep v2.6.15 "foo()" # search old tree for "foo()" +$ git grep "foo()" v2.6.15 # search old tree for "foo()" $ git show v2.6.15:a.txt # look at old version of a.txt ----------------------------------------------- From 48430e44ace97055567294d412dde9bb8adc0fbb Mon Sep 17 00:00:00 2001 From: Mirko Faina Date: Mon, 16 Mar 2026 02:15:42 +0100 Subject: [PATCH 30/33] t0008: improve test cleanup to fix failing test The "large exclude file ignored in tree" test fails. This is due to an additional warning message that is generated in the test. "warning: unable to access 'subdir/.gitignore': Too many levels of symbolic links", the extra warning that is not supposed to be there, happens because of some leftover files left by previous tests. To fix this we improve cleanup on "symlinks not respected in-tree", and because the tests in t0008 in general have poor cleanup, at the start of "large exclude file ignored in tree" we search for any leftover .gitignore and remove them before starting the test. Improve post-test cleanup and add pre-test cleanup to make sure that we have a workable environment for the test. Signed-off-by: Mirko Faina Signed-off-by: Junio C Hamano --- t/t0008-ignores.sh | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/t/t0008-ignores.sh b/t/t0008-ignores.sh index db8bde280ecfc9..e716b5cdfa1bee 100755 --- a/t/t0008-ignores.sh +++ b/t/t0008-ignores.sh @@ -946,7 +946,7 @@ test_expect_success SYMLINKS 'symlinks respected in info/exclude' ' ' test_expect_success SYMLINKS 'symlinks not respected in-tree' ' - test_when_finished "rm .gitignore" && + test_when_finished "rm -rf subdir .gitignore err actual" && ln -s ignore .gitignore && mkdir subdir && ln -s ignore subdir/.gitignore && @@ -957,6 +957,7 @@ test_expect_success SYMLINKS 'symlinks not respected in-tree' ' test_expect_success EXPENSIVE 'large exclude file ignored in tree' ' test_when_finished "rm .gitignore" && + find . -name .gitignore -exec rm "{}" ";" && dd if=/dev/zero of=.gitignore bs=101M count=1 && git ls-files -o --exclude-standard 2>err && echo "warning: ignoring excessively large pattern file: .gitignore" >expect && From d893f3e7cc0a49ac62ff0ec16d7fdbf606399333 Mon Sep 17 00:00:00 2001 From: PRASHANT S BISHT Date: Mon, 16 Mar 2026 22:54:57 +0530 Subject: [PATCH 31/33] t4200: convert test -[df] checks to test_path_* helpers Replace old-style path existence checks in t4200-rerere.sh with the appropriate test_path_* helper functions. These helpers provide clearer diagnostic messages on failure than the raw shell test builtin. Signed-off-by: Prashant S Bisht Signed-off-by: Junio C Hamano --- t/t4200-rerere.sh | 26 +++++++++++++------------- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/t/t4200-rerere.sh b/t/t4200-rerere.sh index 204325f4d53df1..1717f407c80d86 100755 --- a/t/t4200-rerere.sh +++ b/t/t4200-rerere.sh @@ -72,7 +72,7 @@ test_expect_success 'nothing recorded without rerere' ' rm -rf .git/rr-cache && git config rerere.enabled false && test_must_fail git merge first && - ! test -d .git/rr-cache + test_path_is_missing .git/rr-cache ' test_expect_success 'activate rerere, old style (conflicting merge)' ' @@ -84,8 +84,8 @@ test_expect_success 'activate rerere, old style (conflicting merge)' ' sha1=$(sed "s/ .*//" .git/MERGE_RR) && rr=.git/rr-cache/$sha1 && grep "^=======\$" $rr/preimage && - ! test -f $rr/postimage && - ! test -f $rr/thisimage + test_path_is_missing $rr/postimage && + test_path_is_missing $rr/thisimage ' test_expect_success 'rerere.enabled works, too' ' @@ -110,8 +110,8 @@ test_expect_success 'set up rr-cache' ' test_expect_success 'rr-cache looks sane' ' # no postimage or thisimage yet - ! test -f $rr/postimage && - ! test -f $rr/thisimage && + test_path_is_missing $rr/postimage && + test_path_is_missing $rr/thisimage && # preimage has right number of lines cnt=$(sed -ne "/^<<<<<<>>>>>>/p" $rr/preimage | wc -l) && @@ -167,7 +167,7 @@ test_expect_success 'first postimage wins' ' git show first:a1 | sed "s/To die: t/To die! T/" >expect && git commit -q -a -m "prefer first over second" && - test -f $rr/postimage && + test_path_is_file $rr/postimage && oldmtimepost=$(test-tool chmtime --get -60 $rr/postimage) && @@ -190,14 +190,14 @@ test_expect_success 'rerere clear' ' mv $rr/postimage .git/post-saved && echo "$sha1 a1" | tr "\012" "\000" >.git/MERGE_RR && git rerere clear && - ! test -d $rr + test_path_is_missing $rr ' test_expect_success 'leftover directory' ' git reset --hard && mkdir -p $rr && test_must_fail git merge first && - test -f $rr/preimage + test_path_is_file $rr/preimage ' test_expect_success 'missing preimage' ' @@ -205,7 +205,7 @@ test_expect_success 'missing preimage' ' mkdir -p $rr && cp .git/post-saved $rr/postimage && test_must_fail git merge first && - test -f $rr/preimage + test_path_is_file $rr/preimage ' test_expect_success 'set up for garbage collection tests' ' @@ -230,16 +230,16 @@ test_expect_success 'set up for garbage collection tests' ' test_expect_success 'gc preserves young or recently used records' ' git rerere gc && - test -f $rr/preimage && - test -f $rr2/preimage + test_path_is_file $rr/preimage && + test_path_is_file $rr2/preimage ' test_expect_success 'old records rest in peace' ' test-tool chmtime =$just_over_60_days_ago $rr/postimage && test-tool chmtime =$just_over_15_days_ago $rr2/preimage && git rerere gc && - ! test -f $rr/preimage && - ! test -f $rr2/preimage + test_path_is_missing $rr/preimage && + test_path_is_missing $rr2/preimage ' rerere_gc_custom_expiry_test () { From 2594747ad1b52f5a4739de662d5ad14621c94738 Mon Sep 17 00:00:00 2001 From: Jeff King Date: Sat, 14 Mar 2026 12:08:14 -0400 Subject: [PATCH 32/33] transport: plug leaks in transport_color_config() We retrieve config values with repo_config_get_string(), which will allocate a new copy of the string for us. But we don't hold on to those strings, since they are just fed to git_config_colorbool() and color_parse(). But nor do we free them, which means they leak. We can fix this by using the "_tmp" form of repo_config_get_string(), which just hands us a pointer directly to the internal storage. This is OK for our purposes, since we don't need it to last for longer than our parsing calls. Two interesting side notes here: 1. Many types already have a repo_config_get_X() variant that handles this for us (e.g., repo_config_get_bool()). But neither colorbools nor colors themselves have such helpers. We might think about adding them, but converting all callers is a larger task, and out of scope for this fix. 2. As far as I can tell, this leak has been there since 960786e761 (push: colorize errors, 2018-04-21), but wasn't detected by LSan in our test suite. It started triggering when we applied dd3693eb08 (transport-helper, connect: use clean_on_exit to reap children on abnormal exit, 2026-03-12) which is mostly unrelated. Even weirder, it seems to trigger only with clang (and not gcc), and only with GIT_TEST_DEFAULT_REF_FORMAT=reftable. So I think this is another odd case where the pointers happened to be hanging around in stack memory, but changing the pattern of function calls in nearby code was enough for them to be incidentally overwritten. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano --- transport.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/transport.c b/transport.c index e305d6bd228b45..7985b42a7431ca 100644 --- a/transport.c +++ b/transport.c @@ -47,21 +47,21 @@ static int transport_color_config(void) "color.transport.reset", "color.transport.rejected" }, *key = "color.transport"; - char *value; + const char *value; static int initialized; if (initialized) return 0; initialized = 1; - if (!repo_config_get_string(the_repository, key, &value)) + if (!repo_config_get_string_tmp(the_repository, key, &value)) transport_use_color = git_config_colorbool(key, value); if (!want_color_stderr(transport_use_color)) return 0; for (size_t i = 0; i < ARRAY_SIZE(keys); i++) - if (!repo_config_get_string(the_repository, keys[i], &value)) { + if (!repo_config_get_string_tmp(the_repository, keys[i], &value)) { if (!value) return config_error_nonbool(keys[i]); if (color_parse(value, transport_colors[i]) < 0) From ce74208c2fa13943fffa58f168ac27a76d0eb789 Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Tue, 24 Mar 2026 12:31:15 -0700 Subject: [PATCH 33/33] The 20th batch Signed-off-by: Junio C Hamano --- Documentation/RelNotes/2.54.0.adoc | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+) diff --git a/Documentation/RelNotes/2.54.0.adoc b/Documentation/RelNotes/2.54.0.adoc index 064827b6420a5d..7de9f7e7ecf6d2 100644 --- a/Documentation/RelNotes/2.54.0.adoc +++ b/Documentation/RelNotes/2.54.0.adoc @@ -91,6 +91,12 @@ UI, Workflows & Features * "git rebase" learns "--trailer" command to drive the interpret-trailers machinery. + * "git fast-import" learned to optionally replace signature on + commits whose signatures get invalidated due to replaying by + signing afresh. + + * "git history" learned the "split" subcommand. + Performance, Internal Implementation, Development Support etc. -------------------------------------------------------------- @@ -212,6 +218,15 @@ Performance, Internal Implementation, Development Support etc. * Adjust test-lint to allow "sed -E" to use ERE in the patterns. + * Clar (unit testing framework) update from the upstream. + + * Reduce system overhead "git upload-pack" spends on relaying "git + pack-objects" output to the "git fetch" running on the other end of + the connection. + + * Add a coccinelle rule to break the build when "struct strbuf" gets + passed by value. + Fixes since v2.53 ----------------- @@ -358,6 +373,17 @@ Fixes since v2.53 available since OpenSSL 1.1 since 2016 or so. (merge 6392a0b75d bb/imap-send-openssl-4.0-prep later to maint). + * Fix an example in the user-manual. + (merge 5514f14617 gj/user-manual-fix-grep-example later to maint). + + * The final clean-up phase of the diff output could turn the result of + histogram diff algorithm suboptimal, which has been corrected. + (merge e417277ae9 yc/histogram-hunk-shift-fix later to maint). + + * "git diff -U" was too lenient in its command line parsing and + took an empty string as a valid . + (merge 4f6a803aba ty/doc-diff-u-wo-number later to maint). + * Other code cleanup, docfix, build fix, etc. (merge d79fff4a11 jk/remote-tracking-ref-leakfix later to maint). (merge 7a747f972d dd/t5403-modernise later to maint). @@ -401,3 +427,7 @@ Fixes since v2.53 (merge 35f220b639 ss/submodule--helper-use-xmalloc later to maint). (merge 02cbae61df cf/constness-fixes later to maint). (merge 69efd53c81 ms/t7605-test-path-is-helpers later to maint). + (merge d39cef3a1a ss/t0410-delete-object-cleanup later to maint). + (merge 2f05039717 rj/pack-refs-tests-path-is-helpers later to maint). + (merge 2594747ad1 jk/transport-color-leakfix later to maint). + (merge 48430e44ac mf/t0008-cleanup later to maint).