Replace random generated "inventory ID"#4396
Replace random generated "inventory ID"#4396Jaisheesh-2006 wants to merge 11 commits intokptdev:mainfrom
Conversation
Replaces random UUIDs with SHA-1 hashes derived from namespace and name to prevent lost inventory bugs. Makes --name mandatory and adds DNS-1123 validation. Hides --inventory-id to favor deterministic generation. Fixes kptdev#4387 Signed-off-by: Jaisheesh-2006 <jaicodes2006@gmail.com>
✅ Deploy Preview for kptdocs ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
There was a problem hiding this comment.
Pull request overview
This PR addresses lost-inventory behavior by making kpt live init inventory IDs deterministic (derived from namespace + --name) instead of random UUID-based values, and updates related CLI guidance/tests/docs.
Changes:
- Make
--namemandatory forkpt live initand validate it against Kubernetes DNS-1123 naming rules. - Replace auto-generated inventory IDs with a deterministic, length-prefixed SHA-1 hash.
- Update migrate/init flows, generated docs, and e2e coverage to reflect the new required
--namebehavior.
Reviewed changes
Copilot reviewed 5 out of 6 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/lib/errors/resolver/live.go | Updates user-facing resolver messages to instruct using kpt live init --name=.... |
| internal/docs/generated/livedocs/docs.go | Updates generated CLI docs to reflect required --name and “advanced” inventory-id override semantics. |
| e2e/live/end-to-end-test.sh | Updates e2e invocations/assertions to pass --name and expect stable RG names. |
| commands/live/migrate/migratecmd.go | Adds guardrails in migration; however, introduces/retains error-handling issues in RG-file existence checks and malformed inventory handling. |
| commands/live/init/cmdliveinit_test.go | Reworks tests for name validation and deterministic hash expectations. |
| commands/live/init/cmdliveinit.go | Implements deterministic hash ID generation, mandatory --name, hides --inventory-id, and updates init behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
commands/live/migrate/migratecmd.go
Outdated
| if kf.Inventory.Name == "" { | ||
| return errors.E(op, types.UniquePath(dir), | ||
| fmt.Errorf("Kptfile inventory has empty name; re-run: kpt live init --name=<name>")) | ||
| } |
There was a problem hiding this comment.
This empty-name guard is likely unreachable: kptfileutil.ValidateInventory(kf.Inventory) earlier returns an error when Inventory.Name is empty, and the current code treats any validation error as “no inventory” and returns nil. To actually guard malformed inventories, handle the validation error explicitly (e.g., distinguish nil inventory vs missing fields) or move the check before the early return.
commands/live/init/cmdliveinit.go
Outdated
| timeStr := strconv.FormatInt(t.UTC().UnixNano(), 10) | ||
| return fmt.Sprintf("%s-%s", hashStr, timeStr), nil | ||
| h := sha1.New() | ||
| fmt.Fprintf(h, "%d:%s:%d:%s", len(namespace), namespace, len(name), name) |
There was a problem hiding this comment.
fmt.Fprintf's return values are ignored here. Since this repo enables the errcheck linter, this will fail CI. Capture and handle the returned error (or explicitly assign to _ and document why it can’t fail for a hash.Hash) so errcheck passes.
| fmt.Fprintf(h, "%d:%s:%d:%s", len(namespace), namespace, len(name), name) | |
| if _, err := fmt.Fprintf(h, "%d:%s:%d:%s", len(namespace), namespace, len(name), name); err != nil { | |
| return "", fmt.Errorf("failed to write hash input: %w", err) | |
| } |
| // Internal callers (e.g. migrate) may pass empty Name with an explicit | ||
| // InventoryID; derive a stable name from the package directory. | ||
| if c.Name == "" { | ||
| randomSuffix := common.RandomStr() | ||
| c.Name = fmt.Sprintf("%s-%s", defaultInventoryName, randomSuffix) | ||
| if c.InventoryID != "" { | ||
| c.Name = filepath.Base(c.Pkg.UniquePath.String()) | ||
| } else { | ||
| return errors.E(op, c.Pkg.UniquePath, errNameRequired) | ||
| } |
There was a problem hiding this comment.
When Name is empty but InventoryID is provided, this derives the ResourceGroup name from the package directory base without validating it. Directory names can be invalid for Kubernetes metadata.name (uppercase, underscore, etc.), which would create an invalid ResourceGroup YAML and break later operations. Consider validating/sanitizing the derived name (or returning an error that requires an explicit --name in the internal caller).
commands/live/migrate/migratecmd.go
Outdated
| @@ -390,6 +390,11 @@ func (mr *Runner) migrateKptfileToRG(args []string) error { | |||
| return errors.E(op, errors.IO, types.UniquePath(dir), err) | |||
There was a problem hiding this comment.
In the os.Stat error handling switch, the second case checks err (from earlier operations) instead of rgFileErr. This will silently ignore filesystem errors from os.Stat (e.g., permission errors) and can lead to migrating into an invalid state. Update the condition to use rgFileErr and return that error when it's not os.ErrNotExist.
Signed-off-by: Jaisheesh-2006 <jaicodes2006@gmail.com>
The quiet init test was changing the inventory name from inventory-18030002 back to rg-test-case-1a, causing the downstream 'status on symlink' assertion to fail. Signed-off-by: Jaisheesh-2006 <jaicodes2006@gmail.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 6 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
e2e/live/end-to-end-test.sh
Outdated
| local count | ||
| count=$(wc -l < $OUTPUT_DIR/invname | tr -d ' ') | ||
| if [ "$count" -ge 1 ]; then |
There was a problem hiding this comment.
assertRGInventory claims to verify exactly one ResourceGroup inventory object, but the check currently passes when any count >= 1. This can hide regressions where multiple inventories exist in the same namespace. Update the condition to assert the count is exactly 1 (and ideally include the matching rows in the error output for debugging).
commands/live/migrate/migratecmd.go
Outdated
| if kf.Inventory.Name == "" { | ||
| return errors.E(op, types.UniquePath(dir), | ||
| fmt.Errorf("kptfile inventory has empty name; re-run: kpt live init --name=<name>")) | ||
| } | ||
|
|
There was a problem hiding this comment.
This kf.Inventory.Name == "" guard is effectively unreachable because kptfileutil.ValidateInventory(kf.Inventory) above already rejects empty/whitespace names and returns early. If the intent is to surface a clearer error for malformed inventory, handle the specific ValidateInventory error (or move the check before the validation/early-return) so malformed inventory doesn't get silently treated as "migration not needed".
| if kf.Inventory.Name == "" { | |
| return errors.E(op, types.UniquePath(dir), | |
| fmt.Errorf("kptfile inventory has empty name; re-run: kpt live init --name=<name>")) | |
| } |
| if errs := validation.IsDNS1123Subdomain(trimmed); len(errs) > 0 { | ||
| return "", fmt.Errorf("--name %q is not a valid Kubernetes resource name: %s", | ||
| trimmed, strings.Join(errs, "; ")) |
There was a problem hiding this comment.
PR description says --name is validated as a DNS-1123 label, but the implementation uses validation.IsDNS1123Subdomain (which permits dots and longer names). Either update the PR description/docs to say “DNS-1123 subdomain” (which matches Kubernetes metadata.name rules) or switch to label validation if dots should be rejected.
07896b7 to
1e0c7f5
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 6 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| h := sha1.New() | ||
| fmt.Fprintf(h, "%d:%s:%d:%s", len(namespace), namespace, len(name), name) | ||
| return hex.EncodeToString(h.Sum(nil)), nil |
There was a problem hiding this comment.
Using SHA-1 for non-cryptographic purposes (generating deterministic IDs) is acceptable. However, consider adding a code comment explicitly noting that SHA-1 is used here for deterministic hashing, not for security, to prevent future confusion and potential linter warnings from security scanners like gosec.
| # assertRGInventory checks that exactly one ResourceGroup inventory object | ||
| # exists in the passed namespace (selected by the inventory-id label). |
There was a problem hiding this comment.
The comment says the function "checks that exactly one ResourceGroup inventory object exists," but the implementation uses ">= 1" which allows multiple ResourceGroup inventories. Consider either updating the comment to match the implementation ("at least one") or making the check stricter to truly verify exactly one inventory object exists.
- Replace IsDNS1123Subdomain with IsDNS1123Label for stricter name validation (63-char limit, no dots). - Capture fmt.Fprintf error in generateHash. - Validate directory-name fallback with IsDNS1123Label when --name is omitted by internal callers (e.g., migrate). - Fix wrong error variable (err to rgFileErr) in os.Stat switch in migratecmd.go. - Remove unreachable kf.Inventory.Name == '' guard in migratecmd.go. - Tighten assertRGInventory bash check from -ge 1 to -eq 1. - Remove dead generateID function and unused imports. - Add tests for directory-name fallback validation path. Fixes kptdev#4387 Signed-off-by: Jaisheesh-2006 <jaicodes2006@gmail.com>
- Add explanatory comment for SHA-1 usage to clarify it is not for cryptographic security. - Align assertRGInventory bash script comment with the strict -eq 1 implementation. Signed-off-by: Jaisheesh-2006 <jaicodes2006@gmail.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 6 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
commands/live/init/cmdliveinit.go
Outdated
| var errNameRequired = fmt.Errorf( | ||
| "--name is required: provide a stable deployment name " + | ||
| "(e.g. --name=my-app-staging) that remains consistent across re-initializations") |
There was a problem hiding this comment.
Consider using fmt.Errorf with formatting on a single line or breaking the string at natural word boundaries. The current concatenation with '+' operators makes the error message harder to read and modify. For example: fmt.Errorf(\"--name is required: provide a stable deployment name (e.g. --name=my-app-staging) that remains consistent across re-initializations\")
commands/live/init/cmdliveinit.go
Outdated
| // Note: SHA-1 is used here strictly for deterministic ID generation, | ||
| // not for cryptographic security. |
There was a problem hiding this comment.
While the comment correctly clarifies that SHA-1 is not used for cryptographic purposes, it would be helpful to add a brief explanation of why SHA-1 was chosen over alternatives (e.g., simplicity, sufficient collision resistance for this use case, or compatibility with existing systems).
| // Note: SHA-1 is used here strictly for deterministic ID generation, | |
| // not for cryptographic security. | |
| // Note: SHA-1 is used here strictly for deterministic ID generation, not for | |
| // cryptographic security. It is chosen for its simplicity, stable 40-character | |
| // hex output, and compatibility with existing inventory IDs; collision resistance | |
| // is sufficient for this constrained input space (namespace + name), and no | |
| // secrets or security-sensitive data are being protected. |
|
@Jaisheesh-2006 please address the Copilot comments. |
…ents - Formatted the errNameRequired error message into a single string to improve code readability. - Expanded the inline comment in cmdliveinit.go to clarify the rationale for using SHA-1 for deterministic ID generation. Signed-off-by: Jaisheesh-2006 <jaicodes2006@gmail.com>
…heesh-2006/kpt-open-source into fix/4387-loose-inventory-id
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 6 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| if c.InventoryID != "" { | ||
| dirName := filepath.Base(c.Pkg.UniquePath.String()) | ||
| if errs := validation.IsDNS1123Label(dirName); len(errs) > 0 { | ||
| return errors.E(op, c.Pkg.UniquePath, | ||
| fmt.Errorf("directory name %q is not a valid Kubernetes resource name and --name was not provided: %s", | ||
| dirName, strings.Join(errs, "; "))) | ||
| } | ||
| c.Name = dirName |
There was a problem hiding this comment.
The directory-name fallback for internal callers also enforces IsDNS1123Label, which can unnecessarily block migrations for packages whose directory names are valid Kubernetes subdomains (e.g. contain dots) but not labels. Consider aligning this with the actual metadata.name constraints (typically DNS-1123 subdomain) or providing a deterministic transformation instead of hard-failing.
pkg/lib/errors/resolver/live.go
Outdated
| const ( | ||
| noInventoryObjErrorMsg = ` | ||
| Error: Package uninitialized. Please run "kpt live init" command. | ||
| Error: Package uninitialized. Please run "kpt live init --name=<deployment-name>" command. |
There was a problem hiding this comment.
The error message suggests copying --name=<deployment-name>, but the </> characters are shell metacharacters (input redirection) and can fail if copy/pasted. Consider using a safe placeholder like --name=DEPLOYMENT_NAME or a concrete example like --name=my-app.
| Error: Package uninitialized. Please run "kpt live init --name=<deployment-name>" command. | |
| Error: Package uninitialized. Please run "kpt live init --name=DEPLOYMENT_NAME" command. |
commands/live/init/cmdliveinit.go
Outdated
| h := sha1.New() | ||
| if _, err := h.Write([]byte(str)); err != nil { | ||
| return "", errors.E(op, err) | ||
| if errs := validation.IsDNS1123Label(trimmed); len(errs) > 0 { |
There was a problem hiding this comment.
validateName uses validation.IsDNS1123Label, which rejects valid Kubernetes object names such as subdomains containing dots and names up to 253 chars. Since this value becomes metadata.name for the ResourceGroup, consider validating with IsDNS1123Subdomain instead (or document why the stricter label rule is required).
| if errs := validation.IsDNS1123Label(trimmed); len(errs) > 0 { | |
| if errs := validation.IsDNS1123Subdomain(trimmed); len(errs) > 0 { |
add explicit CRD dependency for live apply CRD+CR e2e case update fn-render golden output for subpackage deletion behavior replace unsafe angle-bracket placeholders in user-facing command examples switch live init name validation from DNS1123 label to subdomain add tests for dotted Kubernetes names and directory fallback behavior Signed-off-by: Jaisheesh-2006 <jaicodes2006@gmail.com>
Signed-off-by: Jaisheesh-2006 <jaicodes2006@gmail.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 10 out of 11 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
commands/fn/render/cmdrender.go
Outdated
| "path to a directory to save function results") | ||
| c.Flags().StringVarP(&r.dest, "output", "o", "", | ||
| fmt.Sprintf("output resources are written to provided location. Allowed values: %s|%s|<OUT_DIR_PATH>", cmdutil.Stdout, cmdutil.Unwrap)) | ||
| fmt.Sprintf("output resources are written to provided location. Allowed values: %s|%s|OUT_DIR_PATH", cmdutil.Stdout, cmdutil.Unwrap)) |
There was a problem hiding this comment.
The --output help text now shows OUT_DIR_PATH without placeholder markers, but the generated docs (internal/docs/generated/fndocs/docs.go) still document this as <OUT_DIR_PATH>. This looks like a literal value rather than a placeholder and is inconsistent across user-facing docs. Consider keeping the placeholder form (e.g. <OUT_DIR_PATH>) here or updating the docs generator output to match.
| fmt.Sprintf("output resources are written to provided location. Allowed values: %s|%s|OUT_DIR_PATH", cmdutil.Stdout, cmdutil.Unwrap)) | |
| fmt.Sprintf("output resources are written to provided location. Allowed values: %s|%s|<OUT_DIR_PATH>", cmdutil.Stdout, cmdutil.Unwrap)) |
| r.Command = c | ||
| r.Command.Flags().StringVarP(&r.Dest, "output", "o", "", | ||
| fmt.Sprintf("output resources are written to provided location. Allowed values: %s|%s|<OUT_DIR_PATH>", cmdutil.Stdout, cmdutil.Unwrap)) | ||
| fmt.Sprintf("output resources are written to provided location. Allowed values: %s|%s|OUT_DIR_PATH", cmdutil.Stdout, cmdutil.Unwrap)) |
There was a problem hiding this comment.
The --output help text now shows OUT_DIR_PATH without placeholder markers, but other user-facing docs still use <OUT_DIR_PATH>. Without angle brackets (or similar), it reads like a literal allowed value rather than a placeholder directory path. Consider reverting to <OUT_DIR_PATH> (or updating all docs consistently).
| fmt.Sprintf("output resources are written to provided location. Allowed values: %s|%s|OUT_DIR_PATH", cmdutil.Stdout, cmdutil.Unwrap)) | |
| fmt.Sprintf("output resources are written to provided location. Allowed values: %s|%s|<OUT_DIR_PATH>", cmdutil.Stdout, cmdutil.Unwrap)) |
pkg/lib/errors/resolver/pkg.go
Outdated
| deprecatedv1alpha2KptfileError.Version == "v1alpha2" { | ||
| msg := fmt.Sprintf("Error: Kptfile at %q has an old version (%q) of the Kptfile schema.\n", path, deprecatedv1alpha2KptfileError.Version) | ||
| msg += "Please run \"kpt fn eval <PKG_PATH> -i ghcr.io/kptdev/krm-functions-catalog/fix:latest --include-meta-resources\" to upgrade the package and retry." | ||
| msg += "Please run \"kpt fn eval PKG_PATH -i ghcr.io/kptdev/krm-functions-catalog/fix:latest --include-meta-resources\" to upgrade the package and retry." |
There was a problem hiding this comment.
This error message now instructs users to run kpt fn eval PKG_PATH ..., which can be interpreted as a literal argument. Since you already have the failing path available, consider either using a clearly marked placeholder (e.g. <PKG_PATH>) or, better, interpolating the actual path into the suggested command so users can copy/paste it directly.
| msg += "Please run \"kpt fn eval PKG_PATH -i ghcr.io/kptdev/krm-functions-catalog/fix:latest --include-meta-resources\" to upgrade the package and retry." | |
| msg += fmt.Sprintf("Please run \"kpt fn eval %q -i ghcr.io/kptdev/krm-functions-catalog/fix:latest --include-meta-resources\" to upgrade the package and retry.", path) |
pkg/lib/errors/resolver/update.go
Outdated
| //nolint:lll | ||
| msg = fmt.Sprintf("Package %q is not within a git repository.", pkgNotGitRepoError.Path) | ||
| msg += " Please initialize a repository using 'git init' and then commit the changes using 'git commit -m \"<commit message>\"'." | ||
| msg += " Please initialize a repository using 'git init' and then commit the changes using 'git commit -m \"COMMIT_MESSAGE\"'." |
There was a problem hiding this comment.
The suggested git command uses COMMIT_MESSAGE as a placeholder, which can be mistaken for a literal message. Consider using a clearly marked placeholder (e.g. <commit message> or <COMMIT_MESSAGE>) to make it obvious users should replace it.
pkg/lib/errors/resolver/update.go
Outdated
| if errors.As(err, &pkgRepoDirtyError) { | ||
| msg = fmt.Sprintf("Package %q contains uncommitted changes.", pkgRepoDirtyError.Path) | ||
| msg += " Please commit the changes using 'git commit -m \"<commit message>\"'." | ||
| msg += " Please commit the changes using 'git commit -m \"COMMIT_MESSAGE\"'." |
There was a problem hiding this comment.
The suggested git command uses COMMIT_MESSAGE as a placeholder, which can be mistaken for a literal message. Consider using a clearly marked placeholder (e.g. <commit message> or <COMMIT_MESSAGE>) to make it obvious users should replace it.
Signed-off-by: Jaisheesh-2006 <jaicodes2006@gmail.com>
|
Hello @CsatariGergely. Thank you for bringing the Copilot suggestions to my attention. I have implemented all the requested fixes, and all tests are passing successfully. Please let me know if you need any further changes. |
Description
This PR replaces the generation of random UUIDs for inventory IDs with a deterministic SHA-1 hashing mechanism. By deriving the inventory ID from the package
namespaceand--name, we ensure that the same package configuration always maps to the same inventory object in the cluster.Key Changes:
generateHash(namespace, name)using length-prefixed SHA-1 to replacegoogle/uuid.--nameflag is now mandatory forkpt live initto ensure the hash can be consistently generated.--nameinput to ensure compatibility with Kubernetes resource naming conventions.--inventory-idflag from the help menu to encourage deterministic usage while maintaining backward compatibility for existing workflows.kpt live migrateto prevent operations on empty or malformed inventory names.Motivation
Previously, re-fetching or re-initializing a package would generate a new random UUID. This led to "lost inventory" bugs where
kptcould no longer track or prune resources previously applied to the cluster because the association (the ID) had changed. Deterministic hashing ensures that as long as the namespace and name remain constant, the inventory remains trackable across different environments and local clones.Fixes
Fixes #4387