Fix CLI token source --profile fallback with version detection#1605
Fix CLI token source --profile fallback with version detection#1605mihaimitrea-db wants to merge 3 commits intomainfrom
Conversation
4cb4c26 to
82d9599
Compare
Range-diff: stack/force-refresh-flag (4cb4c26 -> 82d9599)
Reproduce locally: |
82d9599 to
68d45f4
Compare
Range-diff: stack/force-refresh-flag (82d9599 -> 68d45f4)
Reproduce locally: |
68d45f4 to
4a5079f
Compare
Range-diff: stack/force-refresh-flag (68d45f4 -> 4a5079f)
Reproduce locally: |
4a5079f to
6f4fead
Compare
Range-diff: stack/force-refresh-flag (4a5079f -> 6f4fead)
Reproduce locally: |
6f4fead to
2218270
Compare
Range-diff: stack/force-refresh-flag (6f4fead -> 2218270)
Reproduce locally: |
2218270 to
76e74ca
Compare
Range-diff: stack/force-refresh-flag (2218270 -> 76e74ca)
Reproduce locally: |
a19dab0 to
6c77eda
Compare
d7431c4 to
0572f6e
Compare
0572f6e to
4280bbe
Compare
The --profile flag is a global Cobra flag in the Databricks CLI. Old CLIs (< v0.207.1) silently accept it on `auth token` but fail with "cannot fetch credentials" instead of "unknown flag: --profile". This made the previous error-based fallback to --host dead code. Replace the try-and-retry approach with version-based detection: run `databricks version` at init time and use the parsed semver to decide between --profile and --host. This also simplifies CliTokenSource to a single resolved command with no runtime probing. Signed-off-by: Mihai Mitrea <mihai.mitrea@databricks.com>
4280bbe to
271785e
Compare
| } | ||
|
|
||
| // cliVersion represents a parsed Databricks CLI semver version. | ||
| type cliVersion struct { |
There was a problem hiding this comment.
Could we use golang.org/x/mod/semver instead? It's already part of the go.mod.
| // --profile is a global Cobra flag — old CLIs accept it silently but | ||
| // fail with "cannot fetch credentials" instead of "unknown flag". | ||
| // We use version detection to decide --profile vs --host. |
There was a problem hiding this comment.
| // --profile is a global Cobra flag — old CLIs accept it silently but | |
| // fail with "cannot fetch credentials" instead of "unknown flag". | |
| // We use version detection to decide --profile vs --host. | |
| // Flag --profile is a global CLI flag and is recognized for all commands even | |
| // the ones that do not support it. Only use flag --profile in CLI versions that | |
| // are known to support it in command `auth token`. |
| if ver.AtLeast(cliVersionForProfile) { | ||
| cmd = []string{cliPath, "auth", "token", "--profile", cfg.Profile} | ||
| } else { | ||
| logger.Warnf(ctx, "Profile %q was specified but Databricks CLI %s does not support --profile (requires >= %s). Falling back to --host.", cfg.Profile, ver, cliVersionForProfile) |
There was a problem hiding this comment.
| logger.Warnf(ctx, "Profile %q was specified but Databricks CLI %s does not support --profile (requires >= %s). Falling back to --host.", cfg.Profile, ver, cliVersionForProfile) | |
| logger.Warnf(ctx, "Databricks CLI %s does not support --profile (requires >= %s). Falling back to --host.", ver, cliVersionForProfile) |
| switch cfg.HostType() { | ||
| case AccountHost: | ||
| cmd = append(cmd, "--account-id", cfg.AccountID) | ||
| if cmd == nil && cfg.Host != "" { |
There was a problem hiding this comment.
This is equivalent but less bugprone in case cmd is initialized as an empty slice.
| if cmd == nil && cfg.Host != "" { | |
| if len(cmd) == 0 && cfg.Host != "" { |
|
|
||
| // buildCliCommand constructs the CLI command for fetching an auth token. | ||
| // The CLI version determines which flags are used. | ||
| func buildCliCommand(ctx context.Context, cliPath string, cfg *Config, ver cliVersion) []string { |
There was a problem hiding this comment.
[optional] I believe a more readable and idiomatic way to write this code would be to organize it so that (i) the "normal branch" (i.e. use --profile) is on the lowest level of indentation, and (ii) each branch is terminal. Something like this:
func buildCliCommand(ctx context.Context, cliPath string, cfg *Config, ver cliVersion) []string {
if cfg.Profile == "" {
return buildCliHostCommand(ctx context.Context, cliPath, cfg)
}
if !ver.AtLeast(cliVersionForProfile) {
logger.Warnf(ctx, "CLI version XXX does not support --profile")
return buildCliHostCommand(ctx context.Context, cliPath, cfg)
}
cmd := []string{cliPath, "auth", "token", "--profile", cfg.Profile}
if cfg.HostType() == AccountHost {
cmd = append(cmd, "--account-id", cfg.AccountID)
}
return cmd
}
func buildCliHostCommand(ctx context.Context, cliPath string, cfg *Config, ver cliVersion) []string {
cmd = []string{cliPath, "auth", "token", "--host", cfg.Host}
if cfg.HostType() == AccountHost {
cmd = append(cmd, "--account-id", cfg.AccountID)
}
return cmd
}It is a little longer but much easier to follow. Fallin back to using --host is now clearly the exception path and one does not need to understand how the host command is built to understand that.
| // We intentionally discard exec.ExitError — the stderr text is the | ||
| // CLI's error contract; exit codes and process state are not useful. |
There was a problem hiding this comment.
Why aren't they useful? Naively, I'd imagine that one might want to access this information for debugging what is happening with the CLI.
I could understand that we discard it because we do not want the error to be part of the SDK API contract but then I wonder why we are not discarding it at line 155 too.
| } | ||
|
|
||
| cliName := "databricks" | ||
| func TestNewCliTokenSource(t *testing.T) { |
There was a problem hiding this comment.
Could we have this as table tests and merge them with the one above so that we have one comprehensive test suite for NewCliTokenSource?
- Replace custom cliVersion struct with golang.org/x/mod/semver for robust version parsing and comparison. - Use displayVersion helper instead of a String() method on a struct, so the empty (unknown) case is handled explicitly. - Clarify --profile global-flag comment to explain why version detection (not runtime probing) is needed. - Tighten the --profile fallback warning and use len(cmd) == 0 instead of comparing against nil. - Fix the misleading exec.ExitError comment so it describes why we prefer stderr over the wrapped error. - Consolidate TestNewCliTokenSource subtests into a table-driven form consistent with the rest of the file. Signed-off-by: Mihai Mitrea <mihai.mitrea@databricks.com>
Resolves conflict in NEXT_CHANGELOG.md: v0.127.0 and v0.128.0 have already shipped, so their entries have moved to CHANGELOG.md. Only the Layer 1 bug-fix entry remains under v0.129.0 > Bug Fixes. Signed-off-by: Mihai Mitrea <mihai.mitrea@databricks.com>
|
If integration tests don't run automatically, an authorized user can run them manually by following the instructions below: Trigger: Inputs:
Checks will be approved automatically on success. |
Summary
Fix the broken
--profilefallback inCliTokenSourceby replacing error-based detection with version-based CLI detection at init time.Why
The
--profileflag ondatabricks auth tokenis a global Cobra flag (defined as a persistent flag on the root command). Old CLIs (< v0.207.1) silently accept it — they don't report"unknown flag: --profile"but instead fail later with"cannot fetch credentials". This means the existingisUnknownFlagErrorcheck (config/cli_token_source.go:120) never matches, and the--hostfallback is dead code.This was verified by testing against CLI v0.207.0 vs v0.207.1:
databricks auth token --profile workspace→Error: init: cannot fetch credentials(not "unknown flag")databricks auth token --profile workspace→ returns a valid tokenApproaches considered
Three approaches were evaluated for detecting whether the installed CLI supports
--profile:Error-based detection (try-and-retry) — the current approach on
main. Rundatabricks auth token --profile <name>and check whether the error contains"unknown flag: --profile". This is broken: because--profileis a global Cobra flag, old CLIs accept it silently and fail with a different error ("cannot fetch credentials"), so the fallback to--hostnever triggers.--helpflag parsing (databricks auth token --help+ substring matching) was rejected because the--helpoutput format is not a stable API. More importantly,--profilewould appear in--helpoutput even on old CLIs that don't actually implement profile-based token lookup — it shows up because it's a global persistent flag, not because theauth tokensubcommand uses it. This approach has the same fundamental flaw as error-based detection.Version detection (
databricks version+ semver comparison) — the approach taken here. Rundatabricks versionat init time, parse the semver (e.g.,"Databricks CLI v0.207.1"), and compare against known minimum versions for each flag. This is reliable because the version string is a stable output format, and the mapping between flags and CLI versions is well-defined (databricks/cli#855 for--profilein v0.207.1). If version detection fails, the SDK falls back to the most conservative command (--hostonly).References
--profilesupport added in CLI v0.207.1: databricks/cli#855 (Oct 2023)What changed
Interface changes
None.
CliTokenSourceis not part of the public API surface.NewCliTokenSourcenow takescontext.Contextas its first parameter, needed forexec.CommandContextwhen runningdatabricks version. This is consistent with everyCredentialsStrategy.Configuremethod in the codebase, and the single caller (auth_u2m.go) already hasctxin scope.Behavioral changes
cfg.Profileis set but the CLI is too old (< v0.207.1), the SDK now correctly falls back to--host. Previously this fallback was dead code.--profileflag is not supported and the SDK falls back to--host.Internal changes
cliVersiontype: semver parsing withAtLeast()comparison andString()formatting.getCliVersion(ctx, cliPath): runsdatabricks versionand parses the output.parseCliVersion: parses"Databricks CLI v0.207.1"→cliVersion{0, 207, 1}.resolveCliCommand: bridges version detection and command building. Falls back to zero version on detection failure.buildCliCommand: pure function — takes a version, returns a single resolved command. No exec calls, easy to test.CliTokenSourcesimplified to a singlecmd []stringfield. NohostCmd, no runtime fallback.Token()is now one line:return c.execCliCommand(ctx, c.cmd).isUnknownFlagError,buildCliCommands(plural),buildHostCommand,hostCmdfield.How is this tested?
Manual tests on versions 0.207.0 and 0.207.1
Unit tests in
config/cli_token_source_test.go:TestParseCliVersion— standard versions, patch versions, malformed output, empty string, missing prefix.TestCliVersion_AtLeast— equal, higher/lower patch/minor/major, zero vs zero, zero vs nonzero.TestBuildCliCommand— table-driven: version x config → expected command. Covers: host-only, account host, profile+new CLI (uses --profile), profile+old CLI (falls back to --host), profile-only+old CLI (nil), zero version (detection failed, falls back to --host), neither profile nor host (nil).TestNewCliTokenSource— success with host, success with profile, CLI not found, neither profile nor host.TestCliTokenSource_Token— success, CLI error, invalid JSON.