Skip to content

Add host_key_policy option to ComputeEngineSSHHook#66746

Open
potiuk wants to merge 1 commit into
apache:mainfrom
potiuk:add-host-key-policy-option-to-compute-engine-ssh-hook
Open

Add host_key_policy option to ComputeEngineSSHHook#66746
potiuk wants to merge 1 commit into
apache:mainfrom
potiuk:add-host-key-policy-option-to-compute-engine-ssh-hook

Conversation

@potiuk
Copy link
Copy Markdown
Member

@potiuk potiuk commented May 12, 2026

Summary

Expose paramiko's MissingHostKeyPolicy choice as a constructor argument on ComputeEngineSSHHook, so callers can opt into strict host-key verification on the SSH transport. The hook previously hard-coded paramiko.AutoAddPolicy, which means callers who wanted the remote host authenticated had no way to ask for it.

The new host_key_policy argument accepts:

  • the string aliases "auto_add", "reject" and "warning" — mapped to the matching paramiko policy classes;
  • any custom paramiko.MissingHostKeyPolicy instance — so a caller that wants to pin the remote host's key from GCE guest attributes / instance metadata can plug in a policy that loads it on the fly.

The default is "auto_add", preserving the historical behaviour of the hook; no migration is required for existing callers.

The previous inline comment that claimed the missing host-key check was unrelated to the local private key is removed — it conflated two different concerns — and replaced with a pointer to the new constructor argument.

Files changed

  • providers/google/src/airflow/providers/google/cloud/hooks/compute_ssh.py — new host_key_policy parameter, helper resolver, applied in _connect_to_instance. Misleading comment removed.
  • providers/google/tests/unit/google/cloud/hooks/test_compute_ssh.py — new TestHostKeyPolicyResolution class (4 cases: default, string aliases, custom instance, unknown-string error).

Test plan

  • uv run --project providers/google pytest providers/google/tests/unit/google/cloud/hooks/test_compute_ssh.py — 22 / 22 pass
  • prek run --from-ref upstream/main --stage pre-commit — clean

Migration

None. Callers that don't pass host_key_policy get the same paramiko AutoAddPolicy behaviour as before.

Was generative AI tooling used to co-author this PR?
  • Yes — Claude Opus 4.7 (1M context)

Generated-by: Claude Opus 4.7 (1M context) following the guidelines at https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions

Exposes paramiko's `MissingHostKeyPolicy` choice as a constructor
argument so callers can opt into strict host-key verification on the
SSH transport. The argument accepts the string aliases `"auto_add"`,
`"reject"` and `"warning"` (which map to the matching `paramiko`
policy classes) and also passes through any custom
`paramiko.MissingHostKeyPolicy` instance — so a caller that wants to
pin the remote host's key from GCE guest attributes / instance
metadata can plug in a policy that loads it on the fly.

The default is `"auto_add"`, preserving the historical behaviour of
this hook; no migration is required for existing callers. The
previous inline comment claiming the missing host-key check was
unrelated to the local private key is removed — it conflated two
different concerns and is replaced with a pointer to the new
constructor argument.

Generated-by: Claude Opus 4.7 (1M context) following the guidelines at
https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions
@potiuk potiuk requested a review from shahar1 as a code owner May 12, 2026 00:50
@boring-cyborg boring-cyborg Bot added area:providers provider:google Google (including GCP) related issues labels May 12, 2026
callers can plug in a custom policy (e.g. one that loads pinned
host keys from GCE guest attributes).

Any other value raises :class:`AirflowException`.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Any other value raises :class:`AirflowException`.
Any other value raises :class:`ValueError`.

"""Tests for the ``host_key_policy`` constructor argument."""

def test_default_is_auto_add(self):
import paramiko
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for making this import internal? (same goes for the other tests)

Comment on lines +197 to +201
raise ValueError(
f"Unknown host_key_policy {self.host_key_policy!r}. "
"Expected one of 'auto_add', 'reject', 'warning', "
"or an instance of paramiko.MissingHostKeyPolicy."
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider replacing with raise ValueError(...) from None as the KeyError is an implementation detail of the dict lookup

cmd_timeout: int | ArgNotSet = NOTSET,
max_retries: int = 10,
impersonation_chain: str | None = None,
host_key_policy: str | paramiko.MissingHostKeyPolicy = "auto_add",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add a docstring?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:providers provider:google Google (including GCP) related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants