feat: add Kubernetes Secret Manager for $secret://kubernetes/ references#13509
feat: add Kubernetes Secret Manager for $secret://kubernetes/ references#13509hiades-devops wants to merge 1 commit into
Conversation
Implements the Kubernetes secret manager (apisix/secret/kubernetes.lua)
which allows APISIX to read Kubernetes Secrets directly from the cluster
using the pod's ServiceAccount token.
URI format:
$secret://kubernetes/{manager-id}/{namespace}/{secret-name}/{data-key}
Example:
$secret://kubernetes/my-k8s/default/api-creds/client_secret
The implementation follows the same pattern as the existing Vault, AWS,
and GCP secret managers: a schema for Admin API configuration and a
get(conf, key) function called by apisix.secret.fetch_secrets() at
request time.
Key features:
- Uses pod ServiceAccount token (default in-cluster path) for auth
- Reads KUBERNETES_SERVICE_HOST/PORT from env if not explicitly configured
- Verifies TLS using the in-cluster CA bundle by default
- Decodes base64 Secret.data values automatically
- Clear error messages for auth failures, missing secrets, missing keys
Also fixes two bugs in authz-keycloak.lua discovered while using the
Kubernetes secret manager in production (fixes apache#13493):
1. client_id and client_secret maxLength was 100 characters. When APISIX
resolves a $secret:// reference, encrypts the value (encrypt_fields),
and stores it back to etcd, the AES-encrypted result can be 128-152
chars — exceeding the schema limit and causing load_full_data() to
fail on restart, dropping all authz-keycloak services with a 404.
Increased maxLength to 4096 for both fields.
2. The same maxLength = 100 constraint also blocked $secret://kubernetes/
references from being stored at all via the Admin API, since the
reference string itself (e.g. $secret://kubernetes/my-k8s/my-ns/
my-secret/client_secret) can exceed 100 characters.
Closes apache#13493
| } | ||
|
|
||
| if ssl_verify then | ||
| request_opts.ssl_trusted_certificate = DEFAULT_CA_FILE |
There was a problem hiding this comment.
ssl_trusted_certificate is not a supported request_uri option in lua-resty-http (the api7 fork APISIX uses only honors ssl_verify, ssl_server_name and ssl_send_status_req in http_connect.lua), so this line is silently ignored. With ssl_verify = true, the handshake verifies against the nginx-level lua_ssl_trusted_certificate, which defaults to the system CA bundle (apisix.ssl.ssl_trusted_certificate: system) — and the kube-apiserver cert is signed by the cluster CA, not a public one. So the default config will fail the TLS handshake in a real cluster, and the obvious workaround users will reach for is ssl_verify: false, which means sending the ServiceAccount token over an unverified connection.
I think this needs to either drop the no-op option and document that apisix.ssl.ssl_trusted_certificate must be set to /var/run/secrets/kubernetes.io/serviceaccount/ca.crt (the doc currently claims the in-cluster CA is used by default, which is not accurate), or find another way to actually load the cluster CA.
| or os.getenv("KUBERNETES_SERVICE_PORT") | ||
| or "443" | ||
|
|
||
| local uri = "https://" .. k8s_host .. ":" .. k8s_port |
There was a problem hiding this comment.
Hardcoding https:// makes the bundled tests impossible to pass: the mock locations in t/secret/kubernetes.t serve plain HTTP on port 1984, so TESTs 10-14 will fail at the TLS handshake before ever reaching the mock. gcp.lua handles this by letting the full endpoint be configured (entries_uri), which is exactly what its tests use to mock with http://127.0.0.1:1984. I would suggest the same here — e.g. an optional scheme/endpoint override defaulting to https — rather than rewriting the tests around a TLS mock.
Separately, TEST 14 will still fail after that: the manager registered in TEST 13 does not set service_account_file, so get() falls back to /var/run/secrets/kubernetes.io/serviceaccount/token, which does not exist in CI. Worth running the suite locally before the next push.
| client_id = {type = "string", minLength = 1, maxLength = 100}, | ||
| client_secret = {type = "string", minLength = 1, maxLength = 100}, | ||
| client_id = {type = "string", minLength = 1, maxLength = 4096}, | ||
| client_secret = {type = "string", minLength = 1, maxLength = 4096}, |
There was a problem hiding this comment.
The rationale for this bump does not hold on master. Since #13312, core.schema.check skips validation for $secret:// / $env:// references on string fields (skip_validation = secret.is_secret_ref in apisix/core/schema.lua), so a long reference string already passes the 100-char limit. And on reload, plugin_checker runs decrypt_conf before the schema check, so validation sees the original value, not the AES output — what #13493 describes is real on 3.16.0 but is already addressed here.
Raising the limit might still be worth doing for genuinely long client secrets, but then it should be justified on those grounds; otherwise I would drop this hunk from the PR and keep it focused on the new secret manager.
Implements the Kubernetes secret manager (apisix/secret/kubernetes.lua) which allows APISIX to read Kubernetes Secrets directly from the cluster using the pod's ServiceAccount token.
URI format:
$secret://kubernetes/{manager-id}/{namespace}/{secret-name}/{data-key}
Example:
$secret://kubernetes/my-k8s/default/api-creds/client_secret
The implementation follows the same pattern as the existing Vault, AWS, and GCP secret managers: a schema for Admin API configuration and a get(conf, key) function called by apisix.secret.fetch_secrets() at request time.
Key features:
Also fixes two bugs in authz-keycloak.lua discovered while using the Kubernetes secret manager in production (fixes #13493):
client_id and client_secret maxLength was 100 characters. When APISIX resolves a $secret:// reference, encrypts the value (encrypt_fields), and stores it back to etcd, the AES-encrypted result can be 128-152 chars — exceeding the schema limit and causing load_full_data() to fail on restart, dropping all authz-keycloak services with a 404. Increased maxLength to 4096 for both fields.
The same maxLength = 100 constraint also blocked $secret://kubernetes/ references from being stored at all via the Admin API, since the reference string itself (e.g. $secret://kubernetes/my-k8s/my-ns/ my-secret/client_secret) can exceed 100 characters.
Closes #13493
Description
Which issue(s) this PR fixes:
Fixes #
Checklist