Add --exclude-metrics-path to filter request paths from metrics#213
Add --exclude-metrics-path to filter request paths from metrics#213lewispb wants to merge 2 commits into
Conversation
Lets services opt specific paths (typically health checks from upstream load balancers or uptime monitors) out of the Prometheus request and in-flight metrics. Matches are exact, can be repeated, and only suppress metrics — request logs are still emitted. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The motivation for `--exclude-metrics-path` is that high-volume healthcheck traffic distorts aggregate metrics (request rate, latency percentiles, error rates) and inflates the metrics pipeline — not that the output is "noisy". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Adds a per-service --exclude-metrics-path deploy option so specific request paths (e.g., health checks) can be excluded from Prometheus request counters/duration and the in-flight gauge, reducing metrics noise while keeping request logs intact.
Changes:
- Persist per-service excluded paths in
ServiceOptionsand detect exact matches againstr.URL.Path. - Skip Prometheus request tracking (counters/histogram) and in-flight gauge updates for excluded paths.
- Add CLI flag, documentation, and unit/integration-style tests for the new behavior.
Tip
If you aren't ready for review, convert to a draft PR.
Click "Convert to draft" or run gh pr ready --undo.
Click "Ready for review" or run gh pr ready to reengage.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| README.md | Documents the new per-service flag and exact-match behavior for excluding paths from metrics. |
| internal/server/service.go | Stores excluded paths and marks matching requests to skip metrics; avoids in-flight tracking for excluded requests. |
| internal/server/service_test.go | Adds coverage for IsMetricsExcluded and validates the request-context flag is set for excluded paths. |
| internal/server/logging_middleware.go | Conditionally skips TrackRequest when the request context indicates metrics should be excluded. |
| internal/cmd/deploy.go | Introduces the repeatable --exclude-metrics-path deploy flag wired into ServiceOptions. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@lewispb we should probably always exclude health check requests from metrics. We could exclude anything where Would doing that cover your use case? Or do you have a real need to exclude multiple other paths too? (btw I'm still out for the next couple weeks so am not working on this; I just happened to see this PR in passing ;)) |
|
No rush @kevinmcconnell, we can chat about it at the meetup if you like. We could exclude some common health check endpoints automatically, but it's hard to know what any particular app may consider as a health check (we have several that we use internally for different reasons). It's also an area where opinions differ - health checks are real requests, and there are some legitimate cases for actually monitoring the performance of them. Happy to chat in person anyways. |
|
Sounds good @lewispb! let’s chat when we meet 👍
Just to clarify that one part, the proxy already knows the app’s own healthcheck path from the deployment. We have some special handling on that path already (for maintenance mode; to ensure downstream LBs don’t think mark an app in maintenance as unhealthy). So that’s the same endpoint that I would propose excluding, at least by default. We wouldn’t be matching on common paths, we’d be using the one that’s already defined. |
Summary
Adds a per-service
--exclude-metrics-pathflag ondeployso apps can opt specific paths out of the Prometheus request counters and in-flight gauge. Matches are exact and the flag is repeatable.Motivation
High-volume healthcheck traffic from upstream load balancers and uptime monitors both inflates the metrics pipeline (volume → cardinality, storage, scrape cost) and dominates aggregate measures like request rate, latency percentiles, and error rates — so the dashboards no longer reflect real user traffic.
Behavior
r.URL.Path, repeatable via--exclude-metrics-path /up --exclude-metrics-path /healthz(or comma-separated)ServiceOptionsalongside other per-service config; legacy state without the field deserializes finekamal_proxy_http_requests_total,kamal_proxy_http_request_duration_seconds, andkamal_proxy_http_in_flight_requestsare skippedTest plan
go test ./...passesgo vet ./...cleanServiceOptions.IsMetricsExcludedService.ServeHTTPflips the request-context flag for matching paths only--exclude-metrics-path /up, hit/upand/other, confirm/upis absent from/metricsoutput