[BugFux] Fix help text formatting for Sentry sample rates in argument parser t… by xiaoajie738 · Pull Request #946 · vllm-project/production-stack

xiaoajie738 · 2026-05-08T02:35:30Z

…o escape percentage signs.

FILL IN THE PR DESCRIPTION HERE
before:
python3 app.py --help
Traceback (most recent call last):
File "/home/xiongjie/code/github.com/xiaoajie738/production-stack/src/vllm_router/app.py", line 450, in
main()
File "/home/xiongjie/code/github.com/xiaoajie738/production-stack/src/vllm_router/app.py", line 378, in main
args = parse_args()
^^^^^^^^^^^^
File "/home/xiongjie/code/github.com/xiaoajie738/production-stack/src/vllm_router/parsers/parser.py", line 478, in parse_args
args = parser.parse_args()
^^^^^^^^^^^^^^^^^^^
File "/home/xiongjie/miniconda3/envs/py312/lib/python3.12/argparse.py", line 1904, in parse_args
args, argv = self.parse_known_args(args, namespace)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/xiongjie/miniconda3/envs/py312/lib/python3.12/argparse.py", line 1914, in parse_known_args
return self._parse_known_args2(args, namespace, intermixed=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/xiongjie/miniconda3/envs/py312/lib/python3.12/argparse.py", line 1943, in _parse_known_args2
namespace, args = self._parse_known_args(args, namespace, intermixed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/xiongjie/miniconda3/envs/py312/lib/python3.12/argparse.py", line 2184, in _parse_known_args
start_index = consume_optional(start_index)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/xiongjie/miniconda3/envs/py312/lib/python3.12/argparse.py", line 2113, in consume_optional
take_action(action, args, option_string)
File "/home/xiongjie/miniconda3/envs/py312/lib/python3.12/argparse.py", line 2018, in take_action
action(self, namespace, argument_values, option_string)
File "/home/xiongjie/miniconda3/envs/py312/lib/python3.12/argparse.py", line 1148, in call
parser.print_help()
File "/home/xiongjie/miniconda3/envs/py312/lib/python3.12/argparse.py", line 2621, in print_help
self._print_message(self.format_help(), file)
^^^^^^^^^^^^^^^^^^
File "/home/xiongjie/miniconda3/envs/py312/lib/python3.12/argparse.py", line 2605, in format_help
return formatter.format_help()
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/xiongjie/miniconda3/envs/py312/lib/python3.12/argparse.py", line 286, in format_help
help = self._root_section.format_help()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/xiongjie/miniconda3/envs/py312/lib/python3.12/argparse.py", line 217, in format_help
item_help = join([func(*args) for func, args in self.items])
^^^^^^^^^^^
File "/home/xiongjie/miniconda3/envs/py312/lib/python3.12/argparse.py", line 217, in format_help
item_help = join([func(*args) for func, args in self.items])
^^^^^^^^^^^
File "/home/xiongjie/miniconda3/envs/py312/lib/python3.12/argparse.py", line 546, in _format_action
help_text = self._expand_help(action)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/xiongjie/miniconda3/envs/py312/lib/python3.12/argparse.py", line 640, in _expand_help
return self._get_help_string(action) % params
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~
ValueError: unsupported format character ')' (0x29) at index 54

after:
python3 app.py --help
usage: app.py [-h] [--host HOST] [--port PORT] [--root-path ROOT_PATH] [--service-discovery {static,k8s,external-only}] [--k8s-service-discovery-type {pod-ip,service-name}]
[--static-backends STATIC_BACKENDS] [--static-models STATIC_MODELS] [--static-aliases STATIC_ALIASES] [--static-model-types STATIC_MODEL_TYPES] [--static-model-labels STATIC_MODEL_LABELS]
[--static-backend-health-checks] [--static-backend-health-check-interval STATIC_BACKEND_HEALTH_CHECK_INTERVAL]
[--static-backend-health-check-timeout-seconds STATIC_BACKEND_HEALTH_CHECK_TIMEOUT_SECONDS] [--k8s-port K8S_PORT] [--k8s-namespace K8S_NAMESPACE] [--k8s-label-selector K8S_LABEL_SELECTOR]
[--k8s-watcher-timeout-seconds K8S_WATCHER_TIMEOUT_SECONDS] [--backend-health-check-timeout-seconds BACKEND_HEALTH_CHECK_TIMEOUT_SECONDS]
[--routing-logic {roundrobin,session,kvaware,prefixaware,disaggregated_prefill,disaggregated_prefill_orchestrated}] [--lmcache-controller-port LMCACHE_CONTROLLER_PORT]
[--lmcache-controller-reply-port LMCACHE_CONTROLLER_REPLY_PORT] [--lmcache-controller-heartbeat-port LMCACHE_CONTROLLER_HEARTBEAT_PORT] [--session-key SESSION_KEY] [--callbacks CALLBACKS]
[--request-rewriter {noop}] [--enable-batch-api] [--file-storage-class {local_file}] [--file-storage-path FILE_STORAGE_PATH] [--batch-processor {local}]
[--engine-stats-interval ENGINE_STATS_INTERVAL] [--request-stats-window REQUEST_STATS_WINDOW] [--log-stats] [--log-stats-interval LOG_STATS_INTERVAL]
[--dynamic-config-yaml DYNAMIC_CONFIG_YAML | --dynamic-config-json DYNAMIC_CONFIG_JSON] [--version] [--feature-gates FEATURE_GATES] [--log-level {critical,error,warning,info,debug,trace}]
[--log-format {text,json}] [--sentry-dsn SENTRY_DSN] [--sentry-traces-sample-rate SENTRY_TRACES_SAMPLE_RATE] [--sentry-profile-session-sample-rate SENTRY_PROFILE_SESSION_SAMPLE_RATE]
[--otel-endpoint OTEL_ENDPOINT] [--otel-service-name OTEL_SERVICE_NAME] [--otel-secure] [--prefill-model-labels PREFILL_MODEL_LABELS] [--decode-model-labels DECODE_MODEL_LABELS]
[--kv-aware-threshold KV_AWARE_THRESHOLD] [--max-instance-failover-reroute-attempts MAX_INSTANCE_FAILOVER_REROUTE_ATTEMPTS] [--lmcache-health-check-interval LMCACHE_HEALTH_CHECK_INTERVAL]
[--lmcache-worker-timeout LMCACHE_WORKER_TIMEOUT] [--external-providers-config EXTERNAL_PROVIDERS_CONFIG]

Run the FastAPI app.

options:
-h, --help show this help message and exit
--host HOST The host to run the server on.
--port PORT The port to run the server on.
--root-path ROOT_PATH
FastAPI root path for hosting under a subpath (e.g. /vllm).
--service-discovery {static,k8s,external-only}
The service discovery type. Use 'external-only' for deployments with no local vLLM backends.
--k8s-service-discovery-type {pod-ip,service-name}
The k8s service discovery type implementation only applies if service-discovery is specified as k8s.
--static-backends STATIC_BACKENDS
The URLs of static backends, separated by commas. E.g., http://localhost:8000,http://localhost:8001
--static-models STATIC_MODELS
The models of static backends, separated by commas. E.g., model1,model2
--static-aliases STATIC_ALIASES
The aliases of static backends, separated by commas. E.g., your-custom-model:llama3
--static-model-types STATIC_MODEL_TYPES
Specify the static model types of each model. This is used for the backend health check, separated by commas. E.g. chat,embeddings,rerank
--static-model-labels STATIC_MODEL_LABELS
The model labels of static backends, separated by commas. E.g., model1,model2
--static-backend-health-checks
Enable this flag to make vllm-router check periodically if the models work by sending dummy requests to their endpoints.
--static-backend-health-check-interval STATIC_BACKEND_HEALTH_CHECK_INTERVAL
Interval in seconds between static backend health checks (default: 60).
--static-backend-health-check-timeout-seconds STATIC_BACKEND_HEALTH_CHECK_TIMEOUT_SECONDS
Timeout in seconds for static backend health check requests (default: 10).
--k8s-port K8S_PORT The port of vLLM processes when using K8s service discovery.
--k8s-namespace K8S_NAMESPACE
The namespace of vLLM pods when using K8s service discovery.
--k8s-label-selector K8S_LABEL_SELECTOR
The label selector to filter vLLM pods when using K8s service discovery.
--k8s-watcher-timeout-seconds K8S_WATCHER_TIMEOUT_SECONDS
Timeout in seconds for Kubernetes watcher streams (default: 0).
--backend-health-check-timeout-seconds BACKEND_HEALTH_CHECK_TIMEOUT_SECONDS
Timeout in seconds for backend health check requests (default: 10).
--routing-logic {roundrobin,session,kvaware,prefixaware,disaggregated_prefill,disaggregated_prefill_orchestrated}
The routing logic to use
--lmcache-controller-port LMCACHE_CONTROLLER_PORT
The port of the LMCache controller (PULL socket).
--lmcache-controller-reply-port LMCACHE_CONTROLLER_REPLY_PORT
The port of the LMCache controller ROUTER socket for req/reply (e.g., worker registration). Disabled if not set.
--lmcache-controller-heartbeat-port LMCACHE_CONTROLLER_HEARTBEAT_PORT
The port of the LMCache controller ROUTER socket for worker heartbeats. Disabled if not set.
--session-key SESSION_KEY
The key (in the header) to identify a session.
--callbacks CALLBACKS
Path to the callback instance extending CustomCallbackHandler. Consists of <file path without .py ending>..
--request-rewriter {noop}
The request rewriter to use. Default is 'noop' (no rewriting).
--enable-batch-api Enable the batch API for processing files.
--file-storage-class {local_file}
The file storage class to use.
--file-storage-path FILE_STORAGE_PATH
The path to store files.
--batch-processor {local}
The batch processor to use.
--engine-stats-interval ENGINE_STATS_INTERVAL
The interval in seconds to scrape engine statistics.
--request-stats-window REQUEST_STATS_WINDOW
The sliding window in seconds to compute request statistics.
--log-stats Log statistics periodically.
--log-stats-interval LOG_STATS_INTERVAL
The interval in seconds to log statistics.
--version Show version and exit
--feature-gates FEATURE_GATES
Comma-separated list of feature gates (e.g., 'SemanticCache=true')
--log-level {critical,error,warning,info,debug,trace}
Log level for the router and uvicorn. Default is 'info'.
--log-format {text,json}
Log output format. 'text' for human-readable colored output, 'json' for structured JSON logging. Default is 'text'.
--sentry-dsn SENTRY_DSN
Enables Sentry Error Reporting to the specified Data Source Name
--sentry-traces-sample-rate SENTRY_TRACES_SAMPLE_RATE
The sample rate for Sentry traces. Default is 0.1 (10%)
--sentry-profile-session-sample-rate SENTRY_PROFILE_SESSION_SAMPLE_RATE
The sample rate for Sentry profiling sessions. Default is 1.0 (100%)
--otel-endpoint OTEL_ENDPOINT
OTLP endpoint for tracing (e.g., localhost:4317). Enables tracing when set.
--otel-service-name OTEL_SERVICE_NAME
Service name for OpenTelemetry tracing. Default is 'vllm-router'.
--otel-secure Use secure (TLS) connection for OTLP exporter. Default is insecure.
--prefill-model-labels PREFILL_MODEL_LABELS
The model labels of prefill backends, separated by commas. E.g., model1,model2
--decode-model-labels DECODE_MODEL_LABELS
The model labels of decode backends, separated by commas. E.g., model1,model2
--kv-aware-threshold KV_AWARE_THRESHOLD
The threshold for kv-aware routing.
--max-instance-failover-reroute-attempts MAX_INSTANCE_FAILOVER_REROUTE_ATTEMPTS
Number of reroute attempts per failed request
--lmcache-health-check-interval LMCACHE_HEALTH_CHECK_INTERVAL
Health check interval for LMCache worker (seconds)
--lmcache-worker-timeout LMCACHE_WORKER_TIMEOUT
Timeout for LMCache worker (seconds)
--external-providers-config EXTERNAL_PROVIDERS_CONFIG
Path to a YAML file defining external LLM provider configurations (startup-time only).

Dynamic config file:
Only one dynamic config file (YAML or JSON) can be provided

--dynamic-config-yaml DYNAMIC_CONFIG_YAML
The path to the YAML file containing the dynamic configuration, cannot be used with --dynamic-config-json.
--dynamic-config-json DYNAMIC_CONFIG_JSON
The path to the JSON file containing the dynamic configuration, cannot be used with --dynamic-config-yaml.

FIX #xxxx (link existing issues this PR will resolve)

BEFORE SUBMITTING, PLEASE READ THE CHECKLIST BELOW AND FILL IN THE DESCRIPTION ABOVE

Make sure the code changes pass the pre-commit checks.
Sign-off your commit by using -s when doing git commit
Try to classify PRs for easy understanding of the type of changes, such as [Bugfix], [Feat], and [CI].

Detailed Checklist (Click to Expand)

Thank you for your contribution to production-stack! Before submitting the pull request, please ensure the PR meets the following criteria. This helps us maintain the code quality and improve the efficiency of the review process.

PR Title and Classification

Please try to classify PRs for easy understanding of the type of changes. The PR title is prefixed appropriately to indicate the type of change. Please use one of the following:

[Bugfix] for bug fixes.
[CI/Build] for build or continuous integration improvements.
[Doc] for documentation fixes and improvements.
[Feat] for new features in the cluster (e.g., autoscaling, disaggregated prefill, etc.).
[Router] for changes to the vllm_router (e.g., routing algorithm, router observability, etc.).
[Misc] for PRs that do not fit the above categories. Please use this sparingly.

Note: If the PR spans more than one category, please include all relevant prefixes.

Code Quality

The PR need to meet the following code quality standards:

Pass all linter checks. Please use pre-commit to format your code. See README.md for installation.
The code need to be well-documented to ensure future contributors can easily understand the code.
Please include sufficient tests to ensure the change is stay correct and robust. This includes both unit tests and integration tests.

DCO and Signed-off-by

When contributing changes to this project, you must agree to the DCO. Commits must include a Signed-off-by: header which certifies agreement with the terms of the DCO.

Using -s with git commit will automatically add this header.

What to Expect for the Reviews

We aim to address all PRs in a timely manner. If no one reviews your PR within 5 days, please @-mention one of YuhanLiu11
, Shaoting-Feng or ApostaC.

gemini-code-assist

Code Review

This pull request escapes percentage signs in the help strings for Sentry-related CLI arguments to ensure correct formatting. The reviewer suggests using %(default)s instead of hardcoding default values in these help strings to ensure the help messages automatically update if the default values are modified.

gemini-code-assist · 2026-05-08T02:37:54Z

        type=float,
        default=1.0,
-        help="The sample rate for Sentry profiling sessions. Default is 1.0 (100%)",
+        help="The sample rate for Sentry profiling sessions. Default is 1.0 (100%%)",


Consider using %(default)s instead of hardcoding the default value 1.0 in the help string. This ensures the help message automatically updates if the default value is changed in the code. Note that the %% escape is still necessary for the literal percentage sign.

Suggested change

help="The sample rate for Sentry profiling sessions. Default is 1.0 (100%%)",

help="The sample rate for Sentry profiling sessions. Default is %(default)s (100%%)",

…o escape percentage signs. Signed-off-by: xiongjie <xiongjie@sensetime.com>

xiaoajie738 · 2026-05-11T03:11:37Z

@ruizhang0101 Hello, could you please take a look at this PR when you have a moment?

ruizhang0101

LGTM

gemini-code-assist Bot reviewed May 8, 2026

View reviewed changes

Fix help text formatting for Sentry sample rates in argument parser t…

1b0c97a

…o escape percentage signs. Signed-off-by: xiongjie <xiongjie@sensetime.com>

xiaoajie738 force-pushed the fix/argparse-help-percent-escape branch from 8f58292 to 1b0c97a Compare May 8, 2026 02:39

xiaoajie738 changed the title ~~Fix help text formatting for Sentry sample rates in argument parser t…~~ [BugFux] Fix help text formatting for Sentry sample rates in argument parser t… May 8, 2026

Merge branch 'main' into fix/argparse-help-percent-escape

98d2c22

ruizhang0101 approved these changes May 26, 2026

View reviewed changes

ruizhang0101 added 2 commits May 26, 2026 12:48

Merge branch 'main' into fix/argparse-help-percent-escape

f9f29ec

Merge branch 'main' into fix/argparse-help-percent-escape

4b625df

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BugFux] Fix help text formatting for Sentry sample rates in argument parser t…#946

[BugFux] Fix help text formatting for Sentry sample rates in argument parser t…#946
xiaoajie738 wants to merge 4 commits into
vllm-project:mainfrom
xiaoajie738:fix/argparse-help-percent-escape

xiaoajie738 commented May 8, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

gemini-code-assist Bot May 8, 2026

Uh oh!

xiaoajie738 commented May 11, 2026

Uh oh!

ruizhang0101 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	help="The sample rate for Sentry profiling sessions. Default is 1.0 (100%%)",
	help="The sample rate for Sentry profiling sessions. Default is %(default)s (100%%)",

Conversation

xiaoajie738 commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Title and Classification

Code Quality

DCO and Signed-off-by

What to Expect for the Reviews

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

gemini-code-assist Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

xiaoajie738 commented May 11, 2026

Uh oh!

ruizhang0101 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

xiaoajie738 commented May 8, 2026 •

edited

Loading