feat: include authenticated user identity in HTTP access log by ardentperf · Pull Request #9991 · pgadmin-org/pgadmin4

ardentperf · 2026-05-29T16:43:52Z

Set an X-Remote-User response header containing the authenticated username on every request. This allows the access log to be configured to include user identity via standard log format directives (%({x-remote-user}o)s in gunicorn, %{X-Remote-User}o in Apache) without requiring any changes to pgAdmin's session or auth behaviour.

Closes #9990

Default gunicorn access log format is at https://gunicorn.org/reference/settings/?h=access_log#access_log_format and https://github.com/benoitc/gunicorn/blob/9bc5891b4b06f25a8ce0e707053dcb2fb9bf638c/gunicorn/config.py#L1413 ; I confirmed that all other fields are default; this PR only changes the user name field.

Performed an end-to-end test on kubernetes with KIND

With this PR:

Master Branch:

Summary by CodeRabbit

New Features
- Option to include the authenticated username in responses so HTTP access logs can record user identity, improving audit trails and monitoring.
- New configuration toggle (disabled by default) to enable or disable this behavior.
- When enabled and a user is authenticated, a response header carries a latin-1-safe username; when disabled or unauthenticated, no username is added.

Set an X-Remote-User response header containing the authenticated username on every request. This allows the access log to be configured to include user identity via standard log format directives (%({x-remote-user}o)s in gunicorn, %{X-Remote-User}o in Apache) without requiring any changes to pgAdmin's session or auth behaviour. Signed-off-by: Jeremy Schneider <schneider@ardentperf.com>

coderabbitai · 2026-05-29T16:44:08Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: bd02bed5-c71b-4ac4-96a9-e492a27d8600

📥 Commits

Reviewing files that changed from the base of the PR and between b232ead and 0f6f4d5.

📒 Files selected for processing (3)

pkg/docker/gunicorn_config.py
web/config.py
web/pgadmin/__init__.py

🚧 Files skipped from review as they are similar to previous changes (3)

web/pgadmin/init.py
web/config.py
pkg/docker/gunicorn_config.py

Walkthrough

Flask now sets an X-Remote-User response header for authenticated requests when enabled by config.LOG_AUTHENTICATED_USER; Gunicorn's access_log_format is configured to include that header so logs show the authenticated username (or '-' when absent).

Changes

User Identity in HTTP Access Logs

Layer / File(s)	Summary
Config flag for header `web/config.py`	Adds `LOG_AUTHENTICATED_USER = False` to control emitting `X-Remote-User` in responses.
Flask response header for authenticated user `web/pgadmin/__init__.py`	`after_request` sets `X-Remote-User` to a latin-1-safe `current_user.username` when `LOG_AUTHENTICATED_USER` is enabled; removes the header if unauthenticated or empty.
Gunicorn access log format configuration `pkg/docker/gunicorn_config.py`	Sets `access_log_format` to include `%({x-remote-user}o)s`, logging '-' for missing values.

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'feat: include authenticated user identity in HTTP access log' directly and concisely describes the main objective of adding user identity to access logs.
Linked Issues check	✅ Passed	The PR implements the core requirement from `#9990`: exposing authenticated username in HTTP access logs via X-Remote-User header for gunicorn/Apache compatibility.
Out of Scope Changes check	✅ Passed	All changes are directly scoped to the linked issue: config flag, response header setting, and gunicorn log format configuration.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (2)

pkg/docker/gunicorn_config.py (2)
8-11: Consider privacy implications of logging user identities.

The access_log_format now includes authenticated usernames in HTTP access logs, which is the intended behavior per the PR objectives. However, deployments should be aware that:

Usernames (e.g., email addresses like admin@pgadmin.org) constitute personally identifiable information (PII)

Access logs may be subject to data retention policies under GDPR, CCPA, or other privacy regulations

Log aggregation systems, backup procedures, and access controls should account for PII in logs

Consider documenting this change in deployment/administration guides so that operators can implement appropriate log handling policies for their regulatory environment.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/docker/gunicorn_config.py` around lines 8 - 11, The access_log_format in
pkg/docker/gunicorn_config.py now injects authenticated usernames via the
X-Remote-User header (access_log_format), which exposes PII; update
documentation and make the behavior configurable: add guidance in
deployment/administration docs describing the PII risk,
retention/aggregation/backup/access-control recommendations, and instructions to
disable or anonymize usernames (e.g., provide a deploy-time option or env var to
remove %({x-remote-user}o)s from access_log_format or enable masking) so
operators can comply with GDPR/CCPA and other policies.
8-11: ⚡ Quick win

Confirm Gunicorn access-log header lookup is case-insensitive (lowercase config is correct).

Gunicorn recommends using lowercase identifiers in access_log_format, and internally normalizes header lookups for %({header-name}o)s (response headers). So %({x-remote-user}o)s will correctly pick up X-Remote-User even though the actual header is capitalized.

Operational: logging the authenticated username can be sensitive (PII/auditing concerns); ensure retention/access controls align with your compliance requirements.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/docker/gunicorn_config.py` around lines 8 - 11, The access_log_format
line currently uses the lowercase header token %({x-remote-user}o)s; confirm
that this is correct and leave it lowercase (Gunicorn normalizes header lookups
so %({x-remote-user}o)s will match X-Remote-User), and add a short inline
comment or README note next to access_log_format to document that header lookup
is case-insensitive and that logging usernames may contain PII so
retention/access controls must be applied; reference the access_log_format
setting and the %({x-remote-user}o)s token when making these clarifications.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@pkg/docker/gunicorn_config.py`:
- Around line 8-11: The access_log_format in pkg/docker/gunicorn_config.py now
injects authenticated usernames via the X-Remote-User header
(access_log_format), which exposes PII; update documentation and make the
behavior configurable: add guidance in deployment/administration docs describing
the PII risk, retention/aggregation/backup/access-control recommendations, and
instructions to disable or anonymize usernames (e.g., provide a deploy-time
option or env var to remove %({x-remote-user}o)s from access_log_format or
enable masking) so operators can comply with GDPR/CCPA and other policies.
- Around line 8-11: The access_log_format line currently uses the lowercase
header token %({x-remote-user}o)s; confirm that this is correct and leave it
lowercase (Gunicorn normalizes header lookups so %({x-remote-user}o)s will match
X-Remote-User), and add a short inline comment or README note next to
access_log_format to document that header lookup is case-insensitive and that
logging usernames may contain PII so retention/access controls must be applied;
reference the access_log_format setting and the %({x-remote-user}o)s token when
making these clarifications.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: fddec9fa-afa7-41a9-b850-a7da26aafb89

📥 Commits

Reviewing files that changed from the base of the PR and between 0d11dbc and beea339.

📒 Files selected for processing (2)

pkg/docker/gunicorn_config.py
web/pgadmin/__init__.py

dpage

Thanks for this — it's a clean, well-scoped change and a genuinely useful feature. One blocking concern and a couple of smaller points.

🔴 Must fix: non-Latin-1 usernames will 500 every authenticated request. See the inline comment — HTTP header values must encode to Latin-1, but current_user.username can legitimately contain characters outside that range (OAuth2 preferred_username/sub/OAUTH2_USERNAME_CLAIM, Kerberos principals, LDAP attributes). Since the header is set unconditionally for every authenticated user, such a user would hit UnicodeEncodeError during response serialization and be locked out entirely. Sanitizing the value before setting the header avoids this.

🟡 Tests: it would be good to add a small regression test asserting the header is present for an authenticated request and absent for an anonymous one — ideally exercising a non-ASCII username to guard the case above.

ℹ️ Notes (non-blocking):

The access_log_format change only lands for Docker deployments; standard package/pip server installs running their own gunicorn won't pick up the username field without adding the directive themselves. Consistent with the PR's stated Docker/k8s focus — just flagging the scope.
With JSON_LOGGER enabled the username ends up embedded inside the access-log message string rather than as a discrete JSON field, since the JSON formatter wraps gunicorn's rendered access line. Works, but operators expecting a structured field may be surprised.

dpage · 2026-06-01T09:03:25Z

    @app.after_request
    def after_request(response):
+        if current_user.is_authenticated:
+            response.headers['X-Remote-User'] = current_user.username


HTTP header values must encode to Latin-1, but current_user.username isn't guaranteed to be ASCII/Latin-1. OAuth2 (_resolve_username can return preferred_username, sub, or a configured OAUTH2_USERNAME_CLAIM), Kerberos principals, and LDAP-mapped usernames can all contain Cyrillic/CJK/accented characters.

Verified against the Werkzeug currently pinned in the project:

Gorkov (Cyrillic) -> UnicodeEncodeError: 'latin-1' codec can't encode... alice\r\nX-Injected: 1 -> ValueError: Header values must not contain newline characters.

Because the header is set unconditionally for every authenticated user, a user with a non-Latin-1 username will raise during response serialization and get a 500 on every request — effectively locked out of pgAdmin. (The CR/LF case is already blocked by Werkzeug, so there's no header-injection vuln, but it would also 500.)

Suggest sanitizing before setting:

if current_user.is_authenticated and current_user.username: # HTTP headers are latin-1 only; avoid 500s for unicode usernames safe = current_user.username.encode('latin-1', 'replace').decode('latin-1') response.headers['X-Remote-User'] = safe

I did a little bit of poking around but not seeing an easy way to add tests for this (or for the change as a whole) since they depend on having a server like gunicorn in the loop :-/

I'm a little curious how you reproduced the error and what your setup is... pgadmin wouldnt let me create accounts with special chars in email, and i didn't have OAuth2 setup for testing

I repro'd the error by directly calling gunicorn's to_bytestring() function, just to verify it

dpage

Thanks for this — it's a clean, well-scoped change and a genuinely useful feature. One blocking concern and a couple of smaller points.

🔴 Must fix: non-Latin-1 usernames will 500 every authenticated request. See the inline comment — HTTP header values must encode to Latin-1, but current_user.username can legitimately contain characters outside that range (OAuth2 preferred_username/sub/OAUTH2_USERNAME_CLAIM, Kerberos principals, LDAP attributes). Since the header is set unconditionally for every authenticated user, such a user would hit UnicodeEncodeError during response serialization and be locked out entirely. Sanitizing the value before setting the header avoids this.

🟡 Tests: it would be good to add a small regression test asserting the header is present for an authenticated request and absent for an anonymous one — ideally exercising a non-ASCII username to guard the case above.

ℹ️ Notes (non-blocking):

The access_log_format change only lands for Docker deployments; standard package/pip server installs running their own gunicorn won't pick up the username field without adding the directive themselves. Consistent with the PR's stated Docker/k8s focus — just flagging the scope.
With JSON_LOGGER enabled the username ends up embedded inside the access-log message string rather than as a discrete JSON field, since the JSON formatter wraps gunicorn's rendered access line. Works, but operators expecting a structured field may be surprised.

dpage · 2026-06-01T09:03:31Z

    @app.after_request
    def after_request(response):
+        if current_user.is_authenticated:
+            response.headers['X-Remote-User'] = current_user.username


HTTP header values must encode to Latin-1, but current_user.username isn't guaranteed to be ASCII/Latin-1. OAuth2 (_resolve_username can return preferred_username, sub, or a configured OAUTH2_USERNAME_CLAIM), Kerberos principals, and LDAP-mapped usernames can all contain Cyrillic/CJK/accented characters.

Verified against the Werkzeug currently pinned in the project:

Gorkov (Cyrillic) -> UnicodeEncodeError: 'latin-1' codec can't encode... alice\r\nX-Injected: 1 -> ValueError: Header values must not contain newline characters.

Because the header is set unconditionally for every authenticated user, a user with a non-Latin-1 username will raise during response serialization and get a 500 on every request — effectively locked out of pgAdmin. (The CR/LF case is already blocked by Werkzeug, so there's no header-injection vuln, but it would also 500.)

Suggest sanitizing before setting:

if current_user.is_authenticated and current_user.username: # HTTP headers are latin-1 only; avoid 500s for unicode usernames safe = current_user.username.encode('latin-1', 'replace').decode('latin-1') response.headers['X-Remote-User'] = safe

dpage · 2026-06-01T09:07:47Z

Please also address the python style issue causing CI to fail.

Thanks!

mzabuawala · 2026-06-08T04:02:43Z


    @app.after_request
    def after_request(response):
+        if current_user.is_authenticated:


Since not everyone wants to send user information in plain text in every response, we must make it configurable.

LOG_AUTHENTICATED_USER=True/False

mzabuawala · 2026-06-08T04:05:44Z


    @app.after_request
    def after_request(response):
+        if current_user.is_authenticated:


Suggested change

if current_user.is_authenticated:

if current_user.is_authenticated:

response.headers['X-Remote-User'] = current_user.username

else:

# prevents any accidental reuse if middleware or future code sets the header earlier

response.headers.pop('X-Remote-User', None)

Making sure I understand this. I guess the main concern here is inconsistency - it would be a bit weird to overwrite the header sometimes and not others - and users wouldn't know who set the value they see in the field. If we're going to set a value in this field sometimes, then it's best to always control the contents of the field. That makes sense.

My understanding is that the concern is less about the current implementation and more about ensuring the header is always under pgAdmin's control. If we only overwrite it for authenticated users, then in other cases it's unclear whether the value came from pgAdmin itself or was set earlier by middleware or some future code path. Explicitly removing it when there is no authenticated user keeps the behavior deterministic and avoids any ambiguity about the source of the header value.

ardentperf · 2026-06-08T13:35:08Z

Apologies, this past week has been a bit crazy - haven’t forgotten and should get to it this week 🤞

dpage · 2026-06-09T13:54:37Z

A few additional notes to fold into your next pass, on top of the inline comments above:

1. The pep8/CI failure is just the new access_log_format line (96 chars > 79). # noqa won't help since we run pycodestyle directly, so please wrap it:

access_log_format = (
    '%(h)s %(l)s %({x-remote-user}o)s %(t)s '
    '"%(r)s" %(s)s %(b)s "%(f)s" "%(a)s"'
)

2. Default the config flag to off. Building on @mzabuawala's LOG_AUTHENTICATED_USER suggestion: X-Remote-User is sent on every response to the client, not just consumed by the log — so it's visible to proxies, CDNs, TLS middleboxes, browser extensions, etc. For a server-side logging feature that's broader exposure than needed, so the flag should default to False (opt-in), with a docs note that enabling it surfaces the username in response headers.

3. Optional design alternative. If cross-deployment portability (Apache %{X-Remote-User}o) isn't essential, you could stash the name in the WSGI environ instead of a response header — request.environ['x_remote_user'] = current_user.username — and log it via gunicorn's environ atom %({x_remote_user}e)s. That keeps the identity entirely server-side (never sent to the client) and sidesteps the Latin-1 header-encoding 500 altogether. The trade-off is it's gunicorn-specific, whereas the response-header approach also works under Apache — so it's a judgement call, not a request. (Worth confirming the {}e atom behaviour against the pinned gunicorn if you go that route.)

Thanks again for the contribution — no rush, just capturing these so they're in one place for your update.

ardentperf · 2026-06-09T15:16:28Z

3. Optional design alternative. If cross-deployment portability (Apache %{X-Remote-User}o) isn't essential, you could stash the name in the WSGI environ instead of a response header — request.environ['x_remote_user'] = current_user.username — and log it via gunicorn's environ atom %({x_remote_user}e)s. That keeps the identity entirely server-side (never sent to the client) and sidesteps the Latin-1 header-encoding 500 altogether. The trade-off is it's gunicorn-specific, whereas the response-header approach also works under Apache — so it's a judgement call, not a request. (Worth confirming the {}e atom behaviour against the pinned gunicorn if you go that route.)

The main use case I'm interested in right now is containerized deployments on kubernetes, so gunicorn covers the immediate case. But personally I think there's value in the overall idea to many pgAdmin users. My vote would be for the opt-in apache-compatible approach, which makes it accessible others who are also interested in using this outside kubernetes for a more complete audit picture (eg. seeing who has downloaded data from which database and how many bytes were in the download request). I also don't think there should be any concerns of sensitivity (even in regulated environments) around simply knowing which authenticated user made which requests. It's fairly standard for an access log. This being said, I'm open to either option if someone felt there was a strong argument for keeping everything server-side.

…r X-Remote-User header Addresses reviewer feedback on PR pgadmin-org#9991: - Gate the X-Remote-User header behind LOG_AUTHENTICATED_USER (default False) - Encode username as latin-1 with replacement to prevent gunicorn 500s for non-ASCII usernames - Clear the header on unauthenticated requests when the feature is enabled

…mote-User header Addresses reviewer feedback on PR pgadmin-org#9991: - Gate the X-Remote-User header behind LOG_AUTHENTICATED_USER (default False) - Encode username as latin-1 with replacement to prevent gunicorn 500s for non-ASCII usernames - Clear the header on unauthenticated requests when the feature is enabled Signed-off-by: Jeremy Schneider <schneider@ardentperf.com>

…mote-User header Addresses reviewer feedback on PR pgadmin-org#9991: - Gate the X-Remote-User header behind LOG_AUTHENTICATED_USER (default False) - Encode username as latin-1 with replacement to prevent gunicorn 500s for non-ASCII usernames - Clear the header on unauthenticated requests when the feature is enabled Signed-off-by: Jeremy Schneider <schneider@ardentperf.com> style: fix E501 line too long in after_request style: fix E501 line too long in gunicorn_config.py

ardentperf · 2026-06-10T07:38:37Z

pushed updates to the PR addressing comments. CI failures on python style should be addressed.

Default pgAdmin behavior is unchanged; X-Remote-User header is only managed if the user enables the LOG_AUTHENTICATED_USER configuration setting.

I tested with the helm chart on k8s and confirmed that there's no logging by default, and logging is enabled when I add this to my values.yaml file:

extraEnvVars:
  - name: PGADMIN_CONFIG_LOG_AUTHENTICATED_USER
    value: "True"

Is this a good name for the config setting? Technically it controls the header, and I've updated the default gunicorn logging to automatically pick it up. For users who consume this through docker and helm, the practical effect is that the config does enable logging.

ardentperf · 2026-06-10T07:55:41Z

after a bit more thought, i'm also testing the WSGI environ approach to see if this works.

ardentperf · 2026-06-10T08:10:24Z

i just tried testing request.environ['x_remote_user'] = current_user.username and %({x_remote_user}e)s but gunicorn did not log usernames. now i remember trying this before too, and i never did figure out why it doesn't work...

ardentperf · 2026-06-10T08:27:44Z

I think the culprit is here:

https://github.com/miguelgrinberg/flask-socketio/blob/v5.6.1/src/flask_socketio/__init__.py#L40

flask_socketio wraps app.wsgi_app with its own middleware (SocketioMiddleware). Its call method copies the environ before passing it to Flask. The copy at line 40 means Flask receives a new dict — a shallow copy of gunicorn's original environ. Any mutations Flask makes (including request.environ['x_remote_user'] = ... in after_request) go into the copy. Gunicorn's original environ dict is untouched. When gunicorn calls self.log.access(resp, req, environ, ...), it passes its original dict, which never has x_remote_user set. This was confirmed by logging id() of the environ in both after_request and gunicorn's handle_request — they were different on every request.

coderabbitai Bot reviewed May 29, 2026

View reviewed changes

dpage reviewed Jun 1, 2026

View reviewed changes

asheshv assigned ardentperf Jun 6, 2026

mzabuawala reviewed Jun 8, 2026

View reviewed changes

ardentperf force-pushed the pr-loguserid branch from b232ead to 0f6f4d5 Compare June 10, 2026 07:32

Conversation

ardentperf commented May 29, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

dpage left a comment

Choose a reason for hiding this comment

Uh oh!

dpage Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

ardentperf Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dpage left a comment

Choose a reason for hiding this comment

Uh oh!

dpage Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

dpage commented Jun 1, 2026

Uh oh!

mzabuawala Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

mzabuawala Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

ardentperf Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

mzabuawala Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

ardentperf commented Jun 8, 2026

Uh oh!

dpage commented Jun 9, 2026

Uh oh!

ardentperf commented Jun 9, 2026

Uh oh!

ardentperf commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ardentperf commented Jun 10, 2026

Uh oh!

ardentperf commented Jun 10, 2026

Uh oh!

ardentperf commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ardentperf commented May 29, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 29, 2026 •

edited

Loading

ardentperf Jun 10, 2026 •

edited

Loading

ardentperf commented Jun 10, 2026 •

edited

Loading