Problem
When a client requests a directory listing for a repository with a very large number of content units, the content app builds the entire listing in memory, causing an instant OOM kill of the pulp-content pod.
In production, requesting the directory listing for @rubygems/rubygems/fedora-43-x86_64 (266,934 RPM packages) causes an ~800MB-1.2GB memory spike in a single request, immediately OOM-killing the content pod (2 GiB memory limit). The pod has been OOM-killed repeatedly by this request pattern.
Root Cause
pulpcore/content/handler.py — the list_directory_blocking() method (line 606) and render_html() method (line 534) build the entire directory listing in memory:
-
list_directory_blocking() iterates all ContentArtifact objects matching the path, building four in-memory collections (directory_list set, dates dict, content_to_find dict, sizes dict). For 267K packages, this loads ~267K Django ORM objects with select_related("artifact").
-
It then iterates content_repo_ver._content_relationships() to update dates — loading another ~267K RepositoryContent objects.
-
render_html() sorts the 267K entry set, then renders a Jinja template producing ~267K <a href> lines into a single HTML string (~27MB of HTML).
-
The complete HTML string is returned via HTTPOk(text=...), holding the entire response in memory.
Total memory impact: ~267K ORM objects (~267MB) + 4 dicts/sets of 267K entries + sorted list + rendered HTML ≈ 800MB-1.2GB.
Evidence from Production
- Pod:
pulp-content-7469c446f6-89jz5 (2 GiB memory limit)
- OOM kill 1: 2026-05-22 09:16 UTC — memory jumped from 521MB to 1365MB in one minute
- OOM kill 2: 2026-05-22 10:28 UTC — memory jumped from 477MB to 1640MB in one minute
- Repository:
@rubygems/rubygems/fedora-43-x86_64 — 266,934 RPM packages in latest version
- All other requests in the access logs were 302 redirects (not memory-intensive)
Suggested Approaches
- Stream the HTML response: Use
StreamResponse to write the directory listing in chunks instead of building the entire HTML string in memory
- Paginate: Limit directory listings to a configurable maximum number of entries (e.g., 10,000) with pagination links
- Cap and warn: If the directory listing exceeds a threshold, return a truncated listing with a message indicating the listing is too large
- Lazy iteration: Use Django's
.iterator() on the queryset and stream entries as they're fetched from the database, avoiding materializing all ORM objects at once
- Pre-generate at publish/version time: Generate the HTML directory listing pages when a publication or repository version is created, storing them as static artifacts. The content app would then serve pre-built pages instead of generating them on each request
Related
The content app also has a gradual memory leak (~7.3 MB per 1000 requests) that compounds this issue. Worker recycling via --max-requests is being enabled separately to address the leak.
Update: 504 timeout even when OOM is resolved
After increasing the content pod memory limit from 2Gi to 3Gi, the directory listing for @rubygems/rubygems/fedora-43-x86_64 (266,934 packages) no longer OOM-kills the pod — it peaked at 1532MB and survived. However, the request still fails with a 504 Gateway Timeout because building the directory listing takes longer than 30 seconds to complete.
This reinforces the need for an approach that avoids building the entire response on-the-fly at request time.
Problem
When a client requests a directory listing for a repository with a very large number of content units, the content app builds the entire listing in memory, causing an instant OOM kill of the pulp-content pod.
In production, requesting the directory listing for
@rubygems/rubygems/fedora-43-x86_64(266,934 RPM packages) causes an ~800MB-1.2GB memory spike in a single request, immediately OOM-killing the content pod (2 GiB memory limit). The pod has been OOM-killed repeatedly by this request pattern.Root Cause
pulpcore/content/handler.py— thelist_directory_blocking()method (line 606) andrender_html()method (line 534) build the entire directory listing in memory:list_directory_blocking()iterates allContentArtifactobjects matching the path, building four in-memory collections (directory_listset,datesdict,content_to_finddict,sizesdict). For 267K packages, this loads ~267K Django ORM objects withselect_related("artifact").It then iterates
content_repo_ver._content_relationships()to update dates — loading another ~267KRepositoryContentobjects.render_html()sorts the 267K entry set, then renders a Jinja template producing ~267K<a href>lines into a single HTML string (~27MB of HTML).The complete HTML string is returned via
HTTPOk(text=...), holding the entire response in memory.Total memory impact: ~267K ORM objects (~267MB) + 4 dicts/sets of 267K entries + sorted list + rendered HTML ≈ 800MB-1.2GB.
Evidence from Production
pulp-content-7469c446f6-89jz5(2 GiB memory limit)@rubygems/rubygems/fedora-43-x86_64— 266,934 RPM packages in latest versionSuggested Approaches
StreamResponseto write the directory listing in chunks instead of building the entire HTML string in memory.iterator()on the queryset and stream entries as they're fetched from the database, avoiding materializing all ORM objects at onceRelated
The content app also has a gradual memory leak (~7.3 MB per 1000 requests) that compounds this issue. Worker recycling via
--max-requestsis being enabled separately to address the leak.Update: 504 timeout even when OOM is resolved
After increasing the content pod memory limit from 2Gi to 3Gi, the directory listing for
@rubygems/rubygems/fedora-43-x86_64(266,934 packages) no longer OOM-kills the pod — it peaked at 1532MB and survived. However, the request still fails with a 504 Gateway Timeout because building the directory listing takes longer than 30 seconds to complete.This reinforces the need for an approach that avoids building the entire response on-the-fly at request time.