Skip to content

in_podman_metrics: add per-container disk I/O metrics#11986

Open
stondo wants to merge 1 commit into
fluent:masterfrom
stondo:in_podman_metrics-disk-io
Open

in_podman_metrics: add per-container disk I/O metrics#11986
stondo wants to merge 1 commit into
fluent:masterfrom
stondo:in_podman_metrics-disk-io

Conversation

@stondo

@stondo stondo commented Jun 23, 2026

Copy link
Copy Markdown

Summary

Add four per-container block-I/O counters to in_podman_metrics, read from the
cgroups v2 io.stat file and summed across block devices:

  • container_disk_read_bytes_total
  • container_disk_write_bytes_total
  • container_disk_reads_total
  • container_disk_writes_total

This complements the plugin's existing CPU, memory and network metrics so the
input reports a full CPU / memory / disk / network picture per container.

Implementation

  • New sum_io_stat_field() helper parses io.stat (lines like
    8:0 rbytes=.. wbytes=.. rios=.. wios=..) and sums one field across all
    devices. It returns UINT64_MAX (skipped by create_counter) when the file
    cannot be read, and a valid 0 for an existing-but-empty file.
  • Collected in the cgroups v2 path (fill_counters_with_sysfs_data_v2). On
    cgroups v1 the fields stay UINT64_MAX, so the counters are simply skipped.
  • New struct fields, counters and create_counter calls follow the existing
    CPU / memory / network pattern.

Example

container_disk_read_bytes_total{id="..",name="app",image=".."}  16498688
container_disk_write_bytes_total{id="..",name="app",image=".."} 24576
container_disk_reads_total{id="..",name="app",image=".."}       193
container_disk_writes_total{id="..",name="app",image=".."}      6

Testing

  • Builds cleanly: cmake --build build --target flb-plugin-in_podman_metrics
    completes with no warnings on the changed files.
  • Run against Podman pods on an aarch64 Linux device: all four counters emit
    correct per-container values (cross-checked against the containers' io.stat);
    unset on cgroups v1; existing CPU/memory/network output unchanged.

Documentation

  • A companion fluent-bit-docs update will list the new metrics.

Signed-off-by: Stefano Tondo stondo@gmail.com

Summary by CodeRabbit

  • New Features
    • Added cgroups v2 disk I/O metrics to container monitoring, reporting disk read bytes, disk write bytes, disk read operations, and disk write operations per container.
    • Metrics are integrated into the existing podman metrics output with the same per-container labeling for consistent visibility and comparison across containers.

@stondo

stondo commented Jun 23, 2026

Copy link
Copy Markdown
Author

Local checks

  • cmake -S . -B build -DFLB_CONFIG_YAML=Off -DFLB_TESTS_RUNTIME=Off -DFLB_TESTS_INTERNAL=Off -DFLB_EXAMPLES=Off
    then cmake --build build --target flb-plugin-in_podman_metricsBuilt target, no warnings on the changed files.
  • Commit-prefix linter (.github/scripts/commit_prefix_check.py) passes.

Device validation

Built into a Yocto firmware (Fluent Bit 5.0.2) and run on an aarch64 Linux
device with Podman pods. The four new counters emit correct per-container
values (Prometheus exporter, id/image labels elided):

container_disk_read_bytes_total{name="gw-cloud-connector"}  16498688
container_disk_read_bytes_total{name="example-app"}          8744960
container_disk_read_bytes_total{name="ti-logcreator"}         528384
container_disk_write_bytes_total{name="gw-cloud-connector"}    24576
container_disk_reads_total{name="gw-cloud-connector"}            193
container_disk_writes_total{name="gw-cloud-connector"}             6

Cross-checked against each container's cgroup io.stat
(rbytes/wbytes/rios/wios). With cgroups v1 the fields stay invalid and
the counters are skipped; existing CPU/memory/network output is unchanged.

A companion fluent-bit-docs PR will list the new metrics.

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

The podman metrics input plugin gains four disk I/O counters (disk_read_bytes, disk_write_bytes, disk_reads, disk_writes) from cgroups v2 io.stat. The config header adds the needed macros and fields, read_io_stat parses the sysfs data, and the plugin wiring registers and initializes the new counters.

Changes

Disk I/O metrics for podman metrics plugin

Layer / File(s) Summary
Config macros and struct extensions
plugins/in_podman_metrics/podman_metrics_config.h
Adds io.stat parsing token macros, the V2_SYSFS_FILE_IO_STAT macro, disk metric name and description macros, and disk I/O fields in struct container and struct flb_in_metrics.
io.stat reader and v2 counter filling
plugins/in_podman_metrics/podman_metrics_data.h, plugins/in_podman_metrics/podman_metrics_data.c
Declares and implements read_io_stat, which opens a cgroups v2 io.stat sysfs file, scans lines for disk key tokens, and accumulates numeric values into the container fields. fill_counters_with_sysfs_data_v2 calls it for the new disk metrics.
Counter registration and plugin init
plugins/in_podman_metrics/podman_metrics.c
Initializes disk fields to UINT64_MAX in add_container_to_list, extends create_counters to register four per-container disk counters, and sets the four new counter handles to NULL in in_metrics_init.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐇 Hop hop, the disks now sing,
io.stat hums in bytes and spins.
Four new counters take the ring,
reads and writes now dance like pins.
A burrow full of metrics wins.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely matches the main change: adding per-container disk I/O metrics to in_podman_metrics.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
plugins/in_podman_metrics/podman_metrics_data.c (1)

414-417: 🚀 Performance & Scalability | 🔵 Trivial | ⚡ Quick win

Avoid reading io.stat four times per container scrape.

These four calls open and parse the same file repeatedly. A single-pass parser that extracts all four fields at once would cut sysfs I/O and reduce repeated failure warnings on missing files.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/in_podman_metrics/podman_metrics_data.c` around lines 414 - 417, The
four consecutive calls to sum_io_stat_field are each opening and parsing the
same V2_SYSFS_FILE_IO_STAT file separately, causing unnecessary sysfs I/O and
repeated error handling. Refactor this code to create a single-pass parser
function that reads the io.stat file once and extracts all four fields
(IO_STAT_KEY_READ_BYTES, IO_STAT_KEY_WRITE_BYTES, IO_STAT_KEY_READS,
IO_STAT_KEY_WRITES) in a single operation, then assign the extracted values to
cnt->disk_read_bytes, cnt->disk_write_bytes, cnt->disk_reads, and
cnt->disk_writes respectively.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@plugins/in_podman_metrics/podman_metrics_data.c`:
- Line 175: The debug logging statements are using the `%lu` format specifier to
print `uint64_t` values (the variables `data` and `total`), which is not
portable across different ABIs and systems. Replace the `%lu` format specifiers
with `PRIu64` macro in both the flb_plg_debug call containing the `data`
variable and the flb_plg_debug call containing the `total` variable to ensure
portable formatting of `uint64_t` values across different platforms.

---

Nitpick comments:
In `@plugins/in_podman_metrics/podman_metrics_data.c`:
- Around line 414-417: The four consecutive calls to sum_io_stat_field are each
opening and parsing the same V2_SYSFS_FILE_IO_STAT file separately, causing
unnecessary sysfs I/O and repeated error handling. Refactor this code to create
a single-pass parser function that reads the io.stat file once and extracts all
four fields (IO_STAT_KEY_READ_BYTES, IO_STAT_KEY_WRITE_BYTES, IO_STAT_KEY_READS,
IO_STAT_KEY_WRITES) in a single operation, then assign the extracted values to
cnt->disk_read_bytes, cnt->disk_write_bytes, cnt->disk_reads, and
cnt->disk_writes respectively.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 672718c9-ae70-4caa-b76f-bce98160e465

📥 Commits

Reviewing files that changed from the base of the PR and between 0481348 and 5958c56.

📒 Files selected for processing (4)
  • plugins/in_podman_metrics/podman_metrics.c
  • plugins/in_podman_metrics/podman_metrics_config.h
  • plugins/in_podman_metrics/podman_metrics_data.c
  • plugins/in_podman_metrics/podman_metrics_data.h

flb_free(line);
fclose(fp);

flb_plg_debug(ctx->ins, "%s [%s]: %lu", path, key, total);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify format-specifier usage around uint64_t logging in this plugin file.
sed -n '1,40p' plugins/in_podman_metrics/podman_metrics_data.c
rg -n --type=c 'flb_plg_debug\(.*%l[udx]' plugins/in_podman_metrics/podman_metrics_data.c
rg -n --type=c '\buint64_t\b' plugins/in_podman_metrics/podman_metrics_data.c

Repository: fluent/fluent-bit

Length of output: 2543


🏁 Script executed:

# Check if inttypes.h is included anywhere in the file
rg -n 'inttypes\.h' plugins/in_podman_metrics/podman_metrics_data.c

# Read the context around line 175 to see the function and variable declarations
sed -n '140,180p' plugins/in_podman_metrics/podman_metrics_data.c

# Also check line 131 context
sed -n '120,135p' plugins/in_podman_metrics/podman_metrics_data.c

Repository: fluent/fluent-bit

Length of output: 1486


Use uint64_t-safe format specifiers in debug logging.

Lines 131 and 175 format uint64_t values (data and total) with %lu, which is not portable across ABIs. Use PRIu64 (or an explicit cast with a matching specifier).

Proposed fix
+#include <inttypes.h>
...
-    flb_plg_debug(ctx->ins, "%s: %lu", path, data);
+    flb_plg_debug(ctx->ins, "%s: %" PRIu64, path, data);
...
-    flb_plg_debug(ctx->ins, "%s [%s]: %lu", path, key, total);
+    flb_plg_debug(ctx->ins, "%s [%s]: %" PRIu64, path, key, total);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/in_podman_metrics/podman_metrics_data.c` at line 175, The debug
logging statements are using the `%lu` format specifier to print `uint64_t`
values (the variables `data` and `total`), which is not portable across
different ABIs and systems. Replace the `%lu` format specifiers with `PRIu64`
macro in both the flb_plg_debug call containing the `data` variable and the
flb_plg_debug call containing the `total` variable to ensure portable formatting
of `uint64_t` values across different platforms.

@stondo

stondo commented Jun 23, 2026

Copy link
Copy Markdown
Author

Companion documentation PR: fluent/fluent-bit-docs#2608

Add four counters exposing per-container block I/O, read from the cgroups
v2 io.stat file and summed across block devices:

- container_disk_read_bytes_total
- container_disk_write_bytes_total
- container_disk_reads_total
- container_disk_writes_total

This complements the existing CPU, memory and network metrics. The values
are collected in the cgroups v2 path; on cgroups v1 hosts the counters are
reported as invalid and skipped.

Signed-off-by: Stefano Tondo <stondo@gmail.com>
@stondo stondo force-pushed the in_podman_metrics-disk-io branch from 5958c56 to 32fc804 Compare June 24, 2026 16:39

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
plugins/in_podman_metrics/podman_metrics_data.c (1)

197-199: 🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Use portable uint64_t formatting in the debug log.

%lu is not portable for uint64_t; use PRIu64 to avoid ABI-dependent output.

Proposed fix
+#include <inttypes.h>
...
-    flb_plg_debug(ctx->ins, "%s: rbytes=%lu wbytes=%lu rios=%lu wios=%lu", path,
+    flb_plg_debug(ctx->ins, "%s: rbytes=%" PRIu64 " wbytes=%" PRIu64 " rios=%" PRIu64 " wios=%" PRIu64, path,
                   cnt->disk_read_bytes, cnt->disk_write_bytes,
                   cnt->disk_reads, cnt->disk_writes);
#!/bin/bash
# Verify non-portable uint64_t printf usage and whether inttypes.h is present.
rg -n --type=c 'flb_plg_debug\(.*%l[udx]' plugins/in_podman_metrics/podman_metrics_data.c
rg -n --type=c '`#include` <inttypes.h>' plugins/in_podman_metrics/podman_metrics_data.c
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/in_podman_metrics/podman_metrics_data.c` around lines 197 - 199, The
debug log in podman_metrics_data.c uses non-portable %lu specifiers for uint64_t
counters, which can break on different ABIs. Update the flb_plg_debug call in
the disk metrics logging path to use PRIu64 for cnt->disk_read_bytes,
cnt->disk_write_bytes, cnt->disk_reads, and cnt->disk_writes, and make sure the
file includes inttypes.h so the format macros are available.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@plugins/in_podman_metrics/podman_metrics_data.c`:
- Around line 197-199: The debug log in podman_metrics_data.c uses non-portable
%lu specifiers for uint64_t counters, which can break on different ABIs. Update
the flb_plg_debug call in the disk metrics logging path to use PRIu64 for
cnt->disk_read_bytes, cnt->disk_write_bytes, cnt->disk_reads, and
cnt->disk_writes, and make sure the file includes inttypes.h so the format
macros are available.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 36ad8236-cddb-4f2f-9350-530cbef9fdfb

📥 Commits

Reviewing files that changed from the base of the PR and between 5958c56 and 32fc804.

📒 Files selected for processing (4)
  • plugins/in_podman_metrics/podman_metrics.c
  • plugins/in_podman_metrics/podman_metrics_config.h
  • plugins/in_podman_metrics/podman_metrics_data.c
  • plugins/in_podman_metrics/podman_metrics_data.h
🚧 Files skipped from review as they are similar to previous changes (2)
  • plugins/in_podman_metrics/podman_metrics.c
  • plugins/in_podman_metrics/podman_metrics_config.h

@stondo

stondo commented Jun 24, 2026

Copy link
Copy Markdown
Author

Addressed in 32fc8046b:

  • Single-pass io.stat read: io.stat is now opened once per container scrape; a small key/accumulator table fills all four counters (rbytes/wbytes/rios/wios) in one pass via the new read_io_stat(), replacing the four sum_io_stat_field() calls. As a side benefit this emits a single warning instead of four when the file is absent (e.g. cgroups v1).
  • %lu for uint64_t: left as-is. podman_metrics_data.c already uses %lu for uint64_t in seven places (and PRIu64 only once, for a pid sprintf), so changing this one debug line would make the file inconsistent. Happy to convert the whole file to PRIu64 in a separate cleanup if a maintainer prefers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant