Skip to content

validate msgpack record array length before indexing#11973

Open
saddamr3e wants to merge 3 commits into
fluent:masterfrom
saddamr3e:msgpack-record-array-bounds
Open

validate msgpack record array length before indexing#11973
saddamr3e wants to merge 3 commits into
fluent:masterfrom
saddamr3e:msgpack-record-array-bounds

Conversation

@saddamr3e

@saddamr3e saddamr3e commented Jun 21, 2026

Copy link
Copy Markdown
Contributor

flb_time_pop_from_msgpack() validates that the inner timestamp array has
two elements but never checks the outer record array before dereferencing
it:

if (upk->data.type != MSGPACK_OBJECT_ARRAY) {
    return -1;
}
obj = upk->data.via.array.ptr[0];        /* size unchecked */
...
*map = &upk->data.via.array.ptr[1];

A record array with fewer than two elements ([] or a lone timestamp)
reads ptr[0]/ptr[1] past the array. flb_metadata_pop_from_msgpack() and
the pack_print_fluent_record() helper have the same gap. The count is
validated in these callees because they sit behind the storage_backlog
reload path and the out_lib/library entrypoints.

Added a regression test under tests/internal/flb_time.c; before the fix
it pops a 1-element array and reads a garbage timestamp, after the fix
the short arrays are rejected. The internal suite passes under
-DSANITIZE_ADDRESS=On.


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • [N/A] Example configuration file for the change

  • [N/A] Debug log output from testing the change

  • [N/A] Attached Valgrind output that shows no leaks or memory corruption was found

  • [N/A] Run local packaging test showing all targets (including any new ones) build.

  • [N/A] Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • [N/A] Documentation required for this feature

Backporting

  • [N/A] Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

Bug Fixes

  • Enhanced validation of record data structure integrity to prevent potential processing errors.

Tests

  • Added regression test to verify proper rejection of invalid data structures during record processing.

Signed-off-by: saddamr3e <saddamr3e@gmail.com>
Signed-off-by: saddamr3e <saddamr3e@gmail.com>
Signed-off-by: saddamr3e <saddamr3e@gmail.com>
@coderabbitai

coderabbitai Bot commented Jun 21, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

Adds array-size validation guards to flb_time_pop_from_msgpack and flb_metadata_pop_from_msgpack/pack_print_fluent_record to prevent out-of-bounds access when MessagePack record arrays are shorter than expected. A regression test is added to verify the short-array rejection behavior in the time-pop path.

Changes

MsgPack Array Bounds Validation

Layer / File(s) Summary
Bounds check in flb_time_pop_from_msgpack with regression test
src/flb_time.c, tests/internal/flb_time.c
flb_time_pop_from_msgpack now returns -1 when the array has fewer than 2 elements. New helper pop_from_array and test test_pop_from_msgpack_short_array build zero-, one-, and two-element MsgPack arrays and assert rejection/success. TEST_LIST registers the new test.
Bounds checks in flb_metadata_pop_from_msgpack and pack_print_fluent_record
src/flb_pack.c
flb_metadata_pop_from_msgpack validates the top-level array element count and first nested array size before dereferencing. pack_print_fluent_record strengthens checks to require root array ≥ 2 elements and first element exactly 2 elements.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Suggested labels

backport to v4.2.x

Suggested reviewers

  • edsiper
  • cosmo0920

Poem

🐇 Hop hop, I check the size,
Before I peek inside arrays!
Short buffers? No surprise—
Return minus one, I say.
Safe parsing wins the prize! 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and specifically describes the main change: adding validation for msgpack record array length before indexing operations, which is the core fix across all modified files.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
tests/internal/flb_time.c (1)

489-517: ⚡ Quick win

Add pack-path regression coverage alongside the time-path test.

This test validates the flb_time_pop_from_msgpack fix well, but the PR also hardens flb_metadata_pop_from_msgpack/pack_print_fluent_record and those paths are not regression-tested here. Adding short-array malformed-shape tests for the pack path would prevent partial regressions.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/internal/flb_time.c` around lines 489 - 517, The test function
test_pop_from_msgpack_short_array currently only validates the time-path through
pop_from_array calls but does not cover the pack-path functions
flb_metadata_pop_from_msgpack and pack_print_fluent_record that were also
hardened in the PR. Add additional test cases within this function using the
same short-array and malformed-shape msgpack structures (empty array, single
element, valid record) but call flb_metadata_pop_from_msgpack and
pack_print_fluent_record instead of pop_from_array to provide regression
coverage for the pack-path as well.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/flb_pack.c`:
- Around line 866-872: The shape validation checks for msgpack objects are in
place, but they do not guarantee successful decoding of the timestamp data,
particularly for malformed EXT payloads. The `tms` variable can remain
uninitialized if a pop helper function fails during timestamp decoding. Add
return value checks for the pop helper functions that decode the timestamp from
the array elements validated by the shape checks shown in the diff, and return
early with an error code if any pop operation fails before attempting to use the
decoded `tms` value in subsequent processing.

---

Nitpick comments:
In `@tests/internal/flb_time.c`:
- Around line 489-517: The test function test_pop_from_msgpack_short_array
currently only validates the time-path through pop_from_array calls but does not
cover the pack-path functions flb_metadata_pop_from_msgpack and
pack_print_fluent_record that were also hardened in the PR. Add additional test
cases within this function using the same short-array and malformed-shape
msgpack structures (empty array, single element, valid record) but call
flb_metadata_pop_from_msgpack and pack_print_fluent_record instead of
pop_from_array to provide regression coverage for the pack-path as well.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 540c26a9-ca81-40b6-8e79-1b457c2b0d3f

📥 Commits

Reviewing files that changed from the base of the PR and between 9bf9c1b and 91f2c19.

📒 Files selected for processing (3)
  • src/flb_pack.c
  • src/flb_time.c
  • tests/internal/flb_time.c

Comment thread src/flb_pack.c
Comment on lines +866 to 872
if (root.type != MSGPACK_OBJECT_ARRAY || root.via.array.size < 2) {
return -1;
}

o = root.via.array.ptr[0];
if (o.type != MSGPACK_OBJECT_ARRAY) {
if (o.type != MSGPACK_OBJECT_ARRAY || o.via.array.size != 2) {
return -1;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Gate record printing on parse success to avoid using invalid decode state.

The new shape checks are good, but they don’t guarantee timestamp decode success for all allowed types (notably malformed EXT payloads). If decode fails and output still proceeds, tms can be consumed in an invalid/uninitialized state. Please check pop helper return values and abort early on failure.

Proposed fix
 static int pack_print_fluent_record(size_t cnt, msgpack_unpacked result)
 {
     msgpack_object  *metadata;
     msgpack_object   root;
     msgpack_object  *obj;
     struct flb_time  tms;
     msgpack_object   o;
+    int              ret;
@@
-    flb_time_pop_from_msgpack(&tms, &result, &obj);
-    flb_metadata_pop_from_msgpack(&metadata, &result, &obj);
+    ret = flb_time_pop_from_msgpack(&tms, &result, &obj);
+    if (ret != 0) {
+        return -1;
+    }
+
+    ret = flb_metadata_pop_from_msgpack(&metadata, &result, &obj);
+    if (ret != 0) {
+        return -1;
+    }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/flb_pack.c` around lines 866 - 872, The shape validation checks for
msgpack objects are in place, but they do not guarantee successful decoding of
the timestamp data, particularly for malformed EXT payloads. The `tms` variable
can remain uninitialized if a pop helper function fails during timestamp
decoding. Add return value checks for the pop helper functions that decode the
timestamp from the array elements validated by the shape checks shown in the
diff, and return early with an error code if any pop operation fails before
attempting to use the decoded `tms` value in subsequent processing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant