Skip to content

☀️ Fix memory safety vulnerabilities in high-level and VFD code#6140

Open
brtnfld wants to merge 1 commit into
HDFGroup:developfrom
brtnfld:snyk
Open

☀️ Fix memory safety vulnerabilities in high-level and VFD code#6140
brtnfld wants to merge 1 commit into
HDFGroup:developfrom
brtnfld:snyk

Conversation

@brtnfld
Copy link
Copy Markdown
Collaborator

@brtnfld brtnfld commented Jan 2, 2026

Address multiple CWE-415 (double-free), CWE-416 (use-after-free), and CWE-122 (buffer overflow) vulnerabilities identified by static analysis:

  • hl/src/H5DS.c: Fix double-free in H5DSis_scale() by setting buf to NULL after free and adding NULL check in cleanup path

  • hl/src/H5LT.c: Fix multiple memory issues:

    • Set myinput to NULL after free in H5LTtext_to_dtype()
    • Add NULL check in realloc_and_append() to prevent use-after-free
    • Refactor duplicated stmp handling by creating H5LT_append_dtype_super_text() helper function, eliminating ~50 lines of repeated code across 4 case blocks
  • hl/src/H5TB.c: Replace unsafe strcpy() with strncpy() in H5TBget_field_info() using HLTB_MAX_FIELD_LEN constant to prevent buffer overflow

  • hl/src/H5TBpublic.h: Document buffer size requirements for field_names parameter

  • src/H5FDstdio.c: Fix inconsistent resource cleanup in H5FD_stdio_open() by using file->fp instead of f throughout error paths

  • src/H5VLnative.c: Add assert checks for obj and file parameters in H5VL_native_get_file_struct() following internal API conventions

SAFE project work.


Important

Fixes memory safety vulnerabilities in HDF5 codebase, addressing double-free, use-after-free, and buffer overflow issues across multiple files.

  • Memory Safety Fixes:
    • H5DS.c: Fix double-free in H5DSis_scale() by setting buf to NULL after free and adding NULL check in cleanup.
    • H5LT.c: Set myinput to NULL after free in H5LTtext_to_dtype(), add NULL check in realloc_and_append(), refactor H5LT_append_dtype_super_text() to reduce code duplication.
    • H5TB.c: Replace strcpy() with strncpy() in H5TBget_field_info() to prevent buffer overflow.
  • Documentation:
    • H5TBpublic.h: Document buffer size requirements for field_names parameter.
  • Resource Management:
    • H5FDstdio.c: Use file->fp consistently in H5FD_stdio_open() for error paths.
  • Assertions:
    • H5VLnative.c: Add assert checks for obj and file parameters in H5VL_native_get_file_struct().

This description was created by Ellipsis for 7b22833. You can customize this summary. It will automatically update as commits are pushed.

@brtnfld brtnfld added the Component - C Library Core C library issues (usually in the src directory) label Jan 2, 2026
@brtnfld brtnfld added the HDFG-internal Internally coded for use by the HDF Group label Jan 2, 2026
@github-project-automation github-project-automation Bot moved this to To be triaged in HDF5 - TRIAGE & TRACK Jan 2, 2026
Comment thread hl/src/H5DS.c Fixed
Comment thread hl/src/H5DS.c Outdated
Comment thread hl/src/H5TBpublic.h Outdated
Comment thread src/H5FDstdio.c

/* Use the value in the property list */
if (H5Pget_file_locking(fapl_id, &unused, &file->ignore_disabled_file_locks) < 0) {
fclose(file->fp);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the issue with these close calls? file->fp should be the same as f at this point

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Snyk flags it for clarity on resource ownership. The concern is that it's not explicit which pointer "owns" the resource after the assignment. Once you've assigned file->fp = f, the FILE* is conceptually owned by the file structure, and using file->fp in cleanup makes this ownership clear.

Comment thread hl/src/H5LT.c Outdated
Comment thread hl/src/H5TBprivate.h Outdated
Comment thread hl/src/H5LT.c
Comment thread hl/src/H5TBpublic.h
Comment thread hl/src/H5TBpublic.h Outdated
* by the identifier \p loc_id.
*
* \note When retrieving field names, callers should allocate each field_names[i]
* buffer with sufficient space for the expected field name length.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"for the expected field name."?

Comment thread hl/test/test_table.c Outdated
}
}

/* release */
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Release? (for consistency :)

Comment thread hl/test/test_table.c Outdated
memset(field_1000, 'D', 1000);
field_1000[1000] = '\0';

const char *boundary_field_names[5] = {field_253, field_254, field_255, field_1000, "ShortName"};
Copy link
Copy Markdown
Collaborator

@bmribler bmribler Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should "ShortName" be defined as a constant? Same for file names, i.e., "test_boundary.h5".
Definitely "5" should be a constant.

@github-project-automation github-project-automation Bot moved this from To be triaged to In progress in HDF5 - TRIAGE & TRACK Apr 17, 2026
brtnfld added a commit to brtnfld/hdf5 that referenced this pull request Apr 17, 2026
- H5LT.c: Guard against super_len == 0 after successful H5LTdtype_to_text probe call
- H5TBpublic.h: Fix 'size' -> 'sizes' in field_sizes param doc; drop 'length'
  from field name note for clarity
- test_table.c: Capitalize 'release' comments; replace magic literals in
  boundary test with BOUNDARY_NFIELDS, BOUNDARY_NRECORDS, BOUNDARY_FILE, and
  BOUNDARY_SHORT_NAME constants
@brtnfld
Copy link
Copy Markdown
Collaborator Author

brtnfld commented Apr 17, 2026

@bmribler

Added if (super_len == 0) return NULL; guard after the probe call
changed size of the fields" to "sizes of the fields"
H5TBpublic.h, changed "for the expected field name length." to "for the expected field name."
test_table.c: changed /* release / to / Release */ (both instances)
Defined BOUNDARY_NFIELDS, BOUNDARY_NRECORDS, BOUNDARY_FILE, BOUNDARY_SHORT_NAME and replaced all magic literals

@brtnfld brtnfld requested a review from epourmal as a code owner April 17, 2026 13:47
@brtnfld brtnfld requested a review from bmribler April 17, 2026 13:49
@brtnfld brtnfld requested a review from jhendersonHDF April 17, 2026 13:49
brtnfld added a commit to brtnfld/hdf5 that referenced this pull request Apr 17, 2026
…FIELD_LEN public

- H5TBget_field_info: replace unbounded strcpy with bounds-checked memcpy
  guarded by HLTB_MAX_FIELD_LEN; names >= 255 chars truncated safely (CWE-120)
- Move HLTB_MAX_FIELD_LEN from H5TBprivate.h to H5TBpublic.h with docs so
  callers can correctly size their field_names[] buffers
- test_table.c: add backward-compat (32-byte buffer) and boundary-length
  (253/254/255/1000-char names) tests for H5TBget_field_info
- CMakeTests.cmake: add test_boundary.h5 to cleanup list
- CHANGELOG.md: add H5TBget_field_info security fix and HLTB_MAX_FIELD_LEN
  entries; HDFGroup#6140's H5DSis_scale buf=NULL entry is superseded by our broader
  cleanup already on this branch
bmribler
bmribler previously approved these changes Apr 20, 2026
@fortnern fortnern self-assigned this Apr 21, 2026
Comment thread hl/src/H5LT.c Outdated
size_t size_str_to_add, size_str;

/* Precondition: buf must be allocated (verified in debug builds) */
assert(buf != NULL && "Buffer must not be NULL");
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If _no_user_buf is false and str_to_add is NULL, buf could be NULL since it's never touched. Based on the added header comment it seems like this may have been intentional behavior as well, but this function never "ensures space is available" in that case, it just returns buf (or NULL).

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's still a difference between debug and release builds here. debug builds will crash on the assertion failure, whereas release builds will return NULL

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed

Comment thread hl/src/H5TB.c Outdated

name_len = strlen(member_name);

/* Only enforce HLTB_MAX_FIELD_LEN limit when truncation is necessary.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this comment really makes sense, as it doesn't seem to have anything to do with backward compatibility. Regardless of whether memcpy() copies the whole string or HLTB_MAX_FIELD_LEN - 1 bytes, the NUL terminator needs to come at the end of the string either way, so the behavior should be the same. It's just that copying less than HLTB_MAX_FIELD_LEN - 1 bytes could be more efficient.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, the 'backward compatibility' rationale was wrong and has been removed. The comment now explains this is simply an efficiency optimization (copy only name_len + 1 bytes when the name fits, rather than the full HLTB_MAX_FIELD_LEN - 1). The H5TB.c changes have already landed in develop via PR #6371,

brtnfld added a commit to brtnfld/hdf5 that referenced this pull request May 20, 2026
…FIELD_LEN public

- H5TBget_field_info: replace unbounded strcpy with bounds-checked memcpy
  guarded by HLTB_MAX_FIELD_LEN; names >= 255 chars truncated safely (CWE-120)
- Move HLTB_MAX_FIELD_LEN from H5TBprivate.h to H5TBpublic.h with docs so
  callers can correctly size their field_names[] buffers
- test_table.c: add backward-compat (32-byte buffer) and boundary-length
  (253/254/255/1000-char names) tests for H5TBget_field_info
- CMakeTests.cmake: add test_boundary.h5 to cleanup list
- CHANGELOG.md: add H5TBget_field_info security fix and HLTB_MAX_FIELD_LEN
  entries; HDFGroup#6140's H5DSis_scale buf=NULL entry is superseded by our broader
  cleanup already on this branch
brtnfld added a commit to brtnfld/hdf5 that referenced this pull request May 22, 2026
H5LT.c:
- Push H5E_NOSPACE on strdup OOM in H5LTtext_to_dtype so the HDF5
  error stack is populated on allocation failure
- Rename static helper H5LT_append_dtype_super_text -> append_dtype_super_text
  to match the TU-local naming convention (no H5LT_ prefix for file-static
  helpers); update all four call sites
- Add comment on super_len == 0 guard explaining calloc(0,...) avoidance
- Push H5E_NOSPACE from calloc failure inside the helper

H5TB.c (H5TBget_field_info):
- Clarify else-branch comment: the two-branch structure is an efficiency
  optimization (copy only name_len+1 bytes rather than HLTB_MAX_FIELD_LEN-1),
  not a backward-compatibility concern (addresses jhendersonHDF review note)
@brtnfld brtnfld requested review from bmribler and jhendersonHDF May 22, 2026 17:05
H5FDstdio (src/H5FDstdio.c):
- Fix five error paths in H5FD_stdio_open() that called fclose(f) after
  free(file): the correct resource to close is file->fp.  Reorder to
  fclose before free to match standard cleanup idiom and prevent file
  descriptor leaks under memory-pressure failures.

H5VLnative (src/H5VLnative.c):
- Add assert(obj) and assert(file) to H5VL_native_get_file_struct() to
  catch NULL-pointer programming errors early in debug builds.

H5LT (hl/src/H5LT.c):
- Add NULL check after strdup() in H5LTtext_to_dtype(); push H5E_NOSPACE
  so the HDF5 error stack is populated on OOM.
- Fix H5Tclose(super) leak in H5T_ENUM, H5T_VLEN, H5T_ARRAY, H5T_COMPLEX
  branches of H5LT_dtype_to_text(): super was only closed on the success
  path; any failure between H5Tget_super and the final realloc_and_append
  leaked the type ID.  Super is now closed immediately after use.
- Refactor the repeated "get super-type text and append" pattern (four
  near-identical ~15-line blocks) into static helper append_dtype_super_text().
  Pushes H5E_NOSPACE on internal calloc failure.
- Rewrite realloc_and_append() doc comment to document the asymmetric
  ownership contract (callee frees buf on realloc failure in
  library-managed mode; no free in user-buf mode).
- Move the buf == NULL guard to before the _no_user_buf branch so both
  modes short-circuit identically.

H5TB (hl/src/H5TB.c):
- Clarify H5TBget_field_info() else-branch comment: the two-branch copy
  structure is an efficiency optimization (copy name_len+1 bytes rather
  than HLTB_MAX_FIELD_LEN-1), not a backward-compatibility concern.

CHANGELOG (release_docs/CHANGELOG.md):
- Add entries for the stdio VFD leak fix, VOL NULL checks, and H5LT
  memory-safety improvements.
@brtnfld
Copy link
Copy Markdown
Collaborator Author

brtnfld commented May 22, 2026

☀️ SHINES project-related.

@brtnfld brtnfld changed the title Fix memory safety vulnerabilities in high-level and VFD code ☀️ Fix memory safety vulnerabilities in high-level and VFD code May 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Component - C Library Core C library issues (usually in the src directory) HDFG-internal Internally coded for use by the HDF Group

Projects

Status: In progress

Development

Successfully merging this pull request may close these issues.

7 participants