Skip to content

gh-151830: Raise ValueError instead of SystemError for invalid code objects in marshal.loads()#151831

Open
tonghuaroot wants to merge 3 commits into
python:mainfrom
tonghuaroot:fix-marshal-invalid-code-valueerror
Open

gh-151830: Raise ValueError instead of SystemError for invalid code objects in marshal.loads()#151831
tonghuaroot wants to merge 3 commits into
python:mainfrom
tonghuaroot:fix-marshal-invalid-code-valueerror

Conversation

@tonghuaroot

@tonghuaroot tonghuaroot commented Jun 21, 2026

Copy link
Copy Markdown
Contributor

marshal.loads() already raises ValueError with a "bad marshal data (...)" message for malformed input, but a code object whose internal fields
are inconsistent (for example a localsplusnames tuple whose length differs
from the localspluskinds bytes) makes _PyCode_Validate() call
PyErr_BadInternalCall(), which surfaces as
SystemError: bad argument to internal function. This converts that one
case to ValueError: bad marshal data (invalid code object), so the
code-object path is consistent with marshal's existing bad-marshal-data
errors. The change is narrowly scoped to the SystemError from
PyErr_BadInternalCall() and leaves the other _PyCode_Validate() failures
(which already raise ValueError/OverflowError) unchanged.

A regression test was added to Lib/test/test_marshal.py and a NEWS entry
was added under Misc/NEWS.d/.

@sobolevn sobolevn left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(not a full review, just several comments about the test)

Comment thread Lib/test/test_marshal.py Outdated
kinds = blob[off + 5:off + 5 + n]
break
else:
self.fail("could not locate localspluskinds in marshal data")

@sobolevn sobolevn Jun 21, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this actually happen? If not, please use self.assertEqual(kinds, ...)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — replaced the self.fail with assertEqual(kinds, ...), using a no-arg function so the locals are all plain CO_FAST_LOCAL and the expected kinds is deterministic.

Comment thread Lib/test/test_marshal.py Outdated
# Rewrite it with one fewer kind byte than there are names.
corrupt = (blob[:off] + b's' + struct.pack('<i', n - 1)
+ kinds[:n - 1] + blob[off + 5 + n:])
with self.assertRaises(ValueError):

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
with self.assertRaises(ValueError):
with self.assertRaisesRegex(ValueError, 'bad marshal data, \(invalid code object\)'):

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied. I dropped the comma so the pattern matches the actual message — the existing bad marshal data (...) convention in marshal.c.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants