Symex: fork isolation, concrete-detection hardening, bulk-memory hook surface, explore_next#590
Merged
Merged
Conversation
… surface, explore_next
- _fork_child was not propagating findings / _region_at_suspension /
_lazy_regions_used to child paths; all three now copied on fork,
matching Path.clone(). Same fix in _init_seeded_path.
- Concrete writes in _make_default_mem_write now land in both shared
ConcreteMemory and the per-path shadow (as BitVecVal bytes), giving
copy-on-write isolation across forks. _shadow_range_is_concrete
distinguishes fully-concrete shadow ranges (unwrap to int) from
symbolic ones (return z3 expr), preserving analyst-visible shape.
- _concrete_int / _concrete_bool helpers call z3.simplify so a
structurally non-trivial but provably-constant z3 expression
(e.g. Concat(BVV, BVV) + 0 from a multi-byte shadow load) does
not force a spurious MemAddrSuspension or branch fork. Used in
extract_addr, _default_branch, is_true, and resolve_branch.
- All "little" / "big" string literals and str(engine.endian)
wrappers replaced with Endian.LITTLE / Endian.BIG.
- _memop_name derives bulk-op names from mx.ir.MemOp(int(op)).name
so the mapping stays in sync with the C++ enum automatically.
- New intercept.bulk_memory(op=...) hook axis. The C++ substrate
calls mem_bulk_op; Python dispatches to registered handlers then
to per-op default decomposers that re-fire memory_read /
memory_write per byte. Supported: memcpy/memmove/memset/bzero/
memcmp/memchr/strlen/strnlen/strcmp/strncmp/strchr/strrchr/
strcpy/stpcpy/strncpy/stpncpy/strcat/strncat.
- engine.explore(..., from_path=P) clones the prior path's per-path
state into a fresh initial path for the new target function.
path.explore_next(target) is the sugar. Tests in test_explore_next.
- Phase 15 no-default switch test corrected: a switch with no
default: label still has an implicit default IR edge (three
successors); expected return set is {10, 20, 0}, not {10, 20}.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
kumarak
approved these changes
May 6, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fork isolation —
_fork_childwas not propagating three per-path fields (findings,_region_at_suspension,_lazy_regions_used) to child paths; sibling paths shared the parent's mutable state. Fixed by propagating all three fields on fork, matching whatPath.clone()already did. Also fixed_init_seeded_pathfor the same fields.Per-path CoW for concrete writes —
_make_default_mem_writepreviously wrote concrete values only to sharedConcreteMemory, so any fork that wrote through the default handler would have its write visible to sibling paths. Concrete writes now also place per-byteBitVecValentries in the per-path shadow, giving copy-on-write isolation. Shadow reads use a new_shadow_range_is_concretehelper to distinguish fully-concrete shadow ranges (unwrap toint) from symbolic ones (return z3 expression), preserving analyst-visible symbolic shape.Concrete-detection hardening — narrow
isinstance(v, int)checks at terminal-use sites (address coercion, branch resolution,is_true) replaced with_concrete_int/_concrete_boolhelpers that callz3.simplifyso a structurally non-trivial but provably-constant z3 expression (e.g.Concat(BVV, BVV) + 0from a multi-byte shadow load) does not force a spuriousMemAddrSuspensionor branch fork.Endian enum — all
"little"/"big"string literals andstr(self._engine.endian)cargo-cult wrappers replaced withEndian.LITTLE/Endian.BIGfrom_types.py._memop_namehelper — bulk-op name derivation now drives offmx.ir.MemOp(int(op)).nameso the name list stays in sync with the C++ enum automatically instead of a hand-maintained dict.Bulk-memory hook surface — new
intercept.bulk_memory(op="memcpy")(and other op names) hook axis. The C++ substrate callsmem_bulk_op; Python dispatches to any registered analyst handlers and then to a per-op default decomposer that re-fires the existingmemory_read/memory_writehooks per byte. Supported ops:memcpy,memmove,memset,bzero,memcmp,memchr,strlen,strnlen,strcmp,strncmp,strchr,strrchr,strcpy,stpcpy,strncpy,stpncpy,strcat,strncat.explore_next/from_path=—engine.explore(..., from_path=P)clones the prior path's per-path state into a fresh initial path for the new target function.path.explore_next(target)is the sugar.Phase 15 test correction — a C
switchwith nodefault:label still has an implicit default IR edge (three successors total); the test assertion for that case was wrong. Corrected to expect three completed paths with return values{10, 20, 0}.🤖 Generated with Claude Code