Skip to content

trivial fixes, hardening and refactor#841

Open
haitaohuang wants to merge 11 commits into
intel:mainfrom
haitaohuang:upstream/pr1-trivial-fixes
Open

trivial fixes, hardening and refactor#841
haitaohuang wants to merge 11 commits into
intel:mainfrom
haitaohuang:upstream/pr1-trivial-fixes

Conversation

@haitaohuang
Copy link
Copy Markdown
Contributor

No description provided.

haitaohuang and others added 11 commits May 21, 2026 16:55
The VMM signals migration session cancellation by writing data_status
byte[0]=0x02, byte[1]=0x03 (0x302) to the shared buffer.

Previously, 0x302 was treated as a generic TdVmcallErr, reported as
SecureSessionError (6), change to propagate and report it as VmmCanceled
(10) in ReportStatus.

In the SPDM path, MigtdTransport's send/receive implement the
SpdmDeviceIo trait which cannot represent this error in its return types
(SpdmResult and Result<usize, usize>). Log a warning when
ConnectionAborted is detected so the VMM cancellation is visible in
logs, even though the typed error cannot propagate through the SPDM
library's trait boundary.

Signed-off-by: Haitao Huang <haitaohuang@microsoft.com>
…path

The receive path used a unneeded two-step indirection: the first
vmcall_raw_transport_dequeue issued a VMCALL, copied data from
shared memory into CONNECTION_PKT_QUEUES via recv_packet(), and
returned a placeholder vec. The second dequeue then popped the
real data from the queue. This was probably copied design from another
implementation with 'packet' based stream which is not applicable to
vmcall-raw send/receive.

Simplify by having vmcall_service_migtd_receive return the actual
data directly — it already copies from SharedMemory (untrusted,
host-accessible) into a private Vec via .to_vec(), preserving
the security boundary.

Remove dead code: push_stream_queues, pop_stream_queues,
recv_packet, CONNECTION_PKT_QUEUES, add_stream_to_connection_map,
remove_stream_from_connection_map, and related imports.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Haitao Huang <haitaohuang@microsoft.com>
Extract the duplicated poll logic from vmcall_service_migtd_send
and vmcall_service_migtd_receive into poll_vmcall_completion().

The shared function handles: interrupt flag check, data_status
parsing, flag consumption after final status, and success/error
determination. Each caller only adds its operation-specific logic
(send returns data_length, receive copies payload to private Vec).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Haitao Huang <haitaohuang@microsoft.com>
A hostile VMM can complete a tdvmcall_migtd_receive with
status=VMM_SUCCESS but data_length=0. The previous implementation let
this propagate as Ok(0) through VmcallRaw::recv, which would stall the
caller's read loop indefinitely (no forward progress, no error).

Reject zero-length success inside the receive poll_fn closure with
VmcallRawError::Malformed, which surfaces to upstream callers as a
network error instead of an infinite spin.

Also stop relying on the post-completion data_length on the send path
(per spec, data_length is owned by MigTD when status=0 and is not a
meaningful VMM-reported value after completion). The previously
returned value was already discarded by VmcallRaw::send, so this is a
no-op in behavior but makes the spec contract explicit. Document this
in the poll_vmcall_completion doc comment.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Haitao Huang <haitaohuang@microsoft.com>
Public connection state variant State::Establised was misspelled.
Rename to State::Established across vmcall_raw and vsock. No behavior
change.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Haitao Huang <haitaohuang@microsoft.com>
When the migration context has been destroyed (no entry in
VMCALL_MIG_CONTEXT_FLAGS for the request id), any interrupt injection
from the VMM should be rejected. Previously the else branch discarded
the error with `let _ = ...`, allowing execution to fall through and
treat DMA buffer contents as a valid VMM response.

Return an error immediately so the caller rejects the stale injection.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Haitao Huang <haitaohuang@microsoft.com>
align_up() computed `(size & !(PAGE_SIZE - 1)) + PAGE_SIZE` for
non-page-aligned input, which silently wraps to 0 near usize::MAX, and
vmcall_raw_transport_enqueue() blindly downcast buf.len() to u32 and
added the 12-byte header without overflow checks. A pathological caller
could end up allocating a 0-page SharedMemory and writing past it.

Switch align_up() to checked_add and propagate None to the caller as
VmcallRawError::Illegal; size the data buffer via checked_add(12) and
reject buf.len() that does not fit a u32 data_length. Add unit tests
for the boundary cases.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Haitao Huang <haitaohuang@microsoft.com>
parse_events() indexed event_data[1..1 + desc_size] for
EV_EFI_PLATFORM_FIRMWARE_BLOB2 and event_data[..4] for EV_EVENT_TAG
without first validating length. A truncated or malformed event payload
caused a panic instead of being skipped.

Replace the raw slice indexing with .get(...) and propagate None, so a
short event is dropped rather than aborting the parse. Add unit tests
for both code paths.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Haitao Huang <haitaohuang@microsoft.com>
HelloPacketPayload is serialized via as_bytes() (a raw pointer cast
over the struct) and parsed via read_from_bytes() with hard-coded
offsets, both of which assume C layout. Without #[repr(C)] the compiler
is free to reorder fields, breaking the wire format.

Annotate the struct to match the sibling PreSessionMessage.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Haitao Huang <haitaohuang@microsoft.com>
…om_quote

find(PEM_CERT_END) searched from offset 0 of the lossy-decoded quote
string, so if any byte sequence matching the END marker appeared before
the BEGIN marker (or BEGIN was absent), end_index < start_index and
the subsequent slice operation panicked.

Search for END strictly after the end of BEGIN, and return InvalidQuote
on malformed input. Real hardware quotes never trigger the bug, but the
parser should not panic on hostile input.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Haitao Huang <haitaohuang@microsoft.com>
get_event_log() returns &raw[..size + 1] as a workaround for an
upstream bug in cc_measurement::log::CcEvents::next() (uses `<` instead
of `<=`), which silently drops the last event when the slice ends
exactly on an event boundary. In MigTD that last event is the tagged
policy event, so removing the +1 has previously broken AzCVMEmu
integration tests.

Document the workaround in code next to the slice expression and add
last_event_visible_only_with_trailing_padding as a regression test
mirroring the upstream reproducer
(confidential-containers/td-shim#848). The
test fails loudly both if anyone re-applies the "cleanup" and once the
upstream iterator is fixed and the workaround can be removed.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Haitao Huang <haitaohuang@microsoft.com>
@haitaohuang haitaohuang requested review from jyao1 and sgrams as code owners May 21, 2026 18:44
@jyao1
Copy link
Copy Markdown
Contributor

jyao1 commented May 22, 2026

Could you please split and describe each specific issue in a standalone PR?

It is hard for me to review all different topics together.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants