Skip to content

ParentConnection.handle_message: missing destination_node_id routing + broken except_writer guard #2

@vinayakkankanwadi

Description

@vinayakkankanwadi

Two bugs in the same function, ParentConnection.handle_message in sapient_apex_server/connection.py, lines 567-574 (4.2.0 / current master). Both show up when Apex sits as a proxy between an Edge Node and an upstream Fusion Node, and the FN sends acks back through Apex.

1. No destination_node_id lookup

DMMConnection.handle_message (around line 492) looks up msg.parsed.destination_node_id in registered_sensors and routes to that child's writer. ParentConnection.handle_message doesn't. It only fans to dmm_writers and to other parents.

So when something upstream sends a downstream-bound message with destination_id set (registration_ack, task_ack, alert_ack), it never reaches the Child it was meant for. The Child times out waiting for its ack.

The workaround sendRegistrationAck: true (Apex synthesises the ack locally) hides this, but it means Apex isn't transparent. The Child gets Apex's middlewareId in the ack, not the real upstream node's ID.

2. except_writer compared to wrong type

The final line of the function:

self.shared_data.send_to_parent(msg, high_level=False, except_writer=self.writer)

self.writer is a bare WriterType function. But SharedData.send_to_parent iterates parent_all_writers: List[ParentWriter] (dataclass) and does:

for writer in writers:
    if writer != except_writer:
        writer.writer(msg, msg.sapient_version)

A dataclass instance is never equal to a bare function, so the "don't reflect to sender" guard always passes. The message gets sent back out to the same Parent it came in on.

If the upstream Parent is strict about which message types it accepts on which socket direction (some implementations are), the reflected ack triggers an "unsupported message type" error on that side. In milder implementations it's a no-op, just wasted bytes.

Fix (one diff covers both)

def handle_message(self, msg: MessageRecord, generator: IdGenerator):
    msg.updated_data_bytes = msg.received.data_bytes
    if msg.error is None:
        msg.forwarded_count = 0

        dest_id = msg.parsed.destination_node_id
        if dest_id is not None and dest_id in self.shared_data.registered_sensors:
            node_connection = self.shared_data.registered_sensors[dest_id]
            node_connection.writer(msg, msg.sapient_version)
            msg.forwarded_count += 1

        for writer in self.shared_data.dmm_writers:
            writer(msg, msg.sapient_version)
        msg.forwarded_count += len(self.shared_data.dmm_writers)

        self.shared_data.send_to_parent(
            msg, high_level=False, except_writer=self.parent_writer
        )

    _send_error_reply_if_necessary(
        self.writer, "Parent", msg, self.message_format,
        self.shared_data.middleware_node_id, None,
    )

The destination_id block is a no-op when dest_id is None or unknown, so it doesn't change behaviour for high-level broadcast traffic. The except_writer change just makes the existing guard work. self.parent_writer is already the dataclass that gets pushed into parent_all_writers in __init__.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions