Skip to content

Improve efficiency of DatastoreIndexingRouter#source_event_versions_in_index: leverage routing/timestamp #1077

@myronmarston

Description

@myronmarston

This method searches all shards an indices:

# Note: we intentionally search the entire index expression, not just an individual index based on a rollover timestamp.
# And we intentionally do NOT provide a routing value--we want to find the version, no matter what shard the document
# lives on.
#
# Since this `source_event_versions_in_index` is for handling malformed events, its possible that the
# rollover timestamp or routing value on the operation is wrong and that the correct document lives in
# a different shard and index than what the operation is targeted at. We want to search across all of them
# so that we will find it, regardless of where it lives.

This is quite inefficient, particularly when you have many malformed events--it can cause significant load problems on a cluster.

The rationale given in that comment is speculative and I suspect we've never actually had malformed documents with the wrong routing/index rollover timestamp. We should make it more efficient.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions