Skip to content

[Optimization] Avoid disk I/O for STRLEN command on tiered storage strings #6893

@asherlaau

Description

@asherlaau

Description

While exploring the codebase for optimization opportunities, I noticed a TODO in src/server/string_family.cc:133 regarding the STRLEN command.

Currently, when a string is offloaded to tiered storage (SSD), the implementation triggers an asynchronous disk read (ReadTiered) just to determine the string length.

// src/server/string_family.cc:133
// TODO(vlad): Optimize to return co.Size() if no modify operations are present
// TODO(vlad): Omit decoding string to just query its length
if (const auto& co = it_res.value()->second; co.IsExternal()) {
  auto cb = [](string_view s) { return s.size(); };

  TieredStorage::TResult<size_t> fut = ReadTiered<size_t>(
      op_args.db_cntx.db_index, key, co, std::move(cb),
      op_args.shard->tiered_storage());
  return {std::move(fut)};
}

Proposed Optimization

For tiered storage objects, the CompactObject metadata already includes the size of the
string. We can optimize this by returning co.Size() directly when the object is external,
effectively bypassing the disk I/O latency.

I'm interested in working on this!
I have already set up the development environment, built Dragonfly locally, and passed the
string_family_test. Please let me know if I should proceed with a Pull Request.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions