Skip to content

MDEV-39834: Add TidesDB 9(v9.3.3) storage engine#5166

Open
guycipher wants to merge 8 commits into
MariaDB:11.4from
tidesdb:tidesdb-engine
Open

MDEV-39834: Add TidesDB 9(v9.3.3) storage engine#5166
guycipher wants to merge 8 commits into
MariaDB:11.4from
tidesdb:tidesdb-engine

Conversation

@guycipher
Copy link
Copy Markdown

The libtidesdb C library is vendored under storage/tidesdb/libtidesdb/ and compiled into a static archive that is statically linked into the loadable plugin module, mirroring storage/rocksdb, the engine is self-contained and needs no system-installed libtidesdb. include/providers is removed from the
engine's include dirs so the bundled zstd/lz4/snappy link directly instead of routing through MariaDB's compression-provider plugins.

For cross-version portability, spatial- and fulltext-index detection uses the HA_SPATIAL / HA_FULLTEXT key flags (with a HA_SPATIAL/HA_FULLTEXT -> *_legacy rename shim) rather than KEY::algorithm, which older servers leave
unset; my_global.h is included before handler.h so the server typedefs it needs are defined.

Adds the tidesdb mysql-test suite covering CRUD, MVCC and pessimistic locking, transactions/savepoints, isolation levels, compression, encryption, TTL, full-text, spatial, partitioning, object-store offload and crash/durability. The vector test auto-skips where the VECTOR type is unavailable (before 11.7).

Builds and full mysql-test suite shows green on MariaDB 11.4 (LTS) through to 13.0.2

The libtidesdb C library is vendored under storage/tidesdb/libtidesdb/ and
compiled into a static archive that is statically linked into the loadable
plugin module, mirroring storage/rocksdb, the engine is self-contained and
needs no system-installed libtidesdb. include/providers is removed from
the
engine's include dirs so the bundled zstd/lz4/snappy link directly instead
of routing through MariaDB's compression-provider plugins.

For cross-version portability, spatial- and fulltext-index detection uses
the HA_SPATIAL / HA_FULLTEXT key flags (with a HA_SPATIAL/HA_FULLTEXT ->
*_legacy rename shim) rather than KEY::algorithm, which older servers
leave
unset; my_global.h is included before handler.h so the server typedefs it
needs are defined.

Adds the tidesdb mysql-test suite covering CRUD, MVCC and
pessimistic locking, transactions/savepoints, isolation levels,
compression, encryption, TTL, full-text, spatial, partitioning, object-store offload
and crash/durability. The vector test auto-skips where the VECTOR type is
unavailable (before 11.7).

Builds and full mysql-test suite shows green on MariaDB 11.4 (LTS) through to
13.0.2
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Jun 2, 2026

CLA assistant check
All committers have signed the CLA.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request integrates the TidesDB storage engine into MariaDB, adding the core handler implementation in ha_tidesdb.h, build configuration in CMakeLists.txt, vendored libtidesdb source files, and an extensive test suite. The reviewer provided several constructive suggestions, including moving the public header copy operation in CMakeLists.txt from configure time to build time to prevent stale headers during incremental builds. Additionally, the reviewer recommended using C++11 in-class member initializers for various member variables in ha_tidesdb.h to avoid uninitialized memory issues, and explicitly marking the ha_tidesdb destructor with override.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread storage/tidesdb/CMakeLists.txt
Comment thread storage/tidesdb/ha_tidesdb.h
Comment thread storage/tidesdb/ha_tidesdb.h
Comment thread storage/tidesdb/ha_tidesdb.h
Comment thread storage/tidesdb/ha_tidesdb.h
Comment thread storage/tidesdb/ha_tidesdb.h Outdated
@guycipher
Copy link
Copy Markdown
Author

Base (11.4) >> 11.5 >> 12.0 >> … >> 12.x >> 13.0 >> main

Can propagate forward to every newer branch automatically.

@guycipher
Copy link
Copy Markdown
Author

Full TideSQL reference: https://tidesdb.com/reference/tidesql/

@guycipher
Copy link
Copy Markdown
Author

I see

/home/buildbot/storage/tidesdb/ha_tidesdb.cc: In member function 'virtual int ha_tidesdb::write_row(const uchar*)':
/home/buildbot/storage/tidesdb/ha_tidesdb.cc:6694:28: error: structured bindings only available with '-std=c++17' or '-std=gnu++17' [-Werror=c++17-extensions]
 6694 |                 for (auto &[term, tf] : tf_map)
      |                            ^
/home/buildbot/storage/tidesdb/ha_tidesdb.cc: In member function 'virtual int ha_tidesdb::update_row(const uchar*, const uchar*)':
/home/buildbot/storage/tidesdb/ha_tidesdb.cc:7898:32: error: structured bindings only available with '-std=c++17' or '-std=gnu++17' [-Werror=c++17-extensions]
 7898 |                     for (auto &[term, tf] : old_tf)
      |                                ^
/home/buildbot/storage/tidesdb/ha_tidesdb.cc:7906:32: error: structured bindings only available with '-std=c++17' or '-std=gnu++17' [-Werror=c++17-extensions]
 7906 |                     for (auto &[term, tf] : new_tf)
      |                                ^
/home/buildbot/storage/tidesdb/ha_tidesdb.cc:7926:32: error: structured bindings only available with '-std=c++17' or '-std=gnu++17' [-Werror=c++17-extensions]
 7926 |                     for (auto &[term, old_cnt] : old_tf)
      |                                ^
/home/buildbot/storage/tidesdb/ha_tidesdb.cc:7936:32: error: structured bindings only available with '-std=c++17' or '-std=gnu++17' [-Werror=c++17-extensions]
 7936 |                     for (auto &[term, new_cnt] : new_tf)
      |                                ^
/home/buildbot/storage/tidesdb/ha_tidesdb.cc: In member function 'virtual int ha_tidesdb::delete_row(const uchar*)':
/home/buildbot/storage/tidesdb/ha_tidesdb.cc:8206:28: error: structured bindings only available with '-std=c++17' or '-std=gnu++17' [-Werror=c++17-extensions]
 8206 |                 for (auto &[term, tf] : tf_map)
      |                            ^
/home/buildbot/storage/tidesdb/ha_tidesdb.cc: In member function 'virtual ha_rows ha_tidesdb::multi_range_read_info_const(uint, RANGE_SEQ_IF*, void*, uint, uint*, uint*, ha_rows, Cost_estimate*)':
/home/buildbot/storage/tidesdb/ha_tidesdb.cc:8609:25: error: 'struct TABLE' has no member named 'part_info'
 8609 |     if (table && table->part_info) return rows;
      |                         ^~~~~~~~~
/home/buildbot/storage/tidesdb/ha_tidesdb.cc: In member function 'virtual FT_INFO* ha_tidesdb::ft_init_ext(uint, uint, String*)':
/home/buildbot/storage/tidesdb/ha_tidesdb.cc:9850:16: error: structured bindings only available with '-std=c++17' or '-std=gnu++17' [-Werror=c++17-extensions]
 9850 |     for (auto &[pk_str, score] : doc_scores)
      |                ^
cc1plus: all warnings being treated as errors
make[2]: *** [storage/tidesdb/CMakeFiles/tidesdb.dir/build.make:79: storage/tidesdb/CMakeFiles/tidesdb.dir/ha_tidesdb.cc.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:10332: storage/tidesdb/CMakeFiles/tidesdb.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....

Having a look as I'm gonna guess we need to pass that wf.

@guycipher
Copy link
Copy Markdown
Author

C++11 is base I am going to assume, 13 seems to be C++17. I'll keep the engine at bay with C++11 as the library aligns with C11.

Build fixes for MariaDB 11.4-11.7 (C++11; buildbot builds with -Werror):
- Replace C++17 structured bindings with C++11 iterator access in the
  FTS write_row/update_row/delete_row and ft_init_ext paths.
- Guard TABLE::part_info with WITH_PARTITION_STORAGE_ENGINE, which only
  defines that member when partitioning is compiled in.

Initialization hardening (PR review feedback):
- Value-initialize TidesDB_share encryption members (encrypted,
  encryption_key_id, encryption_key_version) in the constructor.
- Mark ~ha_tidesdb() override.
@guycipher
Copy link
Copy Markdown
Author

guycipher commented Jun 2, 2026

Something to have a look at: https://buildbot.mariadb.org/#/builders/554/builds/22266 and https://buildbot.mariadb.org/#/builders/554/builds/22266/steps/3/logs/stdio not sure if CI or something I need to touch on at plugin level.

@guycipher
Copy link
Copy Markdown
Author

Caught it, had to create mariadb-plugin-tidesdb.install

@guycipher
Copy link
Copy Markdown
Author

https://buildbot.mariadb.org/#/builders/534/builds/39281/steps/6/logs/stdio buildbot/amd64-ubuntu-2204-debug-ps does not seem to be due to TidesDB implementation. v9.3.3 of TidesDB and v4.5.4 of TideSQL plugin implemented. Ready for review.

@dr-m
Copy link
Copy Markdown
Contributor

dr-m commented Jun 3, 2026

https://buildbot.mariadb.org/#/builders/534/builds/39281/steps/6/logs/stdio buildbot/amd64-ubuntu-2204-debug-ps does not seem to be due to TidesDB implementation.

Right, there is an open ticket MDEV-38195 about failures of the test events.events_restart.

@gkodinov gkodinov added the External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements. label Jun 3, 2026
@gkodinov gkodinov self-assigned this Jun 3, 2026
Copy link
Copy Markdown
Member

@gkodinov gkodinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution! This is a preliminary review.

I'm approving it as it is, as there're no major violations. Or, if there are any, they're in the competency of the final reviewer. Nevertheless, I'll mention what I see that can be improved.

First of all: please consider merging some of the commits. Ideally there should be one commit per distinct "feature". I'd guess that makes 1 in your case.

Next: this technically is a new feature. As such, I'd suggest considering rebasing to the main branch. Unless of course, there is some special consideration why the targed should be as is.

Finally, there should be a Jira for house-keeping. I'll start one for you, using the PR explanation. But this needs to be improved significantly I'd guess, until it evolves into a proper design specification that would help with the final review.

There's also one small thing I've noticed below.

Please stand by for the final review.

Comment thread storage/tidesdb/plug.in Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think you need this file.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @gkodinov understood for 1. For 2, to be able to propagate forward to newer branches selectively or not and main with a floor of support for 11.4. For 3 understood, I appreciate the help with that and I can work off it from there.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think you need this file.

You're right it's a fossil.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @gkodinov understood for 1. For 2, to be able to propagate forward to newer branches selectively or not and main with a floor of support for 11.4. For 3 understood, I appreciate the help with that and I can work off it from there.

Have you tried just plaining atop of the main branch and re-compiling. You're not changing anything in the server itself. So it might work just as well. You could maybe need to alter some of the calls back to the server, but I do not think it has changed that much. Then it's a matter of running the regression tests successfully (and debugging any failures).
I (and the final reviewer) can guide you through that.

Copy link
Copy Markdown
Author

@guycipher guycipher Jun 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All conditionals are handled from 11.4 to 13.02, though verified sparingly there is a full list. Main sounds ok, if condenses is main, we can do that.

@gkodinov gkodinov changed the title Adds TidesDB 9(v9.3.3) storage engine MDEV-39834: Add TidesDB 9(v9.3.3) storage engine Jun 3, 2026
@gkodinov gkodinov requested a review from sanja-byelkin June 3, 2026 09:08
@gkodinov gkodinov assigned sanja-byelkin and unassigned gkodinov Jun 3, 2026
@sanja-byelkin sanja-byelkin requested a review from Copilot June 5, 2026 14:45
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review this pull request because it exceeds the maximum number of lines (20,000). Try reducing the number of changed lines and requesting a review from Copilot again.

@sanja-byelkin
Copy link
Copy Markdown
Member

sanja-byelkin commented Jun 5, 2026

AFAIK 11.4 closed for the new features, why 11.4 not main?

Especially taking into account required dependencies zstd, lz4, and snappy

@sanja-byelkin
Copy link
Copy Markdown
Member

target_compile_options(tidesdb_embedded PRIVATE -w)

Why all warning supressed in so big code (we have enough problems with warnings with existing things why add more)

@guycipher
Copy link
Copy Markdown
Author

AFAIK 11.4 closed for the new features, why 11.4 not main?

Especially taking into account required dependencies zstd, lz4, and snappy

Well, I originally thought per getting it into a lower branch and merging upwards for visibility was a goal per discussion with foundation, I've brought up a few times. If the goal is newer versions only you guys can tell me and I can correct accordingly. TideSQL was written for 11.4 onwards to modern 13.02 in testing. Hitting many granular versions. We have a list if you require. These versions I've supported have conditionals a lot because things do move around quite a bit as you probably know. Thus we want I'm certain the same thing where a user and everyone is happy.

Cheers

@guycipher
Copy link
Copy Markdown
Author

guycipher commented Jun 5, 2026

target_compile_options(tidesdb_embedded PRIVATE -w)

Can you further explain? We build and develop with all sanitizers on and debug flags and running on 16 different platforms and x86 and x64 in continuous integration.

Please if you have a moment look at the TidesDB repository and the TideSQL main repositories.

If you have a concern specifically with this PRs build let me know how you'd prefer it because I am not following currently properly. The plugin itself I'm highly doubtful would have any warnings as we would catch them in engine testing and continuous integration and plugin testing and integration, but i cant remember right now why it's set the way it is, we can modify a compile option for inclusion? Is this what you're pointing out?

Thank you kindly

@guycipher
Copy link
Copy Markdown
Author

guycipher commented Jun 5, 2026

Hm so @sanja-byelkin do you prefer vendored thirdparty code to use not ignore compiler warnings? I can do

TARGET_COMPILE_OPTIONS(tidesdb_embedded PRIVATE -Wno-error)

Which is review friendly I believe. Do let me know, I shall make the change ASAP. Thank you for your review.

Cheers.

@guycipher
Copy link
Copy Markdown
Author

Upstream on the TidesDB library I am doing a scrub with what you've asked for to assure all is well. The flag you brought up is known to have false positives so I will take time with diligence there.

guycipher added 3 commits June 6, 2026 02:36
Refresh the vendored libtidesdb sources under storage/tidesdb/libtidesdb to
release 9.3.5 (the engine previously bundled 9.3.3). Release 9.3.5 also
resolves the compiler warnings for variables that may be used uninitialized,
so the bundled library now compiles clean under that GCC diagnostic.

Compression backends are now optional. compress.c includes zstd, lz4 and
snappy only when the matching TIDESDB_HAVE_ZSTD, TIDESDB_HAVE_LZ4 or
TIDESDB_HAVE_SNAPPY macro is defined. The CMake build now autodetects each
development library and enables the ones that are present, defining the
TIDESDB_HAVE macros and linking each found library, and a build with none
still works. This removes the earlier hard requirement on all three
compression libraries.

Sync ha_tidesdb.cc and the analyze, engine_status, index_stats and
status_vars tests from the upstream tidesql sources.

The full MTR suite passes on MariaDB 11.4 with 65 tests green and the vector
test skipped where the VECTOR type is unavailable.
Per review, replace the blanket -w on the vendored libtidesdb target with
targeted suppression of the three warning categories the third party C library
trips under MariaDB's stricter C flags -Wdeclaration-after-statement (the
library uses C99/C11 declare at use style while MariaDB compiles C as C90),
-Wframe-larger-than (a few functions use large on stack buffers), and
-Wdiscarded-qualifiers (one const qualifier drop). Any other warning still
surfaces and, in maintainer mode, fails the build, so genuine regressions are
no longer masked.
libtidesdb uses C11 stdatomic.h, which MSVC only enables under
/experimental:c11atomics; without it the Windows build fails with C1189
"C atomic support is not enabled". Add an MSVC branch to the tidesdb_embedded
target setting /std:c17 /experimental:c11atomics plus the same targeted /wd
warning disables the library uses upstream (the Windows build compiles with
/WX). The GCC and Clang warning handling is unchanged.
@guycipher
Copy link
Copy Markdown
Author

guycipher commented Jun 6, 2026

MSVC pthreads issue reviewing builtbot, we handle differently at library level. Investigating to see if another patch is required at library level to fit MariaDB.

edit: yeah I'm gonna have to adjust compat layer a bit to comply. Working on that now.

…ds backend

Re-vendor the libtidesdb 9.3.6 sources. 9.3.6 adds a native Win32 threading
backend in compat.h (SRWLOCK mutexes and rwlocks, CONDITION_VARIABLE condition
variables, _beginthreadex threads, and fiber-local storage so thread-local
destructors still run), so on MSVC it no longer includes pthread.h and needs
no pthreads-win32 library.

This fixes the MariaDB MSVC build, which has no pthreads-win32 and was failing
with C1083 "Cannot open include file pthread.h". POSIX, macOS and MinGW keep
using pthreads as before. Only compat.h, tidesdb.c and tidesdb.h change from
9.3.5; the C11 atomics flag added earlier (/experimental:c11atomics) is still
required on MSVC.
balagrivine pushed a commit to balagrivine/tidesdb that referenced this pull request Jun 7, 2026
decode_varint returns negative without writing its output when max_bytes is zero, which
happens on a truncated or corrupt block, and several indexed read and iterator decode sites
used the result without checking the return, so a handful of locals (ks vs vo seq_val
vlog_off) could be read uninitialized. zero initialize those locals so the failure path
yields a defined zero rather than garbage. also add a TIDESDB_WARN_MAYBE_UNINIT option, gcc
only and off by default, that turns the warning on for an optimized dev build where the
analysis actually runs.

analyzed for MariaDB engine addition MariaDB/server#5166 review

-

correct an iterator scan miss under reaper eviction and stop the cross cf atomicity test flaking on commit backpressure

iter_new pinned the level array once then skipped any sstable it could not
immediately ref treating it as dead when in fact the reaper had it in a
transient evicting window with the descriptor still live in the level. a
reader iterator snapshots only once so that skip dropped a live sstable for
the whole life of the scan and a key in it read back as not found. now we
spin on try_ref while the level array is unchanged and only bail to retry
when the array actually changed under us same shape as the point get reaper
evict skip we already correctd.

the cross cf atomicity test counted a transient busy commit as a hard error
which tripped the errors assert on slow loaded ci boxes. commit through
tdb_test_commit_with_retry like the sibling writers so a backpressure stall
retries instead of failing the run.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements.

Development

Successfully merging this pull request may close these issues.

6 participants