feat(hive): add HmsClient connection lifecycle and URI parsing#796
Open
MisterRaindrop wants to merge 2 commits into
Open
feat(hive): add HmsClient connection lifecycle and URI parsing#796MisterRaindrop wants to merge 2 commits into
MisterRaindrop wants to merge 2 commits into
Conversation
Add the first iceberg_hive code that actually talks to a Hive Metastore
over Thrift, mirroring the structure of iceberg-rust's HmsCatalog::new
while keeping every Thrift type out of the public header via a pImpl.
* HmsEndpoint + ParseHmsUris(): tolerant URI parser that accepts
`thrift://host:port`, bare `host:port`, missing-port (defaults to
9083), and comma-separated lists for HA failover, with whitespace
around each segment stripped. Returns InvalidArgument for empty
hosts, non-numeric or out-of-range ports, and empty list segments.
* HmsClient::Connect(): wires TSocket -> TBufferedTransport /
TFramedTransport (selected by HiveCatalogProperties::ThriftTransport)
-> TBinaryProtocol -> ThriftHiveMetastoreClient. Connect / socket
timeouts come from the properties. Connection failures are caught
and translated into ErrorKind::kIOError so callers never see raw
Thrift exceptions. The dtor best-effort-closes the transport.
* HmsClient::Impl holds the Thrift state in destruction-order
(client -> protocol -> transport -> socket) so teardown is clean.
* src/iceberg/test/hms_client_test.cc adds 15 GoogleTest cases:
11 covering ParseHmsUris (single, multi-HA, default port, scheme
prefix, whitespace, empty/bad host/port edges) and 4 covering
HmsClient::Connect's error paths (missing URI, bad URI, invalid
transport mode, unreachable HMS).
* src/iceberg/test/CMakeLists.txt gains an add_hive_iceberg_test()
helper (mirroring add_rest_iceberg_test) and the hive_catalog_test
target gated on ICEBERG_BUILD_HIVE.
`ctest --test-dir build --output-on-failure` now reports 17/17 passing
(16 previous + 1 new hive_catalog_test with 15 cases inside).
Part of the iceberg-cpp HiveCatalog port (C06).
Add an Ubuntu CI job that configures with -DICEBERG_BUILD_HIVE=ON (via ICEBERG_EXTRA_CMAKE_ARGS) so the iceberg_hive library and hive_catalog_test are compiled and run, and enable ICEBERG_BUILD_HIVE in the cpp-linter build so clang-tidy resolves hms_client.cc's Thrift include paths from the compilation database. Thrift comes from Arrow's bundled build (ICEBERG_BUNDLE_THRIFT=ON, the default), so no system Thrift install is required.
e2817ce to
31299ec
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
ParseHmsUris()— tolerant HMS URI parser:thrift://host:port, barehost:port, default port 9083, and comma-separated HA lists. Rejects empty hosts and invalid ports.HmsClient::Connect()— sets up theTSocket → TTransport → TBinaryProtocol → ThriftHiveMetastoreClientchain (transport and timeouts fromHiveCatalogProperties), translating Thrift exceptions tokIOError. Thrift types stay out of the public header via a pImpl.Ubuntu Hivejob builds with-DICEBERG_BUILD_HIVE=ON(bundled Thrift) and runshive_catalog_test.