Skip to content

docs: How to deploy a replicated ClickHouse cluster with embedded ClickHouse Keeper#790

Open
zlcnju wants to merge 1 commit into
mainfrom
ck-embedded-keeper-cluster
Open

docs: How to deploy a replicated ClickHouse cluster with embedded ClickHouse Keeper#790
zlcnju wants to merge 1 commit into
mainfrom
ck-embedded-keeper-cluster

Conversation

@zlcnju

@zlcnju zlcnju commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

Summary

Adds a How-To article: How to Deploy a Replicated ClickHouse Cluster with Embedded ClickHouse Keeper (docs/en/solutions/).

  • Coordination topology comparison: external ZooKeeper vs standalone Keeper vs embedded Keeper, with the embedded topology recommended for log storage scenarios.
  • Documents that the ClickHouse Operator shipped with the platform (Altinity 0.20.x base) does not include the upstream ClickHouseKeeperInstallation (CHK) controller — the CHK CRD is not installed — so Keeper must run embedded in the ClickHouse Pods or as a standalone StatefulSet.
  • Full ClickHouseInstallation manifest: 1 shard × 3 replicas forming a 3-member Keeper Raft quorum, static keeper_config.xml injected per shard, dynamic server_id/raft_configuration generated by an init container via include_from, anti-affinity, raft-port readiness probe, per-replica Service exposing 9181/9444.
  • Verification steps: mntr four-letter command, system.zookeeper, ReplicatedMergeTree smoke test across replicas, system.replicas health.
  • Recommended settings for log storage workloads (system-table TTLs, max_execution_time, retention TTLs) and the trade-offs/constraints of the embedded topology.

Validation

Validated on an ACP 4.2 environment (ClickHouse Operator v4.2.3 / chop 0.20.0, ClickHouse Server 25.x):

  • Confirmed the CHK API group returns 404 and only CHI/CHIT/CHOP CRDs are installed.
  • Manifest structure derived from the log-storage ClickHouse chart in production use, genericized with placeholders.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Documentation
    • New guide for deploying a replicated ClickHouse cluster using embedded ClickHouse Keeper on Kubernetes platforms, including architecture overview, complete configuration examples, step-by-step verification commands, and operational guidance for cluster management, scaling, and maintenance.

…kHouse Keeper

Covers coordination topology choice (external ZooKeeper / standalone
Keeper / embedded Keeper), a full ClickHouseInstallation manifest with
init-container-generated raft configuration, quorum and replication
verification, and recommended settings for log storage workloads.

Notes that the platform ClickHouse Operator (0.20.x) does not include
the upstream ClickHouseKeeperInstallation controller, so embedded or
standalone Keeper is required.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@zlcnju zlcnju deployed to translate June 12, 2026 09:55 — with GitHub Actions Active
@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Walkthrough

This pull request adds a comprehensive how-to guide for deploying a replicated ClickHouse cluster using embedded ClickHouse Keeper on Kubernetes. The document covers architecture decisions, prerequisites, manifest examples with init-container configuration logic, step-by-step verification procedures, and operational constraints.

Changes

Embedded ClickHouse Keeper Deployment Guide

Layer / File(s) Summary
Document Purpose & Topology Architecture
docs/en/solutions/How_to_Deploy_a_Replicated_ClickHouse_Cluster_with_Embedded_ClickHouse_Keeper.md
Document scope and applicability defined. Explains coordination needs for replicated tables, contrasts three Kubernetes topologies (embedded, ZooKeeper, standalone Keeper), and details embedded topology quorum sizing and configuration injection patterns.
Prerequisites & Keeper Client Service
docs/en/solutions/How_to_Deploy_a_Replicated_ClickHouse_Cluster_with_Embedded_ClickHouse_Keeper.md
Lists prerequisites and required environment variables. Provides headless Service manifest that selects ClickHouse Pods via keeper role label and exposes port 9181 for Keeper quorum connectivity.
ClickHouseInstallation Manifest & Deployment
docs/en/solutions/How_to_Deploy_a_Replicated_ClickHouse_Cluster_with_Embedded_ClickHouse_Keeper.md
Complete ClickHouseInstallation manifest for 1-shard/3-replica layout with embedded Keeper, including static keeper_config.xml fragment, init-container that generates Pod-specific server_id and raft_configuration, readiness probes, and per-replica service ports. Includes key points and deployment instructions.
Verification & Validation Steps
docs/en/solutions/How_to_Deploy_a_Replicated_ClickHouse_Cluster_with_Embedded_ClickHouse_Keeper.md
Ordered verification sequence: waiting for installation completion, confirming Keeper leader/follower roles via mntr four-letter command, querying system tables for coordination connectivity, and smoke-testing replicated table creation, insertion, and replication across replicas.
Operational Settings & Constraints
docs/en/solutions/How_to_Deploy_a_Replicated_ClickHouse_Cluster_with_Embedded_ClickHouse_Keeper.md
Recommended operational settings for log storage workloads (TTL, query execution limits, Keeper read permissions, parallelism, retention/quota). Documents resource coupling, restart behavior constraints, and guidance to use standalone Keeper for divergent scaling needs.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 A new guide hops into docs so bright,
With Keeper embedded, keeping replicas tight,
Init-containers dance, quorums align,
And verification commands help systems shine! ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding documentation on deploying a replicated ClickHouse cluster with embedded ClickHouse Keeper. It is specific, concise, and directly reflects the new documentation file added.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch ck-embedded-keeper-cluster

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@docs/en/solutions/How_to_Deploy_a_Replicated_ClickHouse_Cluster_with_Embedded_ClickHouse_Keeper.md`:
- Around line 286-287: Update the paragraph to explicitly state that scaling
shards requires updating both SHARDS_COUNT and REPLICAS_COUNT because the Raft
config generation loop uses SHARDS_COUNT as well as REPLICAS_COUNT; mention the
server_id formula `SHARD * REPLICAS_COUNT + REPLICA + 1`, the init container
that writes into the in-memory emptyDir and merges via `include_from`, and that
static fragments under `layout.shards[].files` remain unchanged—so when adding
shards you must increase SHARDS_COUNT (not just REPLICAS_COUNT) to avoid
truncating the generated member list.
- Around line 212-241: The init script uses DOMAIN=$(hostname -d) which returns
only the DNS search suffix, breaking the regex that expects the full pod FQDN;
change DOMAIN assignment to use the full hostname (DOMAIN=$(hostname -f) or
`hostname --fqdn`) so the existing regex that extracts DOMAIN_NAME and
DOMAIN_SUFFIX from the full FQDN works correctly; keep the current regex and
downstream variables (DOMAIN_NAME, DOMAIN_SUFFIX, MY_ID, KEEPER_ID) unchanged so
the generated Raft peer hostnames are correct.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6cef2b9e-dc4b-42d4-bc22-4ceea6233807

📥 Commits

Reviewing files that changed from the base of the PR and between 4d9cdd9 and f27dc33.

📒 Files selected for processing (1)
  • docs/en/solutions/How_to_Deploy_a_Replicated_ClickHouse_Cluster_with_Embedded_ClickHouse_Keeper.md

Comment on lines +212 to +241
HOST=$(hostname -s)
DOMAIN=$(hostname -d)
# StatefulSet Pod hostname: <chi>-<cluster>-<shard>-<replica>-<ordinal>
if [[ $HOST =~ (.*)-([0-9]+)-([0-9]+)-([0-9]+)$ ]]; then
SHARD=${BASH_REMATCH[2]}
REPLICA=${BASH_REMATCH[3]}
else
echo "Failed to parse shard/replica from hostname $HOST"; exit 1
fi
# Pod FQDN domain: <chi>-<cluster>-<shard>-<replica>.<namespace>.svc.<zone>
if [[ $DOMAIN =~ ^(.*)-([0-9]+)-([0-9]+)\.(.*)$ ]]; then
DOMAIN_NAME=${BASH_REMATCH[1]}
DOMAIN_SUFFIX=.${BASH_REMATCH[4]}
else
echo "Failed to parse domain $DOMAIN"; exit 1
fi

MY_ID=$((SHARD * REPLICAS_COUNT + REPLICA + 1))
KEEPER_ID=1
{
echo "<clickhouse>"
echo " <keeper_server>"
echo " <server_id>${MY_ID}</server_id>"
echo " <raft_configuration>"
for (( i=0; i<SHARDS_COUNT; i++ )); do
for (( j=0; j<REPLICAS_COUNT; j++ )); do
echo " <server>"
echo " <id>${KEEPER_ID}</id>"
echo " <hostname>${DOMAIN_NAME}-${i}-${j}${DOMAIN_SUFFIX}</hostname>"
echo " <port>${RAFT_PORT}</port>"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use the full FQDN here, not hostname -d.

The regex below expects the full pod FQDN, but hostname -d only returns the DNS suffix. As written, the init container will fail to derive the correct DOMAIN_NAME / DOMAIN_SUFFIX, and the generated Raft peer list will be wrong.

🛠️ Suggested fix
-                  DOMAIN=$(hostname -d)
+                  DOMAIN=$(hostname -f)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
HOST=$(hostname -s)
DOMAIN=$(hostname -d)
# StatefulSet Pod hostname: <chi>-<cluster>-<shard>-<replica>-<ordinal>
if [[ $HOST =~ (.*)-([0-9]+)-([0-9]+)-([0-9]+)$ ]]; then
SHARD=${BASH_REMATCH[2]}
REPLICA=${BASH_REMATCH[3]}
else
echo "Failed to parse shard/replica from hostname $HOST"; exit 1
fi
# Pod FQDN domain: <chi>-<cluster>-<shard>-<replica>.<namespace>.svc.<zone>
if [[ $DOMAIN =~ ^(.*)-([0-9]+)-([0-9]+)\.(.*)$ ]]; then
DOMAIN_NAME=${BASH_REMATCH[1]}
DOMAIN_SUFFIX=.${BASH_REMATCH[4]}
else
echo "Failed to parse domain $DOMAIN"; exit 1
fi
MY_ID=$((SHARD * REPLICAS_COUNT + REPLICA + 1))
KEEPER_ID=1
{
echo "<clickhouse>"
echo " <keeper_server>"
echo " <server_id>${MY_ID}</server_id>"
echo " <raft_configuration>"
for (( i=0; i<SHARDS_COUNT; i++ )); do
for (( j=0; j<REPLICAS_COUNT; j++ )); do
echo " <server>"
echo " <id>${KEEPER_ID}</id>"
echo " <hostname>${DOMAIN_NAME}-${i}-${j}${DOMAIN_SUFFIX}</hostname>"
echo " <port>${RAFT_PORT}</port>"
HOST=$(hostname -s)
DOMAIN=$(hostname -f)
# StatefulSet Pod hostname: <chi>-<cluster>-<shard>-<replica>-<ordinal>
if [[ $HOST =~ (.*)-([0-9]+)-([0-9]+)-([0-9]+)$ ]]; then
SHARD=${BASH_REMATCH[2]}
REPLICA=${BASH_REMATCH[3]}
else
echo "Failed to parse shard/replica from hostname $HOST"; exit 1
fi
# Pod FQDN domain: <chi>-<cluster>-<shard>-<replica>.<namespace>.svc.<zone>
if [[ $DOMAIN =~ ^(.*)-([0-9]+)-([0-9]+)\.(.*)$ ]]; then
DOMAIN_NAME=${BASH_REMATCH[1]}
DOMAIN_SUFFIX=.${BASH_REMATCH[4]}
else
echo "Failed to parse domain $DOMAIN"; exit 1
fi
MY_ID=$((SHARD * REPLICAS_COUNT + REPLICA + 1))
KEEPER_ID=1
{
echo "<clickhouse>"
echo " <keeper_server>"
echo " <server_id>${MY_ID}</server_id>"
echo " <raft_configuration>"
for (( i=0; i<SHARDS_COUNT; i++ )); do
for (( j=0; j<REPLICAS_COUNT; j++ )); do
echo " <server>"
echo " <id>${KEEPER_ID}</id>"
echo " <hostname>${DOMAIN_NAME}-${i}-${j}${DOMAIN_SUFFIX}</hostname>"
echo " <port>${RAFT_PORT}</port>"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@docs/en/solutions/How_to_Deploy_a_Replicated_ClickHouse_Cluster_with_Embedded_ClickHouse_Keeper.md`
around lines 212 - 241, The init script uses DOMAIN=$(hostname -d) which returns
only the DNS search suffix, breaking the regex that expects the full pod FQDN;
change DOMAIN assignment to use the full hostname (DOMAIN=$(hostname -f) or
`hostname --fqdn`) so the existing regex that extracts DOMAIN_NAME and
DOMAIN_SUFFIX from the full FQDN works correctly; keep the current regex and
downstream variables (DOMAIN_NAME, DOMAIN_SUFFIX, MY_ID, KEEPER_ID) unchanged so
the generated Raft peer hostnames are correct.

Comment on lines +286 to +287
- **Quorum and layout.** `shardsCount: 1` and `replicasCount: 3` produce 3 Pods, each a Keeper Raft member. Keep an odd member count; 3 members tolerate one failure. If you add shards, every replica of every shard joins the quorum, and the `server_id` formula `SHARD * REPLICAS_COUNT + REPLICA + 1` stays unique as long as `REPLICAS_COUNT` in the init container matches the real layout.
- **Static vs dynamic Keeper config.** The static part (`tcp_port`, data `path`, coordination settings) is injected per shard through `layout.shards[].files`. The identity-dependent part (`server_id`, `raft_configuration`) is generated by the init container into an in-memory `emptyDir` and merged via `include_from`. Replica changes therefore require only updating `REPLICAS_COUNT`, not editing the static fragment.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Clarify that shard scaling also requires SHARDS_COUNT.

The Raft config loop uses both SHARDS_COUNT and REPLICAS_COUNT. Calling out only REPLICAS_COUNT makes the “add shards” guidance incomplete and can leave the generated member list truncated.

✍️ Suggested wording change
-  - **Quorum and layout.** `shardsCount: 1` and `replicasCount: 3` produce 3 Pods, each a Keeper Raft member. Keep an odd member count; 3 members tolerate one failure. If you add shards, every replica of every shard joins the quorum, and the `server_id` formula `SHARD * REPLICAS_COUNT + REPLICA + 1` stays unique as long as `REPLICAS_COUNT` in the init container matches the real layout.
+  - **Quorum and layout.** `shardsCount: 1` and `replicasCount: 3` produce 3 Pods, each a Keeper Raft member. Keep an odd member count; 3 members tolerate one failure. If you add shards, update both `SHARDS_COUNT` and `REPLICAS_COUNT` in the init container; the `server_id` formula `SHARD * REPLICAS_COUNT + REPLICA + 1` stays unique as long as both values match the real layout.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@docs/en/solutions/How_to_Deploy_a_Replicated_ClickHouse_Cluster_with_Embedded_ClickHouse_Keeper.md`
around lines 286 - 287, Update the paragraph to explicitly state that scaling
shards requires updating both SHARDS_COUNT and REPLICAS_COUNT because the Raft
config generation loop uses SHARDS_COUNT as well as REPLICAS_COUNT; mention the
server_id formula `SHARD * REPLICAS_COUNT + REPLICA + 1`, the init container
that writes into the in-memory emptyDir and merges via `include_from`, and that
static fragments under `layout.shards[].files` remain unchanged—so when adding
shards you must increase SHARDS_COUNT (not just REPLICAS_COUNT) to avoid
truncating the generated member list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant