Configuring ClickHouse Keeper
ClickHouse Keeper is a built-in distributed coordination service for ClickHouse, responsible for replication operation and the execution of distributed DDL queries. Keeper is fully protocol-compatible with ZooKeeper but is implemented in C++ and uses the RAFT consensus algorithm, which provides linearizable writes and more predictable behavior during failures.
A separate ClickHouse Keeper setup is only required for cluster scenarios: when you have multiple ClickHouse nodes and use replicated tables (ReplicatedMergeTree) or distributed ON CLUSTER queries. In a single-server installation, Keeper is not mandatory and does not need to be deployed.
Keeper Deployment Options
ClickHouse Keeper can operate in two modes:
-
As a separate service. It is installed via the
clickhouse-keeperpackage and runs as a separate process with its own configuration file:- main configuration file:
/etc/clickhouse-keeper/keeper_config.xml - additional files:
/etc/clickhouse-keeper/keeper_config.d/*.xml or *.yaml
- main configuration file:
-
As part of the
clickhouse-serverprocess. In this case, the<keeper_server>configuration block is added to the main server configuration:- main file:
/etc/clickhouse-server/config.xml - or a separate file in
/etc/clickhouse-server/config.d/keeper.xml
- main file:
The recommended approach for production is separate ClickHouse Keeper nodes and separate configuration files in /etc/clickhouse-keeper/, to enable independent scaling and maintenance of the coordination cluster.
Basic ClickHouse Keeper Configuration Structure
The main configuration block for Keeper is the <keeper_server> element. Typically, it includes:
Keeper Configuration Block
<clickhouse>
<logger>
<level>trace</level>
<log>/var/log/clickhouse-keeper/clickhouse-keeper.log</log>
<errorlog>/var/log/clickhouse-keeper/clickhouse-keeper.err.log</errorlog>
<size>1000M</size>
<count>10</count>
</logger>
<max_connections>4096</max_connections>
<listen_host>0.0.0.0</listen_host>
<keeper_server>
<!-- Port on which clients (ClickHouse servers or applications) connect to Keeper -->
<tcp_port>9181</tcp_port>
<!-- Unique identifier of the Keeper node in the cluster -->
<server_id>1</server_id>
<log_storage_path>/var/lib/clickhouse/coordination/logs</log_storage_path>
<snapshot_storage_path>/var/lib/clickhouse/coordination/snapshots</snapshot_storage_path>
<!-- Internal coordination settings -->
<coordination_settings>
<operation_timeout_ms>10000</operation_timeout_ms>
<min_session_timeout_ms>10000</min_session_timeout_ms>
<session_timeout_ms>100000</session_timeout_ms>
<raft_logs_level>information</raft_logs_level>
<compress_logs>false</compress_logs>
</coordination_settings>
<!-- enable sanity hostname checks for cluster configuration (e.g. if localhost is used with remote endpoints) -->
<hostname_checks_enabled>true</hostname_checks_enabled>
<!-- Description of all Keeper nodes participating in the quorum -->
<raft_configuration>
<server>
<id>1</id>
<!-- Internal port and hostname -->
<hostname>ch-keeper-01</hostname>
<port>9234</port>
</server>
<server>
<id>2</id>
<!-- Internal port and hostname -->
<hostname>ch-keeper-02</hostname>
<port>9234</port>
</server>
<server>
<id>3</id>
<!-- Internal port and hostname -->
<hostname>ch-keeper-03</hostname>
<port>9234</port>
</server>
<!-- Add more servers here -->
</raft_configuration>
</keeper_server>
<openSSL>
<server>
<!-- Used for secure tcp port -->
<!-- openssl req -subj "/CN=localhost" -new -newkey rsa:2048 -days 365 -nodes -x509 -keyout /etc/clickhouse-server/server.key -out /etc/clickhouse-server/server.crt -->
<!-- <certificateFile>/etc/clickhouse-keeper/server.crt</certificateFile> -->
<!-- <privateKeyFile>/etc/clickhouse-keeper/server.key</privateKeyFile> -->
<!-- dhparams are optional. You can delete the <dhParamsFile> element.
To generate dhparams, use the following command:
openssl dhparam -out /etc/clickhouse-keeper/dhparam.pem 4096
Only file format with BEGIN DH PARAMETERS is supported.
-->
<!-- <dhParamsFile>/etc/clickhouse-keeper/dhparam.pem</dhParamsFile> -->
<verificationMode>none</verificationMode>
<loadDefaultCAFile>true</loadDefaultCAFile>
<cacheSessions>true</cacheSessions>
<disableProtocols>sslv2,sslv3</disableProtocols>
<preferServerCiphers>true</preferServerCiphers>
</server>
</openSSL>
</clickhouse>
Key parameters:
tcp_port— Port for client connections (ClickHouse servers,clickhouse-keeper-client, utilities). Recommended value is9181(to avoid conflict with the standard ZooKeeper port2181)server_id— Unique numeric identifier for a Keeper node. Values must be unique. A simple sequence like1, 2, 3, ...is recommendedlog_storage_path— Path for RAFT coordination logssnapshot_storage_path— Directory for state snapshots (compressed state of the znode tree)coordination_settings— Detailed settings for timeouts, heartbeat frequency, snapshot, and log parameters. In most cases, the basic values from the example are sufficientraft_configuration— Description of all participants in the RAFT quorum
A Keeper cluster should consist of an odd number of nodes (typically 3 or 5). This is necessary to achieve a quorum and increase fault tolerance.
Configuring the RAFT Quorum (<raft_configuration>)
The <raft_configuration> block describes all Keeper nodes that participate in the RAFT quorum:
<raft_configuration>
<secure>false</secure>
<server>
<id>1</id>
<hostname>ch-keeper-01</hostname>
<port>9234</port>
</server>
<server>
<id>2</id>
<hostname>ch-keeper-02</hostname>
<port>9234</port>
</server>
<server>
<id>3</id>
<hostname>ch-keeper-03</hostname>
<port>9234</port>
</server>
</raft_configuration>
For each <server>, the following are defined:
id— The server's identifier within the RAFT quorum. It must match theserver_idin the<keeper_server>block on the corresponding nodehostname— The hostname through which other nodes can reach this Keeper. Using DNS names rather than IP addresses is recommended to maintain a stable mapping ofserver_id↔hostnameport— The port for internal communication between Keeper nodes (inter-server RAFT port). This is different from thetcp_portused by clients
When replacing or migrating a Keeper node, it is crucial not to reuse an old server_id for a different physical server and not to "mix up" the server_id ↔ hostname mapping. This is critical for the correctness of the RAFT quorum.
Internal Coordination Settings
The <coordination_settings> block controls timeouts and RAFT operation parameters:
<coordination_settings>
<operation_timeout_ms>10000</operation_timeout_ms>
<min_session_timeout_ms>10000</min_session_timeout_ms>
<session_timeout_ms>100000</session_timeout_ms>
<raft_logs_level>information</raft_logs_level>
<compress_logs>false</compress_logs>
</coordination_settings>
In most cases, it is sufficient to use the recommended default values. Modifying them is only advisable in the event of specific issues (frequent leader re-elections, unstable network, very large metadata volume).
Configuration Placement and Keeper Startup
Separate clickhouse-keeper Service
In this scenario:
- main configuration file:
/etc/clickhouse-keeper/keeper_config.xml - additional files:
/etc/clickhouse-keeper/keeper_config.d/*.xml or *.yaml
After configuring, execute the following commands:
sudo systemctl enable clickhouse-keeper
sudo systemctl start clickhouse-keeper
sudo systemctl status clickhouse-keeper
Alternatively, Keeper can be started directly:
clickhouse-keeper --config /etc/clickhouse-keeper/keeper_config.xml
# or
clickhouse keeper --config /etc/clickhouse-keeper/keeper_config.xml
Keeper Embedded within the clickhouse-server Process
If Keeper runs as part of clickhouse-server, the <keeper_server> block is added to the server configuration:
<clickhouse>
<!-- ... other ClickHouse settings ... -->
<keeper_server> ... </keeper_server>
</clickhouse>
After modifying the configuration, simply restart the server:
sudo systemctl restart clickhouse-server
In this case, the ClickHouse and Keeper processes are combined. In production environments, this approach is typically used only for small testing setups.
Integrating ClickHouse Keeper with ClickHouse Server
From the perspective of ClickHouse servers, ClickHouse Keeper appears as a ZooKeeper-compatible service. On each ClickHouse node, the coordinators must be specified in the <zookeeper> section:
<clickhouse>
<!-- ... -->
<zookeeper>
<node>
<host>ch-keeper-01</host>
<port>9181</port>
</node>
<node>
<host>ch-keeper-02</host>
<port>9181</port>
</node>
<node>
<host>ch-keeper-03</host>
<port>9181</port>
</node>
</zookeeper>
<!-- Macros for configuring replicated tables -->
<macros>
<cluster>cluster_name</cluster>
<shard>01</shard>
<replica>01</replica>
</macros>
</clickhouse>
It is important that the list of nodes in <zookeeper> matches the actual Keeper cluster configuration (the same hostname and tcp_port defined in <keeper_server>).
The <listen_host> Parameter and IPv6 Considerations
By default, ClickHouse creates several listen_host entries to listen on the local interface for both IPv4 and IPv6 (127.0.0.1 and ::1). If you need to accept connections from other hosts, the following line is typically added:
<listen_host>0.0.0.0</listen_host>
This is a wildcard address for IPv4 only: the server will listen on all IPv4 interfaces of the node but will not open ports on IPv6. For systems using only IPv4, this option is preferable and additionally avoids several IPv6-related issues.
On some distributions, IPv6 is disabled or not configured in the kernel. In such cases, attempting to listen on the IPv6 wildcard address can cause errors.
<listen_host>::</listen_host>
If the server does not have actual IPv6 support, it is recommended to:
-
avoid using
<listen_host>::</listen_host> -
explicitly specify the IPv4 option:
<listen_host>0.0.0.0</listen_host> -
if necessary, additionally disable IPv6 in the ClickHouse/Keeper configuration:
<enable_ipv6>false</enable_ipv6>
This will force ClickHouse to listen only on IPv4 interfaces and eliminate errors related to the absence of IPv6.
Simultaneous IPv4 and IPv6 Support
If IPv6 is used in your infrastructure and the server must accept connections via both protocols, you can use the IPv6 wildcard address instead of 0.0.0.0:
<listen_host>::</listen_host>
In this case, ClickHouse will open ports on all IPv6 and IPv4 interfaces (provided IPv6 is enabled in the system). This mode should only be enabled in combination with proper firewall, network ACL, and user access policy configuration.
Verifying ClickHouse Keeper Functionality
Via ClickHouse
From the perspective of ClickHouse servers, the state of Keeper can be checked by querying the system.zookeeper system table:
SELECT *
FROM system.zookeeper
WHERE path IN ('/', '/clickhouse');
The presence of the /clickhouse node and its service branches (e.g., /clickhouse/task_queue/ddl) indicates that Keeper is accessible and being used for coordination.
Via the clickhouse-keeper-client Utility
ClickHouse includes a console utility, clickhouse-keeper-client, which can work with Keeper using its native protocol:
clickhouse-keeper-client -h ch-keeper-01 -p 9181
In interactive mode, commands similar to the ZooKeeper client are available:
ls /— list child znodesget '/clickhouse'— read the value of a node- s
et '/clickhouse/test' 'value'— write a value exists '/clickhouse'— check if a node exists
This is a convenient way to verify Keeper availability and diagnose the contents of the znode tree.
Four-letter Commands
Like ZooKeeper, ClickHouse Keeper supports a set of "four-letter" commands, which are sent to the client port via TCP, for example using nc:
echo mntr | nc ch-keeper-01 9181
- the
mntrcommand outputs metric values: node status (leader/follower), connection count, latencies, etc. - the
statcommand shows a brief summary of server and client status, andruokchecks service availability (returnsimokif operational)
Operational Recommendations
- for production clusters, a minimum of 3 ClickHouse Keeper nodes on separate hosts or containers is recommended
- when modifying the cluster topology (adding/removing Keeper nodes), pay close attention to maintaining the uniqueness of
server_idand the consistency of theserver_id↔hostnamemapping across all configurations