本文将记录使用clickhouse镜像,分别在两种场景下搭建6节点集群(3分片2副本):1.在一台机器上使用容器方式安装clickhouse集群;2.在六台机器上使用容器方式安装clickhouse集群。

此次使用的是的官方镜像,使用下面命令下载:

docker pull clickhouse/clickhouse-server:21.11

先介绍第一种方式:

1.获取配置文件

    此次集群搭建需求是将clickhouse配置文件映射出来,但是由于官网镜像的设置,如果我们映射配置文件目录到本地而本地目录是空的,会导致clickhouse容器无法启动。所以我这里为了解决这个问题先启动一个clickhouse容器将配置文件拷贝到宿主机上。操作如下:

启动一个clickhouse容器

docker run -d --name clickhouse-server  clickhouse/clickhouse-server:21.11

拷贝配置文件目录到宿主机当前目录

docker cp  clickhouse-server:/etc/clickhouse-server/ ./

2.修改配置文件

(1)修改config.xml配置文件178行,将该行注释去掉:

 167 
 168     <!-- Listen specified address.
 169          Use :: (wildcard IPv6 address), if you want to accept connections both with IPv4 and IPv6 from everywhere.
 170          Notes:
 171          If you open connections from wildcard address, make sure that at least one of the following measures applied:
 172          - server is protected by firewall and not accessible from untrusted networks;
 173          - all users are restricted to subset of network addresses (see users.xml);
 174          - all users have strong passwords, only secure (TLS) interfaces are accessible, or connections are only made via TLS interfaces.
 175          - users without password have readonly access.
 176          See also: https://www.shodan.io/search?query=clickhouse
 177       -->
 178     <!-- <listen_host>::</listen_host> --> #将此行注释去掉
 179 
 180     <!-- Same for hosts without support for IPv6: -->
 181     <!-- <listen_host>0.0.0.0</listen_host> -->
 182 
 183     <!-- Default values - try listen localhost on IPv4 and IPv6. -->
 184     <!--
 185     <listen_host>::1</listen_host>
 186     <listen_host>127.0.0.1</listen_host>
 187     -->
 188 
 189     <!-- Don't exit if IPv6 or IPv4 networks are unavailable while trying to listen. -->
:set number

(2)在config.xml配置文件614行后增加2行注释1行以下内容:

 612     <!-- Configuration of clusters that could be used in Distributed tables.
 613          https://clickhouse.com/docs/en/operations/table_engines/distributed/
 614       -->
 615     <include_from>/etc/clickhouse-server/metrika.xml</include_from> #新增
 616     <remote_servers incl="clickhouse_remote_servers" > #新增
 617     <!-- <remote_servers> --> #注释掉或者删除
 618         <!-- Test only shard config for testing distributed storage -->
 619         <test_shard_localhost>
 620             <!-- Inter-server per-cluster secret for Distributed queries
 621                  default: no secret (no authentication will be performed)
 622                  
 623                  If set, then Distributed queries will be validated on shards, so at least:
 624                  - such cluster should exist on the shard,
 625                  - such cluster should have the same secret.
 626 
 627                  And also (and which is more important), the initial_user will
 628                  be used as current user for the query.
 629 
 630                  Right now the protocol is pretty simple and it only takes into account:

(3) 在config.xml配置文件755行(上步操作后的行数),增加下面内容(<zookeeper>和<macros>):

 748       -->
 749 
 750     <!-- ZooKeeper is used to store metadata about replicas, when using Replicated tables.
 751          Optional. If you don't use replicated tables, you could omit that.
 752 
 753          See https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/replication/
 754       -->
 755     <zookeeper incl="zookeeper-servers" optional="true" />
 756     <macros incl="macros" optional="true"/>
 757     <!--
 758     <zookeeper>
 759         <node>
 760             <host>example1</host>
 761             <port>2181</port>
 762         </node>
 763         <node>
 764             <host>example2</host>
 765             <port>2181</port>
 766         </node>
 767         <node>
 768             <host>example3</host>
 769             <port>2181</port>

(4)修改user.xml 文件,给default用户增加密码:

 66                  In first line will be password and in second - corresponding double SHA1.
 67             -->
 68             <password>jiubugaosuni</password>
 69 
 70             <!-- List of networks with open access.
 71 
 72                  To open access from everywhere, specify:
 73                     <ip>::/0</ip>
 74 
 75                  To open access only from localhost, specify:
 76                     <ip>::1</ip>
 77                     <ip>127.0.0.1</ip>

(5) 新增配置文件metrika.xml(clickhouse的集群配置文件),clickhouse集群是依赖zookeeper的,在这里我们需要zookeeper的容器集群,zookeeper的容器集群配置请看这里 。由于该文件<macros>标签内容不一样,根据每个clickhouse服务器是哪个分片和副本定义,下面展示的clickhouse-server01的metrika.xml文件内容如下:

<yandex>
<clickhouse_remote_servers>
    <perftest_3shards_2replicas>
        <shard>
            <internal_replication>true</internal_replication>
            <replica>
                <host>clickhouse-server01</host>
                <port>9000</port>
		<user>default</user>
		<password>jiubugaosuni</password>
            </replica>
            <replica>
                <host>clickhouse-server02</host>
                <port>9000</port>
		<user>default</user>
                <password>jiubugaosuni</password>
            </replica>
        </shard>
        <shard>
            <internal_replication>true</internal_replication>
            <replica>
                <host>clickhouse-server03</host>
                <port>9000</port>
		<user>default</user>
                <password>jiubugaosuni</password>
            </replica>
            <replica>
                <host>clickhouse-server04</host>
                <port>9000</port>
		<user>default</user>
                <password>jiubugaosuni</password>
            </replica>
        </shard>
        <shard>
            <internal_replication>true</internal_replication>
            <replica>
                <host>clickhouse-server05</host>
                <port>9000</port>
                <user>default</user>
                <password>jiubugaosuni</password>
            </replica>
            <replica>
                <host>clickhouse-server06</host>
                <port>9000</port>
                <user>default</user>
                <password>jiubugaosuni</password>
            </replica>
        </shard>
    </perftest_3shards_2replicas>
</clickhouse_remote_servers>

<zookeeper-servers>
  <node index="1">
    <host>zookeeper1</host>
    <port>2181</port>
  </node>
  <node index="2">
    <host>zookeeper2</host>
    <port>2181</port>
  </node>
  <node index="3">
    <host>zookeeper3</host>
    <port>2181</port>
  </node>
</zookeeper-servers>

<macros>
    <layer>01</layer>
    <shard>01</shard>
    <replica>cluster01-01-01</replica>
</macros>


<networks>
   <ip>::/0</ip>
</networks>


<clickhouse_compression>
<case>
  <min_part_size>10000000000</min_part_size>
                                             
  <min_part_size_ratio>0.01</min_part_size_ratio>                                                                                                                                       
  <method>lz4</method>
</case>
</clickhouse_compression>

</yandex>

 下面是clickhouse-server02配置文件metrika.xml中<macros>标签内容:

<macros>
    <layer>01</layer>
    <shard>01</shard>
    <replica>cluster02-01-02</replica>
</macros>

下面是clickhouse-server03配置文件metrika.xml中<macros>标签内容: 

<macros>
    <layer>02</layer>
    <shard>02</shard>
    <replica>cluster03-02-01</replica>
</macros>

 下面是clickhouse-server04配置文件metrika.xml中<macros>标签内容:

<macros>
    <layer>02</layer>
    <shard>02</shard>
    <replica>cluster04-02-02</replica>
</macros>

 下面是clickhouse-server05配置文件metrika.xml中<macros>标签内容: 

<macros>
    <layer>03</layer>
    <shard>03</shard>
    <replica>cluster05-03-01</replica>
</macros>

  下面是clickhouse-server06配置文件metrika.xml中<macros>标签内容: 

<macros>
    <layer>03</layer>
    <shard>03</shard>
    <replica>cluster06-03-02</replica>
</macros>

2.创建6个节点的配置目录,并将配置文件拷贝入目录,注意每个clickhouse-server的metrika.xml配置文件内容不一样,需要单独修改!

mkdir clickhouse-server01
mkdir clickhouse-server02
mkdir clickhouse-server03
mkdir clickhouse-server04
mkdir clickhouse-server05
mkdir clickhouse-server06

cp -r ./clickhouse-server/* ./clickhouse-server01/
cp -r ./clickhouse-server/* ./clickhouse-server02/
cp -r ./clickhouse-server/* ./clickhouse-server03/
cp -r ./clickhouse-server/* ./clickhouse-server04/
cp -r ./clickhouse-server/* ./clickhouse-server05/
cp -r ./clickhouse-server/* ./clickhouse-server06/

 3.启动容器

docker run -d --name clickhouse-server01 --network=zk_network --ulimit nofile=262144:262144   -v /Users/ch/chetc/clickhouse-server01/:/etc/clickhouse-server/ -p 8123:8123 clickhouse/clickhouse-server:21.11
docker run -d --name clickhouse-server02 --network=zk_network --ulimit nofile=262144:262144   -v /Users/ch/chetc/clickhouse-server02/:/etc/clickhouse-server/ -p 8124:8123 clickhouse/clickhouse-server:21.11
docker run -d --name clickhouse-server03 --network=zk_network --ulimit nofile=262144:262144   -v /Users/ch/chetc/clickhouse-server03/:/etc/clickhouse-server/ -p 8125:8123 clickhouse/clickhouse-server:21.11
docker run -d --name clickhouse-server04 --network=zk_network --ulimit nofile=262144:262144   -v /Users/ch/chetc/clickhouse-server04/:/etc/clickhouse-server/ -p 8126:8123 clickhouse/clickhouse-server:21.11
docker run -d --name clickhouse-server05 --network=zk_network --ulimit nofile=262144:262144   -v /Users/ch/chetc/clickhouse-server05/:/etc/clickhouse-server/ -p 8127:8123 clickhouse/clickhouse-server:21.11
docker run -d --name clickhouse-server06 --network=zk_network --ulimit nofile=262144:262144   -v /Users/ch/chetc/clickhouse-server06/:/etc/clickhouse-server/ -p 8128:8123 clickhouse/clickhouse-server:21.11

 如果要映射数据文件到宿主机命令如下:

docker run -d --name clickhouse-server02 --network=zk_network --ulimit nofile=262144:262144 --volume=$HOME/clickhouse-server02:/var/lib/clickhouse  -v $HOME/clickhouse/clickhouse-server02/:/etc/clickhouse-server/ -p 8124:8123 clickhouse-server:21.11

4.集群验证

执行下面命令进入clickhouse客户端并执行这两个SQL查询集群信息:

% docker exec -it clickhouse-server01 clickhouse-client --user=default --password=jiubugaosuni

ClickHouse client version 21.11.11.1 (official build).
Connecting to localhost:9000 as user default.
Connected to ClickHouse server version 21.11.11 revision 54450.

6d65d19291ad :) SELECT * FROM system.clusters;
6d65d19291ad :) select * from system.zookeeper where path='/clickhouse';

如果可以查询到集群各节点IP信息等,集群应该是OK的。

5.创建集群表

以下命令均在clickhouse客户端中执行

(1)创建一个复制表,在节点1上执行:

% docker exec -it clickhouse-server01 clickhouse-client --user=default --password=jiubugaosuni

ClickHouse client version 21.11.11.1 (official build).
Connecting to localhost:9000 as user default.
Connected to ClickHouse server version 21.11.11 revision 54450.


6d65d19291ad :) CREATE TABLE t1(`id` Int32,`name` String) ENGINE = ReplicatedMergeTree('/clickhouse/tables/01-01/t1','cluster01-01-1') ORDER BY id ;

CREATE TABLE t1
(
    `id` Int32,
    `name` String
)
ENGINE = ReplicatedMergeTree('/clickhouse/tables/01-01/t1', 'cluster01-01-1')
ORDER BY id

Query id: a8dbc590-25ab-4a26-98cc-e2f7f8142d54

Ok.

0 rows in set. Elapsed: 0.369 sec. 

6d65d19291ad :) 

在节点2执行:

 % docker exec -it clickhouse-server02 clickhouse-client --user=default --password=jiubugaosuni

ClickHouse client version 21.11.11.1 (official build).
Connecting to localhost:9000 as user default.
Connected to ClickHouse server version 21.11.11 revision 54450.

7319dec54df9 :) CREATE TABLE t1(`id` Int32,`name` String) ENGINE = ReplicatedMergeTree('/clickhouse/tables/01-01/t1','cluster01-01-2') ORDER BY id ;

CREATE TABLE t1
(
    `id` Int32,
    `name` String
)
ENGINE = ReplicatedMergeTree('/clickhouse/tables/01-01/t1', 'cluster01-01-2')
ORDER BY id

Query id: 2b106332-2948-405b-be3e-8b71a33a0bf7

Ok.

0 rows in set. Elapsed: 0.235 sec. 

7319dec54df9 :) 

顺势在1节点插入数据,在2节点上看看有没有同步过来:

7319dec54df9 :) insert into t1 values(1,'aa'),(2,'bb'),(3,'cc');

INSERT INTO t1 FORMAT Values

Query id: d7498a8b-34e5-46e3-a1a9-37a9d44dd36d

Ok.

3 rows in set. Elapsed: 0.103 sec. 

7319dec54df9 :) exit
Bye.
 % docker exec -it clickhouse-server01 clickhouse-client --user=default --password=jiubugaosuni

ClickHouse client version 21.11.11.1 (official build).
Connecting to localhost:9000 as user default.
Connected to ClickHouse server version 21.11.11 revision 54450.

6d65d19291ad :) select * from t1;

SELECT *
FROM t1

Query id: 06c32ea1-6d35-4c44-95d9-4474ba0b3c58

┌─id─┬─name─┐
│  1 │ aa   │
│  2 │ bb   │
│  3 │ cc   │
└────┴──────┘

3 rows in set. Elapsed: 0.009 sec. 

6d65d19291ad :) 

复制表OK!

(2)创建一个分布式表:

未完待续

Logo

Kafka开源项目指南提供详尽教程,助开发者掌握其架构、配置和使用,实现高效数据流管理和实时处理。它高性能、可扩展,适合日志收集和实时数据处理,通过持久化保障数据安全,是企业大数据生态系统的核心。

更多推荐