之前写了scrapy-cluster的集群配置,没有写它的基础环境搭建,现在补上zookeeper和kafka的安装配置;先讲讲zookeeper的搭建
安装环境:

[root@shulaibao4 ~]# more /etc/issue
CentOS release 6.2 (Final)
Kernel \r on an \m

安装java
我用的是:jdk-8u121-linux-x64.tar.gz(下载地址
下载好后解压:tar -zxvf jdk-8u121-linux-x64.tar.gz
然后移动到/usr/java/目录下: mv jdk1.8.0_121 /usr/java
接着去配置环境:vim /etc/profile
在文件中添加以下配置:

export JAVA_HOME=/usr/java/jdk1.8.0_121
export PATH=$PATH:$JAVA_HOME/bin
export CLASSPATH=$CLASSPATH:$JAVA_HOME/lib:.

然后保存并退出,在命令行下输入 source /etc/profile 加载配置
完成后用 java -version 验证 java 是否安装成功,如不成功请检查你的配置。

zookeeper安装和详解
下载:

[root@shulaibao4 ~]# cd /usr/lib
[root@shulaibao4 lib]# wget http://mirror.bit.edu.cn/apache/zookeeper/zookeeper-3.4.10/zookeeper-3.4.10.tar.gz

下载完成后解压并进入文件:

[root@shulaibao4 lib]# tar -zxf zookeeper-3.4.10.tar.gz
[root@shulaibao4 lib]# cd zookeeper-3.4.10
[root@shulaibao4 zookeeper-3.4.10]# cd conf
[root@shulaibao4 conf]# cp zoo_sample.cfg zoo.cfg
[root@shulaibao4 conf]# vim zoo.cfg

zoo.cfg原始内容如下:

# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial 
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between 
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just 
# example sakes.
dataDir=/tmp/zookeeper
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the 
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1

单机模式下zoo.cfg不需要做修改就可以运行

[root@shulaibao4 conf]# cd ..
[root@shulaibao4 zookeeper-3.4.10]# bin/zkServer.sh start 
ZooKeeper JMX enabled by default
Using config: /usr/lib/zookeeper-3.4.10/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[root@shulaibao4 zookeeper-3.4.10]# bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /usr/lib/zookeeper-3.4.10/bin/../conf/zoo.cfg
Mode: standalone
[root@shulaibao4 zookeeper-3.4.10]# bin/zkServer.sh stop
ZooKeeper JMX enabled by default
Using config: /usr/lib/zookeeper-3.4.10/bin/../conf/zoo.cfg
Stopping zookeeper STOPPED
[root@shulaibao4 zookeeper-3.4.10]# 

接下来讲讲怎么配置参数

tickTime=2000 
# 心跳时间,为了确保client-server连接存在的,以毫秒为单位,最小超时时间为两个心跳时间。

initLimit=10
# 多少个tickTime内,允许其他server连接并初始化数据,如果zooKeeper管理的数据较大,则应相应增大这个值。

syncLimit=5
# 多少个tickTime内,允许follower同步,如果follower落后太多,则会被丢弃

#dataDir=/tmp/zookeeper
dataDir=/home/zookeeper/data
# 用于存放内存数据库快照的文件夹,同时用于集群的myid文件也存在这个文件夹里。

dataLogDir=/home/zookeeper/log
# 用于单独设置transaction log的目录,transaction log分离可以避免和普通log还有快照的竞争。

clientPort=2181
# 客户端监听端口。

#maxClientCnxns=60
# 最大并发客户端数,用于防止DOS的,默认值是10,设置为0是不加限制。

#autopurge.snapRetainCount=3
# 快照保留在dataDir的数量

#autopurge.purgeInterval=1
# 清洗任务间隔时间, 设置为“0”禁用自动清洗功能

server.1=localhost:20881:30881
# 设置集群中的所有要链接的机器, 
# 设置格式为 server.A = B : C : D
# A代表zookeeper.pid的序号,与待会写入数据目录的myid相同,是zookeeper中当前机器的编号,因此A在集群中不可重复。
# B代表zookeeper集群中机器的主机名(ip亦可)
# C代表通信端口
# D代表选举端口

配置完成后保存并退出,再来创建zookeeper的数据目录和日志目录,我的建在/home 目录下,并写入myid

[root@shulaibao4 zookeeper-3.4.10]# mkdir /home/zookeeper
[root@shulaibao4 zookeeper-3.4.10]# mkdir /home/zookeeper/data
[root@shulaibao4 zookeeper-3.4.10]# mkdir /home/zookeeper/log
[root@shulaibao4 zookeeper-3.4.10]# echo 1 > /home/zookeeper/data/myid

接着我们来设置zookeeper在系统中的环境变量vim /etc/profile在其中加入

export PATH=/usr/lib/zookeeper-3.4.10/bin:$PATH

然后source /etc/profile即可
接下来看看我们搭建的集群

[root@shulaibao4 zookeeper-3.4.10]# cd 
[root@shulaibao4 ~]# zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /usr/lib/zookeeper-3.4.10/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[root@shulaibao4 ~]# zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /usr/lib/zookeeper-3.4.10/bin/../conf/zoo.cfg
Mode: standalone
[root@shulaibao4 ~]# zkCli.sh
Connecting to localhost:2181
2017-05-27 16:45:05,671 [myid:] - INFO  [main:Environment@100] - Client environment:zookeeper.version=3.4.10-39d3a4f269333c922ed3db283be479f9deacaa0f, built on 03/23/2017 10:13 GMT
2017-05-27 16:45:05,678 [myid:] - INFO  [main:Environment@100] - Client environment:host.name=shulaibao4
2017-05-27 16:45:05,678 [myid:] - INFO  [main:Environment@100] - Client environment:java.version=jdk1.8.0_121
2017-05-27 16:45:05,683 [myid:] - INFO  [main:Environment@100] - Client environment:java.vendor=Oracle Corporation
2017-05-27 16:45:05,683 [myid:] - INFO  [main:Environment@100] - Client environment:java.home=/usr/java/jdk1.8.0_121/jre
2017-05-27 16:45:05,684 [myid:] - INFO  [main:Environment@100] - Client environment:java.class.path=/usr/lib/zookeeper-3.4.10/bin/../build/classes:/usr/lib/zookeeper-3.4.10/bin/../build/lib/*.jar:/usr/lib/zookeeper-3.4.10/bin/../lib/slf4j-log4j12-1.6.1.jar:/usr/lib/zookeeper-3.4.10/bin/../lib/slf4j-api-1.6.1.jar:/usr/lib/zookeeper-3.4.10/bin/../lib/netty-3.10.5.Final.jar:/usr/lib/zookeeper-3.4.10/bin/../lib/log4j-1.2.16.jar:/usr/lib/zookeeper-3.4.10/bin/../lib/jline-0.9.94.jar:/usr/lib/zookeeper-3.4.10/bin/../zookeeper-3.4.10.jar:/usr/lib/zookeeper-3.4.10/bin/../src/java/lib/*.jar:/usr/lib/zookeeper-3.4.10/bin/../conf::/usr/java/jdk1.8.0_121/lib:.:/usr/java/jdk1.8.0_121/lib:.
2017-05-27 16:45:05,684 [myid:] - INFO  [main:Environment@100] - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2017-05-27 16:45:05,684 [myid:] - INFO  [main:Environment@100] - Client environment:java.io.tmpdir=/tmp
2017-05-27 16:45:05,684 [myid:] - INFO  [main:Environment@100] - Client environment:java.compiler=<NA>
2017-05-27 16:45:05,685 [myid:] - INFO  [main:Environment@100] - Client environment:os.name=Linux
2017-05-27 16:45:05,685 [myid:] - INFO  [main:Environment@100] - Client environment:os.arch=amd64
2017-05-27 16:45:05,685 [myid:] - INFO  [main:Environment@100] - Client environment:os.version=2.6.32-220.el6.x86_64
2017-05-27 16:45:05,685 [myid:] - INFO  [main:Environment@100] - Client environment:user.name=root
2017-05-27 16:45:05,686 [myid:] - INFO  [main:Environment@100] - Client environment:user.home=/root
2017-05-27 16:45:05,686 [myid:] - INFO  [main:Environment@100] - Client environment:user.dir=/root
2017-05-27 16:45:05,689 [myid:] - INFO  [main:ZooKeeper@438] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@57dbe098
2017-05-27 16:45:05,742 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1032] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
Welcome to ZooKeeper!
2017-05-27 16:45:05,761 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@876] - Socket connection established to localhost/127.0.0.1:2181, initiating session
JLine support is enabled
2017-05-27 16:45:05,811 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1299] - Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x15c4911d2d00000, negotiated timeout = 30000

WATCHER::

WatchedEvent state:SyncConnected type:None path:null
[zk: localhost:2181(CONNECTED) 0] ls

这是一台机器的配置,多台机器时只需要修改zoo.cfg文件,在文件中加入所有主机,

server.A = B : C : D

按照规则来配置就行,然后scp 到其他机器上,在其他机器上只要做以下几步:
1, 检查安装java
2, 创建对应的数据目录和日志目录
3, 在数据目录中写入对应的myid的值,
4, 在/etc/profile中加入zookeeper的路径
当这些都做完时,一个完整的集群已搭建完成,可以启动每台机器的zookeeper,看看各自的情况。这里我搭建三台机器,启动集群之后来看看他们各自的状态

机器一:

[root@shulaibao4 ~]# zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /usr/lib/zookeeper-3.4.10/bin/../conf/zoo.cfg
Mode: leader
[root@shulaibao4 ~]# 

机器二:

[root@shulaibao3 ~]# zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /usr/lib/zookeeper-3.4.10/bin/../conf/zoo.cfg
Mode: follower
[root@shulaibao3 ~]#

机器三:

[root@workstation1 ~]# zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /data4/zookeeper-3.4.10/bin/../conf/zoo.cfg
Mode: follower
[root@workstation1 ~]#

详解 zkCli.sh(zkCli.cmd)
zkCli.sh(zkCli.cmd) 是zookeeper提供的命令行工具,方便我们查看服务器的状态,增加,修改,删除数据,zkCli.sh是linux执行文件,zkCli.cmd是windows执行文件,以下以linux为例。
单机启动直接在命令行下输入zkCli.sh即可,在连接集群时需要指定到集群中的某个主机;

[root@shulaibao4 usr]# zkCli.sh -server shulaibao4:2181
Connecting to shulaibao4:2181
      ……  此处略   ……
WatchedEvent state:SyncConnected type:None path:null
[zk: 172.31.22.13:2181(CONNECTED) 0] 

如有疑问请加qq群:526855734

Logo

Kafka开源项目指南提供详尽教程,助开发者掌握其架构、配置和使用,实现高效数据流管理和实时处理。它高性能、可扩展,适合日志收集和实时数据处理,通过持久化保障数据安全,是企业大数据生态系统的核心。

更多推荐