前言:Zookeeper、Kafak的集群部署这里不过多说明。

版本:Kafka2.3。

一 、 kafak Connect的分布式配置文件

 

  1. ##
    # Licensed to the Apache Software Foundation (ASF) under one or more
    # contributor license agreements.  See the NOTICE file distributed with
    # this work for additional information regarding copyright ownership.
    # The ASF licenses this file to You under the Apache License, Version 2.0
    # (the "License"); you may not use this file except in compliance with
    # the License.  You may obtain a copy of the License at
    #
    #    http://www.apache.org/licenses/LICENSE-2.0
    #
    # Unless required by applicable law or agreed to in writing, software
    # distributed under the License is distributed on an "AS IS" BASIS,
    # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    # See the License for the specific language governing permissions and
    # limitations under the License.
    ##
    
    # This file contains some of the configurations for the Kafka Connect distributed worker. This file is intended
    # to be used with the examples, and some settings may differ from those used in a production system, especially
    # the `bootstrap.servers` and those specifying replication factors.
    
    # A list of host/port pairs to use for establishing the initial connection to the Kafka cluster.
    bootstrap.servers=heng-042:9092,heng-043:9092,heng-044:9092
    
    # unique name for the cluster, used in forming the Connect cluster group. Note that this must not conflict with consumer group IDs
    group.id=connect-cluster
    
    # The converters specify the format of data in Kafka and how to translate it into Connect data. Every Connect user will
    # need to configure these based on the format they want their data in when loaded from or stored into Kafka
    key.converter=org.apache.kafka.connect.json.JsonConverter
    value.converter=org.apache.kafka.connect.json.JsonConverter
    # Converter-specific settings can be passed in by prefixing the Converter's setting with the converter we want to apply
    # it to
    key.converter.schemas.enable=true
    value.converter.schemas.enable=true
    
    # Topic to use for storing offsets. This topic should have many partitions and be replicated and compacted.
    # Kafka Connect will attempt to create the topic automatically when needed, but you can always manually create
    # the topic before starting Kafka Connect if a specific topic configuration is needed.
    # Most users will want to use the built-in default replication factor of 3 or in some cases even specify a larger value.
    # Since this means there must be at least as many brokers as the maximum replication factor used, we'd like to be able
    # to run this example on a single-broker cluster and so here we instead set the replication factor to 1.
    offset.storage.topic=connect-offsets
    offset.storage.replication.factor=1
    #offset.storage.partitions=25
    
    # Topic to use for storing connector and task configurations; note that this should be a single partition, highly replicated,
    # and compacted topic. Kafka Connect will attempt to create the topic automatically when needed, but you can always manually create
    # the topic before starting Kafka Connect if a specific topic configuration is needed.
    # Most users will want to use the built-in default replication factor of 3 or in some cases even specify a larger value.
    # Since this means there must be at least as many brokers as the maximum replication factor used, we'd like to be able
    # to run this example on a single-broker cluster and so here we instead set the replication factor to 1.
    config.storage.topic=connect-configs
    config.storage.replication.factor=1
    
    # Topic to use for storing statuses. This topic can have multiple partitions and should be replicated and compacted.
    # Kafka Connect will attempt to create the topic automatically when needed, but you can always manually create
    # the topic before starting Kafka Connect if a specific topic configuration is needed.
    # Most users will want to use the built-in default replication factor of 3 or in some cases even specify a larger value.
    # Since this means there must be at least as many brokers as the maximum replication factor used, we'd like to be able
    # to run this example on a single-broker cluster and so here we instead set the replication factor to 1.
    status.storage.topic=connect-status
    status.storage.replication.factor=1
    #status.storage.partitions=5
    
    # Flush much faster than normal, which is useful for testing/debugging
    offset.flush.interval.ms=10000
    
    # These are provided to inform the user about the presence of the REST host and port configs
    # Hostname & Port for the REST API to listen on. If this is set, it will bind to the interface used to listen to requests.
    #rest.host.name=
    rest.port=8083
    
    # The Hostname & Port that will be given out to other workers to connect to i.e. URLs that are routable from other servers.
    # 此处的两个端口本人在测试时使用的是一直的在分布式下可以自动转发
    rest.advertised.host.name=heng-042
    rest.advertised.port=8083
    
    # Set to a list of filesystem paths separated by commas (,) to enable class loading isolation for plugins
    # (connectors, converters, transformations). The list should consist of top level directories that include
    # any combination of:
    # a) directories immediately containing jars with plugins and their dependencies
    # b) uber-jars with plugins and their dependencies
    # c) directories immediately containing the package directory structure of classes of plugins and their dependencies
    # Examples:
    # plugin.path=/usr/local/share/java,/usr/local/share/kafka/plugins,/opt/connectors,
    #plugin.path=
    

    注意: 要配置的work节点都修改该配置

  2. 启动Kafak Connect(在每台的work节点都要启动)

    connect-distributed.sh -daemon /data/kafka/config/connect-distributed.properties 
  3. 通过jps可以看到进程名为ConnectDistributed的进程

二 、Kafak Connect Rest Api 测试

  1. 添加一个连接器
    curl -i -k  -H "Content-type: application/json" -X POST -d '{"name":"test","config":{"topic":"test-topic","connector.class":"FileStreamSource","tasks.max":"10","file":"/data/log1.out"}}' http://localhost:8083/connectors

     

  2. 查看刚刚创建的连接器
    curl  http://localhost:8083/connectors
  3. 为了验证分布高可用性可以停止部分节点的rest服务。其他Api自行测试

 

 

 

Logo

Kafka开源项目指南提供详尽教程,助开发者掌握其架构、配置和使用,实现高效数据流管理和实时处理。它高性能、可扩展,适合日志收集和实时数据处理,通过持久化保障数据安全,是企业大数据生态系统的核心。

更多推荐