下载

Seatunnel2.3.1源码

Idea中的目录结构

编译

通过maven进行代码编译

编译命令

mvn clean package -pl seatunnel-dist -am -Dmaven.test.skip=true

编译单个模块命令

mvn clean package -pl seatunnel-examples/seatunnel-engine-examples -am -Dmaven.test.skip=true -T 1C

运行

编译完通过SeaTunnelEngineExample类来运行

这样就运行成功啦

附上配置文件样例

Kafka到Redis

env {
        execution.parallelism = 1
        job.mode = STREAMING
        checkpoint.interval = 20000
       }
source {
	   Kafka {
            bootstrap.servers = "xxx:9092,xxx:9092,xxx:9092"
            topic = "test_in"
            consumer.group = "1673212376113"
            format="json"
            result_table_name="kafka"
             schema = {
                fields {
                    cont = "STRING"
                }
            }
        }
}
sink {
    Redis {
        host = "xxx.xxx.x.xxx"
        port = "6379"
        key = "test_20230507"
        data_type = list
        auth = "xxx"
    }
}

Mysql到Redis

env {

  execution.parallelism = 2

  job.mode = "BATCH"

}

source {

  Jdbc {

      url = "jdbc:mysql://xxxxxx:3306/xxxxx"

      driver = "com.mysql.cj.jdbc.Driver"

      connection_check_timeout_sec = 100

      user = "root"

      password = "xxxxx"

      query = "select * from test"

      # partition_column = "id"

      # partition_num = 10

  }



}

sink {

Redis {

  host = xxxxxx

  port = 6379

  key = "seatunnel_jdbc"

  data_type = list

  auth = "xxxxxx"

}

}

运行结果

常见问题

java.lang.RuntimeException: Plugin PluginIdentifier{engineType='seatunnel', pluginType='source', pluginName='XXXX'} not found.

 这种报错是seatunnel-engine-examples模块缺少引用,在pom中引用下对应插件重新编译即可

<dependency>
    <groupId>org.apache.seatunnel</groupId>
    <artifactId>connector-rabbitmq</artifactId>
    <version>${project.version}</version>
</dependency>

mvn clean package -pl seatunnel-dist -am -Dmaven.test.skip=true

如果maven提示下载listenablefuture失败,可以通过连接下载

Could not find artifact com.google.guava:listenablefuture:jar:sources:9999.0-empty-to-avoid-conflict-with-guava

Logo

Kafka开源项目指南提供详尽教程,助开发者掌握其架构、配置和使用,实现高效数据流管理和实时处理。它高性能、可扩展,适合日志收集和实时数据处理,通过持久化保障数据安全,是企业大数据生态系统的核心。

更多推荐