跳到主要内容
SeaTunnel 2.3.11 与 Web 1.0.3 Docker 部署:Kafka 同步 Hive/ES 实战 | 极客日志
Java java
SeaTunnel 2.3.11 与 Web 1.0.3 Docker 部署:Kafka 同步 Hive/ES 实战 介绍基于 Docker Compose 部署 SeaTunnel 2.3.11 服务端与 Web 1.0.3 管理平台的完整流程。涵盖 Hive Metastore、MySQL、PostgreSQL 等基础服务配置,以及 Kafka 数据源到 Hive 和 Elasticsearch 的同步任务设置。包含依赖包安装、配置文件调整(如 Hazelcast、application.yml)及常见故障排查(如容器启动退出、Secret Key 配置、Hive 地址解析异常),适用于大数据集成场景的快速落地。
佛系玩家 发布于 2026/4/6 更新于 2026/5/20 34 浏览SeaTunnel 2.3.11 与 Web 1.0.3 Docker 部署:Kafka 同步 Hive/ES 实战
本文档详细介绍如何使用 Docker 部署 SeaTunnel 2.3.11 和 SeaTunnel Web 1.0.3,并配置 Kafka 虚拟表、数据源以及 Kafka 同步到 Hive 和 Elasticsearch 的完整实战案例。
安装准备
目录结构
seatunnel-docker/
├── docker-compose.yml
├── hive/
│ ├── hive-site.xml
│ └── lib/
│ └── postgresql-42.5 .1 .jar
├── init-sql/
│ └── seatunnel_server_mysql.sql
├── seatunnel/
│ ├── Dockerfile
│ └── apache-seatunnel-2.3 .11 /
│ └── lib/
│ ├── hive-exec -3.1 .3 .jar
│ ├── hive-metastore-3.1 .3 .jar
│ ├── libfb303-0.9 .3 .jar
│ ├── mysql-connector-java-8.0 .28 .jar
│ └── seatunnel-hadoop3-3.1 .4 -uber.jar
└── seatunnel-web/
├── Dockerfile
└── apache-seatunnel-web-1.0 .3 -bin /
└── libs/
└── mysql-connector-java-8.0 .28 .jar
下载 seatunnel
git https://github.com/apache/seatunnel-web.git
seatunnel-web
sh build.sh
clone
cd
下载依赖包
https://jdbc.postgresql.org/download/postgresql-42.5.1.jar
https://repo1.maven.org/maven2/org/apache/hive/hive-exec/3.1.3/hive-exec-3.1.3.jar
https://repo1.maven.org/maven2/org/apache/hive/hive-metastore/3.1.3/hive-metastore-3.1.3.jar
https://repo.maven.apache.org/maven2/org/apache/thrift/libfb303/0.9.3/libfb303-0.9.3.jar
https://repo1.maven.org/maven2/org/apache/thrift/libthrift/0.12.0/libthrift-0.12.0.jar
https://repo1.maven.org/maven2/org/apache/hive/hive-common/3.1.3/hive-common-3.1.3.jar
创建项目目录 将准备好的相关文件存放 seatunnel-docker 目录
mkdir seatunnel-docker
cd seatunnel-docker
Docker 部署
docker-compose.yml 配置 version: '3.9'
networks:
seatunnel-network:
driver: bridge
ipam:
config:
- subnet: 172.16 .0 .0 /24
services:
hive-metastore-db:
image: postgres:15
container_name: hive-metastore-db
hostname: hive-metastore-db
environment:
POSTGRES_DB: metastore_db
POSTGRES_USER: hive
POSTGRES_PASSWORD: hive123456
ports:
- "5432:5432"
volumes:
- ./hive-metastore-db-data:/var/lib/postgresql/data
networks:
seatunnel-network:
ipv4_address: 172.16 .0 .2
healthcheck:
test: ["CMD-SHELL" , "pg_isready -U hive -d metastore_db" ]
interval: 5s
timeout: 5s
retries: 10
start_period: 10s
hive-metastore:
image: apache/hive:4.0.0
container_name: hive-metastore
hostname: hive-metastore
depends_on:
hive-metastore-db:
condition: service_healthy
environment:
SERVICE_NAME: metastore
DB_DRIVER: postgres
SERVICE_OPTS: >--Djavax.jdo.option.ConnectionDriverName=org.postgresql.Driver -Djavax.jdo.option.ConnectionURL=jdbc:postgresql://hive-metastore-db:5432/metastore_db -Djavax.jdo.option.ConnectionUserName=hive -Djavax.jdo.option.ConnectionPassword=hive123456
ports:
- "9083:9083"
volumes:
- ./hive/lib/postgresql-42.5.1.jar:/opt/hive/lib/postgresql-42.5.1.jar
- ./hive/hive-site.xml:/opt/hive/conf/hive-site.xml
- ./hive-warehouse:/opt/hive/data/warehouse
networks:
seatunnel-network:
ipv4_address: 172.16 .0 .3
hive-server2:
image: apache/hive:4.0.0
container_name: hive-server2
hostname: hive-server2
depends_on:
- hive-metastore
environment:
HIVE_SERVER2_THRIFT_PORT: 10000
SERVICE_NAME: hiveserver2
IS_RESUME: "true"
SERVICE_OPTS: "-Dhive.metastore.uris=thrift://hive-metastore:9083"
ports:
- "10000:10000"
- "10002:10002"
volumes:
- ./hive-warehouse:/opt/hive/data/warehouse
networks:
seatunnel-network:
ipv4_address: 172.16 .0 .4
mysql-seatunnel:
image: mysql:8.0.42
container_name: mysql-seatunnel
hostname: mysql-seatunnel
environment:
MYSQL_ROOT_PASSWORD: root123456
MYSQL_DATABASE: seatunnel
MYSQL_ROOT_HOST: '%'
ports:
- "3806:3306"
volumes:
- ./mysql_data:/var/lib/mysql
- ./init-sql:/docker-entrypoint-initdb.d
networks:
seatunnel-network:
ipv4_address: 172.16 .0 .5
command:
--default-authentication-plugin=mysql_native_password
healthcheck:
test: ["CMD" , "mysqladmin" , "ping" , "-h" , "localhost" ]
interval: 10s
timeout: 5s
retries: 5
seatunnel-master:
build:
context: ./seatunnel
dockerfile: Dockerfile
image: seatunnel:2.3.11
container_name: seatunnel-master
hostname: seatunnel-master
extra_hosts:
- "hive-metastore:172.16.0.3"
- "hive-metastore-db:172.16.0.2"
environment:
- SEATUNNEL_HOME=/opt/seatunnel
command: > sh -c " cd /opt/seatunnel && exec bin/seatunnel-cluster.sh -r master "
ports:
- "5801:5801"
volumes:
- ./seatunnel/apache-seatunnel-2.3.11/:/opt/seatunnel/
- ./logs/master:/opt/seatunnel/logs
- ./hive-warehouse:/opt/hive/data/warehouse
networks:
seatunnel-network:
ipv4_address: 172.16 .0 .10
seatunnel-worker1:
image: seatunnel:2.3.11
container_name: seatunnel-worker1
hostname: seatunnel-worker1
extra_hosts:
- "hive-metastore:172.16.0.3"
- "hive-metastore-db:172.16.0.2"
environment:
- SEATUNNEL_HOME=/opt/seatunnel
command: > sh -c " cd /opt/seatunnel && exec bin/seatunnel-cluster.sh -r worker "
volumes:
- ./seatunnel/apache-seatunnel-2.3.11/:/opt/seatunnel/
- ./logs/worker1:/opt/seatunnel/logs
- ./hive-warehouse:/opt/hive/data/warehouse
depends_on:
- seatunnel-master
networks:
seatunnel-network:
ipv4_address: 172.16 .0 .11
seatunnel-worker2:
image: seatunnel:2.3.11
container_name: seatunnel-worker2
hostname: seatunnel-worker2
extra_hosts:
- "hive-metastore:172.16.0.3"
- "hive-metastore-db:172.16.0.2"
environment:
- SEATUNNEL_HOME=/opt/seatunnel
command: > sh -c " cd /opt/seatunnel && exec bin/seatunnel-cluster.sh -r worker "
volumes:
- ./seatunnel/apache-seatunnel-2.3.11/:/opt/seatunnel/
- ./logs/worker2:/opt/seatunnel/logs
- ./hive-warehouse:/opt/hive/data/warehouse
depends_on:
- seatunnel-master
networks:
seatunnel-network:
ipv4_address: 172.16 .0 .12
seatunnel-web:
build:
context: ./seatunnel-web
dockerfile: Dockerfile
image: seatunnel-web:1.0.3
container_name: seatunnel-web
hostname: seatunnel-web
extra_hosts:
- "hive-metastore:172.16.0.3"
- "hive-metastore-db:172.16.0.2"
environment:
- SEATUNNEL_HOME=/opt/seatunnel
- SEATUNNEL_WEB_HOME=/opt/seatunnel-web
ports:
- "8801:8801"
volumes:
- ./seatunnel/apache-seatunnel-2.3.11/:/opt/seatunnel/
- ./seatunnel-web/apache-seatunnel-web-1.0.3-bin/:/opt/seatunnel-web/
- ./logs/web:/opt/seatunnel-web/logs
- ./hive-warehouse:/opt/hive/data/warehouse
depends_on:
- seatunnel-master
networks:
seatunnel-network:
ipv4_address: 172.16 .0 .13
SeaTunnel 配置
Dockerfile FROM eclipse-temurin:8-jdk-ubi9-minimal
WORKDIR /opt/seatunnel/
ENV SEATUNNEL_HOME=/opt/seatunnel
ENV PATH=$PATH:$SEATUNNEL_HOME/bin
EXPOSE 5801
CMD ["sh", "bin/seatunnel-cluster.sh", "-r", "master"]
hazelcast-client.yaml 客户端配置 编辑 seatunnel/apache-seatunnel-2.3.11/config/hazelcast-client.yaml:
hazelcast-client:
cluster-name: seatunnel
properties:
hazelcast.logging.type: log4j2
connection-strategy:
connection-retry:
cluster-connect-timeout-millis: 3000
network:
cluster-members:
- seatunnel-master:5801
hazelcast-master.yaml 配置 编辑 seatunnel/apache-seatunnel-2.3.11/config/hazelcast-master.yaml:
hazelcast:
cluster-name: seatunnel
network:
rest-api:
enabled: false
endpoint-groups:
CLUSTER_WRITE:
enabled: true
DATA:
enabled: true
join:
tcp-ip:
enabled: true
member-list:
- seatunnel-master:5801
- seatunnel-worker1:5802
- seatunnel-worker2:5802
port:
auto-increment: false
port: 5801
properties:
hazelcast.invocation.max.retry.count: 20
hazelcast.tcp.join.port.try.count: 30
hazelcast.logging.type: log4j2
hazelcast.operation.generic.thread.count: 50
hazelcast.heartbeat.failuredetector.type: phi-accrual
hazelcast.heartbeat.interval.seconds: 2
hazelcast.max.no.heartbeat.seconds: 180
hazelcast.heartbeat.phiaccrual.failuredetector.threshold: 10
hazelcast.heartbeat.phiaccrual.failuredetector.sample.size: 200
hazelcast.heartbeat.phiaccrual.failuredetector.min.std.dev.millis: 100
hazelcast-worker.yaml 配置 编辑 seatunnel/apache-seatunnel-2.3.11/config/hazelcast-worker.yaml:
hazelcast:
cluster-name: seatunnel
network:
join:
tcp-ip:
enabled: true
member-list:
- seatunnel-master:5801
- seatunnel-worker1:5802
- seatunnel-worker2:5802
port:
auto-increment: false
port: 5802
properties:
hazelcast.invocation.max.retry.count: 20
hazelcast.tcp.join.port.try.count: 30
hazelcast.logging.type: log4j2
hazelcast.operation.generic.thread.count: 50
hazelcast.heartbeat.failuredetector.type: phi-accrual
hazelcast.heartbeat.interval.seconds: 2
hazelcast.max.no.heartbeat.seconds: 180
hazelcast.heartbeat.phiaccrual.failuredetector.threshold: 10
hazelcast.heartbeat.phiaccrual.failuredetector.sample.size: 200
hazelcast.heartbeat.phiaccrual.failuredetector.min.std.dev.millis: 100
安装连接器依赖包 配置同步任务,点击 Source 组件,源名称下拉框没有数据,需要安装依赖才可以显示。
cd seatunnel/apache-seatunnel-2.3.11/
sh bin/install-plugin.sh
SeaTunnel Web 配置
Dockerfile 配置 FROM eclipse-temurin:8-jdk-ubi9-minimal
WORKDIR /opt/seatunnel-web/
ENV SEATUNNEL_WEB_HOME=/opt/seatunnel-web
ENV SEATUNNEL_HOME=/opt/seatunnel
EXPOSE 8801
CMD ["sh", "bin/seatunnel-backend-daemon.sh", "start"]
application.yml 配置 编辑 seatunnel-web/apache-seatunnel-web-1.0.3-bin/conf/application.yml:
server:
port: 8801
spring:
main:
allow-circular-references: true
application:
name: seatunnel
jackson:
date-format: yyyy-MM-dd HH:mm:ss
datasource:
driver-class-name: com.mysql.cj.jdbc.Driver
url: jdbc:mysql://mysql-seatunnel:3306/seatunnel?useSSL=false&useUnicode=true&characterEncoding=utf-8&allowMultiQueries=true&allowPublicKeyRetrieval=true
username: root
password: root123456
jwt:
expireTime: 86400
secretKey: a3f5c8d2e1b4098765432109abcdef1234567890abcdef
algorithm: HS256
hazelcast-client.yaml 客户端配置 编辑 seatunnel-web/apache-seatunnel-web-1.0.3-bin/conf/hazelcast-client.yaml:
hazelcast-client:
cluster-name: seatunnel
properties:
hazelcast.logging.type: log4j2
connection-strategy:
connection-retry:
cluster-connect-timeout-millis: 3000
network:
cluster-members:
- seatunnel-master:5801
seatunnel-backend-daemon.sh 编辑 seatunnel-web/apache-seatunnel-web-1.0.3-bin/bin/seatunnel-backend-daemon.sh:去除后台模式,去掉 nohup 和最后的 &
$JAVA_HOME /bin/java $JAVA_OPTS \-cp "$CLASSPATH " $SPRING_OPTS \ org.apache.seatunnel.app.SeatunnelApplication >>"${LOGDIR} /seatunnel.out" 2>&1
echo "seatunnel-web started"
plugin-mapping.properties 配置 实际验证此步骤可省略。拷贝 seatunnel/apache-seatunnel-2.3.11/connectors/plugin-mapping.properties 到 seatunnel-web/apache-seatunnel-web-1.0.3-bin/conf/plugin-mapping.properties
cd seatunnel-docker
cp seatunnel/apache-seatunnel-2.3.11/connectors/plugin-mapping.properties seatunnel-web/apache-seatunnel-web-1.0.3-bin/conf/plugin-mapping.properties
Hive 配置
hive-site.xml 配置 <?xml version="1.0" encoding="UTF-8" ?>
<configuration >
<property >
<name > hive.metastore.uris</name >
<value > thrift://hive-metastore:9083</value >
</property >
<property >
<name > hive.metastore.warehouse.dir</name >
<value > /opt/hive/data/warehouse</value >
</property >
<property >
<name > metastore.metastore.event.db.notification.api.auth</name >
<value > false</value >
</property >
</configuration >
lib 目录 依赖包
Mysql 配置
init-sql 目录 初始 SQL 脚本 拷贝 seatunnel-web/apache-seatunnel-web-1.0.3-bin/script/seatunnel_server_mysql.sql 到 init-sql/seatunnel_server_mysql.sql
cd seatunnel-docker
cp seatunnel-web/apache-seatunnel-web-1.0.3-bin/script/seatunnel_server_mysql.sql init-sql/seatunnel_server_mysql.sql
docker 启动
docker compose up -d --build
open http://localhost:8801
运行示例
登录配置语言
登录页面
设置
配置语言
配置数据源
kafka 数据源
ES 数据源
Hive-metastore 本地数据源 配置为 thrift://hive-metastore:9083 也可以。
配置虚拟表
虚拟表列表
创建虚拟表流程
进入「虚拟表」菜单,点击「创建」按钮
选择数据源,配置虚拟表信息
点击「下一步」配置字段映射
点击「下一步」确认信息并保存
配置同步任务
kafka-hive 同步任务
任务组件配置
Source 组件配置
FieldMapper 组件配置(模型视图)
Sink 组件配置
Kafka-Elasticsearch 同步任务
任务组件配置
Source 组件配置
FieldMapper 组件配置(模型视图)
Sink 组件配置
创建同步任务通用流程
进入「任务」→「同步任务定义」,点击「创建」按钮
拖拽或选择 Source、FieldMapper、Sink 组件构建任务流程
双击 Source 组件,配置数据源信息(选择已配置的 Kafka 数据源)
双击 FieldMapper 组件,点击「模型」按钮配置字段映射关系
双击 Sink 组件,配置目标数据源信息(Hive 或 Elasticsearch)
保存并启动任务
需要配置 job mode,不然保存不了,报错 job env can't be empty, please change config
hive 相关操作
创建表
docker exec -it hive-server2 beeline -u jdbc:hive2://localhost:10000 -e " CREATE TABLE IF NOT EXISTS default.test_user_data3 ( user_id STRING, type STRING, content STRING ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE; "
docker exec -it hive-server2 beeline -u jdbc:hive2://localhost:10000 -e " CREATE TABLE IF NOT EXISTS default.test_user_data3 ( user_id STRING, type STRING, content STRING ) STORED AS PARQUET; "
查看表结构 docker exec -it hive-server2 beeline -u jdbc:hive2://localhost:10000 -e " SHOW TABLES IN default; DESCRIBE default.test_user_data3; "
查询表数据 docker exec -it hive-server2 beeline -u jdbc:hive2://localhost:10000 -e " SELECT * FROM default.test_user_data3 LIMIT 10; "
备注
seatunnel-web 容器启动就退出
seatunnel-backend-daemon.sh 编辑 seatunnel-web/apache-seatunnel-web-1.0.3-bin/bin/seatunnel-backend-daemon.sh:去除后台模式,去掉 nohup 和最后的 &
$JAVA_HOME /bin/java $JAVA_OPTS \-cp "$CLASSPATH " $SPRING_OPTS \ org.apache.seatunnel.app.SeatunnelApplication >>"${LOGDIR} /seatunnel.out" 2>&1
echo "seatunnel-web started"
seatunnel-web 启动后访问页面报错 Unknown exception. secret key byte array cannot be null or empty jwt:
expireTime: 86400
secretKey: a3f5c8d2e1b4098765432109abcdef1234567890abcdef
algorithm: HS256
hive 地址解析异常 seatunnel seatunnel-web ERROR [qtp2135089262-20] [MetaStoreUtils.logAndThrowMetaException():166] - Got exception : java .net .URISyntaxException Illegal character inhostname at index 44 : thrift :
docker-compose.yml 对应容器加上 ip 绑定
extra_hosts:
- "hive-metastore:172.16.0.3"
- "hive-metastore-db:172.16.0.2"
Hive 同步报错 error java.lang.NoClassDefFoundError seatunnel/apache-seatunnel-2.3.11/lib 存放依赖包
hive-exec-3.1.3.jar
hive-metastore-3.1.3.jar
libfb303-0.9.3.jar
hive 同步任务显示成功,实际没有数据写入 docker-compose.yml 对应容器加上 hive 写入本地目录的配置
volumes:
- ./hive-warehouse:/opt/hive/data/warehouse
查看任务执行日志 will be executed on worker ./logs/master/seatunnel-engine-master.log
Task [TaskGroupLocation{jobId=1080750681855361026, pipelineId=1, taskGroupId=2}] will be executed on worker [[seatunnel-worker2] :5801], slotID [2] , resourceProfile [ResourceProfile{cpu=CPU{core=0}, heapMemory=Memory{bytes=0}}] , sequence [db6b679c-67cc-43b8-b64a-acaa85c2a4c0] , assigned [1080750681855361026]
相关免费在线工具 Keycode 信息 查找任何按下的键的javascript键代码、代码、位置和修饰符。 在线工具,Keycode 信息在线工具,online
Escape 与 Native 编解码 JavaScript 字符串转义/反转义;Java 风格 \uXXXX(Native2Ascii)编码与解码。 在线工具,Escape 与 Native 编解码在线工具,online
JavaScript / HTML 格式化 使用 Prettier 在浏览器内格式化 JavaScript 或 HTML 片段。 在线工具,JavaScript / HTML 格式化在线工具,online
JavaScript 压缩与混淆 Terser 压缩、变量名混淆,或 javascript-obfuscator 高强度混淆(体积会增大)。 在线工具,JavaScript 压缩与混淆在线工具,online
Base64 字符串编码/解码 将字符串编码和解码为其 Base64 格式表示形式即可。 在线工具,Base64 字符串编码/解码在线工具,online
Base64 文件转换器 将字符串、文件或图像转换为其 Base64 表示形式。 在线工具,Base64 文件转换器在线工具,online