【Maxwell】02 Kafka配置

一、快速搭建Kafka环境

基于Docker容器创建（供参考）：

https://www.cnblogs.com/mindzone/p/15608984.html

这里简要写一下命令：

# 拉取zk + kafka的镜像

docker pull wurstmeister/zookeeper

docker pull wurstmeister/kafka

# 创建zk容器

docker run -d --name zookeeper -p 2181:2181 -t wurstmeister/zookeeper

# 创建kafka容器

docker run -d --name kafka \

-p 9092:9092 \

-e KAFKA_BROKER_ID=0 \

-e KAFKA_ZOOKEEPER_CONNECT=Linux主机IP:2181 \

-e KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://Linux主机IP:9092 \

-e KAFKA_LISTENERS=PLAINTEXT://0.0.0.0:9092 \

-t wurstmeister/kafka

# 检查kafka运行情况

docker ps

测试Topic消息是否正常生产和消费（注意终端是阻塞的，需要多开终端窗口测试）：

#窗口1 生产

[root@centos-linux ~]# docker exec -it kafka /bin/bash

bash-4.4# kafka-console-producer.sh --broker-list localhost:9092 --topic topic名称

#窗口2 消费

[root@centos-linux ~]# docker exec -it kafka /bin/bash

bash-4.4# kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic topic名称 --from-beginning

# 样例

bash-4.4# kafka-console-producer.sh --broker-list localhost:9092 --topic producer


bash-4.4# kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic producer --from-beginning

二、配置Maxwell 绑定Kafka

1、方式一，简单命令参数启动：

cd /usr/local/maxwell-1.29.2

./bin/maxwell \

--user='maxwell' \

--password='123456' \

--host='192.168.2.225' \

--port='3308' \

--producer=kafka \

--kafka.bootstrap.servers=localhost:9092 \

--kafka_topic=producer \

--jdbc_options='useSSL=false&serverTimezone=Asia/Shanghai'

Maxwell运行成功的输出：

[root@localhost maxwell-1.29.2]# ./bin/maxwell \

> --user='maxwell' \

> --password='123456' \

> --host='192.168.2.225' \

> --port='3308' \

> --producer=kafka \

> --kafka.bootstrap.servers=localhost:9092 \

> --kafka_topic=producer \

> --jdbc_options='useSSL=false&serverTimezone=Asia/Shanghai'

Using kafka version: 1.0.0

14:13:50,533 INFO  Maxwell - Starting Maxwell. maxMemory: 247332864 bufferMemoryUsage: 0.25

14:13:50,783 INFO  ProducerConfig - ProducerConfig values:

    acks = 1

    batch.size = 16384

    bootstrap.servers = [localhost:9092]

    buffer.memory = 33554432

    client.id =

    compression.type = snappy

    connections.max.idle.ms = 540000

    enable.idempotence = false

    interceptor.classes = null

    key.serializer = class org.apache.kafka.common.serialization.StringSerializer

    linger.ms = 0

    max.block.ms = 60000

    max.in.flight.requests.per.connection = 5

    max.request.size = 1048576

    metadata.max.age.ms = 300000

    metric.reporters = []

    metrics.num.samples = 2

    metrics.recording.level = INFO

    metrics.sample.window.ms = 30000

    partitioner.class = class org.apache.kafka.clients.producer.internals.DefaultPartitioner

    receive.buffer.bytes = 32768

    reconnect.backoff.max.ms = 1000

    reconnect.backoff.ms = 50

    request.timeout.ms = 30000

    retries = 0

    retry.backoff.ms = 100

    sasl.jaas.config = null

    sasl.kerberos.kinit.cmd = /usr/bin/kinit

    sasl.kerberos.min.time.before.relogin = 60000

    sasl.kerberos.service.name = null

    sasl.kerberos.ticket.renew.jitter = 0.05

    sasl.kerberos.ticket.renew.window.factor = 0.8

    sasl.mechanism = GSSAPI

    security.protocol = PLAINTEXT

    send.buffer.bytes = 131072

    ssl.cipher.suites = null

    ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]

    ssl.endpoint.identification.algorithm = null

    ssl.key.password = null

    ssl.keymanager.algorithm = SunX509

    ssl.keystore.location = null

    ssl.keystore.password = null

    ssl.keystore.type = JKS

    ssl.protocol = TLS

    ssl.provider = null

    ssl.secure.random.implementation = null

    ssl.trustmanager.algorithm = PKIX

    ssl.truststore.location = null

    ssl.truststore.password = null

    ssl.truststore.type = JKS

    transaction.timeout.ms = 60000

    transactional.id = null

    value.serializer = class org.apache.kafka.common.serialization.StringSerializer

14:13:50,847 INFO  AppInfoParser - Kafka version : 1.0.0

14:13:50,847 INFO  AppInfoParser - Kafka commitId : aaa7af6d4a11b29d

14:13:50,871 INFO  Maxwell - Maxwell v1.29.2 is booting (MaxwellKafkaProducer), starting at Position[BinlogPosition[mysql-bin.000005:225424], lastHeartbeat=1642486284932]

14:13:51,040 INFO  MysqlSavedSchema - Restoring schema id 1 (last modified at Position[BinlogPosition[mysql-bin.000005:16191], lastHeartbeat=0])

14:13:51,205 INFO  BinlogConnectorReplicator - Setting initial binlog pos to: mysql-bin.000005:225424

14:13:51,235 INFO  BinaryLogClient - Connected to 192.168.2.225:3308 at mysql-bin.000005/225424 (sid:6379, cid:215)

14:13:51,235 INFO  BinlogConnectorReplicator - Binlog connected.

2、方式二、写在配置文件中：

cd /usr/local/maxwell-1.29.2

vim config.properties

参数项：

kafka_topic=maxwell

producer=kafka

kafka.bootstrap.servers=localhost:9092

host=192.168.2.225

user=maxwell

password=123456

port=3308

启动：

cd /usr/local/maxwell-1.29.2

./bin/maxwell \

--config ./config.properties \

--jdbc_options='useSSL=false&serverTimezone=Asia/Shanghai'

三、Kafka监听测试

由Kafka监听后，maxwell不再打印信息，后台运行，交由kafka发送
在DB操作非查询SQL时，可以发现Kafka消费者能够收到消息

消费者终端的消息：

[root@localhost maxwell-1.29.2]# docker exec -it kafka /bin/bash

bash-5.1# bash-4.4# kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic producer --from-beginning

bash: bash-4.4#: command not found

bash-5.1# kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic producer --from-beginning

[2022-01-18 06:09:16,853] WARN [Consumer clientId=consumer-console-consumer-5789-1, groupId=console-consumer-5789] Error while fetching metadata with correlation id 2 : {producer=LEADER_NOT_AVAILABLE} (org.apache.kafka.cli

[2022-01-18 06:09:16,987] WARN [Consumer clientId=consumer-console-consumer-5789-1, groupId=console-consumer-5789] Error while fetching metadata with correlation id 4 : {producer=LEADER_NOT_AVAILABLE} (org.apache.kafka.cli

hello

aaaaaaaaaaaaaaa

{"database":"test-db","table":"day_sale","type":"delete","ts":1642486851,"xid":71876,"commit":true,"data":{"ID":166,"PRODUCT":"产品C","CHANNEL":"淘宝","AMOUNT":2497.0000,"SALE_DATE":"2022-01-18 13:48:48"}}

四、Kafka分区控制

1、用途：

希望kakfa能够并行执行，因为监听的消息都只送到一个分区的队列上，效率太慢
让Kafka进行并发发送，就多开分区进行，每个分区同时执行消息发送

2、问题：

教程并没有说明是如何关联库和分区的关系，只是会有不同

3、技术要点：

如何配置maxwell对kafka的分区？

参考config.properties对kafka配置的说明：

#       *** kafka ***

# list of kafka brokers

#kafka.bootstrap.servers=hosta:9092,hostb:9092

# kafka topic to write to

# this can be static, e.g. 'maxwell', or dynamic, e.g. namespace_%{database}_%{table}

# in the latter case 'database' and 'table' will be replaced with the values for the row being processed

#kafka_topic=maxwell

# alternative kafka topic to write DDL (alter/create/drop) to.  Defaults to kafka_topic

#ddl_kafka_topic=maxwell_ddl

-- 这段是关于分区的配置信息：

#           *** partitioning ***

# 按照什么方式对数据进行划分？

# What part of the data do we partition by?

# 参数项：库 表 主键 事务ID 线程ID 字段

# producer_partition_by=database # [database, table, primary_key, transaction_id, thread_id, column]

# 如果选用字段来对数据进行划分， 指定在使用producer\u partition\u by=column时，分区依据的字段

# specify what fields to partition by when using producer_partition_by=column

# column separated list.

# 指明字段使用的是哪些

# producer_partition_columns=id,foo,bar

# 如果指明的字段不存在，则会分区规则回退到库名进行划分

# when using producer_partition_by=column, partition by this when

# the specified column(s) don't exist.

# producer_partition_by_fallback=database

#            *** kinesis ***

# kinesis_stream=maxwell

# AWS places a 256 unicode character limit on the max key length of a record

# http://docs.aws.amazon.com/kinesis/latest/APIReference/API_PutRecord.html

#

# Setting this option to true enables hashing the key with the md5 algorithm

# before we send it to kinesis so all the keys work within the key size limit.

# Values: true, false

# Default: false

#kinesis_md5_keys=true

4、分区测试案例：

- 1、创建新的Topic并分配6个分区

# 进入kafka容器

docker exec -it kafka /bin/bash


# 创建主题并分配分区 （必须添加副本参数）

kafka-topics.sh --zookeeper 192.168.177.129:2181 --topic maxwell --create --replication-factor 1 --partitions 6

副本数量 1

--replication-factor 1

分区数量 6

--partitions 6

- 2、更新maxwell配置（按字段配置很少，就按照库划分配置即可）

# kafka配置

producer=kafka

kafka.bootstrap.servers=localhost:9092

# 改Topic名称

kafka_topic=maxwell

# 改分区配置

producer_partition_by=database

- 3、重新启动maxwell

cd /usr/local/maxwell-1.29.2

./bin/maxwell \

--config ./config.properties \

--jdbc_options='useSSL=false&serverTimezone=Asia/Shanghai'

- 4、向库中写入数据，然后查看kafka消息（使用Kafka tool可视化工具）

这一步省略具体步骤，只要是DML操作就行，效果查看使用【Kafka Tool】工具（offset explorer）

五、关于Kafka分区配置的命令补充

Kafka基于这些命令脚本实现功能：

[root@localhost maxwell-1.29.2]# docker exec -it kafka ls /opt/kafka_2.13-2.8.1/bin

connect-distributed.sh               kafka-preferred-replica-election.sh

connect-mirror-maker.sh              kafka-producer-perf-test.sh

connect-standalone.sh                kafka-reassign-partitions.sh

kafka-acls.sh                        kafka-replica-verification.sh

kafka-broker-api-versions.sh         kafka-run-class.sh

kafka-cluster.sh                     kafka-server-start.sh

kafka-configs.sh                     kafka-server-stop.sh

kafka-console-consumer.sh            kafka-storage.sh

kafka-console-producer.sh            kafka-streams-application-reset.sh

kafka-consumer-groups.sh             kafka-topics.sh

kafka-consumer-perf-test.sh          kafka-verifiable-consumer.sh

kafka-delegation-tokens.sh           kafka-verifiable-producer.sh

kafka-delete-records.sh              trogdor.sh

kafka-dump-log.sh                    windows

kafka-features.sh                    zookeeper-security-migration.sh

kafka-leader-election.sh             zookeeper-server-start.sh

kafka-log-dirs.sh                    zookeeper-server-stop.sh

kafka-metadata-shell.sh              zookeeper-shell.sh

kafka-mirror-maker.sh

语句执行报错

kafka-topics.sh --zookeeper 192.168.177.129:2181 --topic maxwell --create --replication-factor 2 --partitions 3

[2022-01-18 08:19:44,532] ERROR org.apache.kafka.common.errors.InvalidReplicationFactorException:

Replication factor: 4 larger than available brokers: 1.

报错思路分析

https://www.cnblogs.com/tyoutetu/p/10855283.html

# 即，需要Kafka集群，一个Kafka代表一个broker，副本必须小于等于集群的数量
--replication-factor （指定数量必须小于等于Kafka集群数，如果单个，写1即可）

不能修改分区数量的原因：

# 分区的数量只能增加，不能减少

bash-5.1# kafka-topics.sh --zookeeper 192.168.177.129:2181 -alter --partitions 3 --topic maxwell

WARNING: If partitions are increased for a topic that has a key, the partition logic or ordering of the messages will be affected

Error while executing topic command : The number of partitions for a topic can only be increased. Topic maxwell currently has 6 partitions, 3 would not be

[2022-01-18 08:28:42,743] ERROR org.apache.kafka.common.errors.InvalidPartitionsException: The number of partitions for a topic can only be increased. Topi

(kafka.admin.TopicCommand$)

解决办法：

删除主题 -> 重建主题