一.版本介绍:
canal.admin :1.1.4
canal.deployer :1.1.4
canal.adapter :1.1.4
Doris :0.14/0.15
三.多源配置:
1.部署Canal
不做详细介绍,可以参考官网部署。地址:https://github.com/alibaba/canal
2.部署Doris
Apache Doris是一个现代化的MPP分析型数据库产品。仅需亚秒级响应时间即可获得查询结果,有效地支持实时数据分析。Apache Doris的分布式架构非常简洁,易于运维,并且可以支持10PB以上的超大数据集。
Apache Doris可以满足多种数据分析需求,例如固定历史报表,实时数据分析,交互式数据分析和探索式数据分析等。令您的数据分析工作更加简单高效!
具体部署操作可参考官网: https://doris.apache.org/
3.部署Canal-adapter
修改解压后的Canal-adapter的配置文件。该文件在${your_canal-adapter}/conf 下,修改application.yml配置文件:
server:
port: 8081
spring:
jackson:
date-format: yyyy-MM-dd HH:mm:ss
time-zone: GMT+8
default-property-inclusion: non_null
canal.conf:
mode: tcp
canalServerHost: 127.0.0.1:11111
batchSize: 500
syncBatchSize: 1000
retries: 0
timeout:
accessKey:
secretKey:
srcDataSources:
defaultDS:
url: jdbc:mysql://172.000.000.00:3306/test?useUnicode=true
username: root
password: password
canalAdapters:
- instance: example
groups:
- groupId: g1
outerAdapters:
- name: rdb
key: mysql1
properties:
jdbc.driverClassName: com.mysql.jdbc.Driver
jdbc.url: jdbc:mysql://10.000.000.00:9030/test?useUnicode=true
jdbc.username: root
jdbc.password:
4.doris建表
Doris数据模型分为3种:
? Aggregate 模型:数据会根据指定key进行聚合,用户只能查询到聚合后的数据。
? Uniq 模型:该模型可以保证 Key 的唯一性。
? Duplicate 模型:在某些多维分析场景下,数据既没有主键,也没有聚合需求,那么就可以引入 Duplicate 数据模型来满足这类需求。
本次我们选择Duplicate 模型来做测试。
CREATE TABLE `test_sink` (
`id` int(32) NOT NULL COMMENT "id",
`name` VARCHAR(26) NULL COMMENT "姓名"
) ENGINE=OLAP
DUPLICATE KEY(`id`)
COMMENT "测试数据表 "
DISTRIBUTED BY HASH(`name`) BUCKETS 1
PROPERTIES (
"replication_allocation" = "tag.location.default: 3",
"in_memory" = "false",
"storage_format" = "V2"
);
5.配置rdb文件
Canal-adapter是通过热加载rdb目录下的yml配置文件来做到同步数据到指定数据库的,所以这块需要根据前面配置的数据源和canal instance实例来配置yml文件,多源配置具体如下:
test_wd_test_sink.yml
dataSourceKey: defaultDS
destination: example
groupId: g1
outerAdapterKey: mysql1
concurrent: true
dbMapping:
database: test
table: test_sink
targetTable: test.test_sink
targetPk:
id: id
mapAll: true
etlCondition: "{\"name\":\"zhangsan\"}"
commitBatch: 10
4.启动Canal-adapter实例
启动之后,当日志中出现以下信息,则说明Canal-adapter启动成功了。
2022-03-09 15:10:32.528 [main] INFO c.a.otter.canal.client.adapter.rdb.config.ConfigLoader - ## Rdb mapping config loaded
2022-03-09 15:10:32.577 [Thread-3] INFO c.a.o.canal.adapter.launcher.loader.CanalAdapterWorker - =============> Start to subscribe destination: example <=============
2022-03-09 15:10:32.585 [Thread-3] INFO c.a.o.canal.adapter.launcher.loader.CanalAdapterWorker - =============> Subscribe destination: example succeed <=============
2022-03-09 15:10:32.793 [main] INFO com.alibaba.druid.pool.DruidDataSource - {dataSource-4} inited
2022-03-09 15:10:32.794 [main] INFO c.a.o.canal.adapter.launcher.loader.CanalAdapterLoader - Load canal adapter: rdb succeed
2022-03-09 15:10:32.795 [main] DEBUG o.s.beans.factory.support.DefaultListableBeanFactory - Returning cached instance of singleton bean 'syncSwitch'
2022-03-09 15:10:32.795 [main] INFO c.a.o.canal.adapter.launcher.loader.CanalAdapterLoader - Start adapter for canal instance: example1 succeed
2022-03-09 15:10:32.795 [main] INFO c.a.o.canal.adapter.launcher.loader.CanalAdapterService - ## the canal client adapters are running now ......
2022-03-09 15:10:32.795 [Thread-5] INFO c.a.o.canal.adapter.launcher.loader.CanalAdapterWorker - =============> Start to connect destination: example1 <=============
2022-03-09 15:10:32.795 [main] DEBUG o.s.beans.factory.support.DefaultListableBeanFactory - Finished creating instance of bean 'scopedTarget.canalAdapterService'
|