[大数据] ShardingSphere-分片查询方案

开发: C++知识库 Java知识库 JavaScript Python PHP知识库人工智能区块链大数据移动开发嵌入式开发工具数据结构与算法开发测试游戏开发网络协议系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑笔记本显卡显示器固态硬盘硬盘耳机手机 iphone vivo oppo 小米华为单反装机图拉丁

-> 大数据 -> ShardingSphere-分片查询方案 -> 正文阅读

[大数据]ShardingSphere-分片查询方案

ShardingSphere-分片查询方案

对于系统中大表横向分表之后的查询优化处理，仅查询。
注意：以下的方案目前仅为了查询。

备注：查询官网一定要选好对应版本
官网：https://shardingsphere.apache.org/index_zh.html

结论：能用，但有限制！

表情况

栗子表结构：biz_trade_profit_detail，历史数据：biz_trade_profit_detail_yyyymm
所有的DLL操作都在biz_trade_profit_detail上，ID使用mysql自增的，分表使用脚本月初处理归档。

table:
-- biz_trade_profit_detail
-- biz_trade_profit_detail_202206
-- biz_trade_profit_detail-202205
-- ...

项目调整

项目架构：mybatis-plus + Druid
配置方式：代码手动配置一些内容

Maven引用包：

<dependency>
    <groupId>org.apache.shardingsphere</groupId>
    <artifactId>shardingsphere-jdbc-core</artifactId>
    <version>5.1.1</version>
</dependency>

配置文件：

dataSources:
  ds0:
    dataSourceClassName: com.alibaba.druid.pool.DruidDataSource
    driverClassName: com.mysql.cj.jdbc.Driver
    url: jdbc:mysql://192.168.0.1:3306/xiaohang?useUnicode=true&characterEncoding=utf8&zeroDateTimeBehavior=convertToNull&useSSL=true&serverTimezone=GMT%2B8
    username: xiaohang
    password: xiaohang

props:
  sql-show: true
  sql-simple: true

rules:
  - !SHARDING
    shardingAlgorithms:
      t-month-inline:
        type: INTERVAL
        props:
          datetime-pattern: 'yyyy-MM-dd HH:mm:ss'
          datetime-lower: '2022-02-01 00:00:00'
          datetime-upper: '2022-08-01 00:00:00'
          sharding-suffix-pattern: 'yyyyMM'
          datetime-interval-amount: 1
          datetime-interval-unit: 'months'
    tables: # 数据分片规则配置
      v_biz_trade_profit_detail: # 逻辑表名，查询的时候需要使用的就是 这个表名
        actualDataNodes: ds0.biz_trade_profit_detail,ds0.biz_trade_profit_detail_$->{202202..202206} 
        # 这个与库里的表必须对上，不能是不存在的，并且设立是可以拼上非规则的表作为节点的
        tableStrategy:
          standard:
            shardingAlgorithmName: t-month-inline
            shardingColumn: create_time
#          hint: # Hint 分片策略
#            shardingAlgorithmName: # 分片算法名称
#          keyGenerateStrategy: # 分布式序列策略
#            column: # 自增列名称，缺省表示不使用自增主键生成器
#            keyGeneratorName: # 分布式序列算法名称

# 具体这些什么意思，直接看官网吧 
# 这里需要注意，使用配置文件和yml文件的区别，以及不同版本的参数写法是不一样的

启动配置

spring:
  main:
    allow-bean-definition-overriding: true

mybatis-Plus 数据源修改：

// 文件：MybatisPlusConfig.java
// 调整：
...
DataSource dataSource = YamlShardingSphereDataSourceFactory.createDataSource(ResourceUtils.getFile("classpath:application-sharding-tables.yml"));
mybatisPlus.setDataSource(dataSource);
...

其他问题

本月数据的查询【可解决】

由于本月数据，以及真实操作数据表均为：biz_trade_profit_detail ，无后缀日期参数，所以在分片里面配置不上

解决方式1：使用本月视图，并且可以一次创建以后一年的空月份视图

CREATE VIEW `biz_trade_profit_detail_202206` AS select * from `biz_trade_profit_detail`;  -- 本月：202206

解决方式2：只是本月数据，并且不做数据操作仅查询的话，直接修改节点绑定就可以了

actualDataNodes: ds0.biz_trade_profit_detail,ds0.biz_trade_profit_detail_$->{202202..202205} 
# 像这样，逗号分隔加上非规则的的分片表就可以了

每月需要重启与调整【使用视图的方法】

╮(╯_╰)╭ 这个搞不定，只能每个月都要改配置和视图，以及本身分表操作。

现在由于配置中的actualDataNodes参数要改，所以得重启项目。

解决方式：直接创建一年或更多的空视图，之后在手动归档表的时候删了就行，不用重启项目了。

查询性能问题【好像无此问题的样子】

这里本月的查询，是使用了视图，未测试大数据量是否用的上索引。【好像有走索引的样子，好像】
范围查询带分片字段的话，是会只查询相关分片的，而无参数的话就是全扫描了。

自定义分片算法

本来以为自定义分片算法可以解决不规则分片节点的问题的，结果发现错了。

分片算法，在查询的时候根本不会被执行的，查询是跟绑定的节点关，只有操作表数据的时候才会被使用到分片算法。

(ノ｀Д)ノ虽然配好了，但完全用不上了

而最开始以为的：Hint 分片策略，其实是用来手动配置规则进行分片操作的，就是那种不依赖表数据进行独立业务逻辑分表的情况下使用的，所以我这里也是用不上的 ε=(′ο｀*)))唉

这个的操作，各个版本的配置好像很不一样的样子，需要注意的。

配置文件说明：

# 数据源配等 不变
...

rules:
  - !SHARDING
    shardingAlgorithms:
      t-xiaohang-inline: # 自己随便定的分配算法名称
        type: CLASS_BASED  # 自定义分片算法类型
        props:
          strategy: STANDARD # 分片策略类型 这个是标准的
          algorithmClassName: com.lookme.modules.config.MyStandardShardingAlgorithm # 算法全限定名
    bindingTableGroups: biz_trade_profit_detail
    tables: # 数据分片规则配置
      v_biz_trade_profit_detail: # 逻辑表，查询的时候需要使用的就是 这个表名
        actualDataNodes: ds0.biz_trade_profit_detail,ds0.biz_trade_profit_detail_$->{202202..202205} 		# 这个与库里的表必须对上，不能是不存在的 可以用逗号拼上其他表
        tableStrategy:
          standard:
            shardingAlgorithmName: t-xiaohang-inline   # 算法名
            shardingColumn: create_time

定义算法类：

package com.lookme.modules.config;

import lombok.extern.slf4j.Slf4j;
import org.apache.commons.lang.time.DateFormatUtils;
import org.apache.shardingsphere.sharding.api.sharding.standard.PreciseShardingValue;
import org.apache.shardingsphere.sharding.api.sharding.standard.RangeShardingValue;
import org.apache.shardingsphere.sharding.api.sharding.standard.StandardShardingAlgorithm;

import java.util.*;

@Slf4j
public class MyStandardShardingAlgorithm implements StandardShardingAlgorithm<Date> {
    // 这个不懂干什么的
    private static final String DATABASES_TABLES_RELEVANCE = "databases-tables-relevance";

    //Getter Setter一定要加，不加拿不到自定义的参数
//    @Getter
//    @Setter
//    private Properties props = new Properties();

    @Override
    public String getType() {
        //对应yml中的type,这里怎么定义，那里就怎么写  【好像没用上的样子啊 ╮(╯_╰)╭ 】
        return "xiaohang";
    }

    @Override
    public void init() {
    }

    @Override
    public String doSharding(Collection<String> collection, PreciseShardingValue<Date> shardingValue) {
        if (log.isDebugEnabled()) {
            log.debug("=============================>collection: {}, shardingVal: {}", collection, shardingValue);
            log.info("==============================>collection: {}, shardingVal: {}", collection, shardingValue);
        }
        // 这个collection就是节点的集合了  具体的算法就是在这里写的了，具体怎么搞 就看需要了
        for (String each : collection) {
            Date date = shardingValue.getValue();
            if (Objects.isNull(date)) {
                date = new Date();
            }

            String suffix = DateFormatUtils.format(date, "yyyyMM");
            if (each.endsWith(suffix)) {
                return each;
            }
        }
        return null;
    }

    @Override
    public Collection<String> doSharding(Collection<String> collection, RangeShardingValue<Date> shardingValue) {
        if (log.isDebugEnabled()) {
            log.debug("=============================2>collection: {}, shardingVal: {}", collection, shardingValue);
            log.info("==============================2>collection: {}, shardingVal: {}", collection, shardingValue);
        }
        return null;
    }

}

参考的东西：

官方：https://shardingsphere.apache.org/document/5.1.1/cn/user-manual/shardingsphere-jdbc/builtin-algorithm/sharding/
自定义算法：https://blog.csdn.net/zwjzone/article/details/124497107
https://www.csdn.net/tags/OtTaUg3sMjAyNjEtYmxvZwO0O0OO0O0O.html

不支持的SQL

说结果：最好在项目初期使用，否则需要项目所有功能重测才行了

很多的数据操作不支持，比如INSERT INTO SELECT * 这种的，等等的SQL。这就很影响系统的功能了。【除非这个项目本身就有严格的SQL要求控制】

具体哪些不支持，可以参考：

官方：https://shardingsphere.apache.org/document/5.1.1/cn/features/sharding/use-norms/sql/#%E4%B8%8D%E6%94%AF%E6%8C%81
其他博客：https://blog.csdn.net/a345203172/article/details/106956278/

大数据最新文章

实现Kafka至少消费一次

亚马逊云科技：还在苦于ETL？Zero ETL的时代

初探MapReduce

【SpringBoot框架篇】32.基于注解+redis实现

Elasticsearch：如何减少 Elasticsearch 集

Go redis操作

Redis面试题

专题五 Redis高并发场景

基于GBase8s和Calcite的多数据源查询

Redis——底层数据结构原理

加:2022-06-21 21:28:50 更:2022-06-21 21:30:03

360图书馆购物三丰科技阅读网日历万年历 2025年7日历

-2025/7/15 9:45:12-

图片自动播放器
↓图片自动播放器↓

TxT小说阅读器
↓语音阅读,小说下载,古典文学↓

一键清除垃圾
↓轻轻一点,清除系统垃圾↓

图片批量下载器
↓批量下载图片,美女图库↓

网站联系: qq:121756557 email:121756557@qq.com IT数码