https://shardingsphere.apache.org/document/current/cn/features/sharding/
<dependency>
<groupId>org.apache.shardingsphere</groupId>
<artifactId>sharding-jdbc-spring-boot-starter</artifactId>
<version>4.0.0-RC1</version>
<version>4.0.1</version>
</dependency>
<!-- for spring namespace -->
<dependency>
<groupId>org.apache.shardingsphere</groupId>
<artifactId>sharding-jdbc-spring-namespace</artifactId>
<version>4.0.0-RC1</version>
<version>4.0.1</version>
</dependency>
下面示例是单数据源配置,可以配置多个数据源做主从、分库分表。
?spring.shardingsphere:
enabled: true
props.sql.show: true
datasource:
names: ds0
ds0:
type: com.zaxxer.hikari.HikariDataSource
driver-class-name: com.mysql.cj.jdbc.Driver
jdbc-url: jdbc:mysql://${noob.dburl}?useUnicode=true&characterEncoding=utf8&zeroDateTimeBehavior=convertToNull&useSSL=true&serverTimezone=GMT%2B8&autoReconnect=true&failOverReadOnly=false
username: ${noob.dbuser}
password: ${noob.dbpwd}
data-source-properties: #下面子属性只能为小驼峰格式,同使用数据源中的配置属性
maximumPoolSize: ${datasource.maximumPoolSize}
minimumIdle: ${datasource.minimumIdle}
maxLifetime: 300000
connectionTimeout: 30000
idleTimeout: 30000
connectionTestQuery: SELECT 1
sharding.tables:
limit_use:
actual-data-nodes: ds0.limit_use_0$->{0..9},ds0.limit_use_$->{10..31}
table-strategy.complex.sharding-columns: loan_no,contract_no
table-strategy.complex.algorithm-class-name: cn.noob.algorithm.sharding.MurmurConsistentHashFeatureCodeComplexShardingAlgorithm #扩展ComplexKeysShardingAlgorithm混合分片算法
loan:
actual-data-nodes: ds0.loan_0$->{0..9},ds0.loan_$->{10..31}
table-strategy.standard.sharding-column: loan_no
table-strategy.standard.precise-algorithm-class-name: cn.noob.algorithm.sharding.MurmurConsistentHashFeatureCodeShardingAlgorithm #扩展PreciseShardingAlgorithm标准精确分片算法
sharding.bindingTables[0]: limit_use, repayment_plan #綁定表
绑定表
避免多表关联场景产生笛卡尔积结果!?
sharding.bindingTables[0]: limit_use, repayment_plan
或者
sharding.bindingTables:
- limit_use, repayment_plan, loan, settle_detail
- customer, bank_account
广播表
? 所有的分片数据源中都存在的表结构及其数据均完全一致的表, 用于数据量不大且需要与海量数据的表进行关联查询的场景,例如:字典表。
spring.shardingsphere.sharding.broadcast-tables = t_dict
分页
通常sql语句需要按分表字段过滤,分页功能不能像常规那样指定分页数查询
常规做法下,翻页数太大导致limit 的offset?,所以对于后翻太多页的操作直接拒绝(权当无效操作,为什么不使用精确查找) ; 取每页数据的?lastId,只取不大于它的后面N条。
SELECT * FROM contract where id < #{lastId}
<where>
<if test="id != null and id != 0 "> id < #{id} </if>
</where>
order by id desc limit #{pageSize}
不支持项
SQL :: ShardingSphere
支持跨库关联查询,由 Federation 执行引擎提供支持。基于 XA 协议的两阶段事务和基于最终一致性的柔性事务支持 有限支持子查询, 子查询一定要有分表键!无论嵌套多少层,ShardingSphere都可以解析至第一个包含数据表的子查询,一旦在下层嵌套中再次找到包含数据表的子查询将直接抛出解析异常。简单来说,通过子查询进行非功能需求,在大部分情况下是可以支持的。比如分页、统计总数等;而通过子查询实现业务查询当前并不能支持 。 只能通过SQL字面提取用于分片的值,因此当分片键处于运算表达式或函数中时,ShardingSphere无法提前获取分片键位于数据库中的值,从而无法计算出真正的分片值 。当出现此类分片键处于运算表达式或函数中的SQL时,ShardingSphere将采用全路由的形式获取结果。 ?不能更新分表字段
路由
SELECT i.* FROM t_order o JOIN t_order_item i ON o.order_id=i.order_id WHERE o.order_id in (10, 11);
标准路由?
绑定关系表 指路由使用相同分片规则的主表和子表。绑定表之间的多表关联查询不会出现笛卡尔积关联,关联查询效率将大大提升。此时走的是标准路由。
在不配置绑定表关系时,假设分片键order_id
将数值10路由至第0片,将数值11路由至第1片,那么路由后的SQL应该为4条,它们呈现为笛卡尔积:
SELECT i.* FROM t_order_0 o JOIN t_order_item_0 i ON o.order_id=i.order_id WHERE o.order_id in (10, 11);
SELECT i.* FROM t_order_0 o JOIN t_order_item_1 i ON o.order_id=i.order_id WHERE o.order_id in (10, 11);
SELECT i.* FROM t_order_1 o JOIN t_order_item_0 i ON o.order_id=i.order_id WHERE o.order_id in (10, 11);
SELECT i.* FROM t_order_1 o JOIN t_order_item_1 i ON o.order_id=i.order_id WHERE o.order_id in (10, 11);
在显示申明配置绑定表关系 后,路由的SQL应该为2条:
SELECT i.* FROM t_order_0 o JOIN t_order_item_0 i ON o.order_id=i.order_id WHERE o.order_id in (10, 11);
SELECT i.* FROM t_order_1 o JOIN t_order_item_1 i ON o.order_id=i.order_id WHERE o.order_id in (10, 11);
其中t_order
在FROM的最左侧 ,ShardingSphere将会以它作为整个绑定表的主表。 所有路由计算将会只使用主表的策略 ,那么t_order_item
表的分片计算将会使用t_order
的条件。故绑定表之间的分区键最好是要相同。
笛卡尔路由
笛卡尔路由是最复杂的情况,它无法根据绑定表的关系定位分片规则,因此非绑定表之间的关联查询需要拆解为笛卡尔积组合执行。 如果上个示例中的SQL并未配置绑定表关系,那么路由的结果应为:
SELECT * FROM t_order_0 o JOIN t_order_item_0 i ON o.order_id=i.order_id WHERE order_id IN (1, 2);
SELECT * FROM t_order_0 o JOIN t_order_item_1 i ON o.order_id=i.order_id WHERE order_id IN (1, 2);
SELECT * FROM t_order_1 o JOIN t_order_item_0 i ON o.order_id=i.order_id WHERE order_id IN (1, 2);
SELECT * FROM t_order_1 o JOIN t_order_item_1 i ON o.order_id=i.order_id WHERE order_id IN (1, 2);
笛卡尔路由查询性能较低,需谨慎使用。
DataSourcePropertiesSetter
package org.apache.shardingsphere.spring.boot.datasource;
/**
* Hikari datasource properties setter.
*/
public final class HikariDataSourcePropertiesSetter implements DataSourcePropertiesSetter {
public void propertiesSet(final Environment environment, final String prefix, final String dataSourceName, final DataSource dataSource) {
Properties properties = new Properties();
String datasourcePropertiesKey = prefix + dataSourceName.trim() + ".data-source-properties";
if (PropertyUtil.containPropertyPrefix(environment, datasourcePropertiesKey)) {
Map datasourceProperties = PropertyUtil.handle(environment, datasourcePropertiesKey, Map.class);
properties.putAll(datasourceProperties);
Method method = dataSource.getClass().getMethod("setDataSourceProperties", Properties.class);
method.invoke(dataSource, properties);
}
}
@Override
public String getType() {
return "com.zaxxer.hikari.HikariDataSource";
}
}
通过实现接口 org.apache.shardingsphere.spring.boot.datasource.DataSourcePropertiesSetter 来反射方式给DataSource注入属性。??
分布式主键
在分片规则配置模块可配置每个表的主键生成策略,默认使用雪花算法(snowflake)生成 64bit 的长整型数据。
它保证了趋势递增 ,同时它也有缺陷:
1> 服务器时钟回拨会导致产生重复序列。
生成器需要记录上一次生成序列的时间,配置一个容忍阈值
?如果时钟回拨的时间超过最大容忍的毫秒数阈值,则程序报错; 如果在可容忍的范围内,默认分布式主键生成器会等待时钟同步到最后一次主键生成的时间后再继续工作。 每次发生时间回拨就切换一次workid。?
2> workId(中间10位)耗尽。 要规划好workId的分发及回收的管理策略!
?支持的分片策略
package org.apache.shardingsphere.core.strategy.route;
@NoArgsConstructor(access = AccessLevel.PRIVATE)
public final class ShardingStrategyFactory {
/**
* @param shardingStrategyConfig sharding strategy configuration
* @return sharding strategy instance
*/
public static ShardingStrategy newInstance(final ShardingStrategyConfiguration shardingStrategyConfig) {
if (shardingStrategyConfig instanceof StandardShardingStrategyConfiguration) {
return new StandardShardingStrategy((StandardShardingStrategyConfiguration) shardingStrategyConfig);
}
if (shardingStrategyConfig instanceof InlineShardingStrategyConfiguration) {
return new InlineShardingStrategy((InlineShardingStrategyConfiguration) shardingStrategyConfig);
}
if (shardingStrategyConfig instanceof ComplexShardingStrategyConfiguration) {
return new ComplexShardingStrategy((ComplexShardingStrategyConfiguration) shardingStrategyConfig);
}
if (shardingStrategyConfig instanceof HintShardingStrategyConfiguration) {
return new HintShardingStrategy((HintShardingStrategyConfiguration) shardingStrategyConfig);
}
return new NoneShardingStrategy();
}
}
标准分片StandardShardingStrategy:单一分片键。 精准分片PreciseShardingAlgorithm和范围分片RangeShardingAlgorithm
复合分片ComplexShardingStrategy:多个分片键关联逻辑处理ComplexKeysShardingAlgorithm Hint分片HintShardingStrategy:指定要查询的分库及分表HintShardingAlgorithm 。配置好使用hint策略的分表及分库算法后,HintManager里设置databaseSharding和tableSharding,使用 ThreadLocal 管理分片键值,仅在当前线程内生效。 (还可以通过 SQL 中的特殊注释的方式引用 Hint, 指定了强制分片路由的 SQL 将会无视原有的分片逻辑,直接路由至指定的真实数据节点。) 行表达式分片InlineShardingStrategy?:??在配置中使用 Groovy 表达式,提供对 SQL语句中的 = 和 IN 的分片操作支持,它只支持单分片健的精准规则 。 无实际的分片策略NoneShardingStrategy
源码简析
自动配置类名字居然是:?org.apache.shardingsphere.shardingjdbc.spring.boot.SpringBootConfiguration? !
根据具体的配置差异会自动选择实例化实际的DataSource类型。
ShardingRuleCondition -> ?ShardingDataSource ?->?ShardingPreparedStatement ShardingRuntimeContext -> ShardingRule ->?ShardingRouter#router: 按分片策略路由
MasterSlaveRuleCondition -> MasterSlaveDataSource ->MasterSlavePreparedStatement MasterSlaveRuntimeContext -> MasterSlaveRule ->??MasterSlaveRouter#router :??区分读写请求,写走主库,读按负载策略LoadBalanceAlgorithm路由从库。
下文以4.0.1版本的sharding模式为例 , 4.0.0-RC1 与?4.0.1 版本?在解析sq路由的变动还是挺大的。
2020-03-15 17:24:46.705 [main] INFO [trace=,span=,parent=] ShardingSphere-SQL - Rule Type: sharding
2020-03-15 17:24:46.705 [main] INFO [trace=,span=,parent=] ShardingSphere-SQL - Logic SQL: select * from limit_use u , repayment_plan t where t.loan_no ='LOAN683009630195941376' and u.loan_no = t.loan_no
2020-03-15 17:24:46.705 [main] INFO [trace=,span=,parent=] ShardingSphere-SQL - SQLStatement: SelectSQLStatementContext(super=CommonSQLStatementContext(sqlStatement=org.apache.shardingsphere.sql.parser.sql.statement.dml.SelectStatement@734c98ff, tablesContext=TablesContext(tables=[Table(name=limit_use, alias=Optional.of(u)), Table(name=repayment_plan, alias=Optional.of(t))], schema=Optional.absent())), projectionsContext=ProjectionsContext(startIndex=7, stopIndex=7, distinctRow=false, projections=[ShorthandProjection(owner=Optional.absent())], columnLabels=[id, flow_no, customer_id, customer_name, certificate_no, loan_no, mobile_phone, bank_account_id, bank_account_no, repay_bank_account_id, repay_bank_account_no, contract_no, use_amount, total_periods, term_type, term, channel, loan_type, loan_type_id, repayment_method, due_day, apply_date, accept_date, settle_date, interest_rate, loan_date, interest_start_date, use_status, purpose, trade_channel, trade_status, trade_no, trade_message, business_error_code, reason, version, merge_loan_flag, merge_loan_account, create_time, update_time, id, loan_no, period, should_repayment_date, should_repayment_principal, should_repayment_interest, should_repayment_penalty, should_repayment_fee, actual_repayment_principal, actual_repayment_interest, actual_repayment_penalty, actual_repayment_fee, interest_deduction_amount, penalty_deduction_amount, remaining_principal, is_overdue, actual_repayment_date, repayment_plan_status, grace_days, version, create_time, update_time]), groupByContext=org.apache.shardingsphere.sql.parser.relation.segment.select.groupby.GroupByContext@7a203af7, orderByContext=org.apache.shardingsphere.sql.parser.relation.segment.select.orderby.OrderByContext@360e9b23, paginationContext=org.apache.shardingsphere.sql.parser.relation.segment.select.pagination.PaginationContext@67ee3b1, containsSubquery=false)
2020-03-15 17:24:46.705 [main] INFO [trace=,span=,parent=] ShardingSphere-SQL - Actual SQL: ds0 ::: select * from limit_use_08 u , repayment_plan_08 t where t.loan_no ='LOAN683009630195941376' and u.loan_no = t.loan_no
RoutingResult:
[RoutingUnit(dataSourceName=ds0, masterSlaveLogicDataSourceName=ds0, tableUnits=[TableUnit(logicTableName=limit_use, actualTableName=limit_use_08), TableUnit(logicTableName=repayment_plan, actualTableName=repayment_plan_08)]),
RoutingUnit(dataSourceName=ds0, masterSlaveLogicDataSourceName=ds0, tableUnits=[TableUnit(logicTableName=limit_use, actualTableName=limit_use_00), TableUnit(logicTableName=repayment_plan, actualTableName=repayment_plan_08)])]
org.apache.shardingsphere.api.config.sharding.
ShardingRuleConfiguration
org.apache.shardingsphere.core.rule.
ShardingRule
org.apache.shardingsphere.core.rule.
TableRule
ShardingPreparedStatement#execute
从mybatis执行转入ShardingJdbc逻辑:ShardingDataSource -> ShardingConnection -> ShardingPreparedStatement。
分片入口:?ShardingPreparedStatement#execute() -> BaseShardingEngine#shard()??:
先 PreparedStatementRoutingEngine#route :路由结果SQLRouteResult ; 后 BaseShardingEngine#rewriteAndConvert: 依据上面得到的SQLRouteResult改写sql成真实的库表。? ? ? ? ?
PreparedStatementRoutingEngine#route
public SQLRouteResult route(final List<Object> parameters) {
if (null == sqlStatement) {
sqlStatement = shardingRouter.parse(logicSQL, true);
}
// 先? 分片路由? 再? 主从库路由。
return masterSlaveRouter.route(shardingRouter.route(logicSQL, parameters, sqlStatement));
}
SQLParseEngine.parse??解析sql成语法树? ->?ShardingRouter.route() ->?RoutingEngine.route() : 按分库分表条件路由真实的库表序号
public SQLRouteResult route(final String logicSQL, final List<Object> parameters, final SQLStatement sqlStatement) {
// ShardingStatementValidator.validate 对 insert、update 时 禁止对分表字段更新
Optional<ShardingStatementValidator> shardingStatementValidator = ShardingStatementValidatorFactory.newInstance(sqlStatement);
if (shardingStatementValidator.isPresent()) {
shardingStatementValidator.get().validate(shardingRule, sqlStatement, parameters);
}
SQLStatementContext sqlStatementContext = SQLStatementContextFactory.newInstance(metaData.getRelationMetas(), logicSQL, parameters, sqlStatement);
Optional<GeneratedKey> generatedKey = sqlStatement instanceof InsertStatement
? GeneratedKey.getGenerateKey(shardingRule, metaData.getTables(), parameters, (InsertStatement) sqlStatement) : Optional.<GeneratedKey>absent();
// 入参条件
ShardingConditions shardingConditions = getShardingConditions(parameters, sqlStatementContext, generatedKey.orNull(), metaData.getRelationMetas());
boolean needMergeShardingValues = isNeedMergeShardingValues(sqlStatementContext);
if (sqlStatementContext.getSqlStatement() instanceof DMLStatement && needMergeShardingValues) {
checkSubqueryShardingValues(sqlStatementContext, shardingConditions);
mergeShardingConditions(shardingConditions);
}
RoutingEngine routingEngine = RoutingEngineFactory.newInstance(shardingRule, metaData, sqlStatementContext, shardingConditions);
// 通过条件和分表规则 路由执行
RoutingResult routingResult = routingEngine.route();
if (needMergeShardingValues) {
// ?needMergeShardingValues == true 需要判定RoutingUnits 惟一
Preconditions.checkState(1 == routingResult.getRoutingUnits().size(), "Must have one sharding with subquery.");
}
SQLRouteResult result = new SQLRouteResult(sqlStatementContext, shardingConditions, generatedKey.orNull());
result.setRoutingResult(routingResult);
if (sqlStatementContext instanceof InsertSQLStatementContext) {
setGeneratedValues(result);
}
return result;
}
这里重点关注RoutingEngineFactory#newInstance 依据 分片规则和 sql执行上下文等创建的不同场景RoutingEngine
public static RoutingEngine newInstance(final ShardingRule shardingRule,
final ShardingSphereMetaData metaData, final SQLStatementContext sqlStatementContext, final ShardingConditions shardingConditions) {
SQLStatement sqlStatement = sqlStatementContext.getSqlStatement();
Collection<String> tableNames = sqlStatementContext.getTablesContext().getTableNames();
if (sqlStatement instanceof TCLStatement) {
return new DatabaseBroadcastRoutingEngine(shardingRule); //授权、角色控制等数据库控制语言: 全库路由, 基于每个DataSourceName构建一个RoutingUnit
}
if (sqlStatement instanceof DDLStatement) {
return new TableBroadcastRoutingEngine(shardingRule, metaData.getTables(), sqlStatementContext); //数据定义于燕燕:全库表路由
}
if (sqlStatement instanceof DALStatement) {
return getDALRoutingEngine(shardingRule, sqlStatement, tableNames);
}
if (sqlStatement instanceof DCLStatement) {
return getDCLRoutingEngine(shardingRule, sqlStatementContext, metaData);
}
if (shardingRule.isAllInDefaultDataSource(tableNames)) {
return new DefaultDatabaseRoutingEngine(shardingRule, tableNames); // 默认的数据库
}
if (shardingRule.isAllBroadcastTables(tableNames)) {
return sqlStatement instanceof SelectStatement ? new UnicastRoutingEngine(shardingRule, tableNames) : new DatabaseBroadcastRoutingEngine(shardingRule);
// UnicastRoutingEngine 代表单播路由,用于获取某一真实表信息的场景,它只需要从任意库中的任意真实表中获取数据即可。例如 DESCRIBE 语句就适合使用 UnicastRoutingEngine,因为每个真实表中的数据描述结构都是相同的。
}
if (sqlStatementContext.getSqlStatement() instanceof DMLStatement && tableNames.isEmpty() && shardingRule.hasDefaultDataSourceName()) {
return new DefaultDatabaseRoutingEngine(shardingRule, tableNames);
}
if (sqlStatementContext.getSqlStatement() instanceof DMLStatement && shardingConditions.isAlwaysFalse() || tableNames.isEmpty() || !shardingRule.tableRuleExists(tableNames)) {
return new UnicastRoutingEngine(shardingRule, tableNames);
}
return getShardingRoutingEngine(shardingRule, sqlStatementContext, shardingConditions, tableNames); // StandardRoutingEngine(单个表或者有绑定关系的多个表) 或 ComplexRoutingEngine?
}
对于不分片的表,会判定是否有默认的数据库配置ShardingDataSourceNames#getDefaultDataSourceName :
// 只有1个数据库,则它就是默认;否则取配置的default
public String getDefaultDataSourceName() {
return 1 == dataSourceNames.size() ? dataSourceNames.iterator().next() : shardingRuleConfig.getDefaultDataSourceName();
}
?StandardRoutingEngine#route()? -> #getDataNodes:
private Collection<DataNode> getDataNodes(final TableRule tableRule) {
if (isRoutingByHint(tableRule)) { // 先判定指定datebase和table都是Hint策略!!!
return routeByHint(tableRule);
}
if (isRoutingByShardingConditions(tableRule)) { // datebase和table都不是Hint策略
return routeByShardingConditions(tableRule);
}
return routeByMixedConditions(tableRule);
}
?#routeByShardingConditions-> #routeTables :? 根据配置的分片规则获取指定的算法ShardingStrategy来计算出路由的实际的分库及分表序列
?ShardingStrategy
StandardShardingStrategy#doSharding (单字段): 只取RouteValue 集合中的第一个值作为判定依据
ComplexShardingStrategy (混合字段): 当ListRouteValue传入的需要计算分片规则的数据值有N个就会计算N次分表,通过TreeSet去重?
public final class ListRouteValue<T extends Comparable<?>> implements RouteValue {
// 表字段
private final String columnName;
// 表
private final String tableName;
// 字段入参值
private final Collection<T> values;