Mysql collate
mysql 中的collate
mysql常见的collate出现在字符串字段的设置和表的设计中,例:COLLATE utf8mb4_general_ci
CREATE TABLE `play` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(20) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`age` int(11) DEFAULT '0',
`mobile` varchar(11) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci;
collate和charset
-
charset设置字符串编码集,常用的utf8,mysql遗留问题utf8最存储3字节的大小,4字节的文字无法存储,需要utf8mb4 -
collate和charset关联,定义了字符串的排序规则,如utf8mb4_general_ci是和utf8mb4对应的排序规则,ci为Case Insensitive, 即大小写不敏感,对应cs为Case Sensitive,即大小写敏感 -
数据库中涉及排序和比较的地方都会受到collate影响 -
查看数据库的所有charset和collate show charset; show COLLATION;
collate设置级别
可以在实例、库、表、字段和SQL语句级别指定collate
-
库级别 CREATE DATABASE <db_name> DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;
-
表级别和字段 CREATE TABLE tablename (
`name` varchar(20) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
...
...
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci;
-
SQL语句级别 SELECT DISTINCT field1 COLLATE utf8mb4_general_ci FROM table1;
SELECT field1, field2 FROM table1 ORDER BY field1 COLLATE utf8mb4_general_ci;
select * from table1,table2 where table1.field = table2.field
总结
- 国内常用的排序规则为utf8mb4_unicode_ci,utf8mb4_general_ci,对于我们来说两者没有太大区别,
都可以使用 - 配置了多个排序规则,优先级为SQL语句 > 列级别 > 表级别 > 库级别 > 实例级别
项目中的坑
公司项目经过多人的维护后,规范不够严格,不同表同一字段,创建的数据库表指定字符串排序规则为 utf8mb4_general_ci和utf8mb4_unicode_ci两种,导致关联无法比较
SELECT a.mobile from play a INNER JOIN user_t b
ON a.mobile = b.mobile
> 1267 - Illegal mix of collations (utf8mb4_general_ci,IMPLICIT) and (utf8mb4_unicode_ci,IMPLICIT) for operation '='
> 时间: 0.003s
SELECT a.mobile from play a INNER JOIN user_t b
ON a.mobile = b.mobile COLLATE utf8mb4_general_ci
a INNER JOIN user_t b
ON a.mobile = b.mobile COLLATE utf8mb4_general_ci
|