[大数据] MySQL索引优化

开发: C++知识库 Java知识库 JavaScript Python PHP知识库人工智能区块链大数据移动开发嵌入式开发工具数据结构与算法开发测试游戏开发网络协议系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑笔记本显卡显示器固态硬盘硬盘耳机手机 iphone vivo oppo 小米华为单反装机图拉丁

-> 大数据 -> MySQL索引优化 -> 正文阅读

[大数据]MySQL索引优化

文章很长你忍一下

1.索引分析

1.1 单表优化

1.1.1 案例

CREATE TABLE IF NOT EXISTS `article`(
`id` INT(10) UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT,
`author_id` INT (10) UNSIGNED NOT NULL,
`category_id` INT(10) UNSIGNED NOT NULL , 
`views` INT(10) UNSIGNED NOT NULL , 
`comments` INT(10) UNSIGNED NOT NULL,
`title` VARBINARY(255) NOT NULL,
`content` TEXT NOT NULL
);

insert into `article`(author_id,category_id,views,comments,title,content) values
(1,1,1,1,'1','1'),
(2,2,2,2,'2','2'),
(1,1,3,3,'3','3');

执行SQL查询语句

select id, author_id from article 
where category_id=1 and comments>1
order by views desc limit 1;

在这里插入图片描述
使用explain查看SQL性能

explain select id, author_id from article 
where category_id=1 and comments>1 
order by views desc limit 1;

在这里插入图片描述
因为where子句中使用到的字段没有加索引所以type是ALL全表扫描

1.1.2 优化

先给字段加索引使用到了三个字段分别是category_id、comments、views
先尝试添加联合索引并查看索引

create index idx_article_cv on article(category_id, comments, views);
show index from article;

在这里插入图片描述
在执行explain语句查看SQL语句的性能

explain 
select id, author_id from article 
where category_id=1 and comments>1 
order by views desc limit 1;

在这里插入图片描述
type是range 这是因为where子句中有>范围搜索这是可以接受的
但重要的是Extra 中有Using filesort 这个问题有些严重
我们建立的索引是联合索引(category_id, comments, views), 在comments>1范围搜索之后的views索引会失效 order by 子句不会使用索引排序而是使用了文件排序。

解决方法是建立（category_id, views）的联合索引
先将之前建立的索引drop一下在建立新的联合索引

drop index idx_article_cv on article;
create index idx_article_cv on article(category_id, views);

再次查看SQL性能

explain 
select id, author_id from article 
where category_id=1 and comments>1 
order by views desc limit 1;

在这里插入图片描述
Using filesort变成了Using where

1.2 两表优化

1.2.1 案例

建表语句

CREATE TABLE IF NOT EXISTS `class`(
`id` INT(10) UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT,
`card` INT (10) UNSIGNED NOT NULL
);
CREATE TABLE IF NOT EXISTS `book`(
`bookid` INT(10) UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT,
`card` INT (10) UNSIGNED NOT NULL
);

insert into class(card) values(floor(1+(rand()*20)));
insert into class(card) values(floor(1+(rand()*20)));
insert into class(card) values(floor(1+(rand()*20)));
insert into class(card) values(floor(1+(rand()*20)));
insert into class(card) values(floor(1+(rand()*20)));
insert into class(card) values(floor(1+(rand()*20)));
insert into class(card) values(floor(1+(rand()*20)));
insert into class(card) values(floor(1+(rand()*20)));
insert into class(card) values(floor(1+(rand()*20)));
insert into class(card) values(floor(1+(rand()*20)));
insert into class(card) values(floor(1+(rand()*20)));
insert into class(card) values(floor(1+(rand()*20)));
insert into class(card) values(floor(1+(rand()*20)));
insert into class(card) values(floor(1+(rand()*20)));
insert into class(card) values(floor(1+(rand()*20)));
insert into class(card) values(floor(1+(rand()*20)));
insert into class(card) values(floor(1+(rand()*20)));
insert into class(card) values(floor(1+(rand()*20)));
insert into class(card) values(floor(1+(rand()*20)));
insert into class(card) values(floor(1+(rand()*20)));

insert into book(card) values(floor(1+(rand()*20)));
insert into book(card) values(floor(1+(rand()*20)));
insert into book(card) values(floor(1+(rand()*20)));
insert into book(card) values(floor(1+(rand()*20)));
insert into book(card) values(floor(1+(rand()*20)));
insert into book(card) values(floor(1+(rand()*20)));
insert into book(card) values(floor(1+(rand()*20)));
insert into book(card) values(floor(1+(rand()*20)));
insert into book(card) values(floor(1+(rand()*20)));
insert into book(card) values(floor(1+(rand()*20)));
insert into book(card) values(floor(1+(rand()*20)));
insert into book(card) values(floor(1+(rand()*20)));
insert into book(card) values(floor(1+(rand()*20)));
insert into book(card) values(floor(1+(rand()*20)));
insert into book(card) values(floor(1+(rand()*20)));
insert into book(card) values(floor(1+(rand()*20)));
insert into book(card) values(floor(1+(rand()*20)));
insert into book(card) values(floor(1+(rand()*20)));
insert into book(card) values(floor(1+(rand()*20)));
insert into book(card) values(floor(1+(rand()*20)));

执行两表联查

select * from class left join book on class.card=book.card;

在这里插入图片描述
查看SQL性能

explain select * from class 
left join book on class.card=book.card;

在这里插入图片描述
因为两个表的card字段都没有建立索引，所以不出预料type都是ALL

1.2.2 优化

先给左表card加索引

alter table class add index Y(card);
show index from class;

在这里插入图片描述
查看SQL性能
尝试给右表card加索引
先drop左表的索引

drop index Y on class;

给右表加索引

alter table book add index Y(card);
show index from book;

在这里插入图片描述
查看SQL性能

因为是左连接 left join，左表一定是全表扫描的从右表搜索
所以左连接给右表字段加索引

下面是两表都加索引的性能结果：
在这里插入图片描述
虽然class的type是index 但是rows是20 依然是扫描全表
结论：左连接对右表加索引右连接对左表加索引

1.3 三表优化

CREATE TABLE IF NOT EXISTS `phone`(
`phoneid` INT(10) UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT,
`card` INT (10) UNSIGNED NOT NULL
)ENGINE = INNODB;

insert into phone(card) values(floor(1+(rand()*20)));
insert into phone(card) values(floor(1+(rand()*20)));
insert into phone(card) values(floor(1+(rand()*20)));
insert into phone(card) values(floor(1+(rand()*20)));
insert into phone(card) values(floor(1+(rand()*20)));
insert into phone(card) values(floor(1+(rand()*20)));
insert into phone(card) values(floor(1+(rand()*20)));
insert into phone(card) values(floor(1+(rand()*20)));
insert into phone(card) values(floor(1+(rand()*20)));
insert into phone(card) values(floor(1+(rand()*20)));
insert into phone(card) values(floor(1+(rand()*20)));
insert into phone(card) values(floor(1+(rand()*20)));
insert into phone(card) values(floor(1+(rand()*20)));
insert into phone(card) values(floor(1+(rand()*20)));
insert into phone(card) values(floor(1+(rand()*20)));
insert into phone(card) values(floor(1+(rand()*20)));
insert into phone(card) values(floor(1+(rand()*20)));
insert into phone(card) values(floor(1+(rand()*20)));
insert into phone(card) values(floor(1+(rand()*20)));
insert into phone(card) values(floor(1+(rand()*20)));

三表联查使用的是class、book 和 phone三张表
首先要将1.2中的class、book除了主键的其它索引全部drop掉

show index from class;
drop index YL on class;
show index from book;
drop index Y on book;

查看三表联查的性能

explain select * from class 
left join book on class.card=book.card 
left join phone on book.card=phone.card;

在这里插入图片描述
基于1.2的结论，索引应该建立在book 和 phone表上

alter table book add index Y(card);
alter table phone add index Z(card);

在这里插入图片描述
class表作为最外层表一定是全部扫描的 rows一定是20
其它两张表的rows被优化到了1行数据；表示通过索引查找到记录
不需要读取行数据遍历，优化了查询效率。
结论：索引需要建立在需要经常查询的字段上。

1.4 结论

尽可能减少Join语句中的NestedLoop的循环次数：永远用小的结果集驱动大的结果集；
比如在实际业务上class(书的类别)表的数据一定是小于book的数据数；使用class作为主表来驱动book表效率会比使用book作为主表驱动class表高得多。
优先优化NestedLoop语句的内层循环；
要先保证最里面的执行效率才能优化外层语句。
保证join语句中被驱动表上的join条件字段已经被索引
在无法保证被驱动表的join字段被索引且内存资源充足的前提下，不要太吝啬joinBuffer的设置。

2.索引失效(应该避免)

建表语句

CREATE TABLE staffs(
id INT PRIMARY KEY AUTO_INCREMENT,
`name` VARCHAR(24)NOT NULL DEFAULT'' COMMENT'姓名',
`age` INT NOT NULL DEFAULT 0 COMMENT'年龄',
`pos` VARCHAR(20) NOT NULL DEFAULT'' COMMENT'职位',
`add_time` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT'入职时间'
)CHARSET utf8 COMMENT'员工记录表';

insert into staffs(NAME,age,pos,add_time) values('z3',22,'manager',NOW());
insert into staffs(NAME,age,pos,add_time) values('July',23,'dev',NOW());
insert into staffs(NAME,age,pos,add_time) values('2000',23,'dev',NOW());


CREATE TABLE tbl_user(
`id` INT(11) NOT NULL AUTO_INCREMENT,
`name` VARCHAR(20) DEFAULT NULL,
`age`INT(11) DEFAULT NULL,
`email` VARCHAR(20) DEFAULT NULL,
PRIMARY KEY(`id`)
)ENGINE=INNODB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;

insert into tbl_user(NAME,age,email) values('1aa1',21,'b@163.com');
insert into tbl_user(NAME,age,email) values('2aa2',222,'a@163.com');
insert into tbl_user(NAME,age,email) values('3aa3',265,'c@163.com');
insert into tbl_user(NAME,age,email) values('4aa4',21,'d@163.com');


create table test03(
id int primary key not null auto_increment,
c1 char(10),
c2 char(10),
c3 char(10),
c4 char(10),
c5 char(10)
);

insert into test03(c1,c2,c3,c4,c5) values('a1','a2','a3','a4','a5');
insert into test03(c1,c2,c3,c4,c5) values('b1','b2','b3','b4','b5');
insert into test03(c1,c2,c3,c4,c5) values('c1','c2','c3','c4','c5');
insert into test03(c1,c2,c3,c4,c5) values('d1','d2','d3','d4','d5');
insert into test03(c1,c2,c3,c4,c5) values('e1','e2','e3','e4','e5');

2.1 对于联合索引全值匹配查询效率最高

案例：
对staffs表建立联合索引 并执行而小盘explain查看查询效率

alter table staffs add index idx_staffs_nameAgePos(name, age, pos);
explain select name, age, pos from staffs where name='July' and age=23 and pos='dev';

在这里插入图片描述
这句查询语句使用到了(name, age, pos) 三个字段的索引 key_len=140

explain select name, age, pos from staffs where name='July' and age=23;

在这里插入图片描述
这句查询只用到了name age两个字段的索引所以key_len=78

2.2 最佳左前缀法则

使用联合索引查询时不能跳过中间索引，否则之后的索引会失效

explain select * from staffs where age=23 and pos='dev';

在这里插入图片描述
跳过name索引，导致age和pos字段索引失效

explain select * from staffs where name='July' and pos='dev';

在这里插入图片描述
跳过了age字段，联合索引的pos字段失效。因为key_len时74 只用到了name字段的索引。

2.3. 在索引列上做操作

在索引列上做操作(计算、函数、自动或手动的类型转换)，会导致索引失效而转向全表扫描
对比以下两条语句的查询效率

explain select * from staffs where name='July';

在这里插入图片描述

explain select * from staffs where left(name, 4)='July';

在这里插入图片描述
对那么字段使用left函数索引失效转变为了全表扫描。

2.4 范围查找之后的索引会全部失效

explain select * from staffs where name='July' and age>20 and pos='dev';

在这里插入图片描述
key_len为78，只用到了name 和 age索引 age之后的pos索引失效

2.5 尽量使用覆盖索引尽量不使用select * 查询

explain select * from staffs where name='July' and age=23 and pos='dev';

在这里插入图片描述

explain select name, age, pos from staffs where name='July' and age=23 and pos='dev';

在这里插入图片描述
using index 表示使用了覆盖索引，Using where 是因为索引字段不是唯一的需要使用where 过滤。

2.6 使用不等于(!= <>)索引失效

MySQL使用不等于(!= <>)查询时索引失效会导致全表或全索引扫描

explain select * from staffs where name<>'July';

在这里插入图片描述

explain select name, age, pos from staffs where name<>'July';

在这里插入图片描述

2.7 is null 和 is not null无法使用索引

explain select * from staffs where name is null;

在这里插入图片描述

2.8 like通配符开头的mysql索引失效会变成全表扫描

%开头的索引会失效如 %July
%放在后面 July% 不会引起索引失效是range查询因为索引底层是根据字母顺序来排序的

explain select * from staffs where name like 'Jul%';

在这里插入图片描述
range 表示使用到了索引

explain select * from staffs where name like '%Jul%';

在这里插入图片描述
ALL表示遍历整个表索引失效

explain select name, age, pos from staffs where name like 'Jul%';

在这里插入图片描述
type是index表示遍历整个索引树但是不遍历行数据一般使用覆盖索引的场景出现

explain select name, age, pos from staffs where name like '%Jul%';

在这里插入图片描述
因为使用了覆盖索引，就算%放在前面type也是index 遍历索引不遍历整个表。

面试题：如何有优化需要将%放在模糊查询字符之前的SQL？
答：使用覆盖索引可以将type优化到index(注意覆盖索引是一种优化方式不是一个索引类型，覆盖索引通过建立联合索引实现)

2.9 自动类型转换会使索引失效

explain select * from staffs where name=2000;

在这里插入图片描述
本应该是ref 非唯一性的索引扫描因为自动的类型转换导致索引失效。

2.10 使用or连接会导致索引失效

explain select * from staffs where name='z3' or name='July';

在这里插入图片描述

大数据最新文章

实现Kafka至少消费一次

亚马逊云科技：还在苦于ETL？Zero ETL的时代

初探MapReduce

【SpringBoot框架篇】32.基于注解+redis实现

Elasticsearch：如何减少 Elasticsearch 集

Go redis操作

Redis面试题

专题五 Redis高并发场景

基于GBase8s和Calcite的多数据源查询

Redis——底层数据结构原理

加:2021-10-15 11:51:47 更:2021-10-15 11:52:15

360图书馆购物三丰科技阅读网日历万年历 2025年7日历

-2025/7/15 6:49:00-

图片自动播放器
↓图片自动播放器↓

TxT小说阅读器
↓语音阅读,小说下载,古典文学↓

一键清除垃圾
↓轻轻一点,清除系统垃圾↓

图片批量下载器
↓批量下载图片,美女图库↓

网站联系: qq:121756557 email:121756557@qq.com IT数码