之前在学习SQL时刷过一遍LeetCode上的SQL题,不过只做一遍效果并不是很好,很快也忘记了具体的解题思路。在这里将对其中的:Q176(第二高薪水) 、 Q177(第N高薪水) 、 Q178(分数排名) 、 Q184(部门工资最高的员工) 、 Q185(部门工资前三高的员工) 进行归纳总结,从而更进一步的去理解有关排名和分组筛选相关的问题。 LeetCode上的SQL答案可详见Github-LeetCode,欢迎Start,Issue Leetcode上这五道题放在一起看,其考察的知识点可以拓展为下面三个方向:
分组筛选问题(最大值、第N个值、前N个值) 排名问题
CREATE TABLE `empl` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`salary` int(11) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
CREATE TABLE `employee` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`salary` int(11) NOT NULL,
`deparment` varchar(64) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
不分组筛选 - 获取前N个值 自身左连接法 筛选empl表中薪水最大的前三个值,实现思路:假设数据集中没有重复数据,则不难想到,最大的值不存在比其自身更大的数据项,第二大值只存在一个比自己大的数据项,第三大值有且只有两个比自己大的值;以此类推将自身和比自己大的数据表做join,则可以通过分组计数的方式获取前N个值。SQL如下
SELECT
a.salary AS salary
FROM
(
SELECT DISTINCT
salary
FROM
empl
) a
LEFT JOIN (
SELECT DISTINCT
salary
FROM
empl
) b ON (a.salary < b.salary)
GROUP BY
a.salary
HAVING
count(*) < 3
ORDER BY
salary DESC;
将上述left join改写成子查询,SQL如下:
SELECT
salary
FROM
(
SELECT DISTINCT
salary
FROM
empl
) a
WHERE
3 > (
SELECT
count(*)
FROM
(
SELECT DISTINCT
salary
FROM
empl
) b
WHERE
a.salary < b.salary
)
ORDER BY
salary DESC;
分组筛选 左连接
实现筛选表employee中各部门薪水最高的前三个;只需要在上文的基础上,增加分组操作即可。同样需要保证数据集无不重复元素,
SELECT
a.salary,
a.deparment
FROM
(
SELECT DISTINCT
salary,
deparment
FROM
employee
) a
LEFT JOIN (
SELECT DISTINCT
salary,
deparment
FROM
employee
) b ON (
a.deparment = b.deparment
AND a.salary < b.salary
)
GROUP BY
a.deparment,
a.salary
HAVING
count(*) < 3
ORDER BY
a.deparment,
a.salary DESC;
改写为子查询
SELECT
*
FROM
(
SELECT DISTINCT
deparment,
salary
FROM
employee
) a
WHERE
3 > (
SELECT
count(*)
FROM
(
SELECT DISTINCT
deparment,
salary
FROM
employee
) b
WHERE
a.deparment = b.deparment
AND a.salary < b.salary
)
ORDER BY
a.deparment,
a.salary DESC;
CREATE TABLE `demo_user` (
`id` varchar(100) NOT NULL,
`name` varchar(100) CHARACTER NOT NULL,
`age` int DEFAULT NULL,
`address` varchar(100) DEFAULT NULL,
`create_time` timestamp NULL DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb3
INSERT INTO zhong_test_2.demo_user
(id, name, age, address, create_time)
VALUES('2', 'zhong', 1, 'gsgfsfgs', '2021-12-10 12:12:12');
INSERT INTO zhong_test_2.demo_user
(id, name, age, address, create_time)
VALUES('3', 'zhong', 12, 'gsgfsfgs', '2021-11-10 12:12:12');
INSERT INTO zhong_test_2.demo_user
(id, name, age, address, create_time)
VALUES('1', 'zhong', 12, 'gsgfsfgs', '2021-12-10 12:12:12');
INSERT INTO zhong_test_2.demo_user
(id, name, age, address, create_time)
VALUES('4', 'zhong', 15, 'gsgfsfgs', '2021-10-10 12:12:12');
INSERT INTO zhong_test_2.demo_user
(id, name, age, address, create_time)
VALUES('5', 'li', 12, 'gsgfsfgs', '2021-12-10 12:12:12');
INSERT INTO zhong_test_2.demo_user
(id, name, age, address, create_time)
VALUES('6', 'li', 1, 'gsgfsfgs', '2021-11-10 12:12:12');
INSERT INTO zhong_test_2.demo_user
(id, name, age, address, create_time)
VALUES('7', 'li', 2, 'gsgfsfgs', '2021-10-10 12:12:12');
INSERT INTO zhong_test_2.demo_user
(id, name, age, address, create_time)
VALUES('8', 'li', 120, 'gsgfsfgs', '2021-10-10 12:12:12');
INSERT INTO zhong_test_2.demo_user
(id, name, age, address, create_time)
VALUES('9', 'wang', 12, 'gsgfsfgs', '2021-12-10 12:12:12');
INSERT INTO zhong_test_2.demo_user
(id, name, age, address, create_time)
VALUES('10', 'wang', 3, 'gsgfsfgs', '2021-11-10 12:12:12');
INSERT INTO zhong_test_2.demo_user
(id, name, age, address, create_time)
VALUES('11', 'wang', 5, 'gsgfsfgs', '2021-12-10 12:12:12');
INSERT INTO zhong_test_2.demo_user
(id, name, age, address, create_time)
VALUES('12', 'wang', 3, 'gsgfsfgs', '2021-11-10 12:12:12');
INSERT INTO zhong_test_2.demo_user
(id, name, age, address, create_time)
VALUES('13', 'huang', 1, 'gsgfsfgs', '2021-12-10 12:12:12');
INSERT INTO zhong_test_2.demo_user
(id, name, age, address, create_time)
VALUES('14', 'huang', 12, 'gsgfsfgs', '2021-10-10 12:12:12');
select * from zhong_test_2.demo_user origin
where 2 > ( select count(*) from zhong_test_2.demo_user du where du.name = origin .name and du.age > origin.age)
order by origin.age desc;
解析:
2 > : 表示获取每一类的数据的条数
du.name = origin .name : 表示分组依据,按照name分组
du.age > origin.age : 排序方式,也就是获取前多少数据或者倒数多少条数据。> 表述获取前多少条数据,< 表示获取倒数多少条数据。
取前两名
select * from zhong_test_2.demo_user origin
where 2 > ( select count(*) from zhong_test_2.demo_user du where du.name = origin .name and du.age > origin.age)
order by origin.age desc;
取后两名
select * from zhong_test_2.demo_user origin
where 2 > ( select count(*) from zhong_test_2.demo_user du where du.name = origin .name and du.age < origin.age)
order by origin.age desc;
```
|