[大数据] LeetCode-SQL（七）

开发: C++知识库 Java知识库 JavaScript Python PHP知识库人工智能区块链大数据移动开发嵌入式开发工具数据结构与算法开发测试游戏开发网络协议系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑笔记本显卡显示器固态硬盘硬盘耳机手机 iphone vivo oppo 小米华为单反装机图拉丁

-> 大数据 -> LeetCode-SQL（七） -> 正文阅读

[大数据]LeetCode-SQL（七）

以下题目均来自力扣

121、1549.每件商品的最新订单

难度：★★★☆☆

表: Customers

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| customer_id   | int     |
| name          | varchar |
+---------------+---------+
customer_id 是该表主键.
该表包含消费者的信息.

表: Orders

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| order_id      | int     |
| order_date    | date    |
| customer_id   | int     |
| product_id    | int     |
+---------------+---------+
order_id 是该表主键.
该表包含消费者customer_id产生的订单.
不会有商品被相同的用户在一天内下单超过一次.

表: Products

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| product_id    | int     |
| product_name  | varchar |
| price         | int     |
+---------------+---------+
product_id 是该表主键.
该表包含所有商品的信息.

写一个SQL 语句, 找到每件商品的最新订单(可能有多个).

返回的结果以 product_name 升序排列, 如果有排序相同, 再以 product_id 升序排列. 如果还有排序相同, 再以 order_id 升序排列.

查询结果格式如下例所示:

Customers
+-------------+-----------+
| customer_id | name      |
+-------------+-----------+
| 1           | Winston   |
| 2           | Jonathan  |
| 3           | Annabelle |
| 4           | Marwan    |
| 5           | Khaled    |
+-------------+-----------+

Orders
+----------+------------+-------------+------------+
| order_id | order_date | customer_id | product_id |
+----------+------------+-------------+------------+
| 1        | 2020-07-31 | 1           | 1          |
| 2        | 2020-07-30 | 2           | 2          |
| 3        | 2020-08-29 | 3           | 3          |
| 4        | 2020-07-29 | 4           | 1          |
| 5        | 2020-06-10 | 1           | 2          |
| 6        | 2020-08-01 | 2           | 1          |
| 7        | 2020-08-01 | 3           | 1          |
| 8        | 2020-08-03 | 1           | 2          |
| 9        | 2020-08-07 | 2           | 3          |
| 10       | 2020-07-15 | 1           | 2          |
+----------+------------+-------------+------------+

Products
+------------+--------------+-------+
| product_id | product_name | price |
+------------+--------------+-------+
| 1          | keyboard     | 120   |
| 2          | mouse        | 80    |
| 3          | screen       | 600   |
| 4          | hard disk    | 450   |
+------------+--------------+-------+

Result
+--------------+------------+----------+------------+
| product_name | product_id | order_id | order_date |
+--------------+------------+----------+------------+
| keyboard     | 1          | 6        | 2020-08-01 |
| keyboard     | 1          | 7        | 2020-08-01 |
| mouse        | 2          | 8        | 2020-08-03 |
| screen       | 3          | 3        | 2020-08-29 |
+--------------+------------+----------+------------+
keyboard 的最新订单在2020-08-01, 在这天有两次下单.
mouse 的最新订单在2020-08-03, 在这天只有一次下单.
screen 的最新订单在2020-08-29, 在这天只有一次下单.
hard disk 没有被下单, 我们不把它包含在结果表中.

解答：

# Write your MySQL query statement below
with tmp as(		# 临时表tmp
select
    order_id,
    order_date,
    p.product_id,
    p.product_name,
    p.price
from
    orders o
left join  				# 左连接products表
    products p
on
    o.product_id=p.product_id		# 链接条件
),		
tmp1 as(				# 临时表tmp1
select
    product_name,
    product_id,
    order_id,
    order_date,
    dense_rank() over(partition by product_name order by order_date desc) rk	# 排名
from 
    tmp
)
select
     product_name,
    product_id,
    order_id,
    order_date
from
    tmp1
where
    rk=1			# 条件取第一
order by
    product_name,product_id,order_id		# 排序条件
;

122、1555.银行账户概要

难度：★★★☆☆

用户表： Users

+--------------+---------+
| Column Name  | Type    |
+--------------+---------+
| user_id      | int     |
| user_name    | varchar |
| credit       | int     |
+--------------+---------+
user_id 是这个表的主键。
表中的每一列包含每一个用户当前的额度信息。

交易表：Transactions

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| trans_id      | int     |
| paid_by       | int     |
| paid_to       | int     |
| amount        | int     |
| transacted_on | date    |
+---------------+---------+
trans_id 是这个表的主键。
表中的每一列包含银行的交易信息。
ID 为 paid_by 的用户给 ID 为 paid_to 的用户转账。

力扣银行 (LCB) 帮助程序员们完成虚拟支付。我们的银行在表 Transaction 中记录每条交易信息，我们要查询每个用户的当前余额，并检查他们是否已透支（当前额度小于 0）。

写一条 SQL 语句，查询：

user_id 用户 ID
user_name 用户名
credit 完成交易后的余额
credit_limit_breached 检查是否透支（“Yes” 或 “No”）

以任意顺序返回结果表。

查询格式见如下所示。

示例 1:

输入：
Users 表：
+------------+--------------+-------------+
| user_id    | user_name    | credit      |
+------------+--------------+-------------+
| 1          | Moustafa     | 100         |
| 2          | Jonathan     | 200         |
| 3          | Winston      | 10000       |
| 4          | Luis         | 800         | 
+------------+--------------+-------------+

Transactions 表：
+------------+------------+------------+----------+---------------+
| trans_id   | paid_by    | paid_to    | amount   | transacted_on |
+------------+------------+------------+----------+---------------+
| 1          | 1          | 3          | 400      | 2020-08-01    |
| 2          | 3          | 2          | 500      | 2020-08-02    |
| 3          | 2          | 1          | 200      | 2020-08-03    |
+------------+------------+------------+----------+---------------+

输出：
+------------+------------+------------+-----------------------+
| user_id    | user_name  | credit     | credit_limit_breached |
+------------+------------+------------+-----------------------+
| 1          | Moustafa   | -100       | Yes                   | 
| 2          | Jonathan   | 500        | No                    |
| 3          | Winston    | 9900       | No                    |
| 4          | Luis       | 800        | No                    |
+------------+------------+------------+-----------------------+
Moustafa 在 "2020-08-01" 支付了 $400 并在 "2020-08-03" 收到了 $200 ，当前额度 (100 -400 +200) = -$100
Jonathan 在 "2020-08-02" 收到了 $500 并在 "2020-08-08" 支付了 $200 ，当前额度 (200 +500 -200) = $500
Winston 在 "2020-08-01" 收到了 $400 并在 "2020-08-03" 支付了 $500 ，当前额度 (10000 +400 -500) = $9900
Luis 未收到任何转账信息，额度 = $800

解答：

# Write your MySQL query statement below
with tmp1 as(			# 临时表tmp1，表示转账的id和金钱
select
    paid_by,
    sum(amount) amount
from
    transactions
group by
    paid_by
),
tmp2 as(				# 临时表tmp2，表示收账的id和金钱
select
    paid_to,
    sum(amount) amount
from
    transactions
group by
    paid_to
),
tmp3 as(				# 临时表tmp3
select
    user_id,
    user_name,
    ifnull(credit-tmp1.amount,credit) credit		# 减去转账的钱
from 
    Users
left join
    tmp1				# 左连接转账的临时表
on
    users.user_id=tmp1.paid_by		# 连接条件
),
tmp4 as(					# 临时表tmp4
select	
    user_id,
    user_name,
    ifnull(credit+tmp2.amount,credit) credit		# 加上收账的钱
from 
    tmp3
left join			# 左连接收账的临时表
    tmp2
on
    tmp3.user_id=tmp2.paid_to    		# 收账条件
)
select
	*,
	if(credit>0,'No','Yes') credit_limit_breached		# 判断是否透支的条件
from
	tmp4
;

123、1565.按月统计订单数和顾客数

难度：★★☆☆☆

表：Orders

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| order_id      | int     |
| order_date    | date    |
| customer_id   | int     |
| invoice       | int     |
+---------------+---------+
order_id 是 Orders 表的主键。
这张表包含顾客(customer_id)所下订单的信息。

写一个查询语句来按月统计金额（invoice）大于 $20 的唯一 订单数 和唯一 顾客数。

查询结果无排序要求。

查询结果格式如下所示。

示例 1：

输入：
Orders
+----------+------------+-------------+------------+
| order_id | order_date | customer_id | invoice    |
+----------+------------+-------------+------------+
| 1        | 2020-09-15 | 1           | 30         |
| 2        | 2020-09-17 | 2           | 90         |
| 3        | 2020-10-06 | 3           | 20         |
| 4        | 2020-10-20 | 3           | 21         |
| 5        | 2020-11-10 | 1           | 10         |
| 6        | 2020-11-21 | 2           | 15         |
| 7        | 2020-12-01 | 4           | 55         |
| 8        | 2020-12-03 | 4           | 77         |
| 9        | 2021-01-07 | 3           | 31         |
| 10       | 2021-01-15 | 2           | 20         |
+----------+------------+-------------+------------+
输出：
+---------+-------------+----------------+
| month   | order_count | customer_count |
+---------+-------------+----------------+
| 2020-09 | 2           | 2              |
| 2020-10 | 1           | 1              |
| 2020-12 | 2           | 1              |
| 2021-01 | 1           | 1              |
+---------+-------------+----------------+
解释：
在 2020 年 09 月，有 2 份来自 2 位不同顾客的金额大于 $20 的订单。
在 2020 年 10 月，有 2 份来自 1 位顾客的订单，并且只有其中的 1 份订单金额大于 $20 。
在 2020 年 11 月，有 2 份来自 2 位不同顾客的订单，但由于金额都小于 $20 ，所以我们的查询结果中不包含这个月的数据。
在 2020 年 12 月，有 2 份来自 1 位顾客的订单，且 2 份订单金额都大于 $20 。
在 2021 年 01 月，有 2 份来自 2 位不同顾客的订单，但只有其中一份订单金额大于 $20 。

解答：

# Write your MySQL query statement below
select
    date_format(order_date,'%Y-%m') month,			
    count(*) order_count,					# 直接count计数，多少是大于20的
    count(distinct customer_id) customer_count			# 去重计数
from
    orders
where 
    invoice>20			# 先取出小于20的
group by
    date_format(order_date,'%Y-%m')			# 根据日期和月份进行分组
;

124、1571.仓库经理

难度：★★☆☆☆

表: Warehouse

+--------------+---------+
| Column Name  | Type    |
+--------------+---------+
| name         | varchar |
| product_id   | int     |
| units        | int     |
+--------------+---------+
(name, product_id) 是该表主键.
该表的行包含了每个仓库的所有商品信息.

表: Products

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| product_id    | int     |
| product_name  | varchar |
| Width         | int     |
| Length        | int     |
| Height        | int     |
+---------------+---------+
product_id 是该表主键.
该表的行包含了每件商品以英尺为单位的尺寸(宽度, 长度和高度)信息.

写一个 SQL 查询来报告, 每个仓库的存货量是多少立方英尺.

返回结果没有顺序要求.

查询结果如下例所示.

示例 1:

输入：
Warehouse 表:
+------------+--------------+-------------+
| name       | product_id   | units       |
+------------+--------------+-------------+
| LCHouse1   | 1            | 1           |
| LCHouse1   | 2            | 10          |
| LCHouse1   | 3            | 5           |
| LCHouse2   | 1            | 2           |
| LCHouse2   | 2            | 2           |
| LCHouse3   | 4            | 1           |
+------------+--------------+-------------+
Products 表:
+------------+--------------+------------+----------+-----------+
| product_id | product_name | Width      | Length   | Height    |
+------------+--------------+------------+----------+-----------+
| 1          | LC-TV        | 5          | 50       | 40        |
| 2          | LC-KeyChain  | 5          | 5        | 5         |
| 3          | LC-Phone     | 2          | 10       | 10        |
| 4          | LC-T-Shirt   | 4          | 10       | 20        |
+------------+--------------+------------+----------+-----------+
输出：
+----------------+------------+
| WAREHOUSE_NAME | VOLUME     | 
+----------------+------------+
| LCHouse1       | 12250      | 
| LCHouse2       | 20250      |
| LCHouse3       | 800        |
+----------------+------------+
解释：
Id为1的商品(LC-TV)的存货量为 5x50x40 = 10000
Id为2的商品(LC-KeyChain)的存货量为 5x5x5 = 125 
Id为3的商品(LC-Phone)的存货量为 2x10x10 = 200
Id为4的商品(LC-T-Shirt)的存货量为 4x10x20 = 800
仓库LCHouse1: 1个单位的LC-TV + 10个单位的LC-KeyChain + 5个单位的LC-Phone.
          总存货量为: 1*10000 + 10*125  + 5*200 = 12250 立方英尺
仓库LCHouse2: 2个单位的LC-TV + 2个单位的LC-KeyChain.
          总存货量为: 2*10000 + 2*125 = 20250 立方英尺
仓库LCHouse3: 1个单位的LC-T-Shirt.
          总存货量为: 1*800 = 800 立方英尺.

解答：

# Write your MySQL query statement below
select
    name warehouse_name,
    sum((p.width*length*height)*units) volume		# 计算总共的大小
from
    warehouse w
left join				# 用warehouse表左连接products表	
    products p
on
    w.product_id=p.product_id		# 连接条件
group by
    name				# 用name进行分组
;

125、1581.进店却未进行过交易的顾客

难度：★★☆☆☆

表：Visits

+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| visit_id    | int     |
| customer_id | int     |
+-------------+---------+
visit_id 是该表的主键。
该表包含有关光临过购物中心的顾客的信息。

表：Transactions

+----------------+---------+
| Column Name    | Type    |
+----------------+---------+
| transaction_id | int     |
| visit_id       | int     |
| amount         | int     |
+----------------+---------+
transaction_id 是此表的主键。
此表包含 visit_id 期间进行的交易的信息。

有一些顾客可能光顾了购物中心但没有进行交易。请你编写一个 SQL 查询，来查找这些顾客的 ID ，以及他们只光顾不交易的次数。

返回以任何顺序排序的结果表。

查询结果格式如下例所示：

Visits
+----------+-------------+
| visit_id | customer_id |
+----------+-------------+
| 1        | 23          |
| 2        | 9           |
| 4        | 30          |
| 5        | 54          |
| 6        | 96          |
| 7        | 54          |
| 8        | 54          |
+----------+-------------+

Transactions
+----------------+----------+--------+
| transaction_id | visit_id | amount |
+----------------+----------+--------+
| 2              | 5        | 310    |
| 3              | 5        | 300    |
| 9              | 5        | 200    |
| 12             | 1        | 910    |
| 13             | 2        | 970    |
+----------------+----------+--------+

Result 表：
+-------------+----------------+
| customer_id | count_no_trans |
+-------------+----------------+
| 54          | 2              |
| 30          | 1              |
| 96          | 1              |
+-------------+----------------+
ID = 23 的顾客曾经逛过一次购物中心，并在 ID = 12 的访问期间进行了一笔交易。
ID = 9 的顾客曾经逛过一次购物中心，并在 ID = 13 的访问期间进行了一笔交易。
ID = 30 的顾客曾经去过购物中心，并且没有进行任何交易。
ID = 54 的顾客三度造访了购物中心。在 2 次访问中，他们没有进行任何交易，在 1 次访问中，他们进行了 3 次交易。
ID = 96 的顾客曾经去过购物中心，并且没有进行任何交易。
如我们所见，ID 为 30 和 96 的顾客一次没有进行任何交易就去了购物中心。顾客 54 也两次访问了购物中心并且没有进行任何交易。

解答：

# Write your MySQL query statement below
select
   customer_id,
   count(*) count_no_trans			④# 求数量
from
    visits
where 						①
    visit_id 
not in
    (select visit_id from transactions)		②# 子查询排序交易过得顾客
group by
    customer_id			③# 按照顾客id进行分组
;

126、1587.银行账户概要II

难度：★★☆☆☆

表: Users

+--------------+---------+
| Column Name  | Type    |
+--------------+---------+
| account      | int     |
| name         | varchar |
+--------------+---------+
account 是该表的主键.
表中的每一行包含银行里中每一个用户的账号.

表: Transactions

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| trans_id      | int     |
| account       | int     |
| amount        | int     |
| transacted_on | date    |
+---------------+---------+
trans_id 是该表主键.
该表的每一行包含了所有账户的交易改变情况.
如果用户收到了钱, 那么金额是正的; 如果用户转了钱, 那么金额是负的.
所有账户的起始余额为 0.

写一个 SQL, 报告余额高于 10000 的所有用户的名字和余额. 账户的余额等于包含该账户的所有交易的总和.

返回结果表单没有顺序要求.

查询结果格式如下例所示.

Users table:
+------------+--------------+
| account    | name         |
+------------+--------------+
| 900001     | Alice        |
| 900002     | Bob          |
| 900003     | Charlie      |
+------------+--------------+

Transactions table:
+------------+------------+------------+---------------+
| trans_id   | account    | amount     | transacted_on |
+------------+------------+------------+---------------+
| 1          | 900001     | 7000       |  2020-08-01   |
| 2          | 900001     | 7000       |  2020-09-01   |
| 3          | 900001     | -3000      |  2020-09-02   |
| 4          | 900002     | 1000       |  2020-09-12   |
| 5          | 900003     | 6000       |  2020-08-07   |
| 6          | 900003     | 6000       |  2020-09-07   |
| 7          | 900003     | -4000      |  2020-09-11   |
+------------+------------+------------+---------------+

Result table:
+------------+------------+
| name       | balance    |
+------------+------------+
| Alice      | 11000      |
+------------+------------+
Alice 的余额为(7000 + 7000 - 3000) = 11000.
Bob 的余额为1000.
Charlie 的余额为(6000 + 6000 - 4000) = 8000.

解答：

# Write your MySQL query statement below
select
    u.name,
    sum(amount) balance	⑤# 取其余额
from
    transactions t
left join				①# transactions表和users表进行左连接
    users u		
on
    t.account=u.account	②# 连接条件
group by
    t.account			③# 按account进行分组
having
    sum(amount)>10000	④# 取余额大于10000的
;

127、1596.每位顾客最经常订购的商品

难度：★★★☆☆

表：Customers

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| customer_id   | int     |
| name          | varchar |
+---------------+---------+
customer_id 是该表主键
该表包含所有顾客的信息

表：Orders

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| order_id      | int     |
| order_date    | date    |
| customer_id   | int     |
| product_id    | int     |
+---------------+---------+
order_id 是该表主键
该表包含顾客 customer_id 的订单信息
没有顾客会在一天内订购相同的商品 多于一次

表：Products

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| product_id    | int     |
| product_name  | varchar |
| price         | int     |
+---------------+---------+
product_id 是该表主键
该表包含了所有商品的信息

写一个 SQL 语句，找到每一个顾客最经常订购的商品。

结果表单应该有每一位至少下过一次单的顾客 customer_id , 他最经常订购的商品的 product_id 和 product_name。

返回结果 没有顺序要求。

查询结果格式如下例所示：

Customers
+-------------+-------+
| customer_id | name  |
+-------------+-------+
| 1           | Alice |
| 2           | Bob   |
| 3           | Tom   |
| 4           | Jerry |
| 5           | John  |
+-------------+-------+

Orders
+----------+------------+-------------+------------+
| order_id | order_date | customer_id | product_id |
+----------+------------+-------------+------------+
| 1        | 2020-07-31 | 1           | 1          |
| 2        | 2020-07-30 | 2           | 2          |
| 3        | 2020-08-29 | 3           | 3          |
| 4        | 2020-07-29 | 4           | 1          |
| 5        | 2020-06-10 | 1           | 2          |
| 6        | 2020-08-01 | 2           | 1          |
| 7        | 2020-08-01 | 3           | 3          |
| 8        | 2020-08-03 | 1           | 2          |
| 9        | 2020-08-07 | 2           | 3          |
| 10       | 2020-07-15 | 1           | 2          |
+----------+------------+-------------+------------+

Products
+------------+--------------+-------+
| product_id | product_name | price |
+------------+--------------+-------+
| 1          | keyboard     | 120   |
| 2          | mouse        | 80    |
| 3          | screen       | 600   |
| 4          | hard disk    | 450   |
+------------+--------------+-------+
Result 表：
+-------------+------------+--------------+
| customer_id | product_id | product_name |
+-------------+------------+--------------+
| 1           | 2          | mouse        |
| 2           | 1          | keyboard     |
| 2           | 2          | mouse        |
| 2           | 3          | screen       |
| 3           | 3          | screen       |
| 4           | 1          | keyboard     |
+-------------+------------+--------------+

Alice (customer 1) 三次订购鼠标, 一次订购键盘, 所以鼠标是 Alice 最经常订购的商品.
Bob (customer 2) 一次订购键盘, 一次订购鼠标, 一次订购显示器, 所以这些都是 Bob 最经常订购的商品.
Tom (customer 3) 只两次订购显示器, 所以显示器是 Tom 最经常订购的商品.
Jerry (customer 4) 只一次订购键盘, 所以键盘是 Jerry 最经常订购的商品.
John (customer 5) 没有订购过商品, 所以我们并没有把 John 包含在结果表中.

解答：

# Write your MySQL query statement below
with tmp as(				# 临时表tmp
select
   customer_id,
   o.product_id,
   product_name,
   count(*) cn				# 求出数量
from    
    orders o
left join					# orders表左连接products表
    products p
on
    o.product_id=p.product_id		# 连接条件
group by
    customer_id,o.product_id		# 进行分组
),
tmp1 as(				# 临时表tmp1
select
    customer_id,
    product_id,
    product_name,
    cn,
    dense_rank() over(partition by customer_id order by cn desc) rk			# 排序
from
    tmp
)
select
    customer_id,
    product_id,
    product_name
from
    tmp1
where 
    rk=1		# 取rk=1，即最近常卖的
;

128、1607.没有卖出的卖家

难度：★★☆☆☆

表: Customer

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| customer_id   | int     |
| customer_name | varchar |
+---------------+---------+
customer_id 是该表主键.
该表的每行包含网上商城的每一位顾客的信息.

表: Orders

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| order_id      | int     |
| sale_date     | date    |
| order_cost    | int     |
| customer_id   | int     |
| seller_id     | int     |
+---------------+---------+
order_id 是该表主键.
该表的每行包含网上商城的所有订单的信息.
sale_date 是顾客customer_id和卖家seller_id之间交易的日期.

表: Seller

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| seller_id     | int     |
| seller_name   | varchar |
+---------------+---------+
seller_id 是该表主键.
该表的每行包含每一位卖家的信息.

写一个SQL语句, 报告所有在2020年度没有任何卖出的卖家的名字.

返回结果按照 seller_name 升序排列.

查询结果格式如下例所示.

示例 1:

输入：
Customer 表:
+--------------+---------------+
| customer_id  | customer_name |
+--------------+---------------+
| 101          | Alice         |
| 102          | Bob           |
| 103          | Charlie       |
+--------------+---------------+
Orders 表:
+-------------+------------+--------------+-------------+-------------+
| order_id    | sale_date  | order_cost   | customer_id | seller_id   |
+-------------+------------+--------------+-------------+-------------+
| 1           | 2020-03-01 | 1500         | 101         | 1           |
| 2           | 2020-05-25 | 2400         | 102         | 2           |
| 3           | 2019-05-25 | 800          | 101         | 3           |
| 4           | 2020-09-13 | 1000         | 103         | 2           |
| 5           | 2019-02-11 | 700          | 101         | 2           |
+-------------+------------+--------------+-------------+-------------+
Seller 表:
+-------------+-------------+
| seller_id   | seller_name |
+-------------+-------------+
| 1           | Daniel      |
| 2           | Elizabeth   |
| 3           | Frank       |
+-------------+-------------+
输出：
+-------------+
| seller_name |
+-------------+
| Frank       |
+-------------+
解释：
Daniel在2020年3月卖出1次.
Elizabeth在2020年卖出2次, 在2019年卖出1次.
Frank在2019年卖出1次, 在2020年没有卖出.

解答：

# Write your MySQL query statement below
select
    seller_name
from
    seller s
left join							# 左连接子查询表
    (select							# 子查询
        distinct seller_id			# 去重求出seller_id
    from
        orders
    where
        date_format(sale_date,'%Y')='2020'		# 条件为2020年度的
    ) o
on
    s.seller_id=o.seller_id				# 左连接条件
where   
    o.seller_id is null					# 取值条件，连接后位null的即2020年没有卖出的卖家
order by
    seller_name							# 按照题目要求seller_name升序
;

129、1613.找到遗失的ID

难度：★★★☆☆

表: Customers

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| customer_id   | int     |
| customer_name | varchar |
+---------------+---------+
customer_id 是该表主键.
该表第一行包含了顾客的名字和id.

写一个 SQL 语句, 找到所有遗失的顾客id. 遗失的顾客id是指那些不在 Customers 表中, 值却处于 1 和表中最大 customer_id 之间的id.

注意: 最大的 customer_id 值不会超过 100.

返回结果按 ids 升序排列

查询结果格式如下例所示.

Customers 表:
+-------------+---------------+
| customer_id | customer_name |
+-------------+---------------+
| 1           | Alice         |
| 4           | Bob           |
| 5           | Charlie       |
+-------------+---------------+

Result 表:
+-----+
| ids |
+-----+
| 2   |
| 3   |
+-----+
表中最大的customer_id是5, 所以在范围[1,5]内, ID2和3从表中遗失.

解答：

# Write your MySQL query statement below
with recursive a as(			# 临时表a
    select 1 as n
    union all
    select 
    	n+1 
    from 
    	a 
    where 
    	n<(select max(customer_id) from customers)		# 递增到最大值
)
select 
	n ids 
from 
	a 
left join 			# 两表左连接
	customers 
on 
	a.n=customers.customer_id 		# 连接条件
where 	
	customer_id is null			# 为null的说明是遗失的
;

130、1623.三人国家代表队

难度：★★☆☆☆

表: SchoolA

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| student_id    | int     |
| student_name  | varchar |
+---------------+---------+
student_id 是表的主键
表中的每一行包含了学校A中每一个学生的名字和ID
所有student_name在表中都是独一无二的

表: SchoolB

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| student_id    | int     |
| student_name  | varchar |
+---------------+---------+
student_id 是表的主键
表中的每一行包含了学校B中每一个学生的名字和ID
所有student_name在表中都是独一无二的

表: SchoolC

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| student_id    | int     |
| student_name  | varchar |
+---------------+---------+
student_id 是表的主键
表中的每一行包含了学校C中每一个学生的名字和ID
所有student_name在表中都是独一无二的

有一个国家只有三所学校，这个国家的每一个学生只会注册一所学校。

这个国家正在参加一个竞赛，他们希望从这三所学校中各选出一个学生来组建一支三人的代表队。

例如：

member_A是从 SchoolA中选出的
member_B是从 SchoolB中选出的
member_C是从 SchoolC中选出的
被选中的学生具有不同的名字和ID（没有任何两个学生拥有相同的名字、没有任何两个学生拥有相同的ID）

使用上述条件，编写SQL查询语句来找到所有可能的三人国家代表队组合。

查询结果接受任何顺序。

查询结果格式样例：

SchoolA table:
+------------+--------------+
| student_id | student_name |
+------------+--------------+
| 1          | Alice        |
| 2          | Bob          |
+------------+--------------+

SchoolB table:
+------------+--------------+
| student_id | student_name |
+------------+--------------+
| 3          | Tom          |
+------------+--------------+

SchoolC table:
+------------+--------------+
| student_id | student_name |
+------------+--------------+
| 3          | Tom          |
| 2          | Jerry        |
| 10         | Alice        |
+------------+--------------+

预期结果:
+----------+----------+----------+
| member_A | member_B | member_C |
+----------+----------+----------+
| Alice    | Tom      | Jerry    |
| Bob      | Tom      | Alice    |
+----------+----------+----------+

让我们看看有哪些可能的组合：
- (Alice, Tom, Tom) --> 不适用，因为member_B（Tom）和member_C（Tom）有相同的名字和ID
- (Alice, Tom, Jerry) --> 可能的组合
- (Alice, Tom, Alice) --> 不适用，因为member_A和member_C有相同的名字
- (Bob, Tom, Tom) --> 不适用，因为member_B和member_C有相同的名字和ID
- (Bob, Tom, Jerry) --> 不适用，因为member_A和member_C有相同的ID
- (Bob, Tom, Alice) --> 可能的组合.

解答：

select 
    a.student_name as member_A, 
    b.student_name as member_B, 
    c.student_name as member_C
from 
	schoola a, schoolb b, schoolc c
where
    a.student_id<>b.student_id and 
    b.student_id<>c.student_id and 
    a.student_id<>c.student_id

131、1633.各赛事的用户注册率

难度：★★☆☆☆

用户表： Users

+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| user_id     | int     |
| user_name   | varchar |
+-------------+---------+
user_id 是该表的主键。
该表中的每行包括用户 ID 和用户名。

注册表： Register

+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| contest_id  | int     |
| user_id     | int     |
+-------------+---------+
(contest_id, user_id) 是该表的主键。
该表中的每行包含用户的 ID 和他们注册的赛事。

写一条 SQL 语句，查询各赛事的用户注册百分率，保留两位小数。

返回的结果表按 percentage 的降序排序，若相同则按 contest_id 的升序排序。

查询结果如下示例所示：

Users 表：
+---------+-----------+
| user_id | user_name |
+---------+-----------+
| 6       | Alice     |
| 2       | Bob       |
| 7       | Alex      |
+---------+-----------+

Register 表：
+------------+---------+
| contest_id | user_id |
+------------+---------+
| 215        | 6       |
| 209        | 2       |
| 208        | 2       |
| 210        | 6       |
| 208        | 6       |
| 209        | 7       |
| 209        | 6       |
| 215        | 7       |
| 208        | 7       |
| 210        | 2       |
| 207        | 2       |
| 210        | 7       |
+------------+---------+

结果表：
+------------+------------+
| contest_id | percentage |
+------------+------------+
| 208        | 100.0      |
| 209        | 100.0      |
| 210        | 100.0      |
| 215        | 66.67      |
| 207        | 33.33      |
+------------+------------+
所有用户都注册了 208、209 和 210 赛事，因此这些赛事的注册率为 100% ，我们按 contest_id 的降序排序加入结果表中。
Alice 和 Alex 注册了 215 赛事，注册率为 ((2/3) * 100) = 66.67%
Bob 注册了 207 赛事，注册率为 ((1/3) * 100) = 33.33%

解答：

# Write your MySQL query statement below
with tmp as(		# 临时表tmp
select
    contest_id,
    count(*) n		# 计算数量
from
    register
group by				# 进行分组
    contest_id
)
select
    contest_id,
    round(n/(select count(*) from users)*100,2) percentage			# 算百分比
from 
    tmp
order by
    percentage desc,contest_id				# 排序条件
;

132、1635.Hopper公司查询I

难度：★★★★★

表: Drivers

+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| driver_id   | int     |
| join_date   | date    |
+-------------+---------+
driver_id是该表的主键。
该表的每一行均包含驾驶员的ID以及他们加入Hopper公司的日期。

表: Rides

+--------------+---------+
| Column Name  | Type    |
+--------------+---------+
| ride_id      | int     |
| user_id      | int     |
| requested_at | date    |
+--------------+---------+
ride_id是该表的主键。
该表的每一行均包含行程ID(ride_id)，用户ID(user_id)以及该行程的日期(requested_at)。
该表中可能有一些不被接受的乘车请求。

表: AcceptedRides

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| ride_id       | int     |
| driver_id     | int     |
| ride_distance | int     |
| ride_duration | int     |
+---------------+---------+
ride_id是该表的主键。
该表的每一行都包含已接受的行程信息。
表中的行程信息都在“Rides”表中存在。

编写SQL查询以报告2020年每个月的以下统计信息：

截至某月底，当前在Hopper公司工作的驾驶员数量（active_drivers）。
该月接受的乘车次数（accepted_rides）。

返回按month 升序排列的结果表，其中month 是月份的数字（一月是1，二月是2，依此类推）。

查询结果格式如下例所示。

表 Drivers:
+-----------+------------+
| driver_id | join_date  |
+-----------+------------+
| 10        | 2019-12-10 |
| 8         | 2020-1-13  |
| 5         | 2020-2-16  |
| 7         | 2020-3-8   |
| 4         | 2020-5-17  |
| 1         | 2020-10-24 |
| 6         | 2021-1-5   |
+-----------+------------+

表 Rides:
+---------+---------+--------------+
| ride_id | user_id | requested_at |
+---------+---------+--------------+
| 6       | 75      | 2019-12-9    |
| 1       | 54      | 2020-2-9     |
| 10      | 63      | 2020-3-4     |
| 19      | 39      | 2020-4-6     |
| 3       | 41      | 2020-6-3     |
| 13      | 52      | 2020-6-22    |
| 7       | 69      | 2020-7-16    |
| 17      | 70      | 2020-8-25    |
| 20      | 81      | 2020-11-2    |
| 5       | 57      | 2020-11-9    |
| 2       | 42      | 2020-12-9    |
| 11      | 68      | 2021-1-11    |
| 15      | 32      | 2021-1-17    |
| 12      | 11      | 2021-1-19    |
| 14      | 18      | 2021-1-27    |
+---------+---------+--------------+

表 AcceptedRides:
+---------+-----------+---------------+---------------+
| ride_id | driver_id | ride_distance | ride_duration |
+---------+-----------+---------------+---------------+
| 10      | 10        | 63            | 38            |
| 13      | 10        | 73            | 96            |
| 7       | 8         | 100           | 28            |
| 17      | 7         | 119           | 68            |
| 20      | 1         | 121           | 92            |
| 5       | 7         | 42            | 101           |
| 2       | 4         | 6             | 38            |
| 11      | 8         | 37            | 43            |
| 15      | 8         | 108           | 82            |
| 12      | 8         | 38            | 34            |
| 14      | 1         | 90            | 74            |
+---------+-----------+---------------+---------------+

结果表:
+-------+----------------+----------------+
| month | active_drivers | accepted_rides |
+-------+----------------+----------------+
| 1     | 2              | 0              |
| 2     | 3              | 0              |
| 3     | 4              | 1              |
| 4     | 4              | 0              |
| 5     | 5              | 0              |
| 6     | 5              | 1              |
| 7     | 5              | 1              |
| 8     | 5              | 1              |
| 9     | 5              | 0              |
| 10    | 6              | 0              |
| 11    | 6              | 2              |
| 12    | 6              | 1              |
+-------+----------------+----------------+

截至1月底->两个活跃的驾驶员（10,8），没有被接受的行程。
截至2月底->三个活跃的驾驶员（10,8,5），没有被接受的行程。
截至3月底->四个活跃的驾驶员（10,8,5,7），一个被接受的行程（10）。
截至4月底->四个活跃的驾驶员（10,8,5,7），没有被接受的行程。
截至5月底->五个活跃的驾驶员（10,8,5,7,4），没有被接受的行程。
截至6月底->五个活跃的驾驶员（10,8,5,7,4），一个被接受的行程（13）。
截至7月底->五个活跃的驾驶员（10,8,5,7,4），一个被接受的行程（7）。
截至8月底->五个活跃的驾驶员（10,8,5,7,4），一位接受的行程（17）。
截至9月底->五个活跃的驾驶员（10,8,5,7,4），没有被接受的行程。
截至10月底->六个活跃的驾驶员（10,8,5,7,4,1），没有被接受的行程。
截至11月底->六个活跃的驾驶员（10,8,5,7,4,1），两个被接受的行程（20,5）。
截至12月底->六个活跃的驾驶员（10,8,5,7,4,1），一个被接受的行程（2）。

解答：

# Write your MySQL query statement below
#不要试图三个合并 然后直接求两个统计量，还是乖乖两两合并 分别统计

#1.构建连续月份数列
with recursive m(n) as(
    select 1 
    union all 
    select n+1 from m
    where n<12
)

select n as month,active_drivers,accepted_rides
from
m left join
#2.表1统计active_drivers
(select m.n,
    sum(year(join_date)<=2019)+
    sum(year(join_date)=2020 and month(join_date)<=m.n) as active_drivers
from drivers d join m 
group by m.n) temp1
using(n)
left join 
#3.表2统计accepted_rides
(select m.n,
    sum(year(requested_at)=2020 and month(requested_at)=m.n and ride_distance is not null) accepted_rides
from rides r left join acceptedrides a 
using(ride_id)
join m
group by m.n) temp2
using(n)
#4.将表1 表2 leftjoin到完整月份数列上

133、1645.Hopper Company Queries II

难度：★★★★★

Table: Drivers

+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| driver_id   | int     |
| join_date   | date    |
+-------------+---------+
driver_id is the primary key for this table.
Each row of this table contains the driver's ID and the date they joined the Hopper company.

Table: Rides

+--------------+---------+
| Column Name  | Type    |
+--------------+---------+
| ride_id      | int     |
| user_id      | int     |
| requested_at | date    |
+--------------+---------+
ride_id is the primary key for this table.
Each row of this table contains the ID of a ride, the user's ID that requested it, and the day they requested it.
There may be some ride requests in this table that were not accepted.

Table: AcceptedRides

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| ride_id       | int     |
| driver_id     | int     |
| ride_distance | int     |
| ride_duration | int     |
+---------------+---------+
ride_id is the primary key for this table.
Each row of this table contains some information about an accepted ride.
It is guaranteed that each accepted ride exists in the Rides table.

Write an SQL query to report the percentage of working drivers (working_percentage) for each month of 2020 where:

Note that if the number of available drivers during a month is zero, we consider the working_percentage to be 0.

Return the result table ordered by month in ascending order, where month is the month’s number (January is 1, February is 2, etc.). Round working_percentage to the nearest 2 decimal places.

The query result format is in the following example.

Example 1:

Input: 
Drivers table:
+-----------+------------+
| driver_id | join_date  |
+-----------+------------+
| 10        | 2019-12-10 |
| 8         | 2020-1-13  |
| 5         | 2020-2-16  |
| 7         | 2020-3-8   |
| 4         | 2020-5-17  |
| 1         | 2020-10-24 |
| 6         | 2021-1-5   |
+-----------+------------+
Rides table:
+---------+---------+--------------+
| ride_id | user_id | requested_at |
+---------+---------+--------------+
| 6       | 75      | 2019-12-9    |
| 1       | 54      | 2020-2-9     |
| 10      | 63      | 2020-3-4     |
| 19      | 39      | 2020-4-6     |
| 3       | 41      | 2020-6-3     |
| 13      | 52      | 2020-6-22    |
| 7       | 69      | 2020-7-16    |
| 17      | 70      | 2020-8-25    |
| 20      | 81      | 2020-11-2    |
| 5       | 57      | 2020-11-9    |
| 2       | 42      | 2020-12-9    |
| 11      | 68      | 2021-1-11    |
| 15      | 32      | 2021-1-17    |
| 12      | 11      | 2021-1-19    |
| 14      | 18      | 2021-1-27    |
+---------+---------+--------------+
AcceptedRides table:
+---------+-----------+---------------+---------------+
| ride_id | driver_id | ride_distance | ride_duration |
+---------+-----------+---------------+---------------+
| 10      | 10        | 63            | 38            |
| 13      | 10        | 73            | 96            |
| 7       | 8         | 100           | 28            |
| 17      | 7         | 119           | 68            |
| 20      | 1         | 121           | 92            |
| 5       | 7         | 42            | 101           |
| 2       | 4         | 6             | 38            |
| 11      | 8         | 37            | 43            |
| 15      | 8         | 108           | 82            |
| 12      | 8         | 38            | 34            |
| 14      | 1         | 90            | 74            |
+---------+-----------+---------------+---------------+
Output: 
+-------+--------------------+
| month | working_percentage |
+-------+--------------------+
| 1     | 0.00               |
| 2     | 0.00               |
| 3     | 25.00              |
| 4     | 0.00               |
| 5     | 0.00               |
| 6     | 20.00              |
| 7     | 20.00              |
| 8     | 20.00              |
| 9     | 0.00               |
| 10    | 0.00               |
| 11    | 33.33              |
| 12    | 16.67              |
+-------+--------------------+
Explanation: 
By the end of January --> two active drivers (10, 8) and no accepted rides. The percentage is 0%.
By the end of February --> three active drivers (10, 8, 5) and no accepted rides. The percentage is 0%.
By the end of March --> four active drivers (10, 8, 5, 7) and one accepted ride by driver (10). The percentage is (1 / 4) * 100 = 25%.
By the end of April --> four active drivers (10, 8, 5, 7) and no accepted rides. The percentage is 0%.
By the end of May --> five active drivers (10, 8, 5, 7, 4) and no accepted rides. The percentage is 0%.
By the end of June --> five active drivers (10, 8, 5, 7, 4) and one accepted ride by driver (10). The percentage is (1 / 5) * 100 = 20%.
By the end of July --> five active drivers (10, 8, 5, 7, 4) and one accepted ride by driver (8). The percentage is (1 / 5) * 100 = 20%.
By the end of August --> five active drivers (10, 8, 5, 7, 4) and one accepted ride by driver (7). The percentage is (1 / 5) * 100 = 20%.
By the end of September --> five active drivers (10, 8, 5, 7, 4) and no accepted rides. The percentage is 0%.
By the end of October --> six active drivers (10, 8, 5, 7, 4, 1) and no accepted rides. The percentage is 0%.
By the end of November --> six active drivers (10, 8, 5, 7, 4, 1) and two accepted rides by two different drivers (1, 7). The percentage is (2 / 6) * 100 = 33.33%.
By the end of December --> six active drivers (10, 8, 5, 7, 4, 1) and one accepted ride by driver (4). The percentage is (1 / 6) * 100 = 16.67%.

解答：

# Write your MySQL query statement below
# 构建连续月份数列
with recursive m(n) as(
    select 1
    union all 
    select n+1 from m 
    where n<12
)
# 将两表按month merge，做计算
select n month,
ifnull(round(acceptDrivers/availDrivers*100,2),0) as working_percentage
from(
# 表1——统计每月available的司机
select m.n,
sum(year(join_date)<=2019)+
sum(year(join_date)=2020 and month(join_date)<=m.n) as availDrivers
from m join drivers d 
group by m.n) temp1 
left join
# 表2——统计每月有accept的不同司机id数量
(select m.n,
count(distinct driver_id) as acceptDrivers
from rides r left join acceptedrides a 
using(ride_id)
right join m 
on year(requested_at)=2020
and month(requested_at) = m.n
group by m.n) temp2 
using(n)
order by n
;

134、1651.Hopper Company Queries III

难度：★★★★★

Table: Drivers

+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| driver_id   | int     |
| join_date   | date    |
+-------------+---------+
driver_id is the primary key for this table.
Each row of this table contains the driver's ID and the date they joined the Hopper company.

Table: Rides

+--------------+---------+
| Column Name  | Type    |
+--------------+---------+
| ride_id      | int     |
| user_id      | int     |
| requested_at | date    |
+--------------+---------+
ride_id is the primary key for this table.
Each row of this table contains the ID of a ride, the user's ID that requested it, and the day they requested it.
There may be some ride requests in this table that were not accepted.

Table: AcceptedRides

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| ride_id       | int     |
| driver_id     | int     |
| ride_distance | int     |
| ride_duration | int     |
+---------------+---------+
ride_id is the primary key for this table.
Each row of this table contains some information about an accepted ride.
It is guaranteed that each accepted ride exists in the Rides table.

Write an SQL query to compute the average_ride_distance and average_ride_duration of every 3-month window starting from January - March 2020 to October - December 2020. Round average_ride_distance and average_ride_duration to the nearest two decimal places.

The average_ride_distance is calculated by summing up the total ride_distance values from the three months and dividing it by 3. The average_ride_duration is calculated in a similar way.

Return the result table ordered by month in ascending order, where month is the starting month’s number (January is 1, February is 2, etc.).

The query result format is in the following example.

Example 1:

Input: 
Drivers table:
+-----------+------------+
| driver_id | join_date  |
+-----------+------------+
| 10        | 2019-12-10 |
| 8         | 2020-1-13  |
| 5         | 2020-2-16  |
| 7         | 2020-3-8   |
| 4         | 2020-5-17  |
| 1         | 2020-10-24 |
| 6         | 2021-1-5   |
+-----------+------------+
Rides table:
+---------+---------+--------------+
| ride_id | user_id | requested_at |
+---------+---------+--------------+
| 6       | 75      | 2019-12-9    |
| 1       | 54      | 2020-2-9     |
| 10      | 63      | 2020-3-4     |
| 19      | 39      | 2020-4-6     |
| 3       | 41      | 2020-6-3     |
| 13      | 52      | 2020-6-22    |
| 7       | 69      | 2020-7-16    |
| 17      | 70      | 2020-8-25    |
| 20      | 81      | 2020-11-2    |
| 5       | 57      | 2020-11-9    |
| 2       | 42      | 2020-12-9    |
| 11      | 68      | 2021-1-11    |
| 15      | 32      | 2021-1-17    |
| 12      | 11      | 2021-1-19    |
| 14      | 18      | 2021-1-27    |
+---------+---------+--------------+
AcceptedRides table:
+---------+-----------+---------------+---------------+
| ride_id | driver_id | ride_distance | ride_duration |
+---------+-----------+---------------+---------------+
| 10      | 10        | 63            | 38            |
| 13      | 10        | 73            | 96            |
| 7       | 8         | 100           | 28            |
| 17      | 7         | 119           | 68            |
| 20      | 1         | 121           | 92            |
| 5       | 7         | 42            | 101           |
| 2       | 4         | 6             | 38            |
| 11      | 8         | 37            | 43            |
| 15      | 8         | 108           | 82            |
| 12      | 8         | 38            | 34            |
| 14      | 1         | 90            | 74            |
+---------+-----------+---------------+---------------+
Output: 
+-------+-----------------------+-----------------------+
| month | average_ride_distance | average_ride_duration |
+-------+-----------------------+-----------------------+
| 1     | 21.00                 | 12.67                 |
| 2     | 21.00                 | 12.67                 |
| 3     | 21.00                 | 12.67                 |
| 4     | 24.33                 | 32.00                 |
| 5     | 57.67                 | 41.33                 |
| 6     | 97.33                 | 64.00                 |
| 7     | 73.00                 | 32.00                 |
| 8     | 39.67                 | 22.67                 |
| 9     | 54.33                 | 64.33                 |
| 10    | 56.33                 | 77.00                 |
+-------+-----------------------+-----------------------+
Explanation: 
By the end of January --> average_ride_distance = (0+0+63)/3=21, average_ride_duration = (0+0+38)/3=12.67
By the end of February --> average_ride_distance = (0+63+0)/3=21, average_ride_duration = (0+38+0)/3=12.67
By the end of March --> average_ride_distance = (63+0+0)/3=21, average_ride_duration = (38+0+0)/3=12.67
By the end of April --> average_ride_distance = (0+0+73)/3=24.33, average_ride_duration = (0+0+96)/3=32.00
By the end of May --> average_ride_distance = (0+73+100)/3=57.67, average_ride_duration = (0+96+28)/3=41.33
By the end of June --> average_ride_distance = (73+100+119)/3=97.33, average_ride_duration = (96+28+68)/3=64.00
By the end of July --> average_ride_distance = (100+119+0)/3=73.00, average_ride_duration = (28+68+0)/3=32.00
By the end of August --> average_ride_distance = (119+0+0)/3=39.67, average_ride_duration = (68+0+0)/3=22.67
By the end of Septemeber --> average_ride_distance = (0+0+163)/3=54.33, average_ride_duration = (0+0+193)/3=64.33
By the end of October --> average_ride_distance = (0+163+6)/3=56.33, average_ride_duration = (0+193+38)/3=77.00

解答：

# Write your MySQL query statement below
with recursive m(n) as(
select 1 
union all 
select n+1 from m 
where n<12
)

select m.n month,
ifnull(round(sum(ride_distance)/3,2),0) as average_ride_distance,
ifnull(round(sum(ride_duration)/3,2),0) as average_ride_duration
from acceptedrides a left join rides r 
using(ride_id)
right join m 
on year(requested_at)=2020 
and month(requested_at) between m.n and m.n+2
group by m.n
order by m.n
limit 10 #用limit控制
;

135、1661.每台机器的进程平均运行时间

难度：★★☆☆☆

表: Activity

+----------------+---------+
| Column Name    | Type    |
+----------------+---------+
| machine_id     | int     |
| process_id     | int     |
| activity_type  | enum    |
| timestamp      | float   |
+----------------+---------+
该表展示了一家工厂网站的用户活动.
(machine_id, process_id, activity_type) 是当前表的主键.
machine_id 是一台机器的ID号.
process_id 是运行在各机器上的进程ID号.
activity_type 是枚举类型 ('start', 'end').
timestamp 是浮点类型,代表当前时间(以秒为单位).
'start' 代表该进程在这台机器上的开始运行时间戳 , 'end' 代表该进程在这台机器上的终止运行时间戳.
同一台机器，同一个进程都有一对开始时间戳和结束时间戳，而且开始时间戳永远在结束时间戳前面.

现在有一个工厂网站由几台机器运行，每台机器上运行着相同数量的进程. 请写出一条SQL计算每台机器各自完成一个进程任务的平均耗时.

完成一个进程任务的时间指进程的'end' 时间戳 减去 'start' 时间戳. 平均耗时通过计算每台机器上所有进程任务的总耗费时间除以机器上的总进程数量获得.

结果表必须包含machine_id（机器ID） 和对应的 average time（平均耗时） 别名 processing_time, 且四舍五入保留3位小数.

具体参考例子如下:

Activity table:
+------------+------------+---------------+-----------+
| machine_id | process_id | activity_type | timestamp |
+------------+------------+---------------+-----------+
| 0          | 0          | start         | 0.712     |
| 0          | 0          | end           | 1.520     |
| 0          | 1          | start         | 3.140     |
| 0          | 1          | end           | 4.120     |
| 1          | 0          | start         | 0.550     |
| 1          | 0          | end           | 1.550     |
| 1          | 1          | start         | 0.430     |
| 1          | 1          | end           | 1.420     |
| 2          | 0          | start         | 4.100     |
| 2          | 0          | end           | 4.512     |
| 2          | 1          | start         | 2.500     |
| 2          | 1          | end           | 5.000     |
+------------+------------+---------------+-----------+

Result table:
+------------+-----------------+
| machine_id | processing_time |
+------------+-----------------+
| 0          | 0.894           |
| 1          | 0.995           |
| 2          | 1.456           |
+------------+-----------------+

一共有3台机器,每台机器运行着两个进程.
机器 0 的平均耗时: ((1.520 - 0.712) + (4.120 - 3.140)) / 2 = 0.894
机器 1 的平均耗时: ((1.550 - 0.550) + (1.420 - 0.430)) / 2 = 0.995
机器 2 的平均耗时: ((4.512 - 4.100) + (5.000 - 2.500)) / 2 = 1.456

解答：

# Write your MySQL query statement below
with tmp1 as(					# 临时表tmp1
select 
    machine_id,
    count(process_id) cn_id,			# 求id数
    activity_type,
    sum(`timestamp`) `timestamp`		# 求总时间
from
    activity
group by
    machine_id,activity_type		# 按照machine_id,activity_type进行分区
), 
tmp2 as(						# 临时表tmp2
select
    machine_id,
    max(cn_id) cn,					# 转换为一行
    max(case when activity_type='start' then `timestamp` end) st,			# 转换为一行
    max(case when activity_type='end' then `timestamp` end) en				# 转换为一行
from		
    tmp1
group by    
    machine_id				# 按照machine_id进行分组
)
select
    machine_id,
    round((en-st)/cn,3) processing_time			# 直接进行计算
from
    tmp2
;

136、1667.修复表中的名字

难度：★★☆☆☆

表： Users

+----------------+---------+
| Column Name    | Type    |
+----------------+---------+
| user_id        | int     |
| name           | varchar |
+----------------+---------+
user_id 是该表的主键。
该表包含用户的 ID 和名字。名字仅由小写和大写字符组成。

编写一个 SQL 查询来修复名字，使得只有第一个字符是大写的，其余都是小写的。

返回按 user_id 排序的结果表。

查询结果格式示例如下：

Users table:
+---------+-------+
| user_id | name  |
+---------+-------+
| 1       | aLice |
| 2       | bOB   |
+---------+-------+

Result table:
+---------+-------+
| user_id | name  |
+---------+-------+
| 1       | Alice |
| 2       | Bob   |
+---------+-------+

解答：

# Write your MySQL query statement below
select
    user_id,
    concat(upper(left(name,1)),lower(substring(name,2))) name	# 组合字符串
from
    users
order by
    user_id
;

扩展：

1、left(s,n):返回字符串最左边的n个字符
2、right(s,n):返回字符串最右边的n个字符
3、upper(s):将字符串s全部转换为大写
4、lower(s):将字符串s全部转换为小写
5、substring(s,index,len):返回从字符串s的index位置其len个字符，其中index是从1开始算，len不写则为后面全部
6、concat(s1,s2,...,sn):连接s1、s2、...、sn为一个字符串

137、1677.发票中的产品金额

难度：★★☆☆☆

Product 表

+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| product_id  | int     |
| name        | varchar |
+-------------+---------+
product_id 是这张表的主键
表中含有产品 id 、产品名称。产品名称都是小写的英文字母，产品名称都是唯一的

Invoice表：

+-------------+------+
| Column Name | Type |
+-------------+------+
| invoice_id  | int  |
| product_id  | int  |
| rest        | int  |
| paid        | int  |
| canceled    | int  |
| refunded    | int  |
+-------------+------+
invoice_id 发票 id ，是这张表的主键
product_id 产品 id
rest 应缴款项
paid 已支付金额
canceled 已取消金额
refunded 已退款金额

要求写一个SQL查询，对于所有产品，返回每个产品的产品名称，以及全部发票累计的总应缴款项、总已支付金额、总已取消金额、总已退款金额。

查询结果按 product_name 排序

示例：

Product 表：
+------------+-------+
| product_id | name  |
+------------+-------+
| 0          | ham   |
| 1          | bacon |
+------------+-------+
Invoice table:
+------------+------------+------+------+----------+----------+
| invoice_id | product_id | rest | paid | canceled | refunded |
+------------+------------+------+------+----------+----------+
| 23         | 0          | 2    | 0    | 5        | 0        |
| 12         | 0          | 0    | 4    | 0        | 3        |
| 1          | 1          | 1    | 1    | 0        | 1        |
| 2          | 1          | 1    | 0    | 1        | 1        |
| 3          | 1          | 0    | 1    | 1        | 1        |
| 4          | 1          | 1    | 1    | 1        | 0        |
+------------+------------+------+------+----------+----------+
Result 表：
+-------+------+------+----------+----------+
| name  | rest | paid | canceled | refunded |
+-------+------+------+----------+----------+
| bacon | 3    | 3    | 3        | 3        |
| ham   | 2    | 4    | 5        | 3        |
+-------+------+------+----------+----------+
- bacon 的总应缴款项为 1 + 1 + 0 + 1 = 3
- bacon 的总已支付金额为 1 + 0 + 1 + 1 = 3
- bacon 的总已取消金额为 0 + 1 + 1 + 1 = 3
- bacon 的总已退款金额为 1 + 1 + 1 + 0 = 3
- ham 的总应缴款项为 2 + 0 = 2
- ham 的总已支付金额为 0 + 4 = 4
- ham 的总已取消金额为 5 + 0 = 5
- ham 的总已退款金额为 0 + 3 = 3

解答：

# Write your MySQL query statement below
select
    name,
    ifnull(sum(rest),0) rest,					# 求和，null为0
    ifnull(sum(paid),0) paid,					# 求和，null为0
    ifnull(sum(canceled),0) canceled,			# 求和，null为0
    ifnull(sum(refunded),0) refunded			# 求和，null为0
from
    product p
left join				# product和invoice进行左连接，因为需要的是所有产品的，所以以product为主
    invoice i
on
    p.product_id=i.product_id		# 连接条件
group by
    name				# 分组条件
order by
    name				# 排序条件
;

138、1683.无效的推文

难度：★★☆☆☆

表：Tweets

+----------------+---------+
| Column Name    | Type    |
+----------------+---------+
| tweet_id       | int     |
| content        | varchar |
+----------------+---------+
tweet_id 是这个表的主键。
这个表包含某社交媒体 App 中所有的推文。

写一条 SQL 语句，查询所有无效推文的编号（ID）。当推文内容中的字符数严格大于 15 时，该推文是无效的。

以任意顺序返回结果表。

查询结果格式如下示例所示：

Tweets 表：
+----------+----------------------------------+
| tweet_id | content                          |
+----------+----------------------------------+
| 1        | Vote for Biden                   |
| 2        | Let us make America great again! |
+----------+----------------------------------+

结果表：
+----------+
| tweet_id |
+----------+
| 2        |
+----------+
推文 1 的长度 length = 14。该推文是有效的。
推文 2 的长度 length = 32。该推文是无效的。

解答：

# Write your MySQL query statement below
select
    tweet_id
from
    tweets
where 
    char_length(content)>15			# 判断条件
;

扩展：

# 1、char_length(str):返回字符串s的字符数

139、1693.每天的领导人和合伙人

难度：★★☆☆☆

表：DailySales

+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| date_id     | date    |
| make_name   | varchar |
| lead_id     | int     |
| partner_id  | int     |
+-------------+---------+
该表没有主键。
该表包含日期、产品的名称，以及售给的领导和合伙人的编号。
名称只包含小写英文字母。

写一条 SQL 语句，使得对于每一个 date_id 和 make_name，返回不同的 lead_id 以及不同的 partner_id 的数量。

按任意顺序返回结果表。

查询结果格式如下示例所示：

DailySales 表：
+-----------+-----------+---------+------------+
| date_id   | make_name | lead_id | partner_id |
+-----------+-----------+---------+------------+
| 2020-12-8 | toyota    | 0       | 1          |
| 2020-12-8 | toyota    | 1       | 0          |
| 2020-12-8 | toyota    | 1       | 2          |
| 2020-12-7 | toyota    | 0       | 2          |
| 2020-12-7 | toyota    | 0       | 1          |
| 2020-12-8 | honda     | 1       | 2          |
| 2020-12-8 | honda     | 2       | 1          |
| 2020-12-7 | honda     | 0       | 1          |
| 2020-12-7 | honda     | 1       | 2          |
| 2020-12-7 | honda     | 2       | 1          |
+-----------+-----------+---------+------------+
结果表：
+-----------+-----------+--------------+-----------------+
| date_id   | make_name | unique_leads | unique_partners |
+-----------+-----------+--------------+-----------------+
| 2020-12-8 | toyota    | 2            | 3               |
| 2020-12-7 | toyota    | 1            | 2               |
| 2020-12-8 | honda     | 2            | 2               |
| 2020-12-7 | honda     | 3            | 2               |
+-----------+-----------+--------------+-----------------+
在 2020-12-8，丰田（toyota）有领导者 = [0, 1] 和合伙人 = [0, 1, 2] ，同时本田（honda）有领导者 = [1, 2] 和合伙人 = [1, 2]。
在 2020-12-7，丰田（toyota）有领导者 = [0] 和合伙人 = [1, 2] ，同时本田（honda）有领导者 = [0, 1, 2] 和合伙人 = [1, 2]。

解答：

# Write your MySQL query statement below

select
    date_id,
    make_name,
    count(distinct lead_id) unique_leads,			# 去重求数
    count(distinct partner_id) unique_partners		# 去重求数
from
    dailysales
group by
    date_id,make_name			# 按照date_id,make_name进行分组
;

140、1699.两人之间的通话次数

表： Calls

+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| from_id     | int     |
| to_id       | int     |
| duration    | int     |
+-------------+---------+
该表没有主键，可能存在重复项。
该表包含 from_id 与 to_id 间的一次电话的时长。
from_id != to_id

编写 SQL 语句，查询每一对用户 (person1, person2) 之间的通话次数和通话总时长，其中 person1 < person2 。

以任意顺序返回结果表。

查询结果格式如下示例所示：

Calls 表：
+---------+-------+----------+
| from_id | to_id | duration |
+---------+-------+----------+
| 1       | 2     | 59       |
| 2       | 1     | 11       |
| 1       | 3     | 20       |
| 3       | 4     | 100      |
| 3       | 4     | 200      |
| 3       | 4     | 200      |
| 4       | 3     | 499      |
+---------+-------+----------+

结果表：
+---------+---------+------------+----------------+
| person1 | person2 | call_count | total_duration |
+---------+---------+------------+----------------+
| 1       | 2       | 2          | 70             |
| 1       | 3       | 1          | 20             |
| 3       | 4       | 4          | 999            |
+---------+---------+------------+----------------+
用户 1 和 2 打过 2 次电话，总时长为 70 (59 + 11)。
用户 1 和 3 打过 1 次电话，总时长为 20。
用户 3 和 4 打过 4 次电话，总时长为 999 (100 + 200 + 200 + 499)。

解答：

# Write your MySQL query statement below
with tmp as(			# 临时表tmp
select
    *
from
    calls
where 
    from_id<to_id		# 取from_id<to_id的
union all				# union all进行结合
select
    to_id from_id,		# 反转，把小的放前面
    from_id to_id,		# 反转，把大的放前面
    duration
from 
    calls
where   
    from_id>to_id		# 取from_id>to_id
)
select 
    from_id person1,
    to_id person2,
    count(*) call_count,
    sum(duration) total_duration
from
    tmp
group by
    from_id,to_id
;

大数据最新文章

实现Kafka至少消费一次

亚马逊云科技：还在苦于ETL？Zero ETL的时代

初探MapReduce

【SpringBoot框架篇】32.基于注解+redis实现

Elasticsearch：如何减少 Elasticsearch 集

Go redis操作

Redis面试题

专题五 Redis高并发场景

基于GBase8s和Calcite的多数据源查询

Redis——底层数据结构原理

加:2022-03-06 13:08:12 更:2022-03-06 13:10:04

360图书馆购物三丰科技阅读网日历万年历 2025年12日历

-2025/12/28 1:37:00-

图片自动播放器
↓图片自动播放器↓

TxT小说阅读器
↓语音阅读,小说下载,古典文学↓

一键清除垃圾
↓轻轻一点,清除系统垃圾↓

图片批量下载器
↓批量下载图片,美女图库↓

网站联系: qq:121756557 email:121756557@qq.com IT数码