一、题目
Table: Activity
+
| Column Name | Type |
+
| player_id | int |
| device_id | int |
| event_date | date |
| games_played | int |
+
(player_id,event_date)是此表的主键。
这张表显示了某些游戏的玩家的活动情况。
每一行是一个玩家的记录,他在某一天使用某个设备注销之前登录并玩了很多游戏(可能是 0)。
编写一个 SQL 查询,报告在首次登录的第二天再次登录的玩家的比率,四舍五入到小数点后两位。换句话说,您需要计算从首次登录日期开始至少连续两天登录的玩家的数量,然后除以玩家总数。
查询结果格式如下所示:
Activity table:
+
| player_id | device_id | event_date | games_played |
+
| 1 | 2 | 2016-03-01 | 5 |
| 1 | 2 | 2016-03-02 | 6 |
| 2 | 3 | 2017-06-25 | 1 |
| 3 | 1 | 2016-03-02 | 0 |
| 3 | 4 | 2018-07-03 | 5 |
+
Result table:
+
| fraction |
+
| 0.33 |
+
只有 ID 为 1 的玩家在第一天登录后才重新登录,所以答案是 1/3 = 0.33
二、解决
1、lag+rank() over+datediff
思路: 略。 代码:
select round(select count(player_id) from (
select
player_id,
event_date,
lag(event_date)
over(partition by player_id order by event_date) as next_date,
rank() over(partition by player_id order by event_date) as login_times
from activity
) as temp where datediff(event_date, next_date) = 1 and login_times = 2
/
select count(player_id) from (
select player_id from activity group by player_id
) as temp, 2) as fraction;
select round((select count(player_id) from (
select
player_id,
event_date,
lag(event_date)
over(partition by player_id order by event_date) as next_date,
rank() over(partition by player_id order by event_date) as login_times
from activity
) as temp where datediff(event_date, next_date) = 1 and login_times = 2)
/
(select count(player_id) from (
select player_id from activity group by player_id
) as temp), 2) as fraction;
补充:
select
player_id,
event_date,
lag(event_date) over(partition by player_id order by event_date) as next_date,
rank() over(partition by player_id order by event_date) as login_times
from activity
select count(player_id) from (
select
player_id,
event_date,
lag(event_date)
over(partition by player_id order by event_date) as next_date,
rank() over(partition by player_id order by event_date) as login_times
from activity
) as temp where datediff(event_date, next_date) = 1 and login_times = 2
2、datediff()+min
思路: 略。 代码-版本1:
select round((
(select count(player_id) from (
select
player_id,
datediff(event_date, min(event_date) over(partition by player_id)) as diff
from activity
) as temp where diff = 1) / (select count(distinct player_id) from activity)
), 2) as fraction;
代码-版本2:
with temp as (
select
player_id,
datediff(event_date, min(event_date) over(partition by player_id)) as diff
from activity
) select round(
sum(case diff when 1 then 1 else 0 end) /
count(distinct player_id),
2) as fraction from temp;
3、group by+left join+avg()
思路: 略。 代码:
select round(avg(event_date is not null), 2) as fraction from (
select player_id, min(event_date) as first_login from activity
group by player_id
) temp left join activity
on temp.player_id = activity.player_id
and datediff(event_date, first_login) = 1;
三、参考
1、从小白视角用 4 种方法详细介绍 2、sql中的窗口函数:lead,lag
|