方法一 :当面对全量表,使用自关联 --笛卡尔积 先看一下笛卡尔积的结果
首先看一下全量表的明细数据
对明细数据进行聚合
select
*
from
( select count(*) as cnt,app_id,down_date
from dwd_hispace_down_ds
group by app_id,down_date) a
left join
( select count(*) as cnt,app_id,down_date
from dwd_hispace_down_ds
group by app_id,down_date) b
on a.app_id=b.app_id
select
a.down_date
,a.app_id
,sum(case when a.down_date - b.down_date=7
then a.cnt/b.cnt else 0 end) as seven_tb
,sum(case when add_months(a.down_date,-1) = b.down_date
then a.cnt/b.cnt else 0 end) as month_tb
from
( select count(*) as cnt,app_id,down_date
from dwd_hispace_down_ds
group by app_id,down_date) a
left join
( select count(*) as cnt,app_id,down_date
from dwd_hispace_down_ds
group by app_id,down_date) b
on a.app_id=b.app_id
group by
a.down_date
,a.app_id
方法二:面对每天的增量表,假如是hive大数据的情景下,且数据量特别大,一次只能扫描一天的分区。如要计算20211201当天的周月同比。先看一下所有分区的数据
select
app_id
,nvl(max(cnt)/max(seven_cnt),0) as seven_tb
,nvl(max(cnt)/max(month_cnt),0) as month_tb
from
(
select
app_id
,count(*) as cnt
,null as seven_cnt
,null as month_cnt
from DWD_HISPACE_DOWN_DM
where pt_d=20211201 group by app_id
union all
select
app_id
,null as cnt
,count(*) as seven_cnt
,null as month_cnt
from DWD_HISPACE_DOWN_DM
where pt_d=to_char(to_date('20211201','yyyy-mm-dd')-7,'yyyymmdd') group by app_id
union all
select
app_id
,null as cnt
,null as seven_cnt
,count(*) as month_cnt
from DWD_HISPACE_DOWN_DM
where pt_d=to_char(add_months(to_date('20211201','yyyy-mm-dd'),-1),'yyyymmdd') group by app_id
)
group by app_id
方法三 关联时间维度表
select
a1.app_id
,sum(a1.cnt)/sum(b1.cnt) as week_tb
from
(
select
app_id
,cnt
,b.week_ago
from
( select count(*) as cnt,app_id,down_date
from dwd_hispace_down_ds
group by app_id,down_date) a
left join dim_time b
on a.down_date=tO_date(b.today,'yyyymmdd')
) a1
inner join
(
select count(*) as cnt,app_id,down_date
from dwd_hispace_down_ds
group by app_id,down_date
) b1
on tO_date(a1.week_ago,'yyyymmdd')=b1.down_date
and a1.app_id=b1.app_id
group by a1.app_id
|