1、行转列(一)
主要使用:
CONCAT(string A/col, string B/col…):返回输入字符串连接后的结果,支持任意个输入字符串; CONCAT_WS(separator, str1, str2,…):它是一个特殊形式的 CONCAT()。第一个参数是剩余其他参数间的分隔符。 COLLECT_SET(col):函数只接受基本数据类型,它的主要作用是将某字段的值进行去重汇总,产生array类型字段。
示例
原始数据:
name | constellation | blood_type |
---|
宋江 | 白羊座 | A | 鲁智深 | 射手 | A | 武松 | 白羊座 | B | 潘金莲 | 白羊座 | A | 西门庆 | 射手 | A |
期望输出结果:
constell_blood | name_list |
---|
射手座,A | 鲁智深&西门庆 | 白羊座,A | 宋江&潘金莲 | 白羊座,B | 武松 |
实现:
select
t1.base,
concat_ws('&', collect_set(t1.name)) name
from
(select
name,
concat(constellation, ",", blood_type) constell_blood
from
person_info) t1
group by
t1.constell_blood;
2、列转行(一)
主要使用 EXPLODE(col):将hive一列中复杂的array或者map结构拆分成多行。 LATERAL VIEW 用法:LATERAL VIEW udtf(expression) tableAlias AS columnAlias 解释:用于和split, explode等UDTF一起使用,它能够将一列数据拆成多行数据,在此基础上可以对拆分后的数据进行聚合。
示例
原始数据:
movie | category |
---|
《疑犯追踪》 | 悬疑&动作 | 《Lie to me》 | 悬疑&警匪 |
期望输出结果:
movie | category_name |
---|
《疑犯追踪》 | 悬疑 | 《疑犯追踪》 | 动作 | 《Lie to me》 | 悬疑 | 《Lie to me》 | 警匪 |
实现:
select
movie,
category_name
from
movie_info
lateral view explode(split(category, "\\&")) table_tmp as category_name;
3、行转列(二)
主要使用sum(case when )
示例
原始数据:
stu_id | name | course | score |
---|
01 | zhangsan | math | 90 | 01 | zhangsan | chinese | 88 | 01 | zhangsan | english | 88 | 02 | lisi | math | 66 | 02 | lisi | chinese | 77 | 02 | lisi | english | 80 |
期望输出结果:
stu_id | name | mat_score | chi_score | eng_score |
---|
01 | zhangsan | 90 | 88 | 88 | 02 | lisi | 66 | 77 | 80 |
实现1:
select
stu_id,
name,
sum(case when course='math' then score else 0 end) mat_score,
sum(case when course='chinese' then score else 0 end) chi_score,
sum(case when course='english' then score else 0 end) eng_score
from stu001
group by stu_id,name
实现2:
select
a.stu_id,
a.name,
(select score from stu001 m where course='math' and a.stu_id=m.stu_id and m.name=a.name ) mat_score,
(select score from stu001 m where course='chinese' and a.stu_id=m.stu_id and m.name=a.name ) chi_score,
(select score from stu001 m where course='english' and a.stu_id=m.stu_id and m.name=a.name ) eng_score
from stu001 a
group by stu_id,name;
4、列转行(二)
示例
主要使用union all 原始数据:
stu_id | name | mat_score | chi_score | eng_score |
---|
01 | zhangsan | 90 | 88 | 88 | 02 | lisi | 66 | 77 | 80 |
期望输出结果:
stu_id | name | course | score |
---|
01 | zhangsan | math | 90 | 01 | zhangsan | chinese | 88 | 01 | zhangsan | english | 88 | 02 | lisi | math | 66 | 02 | lisi | chinese | 77 | 02 | lisi | english | 80 |
实现:
select
stu_id,
name,
if(mat_score is not null ,'match',null) as course,
mat_score as score
from
(select
a.stu_id,
a.name,
(select score from stu001 m where course='math' and a.stu_id=m.stu_id and m.name=a.name ) mat_score,
(select score from stu001 m where course='chinese' and a.stu_id=m.stu_id and m.name=a.name ) chi_score,
(select score from stu001 m where course='english' and a.stu_id=m.stu_id and m.name=a.name ) eng_score
from stu001 a
group by stu_id,name) tmp0
union all
select
stu_id,
name,
if(chi_score is not null ,'chinese',null) as course,
chi_score as score
from
(select
a.stu_id,
a.name,
(select score from stu001 m where course='math' and a.stu_id=m.stu_id and m.name=a.name ) mat_score,
(select score from stu001 m where course='chinese' and a.stu_id=m.stu_id and m.name=a.name ) chi_score,
(select score from stu001 m where course='english' and a.stu_id=m.stu_id and m.name=a.name ) eng_score
from stu001 a
group by stu_id,name) tmp1
union all
select
stu_id,
name,
if(eng_score is not null ,'english',null) as course,
eng_score as score
from
(select
a.stu_id,
a.name,
(select score from stu001 m where course='math' and a.stu_id=m.stu_id and m.name=a.name ) mat_score,
(select score from stu001 m where course='chinese' and a.stu_id=m.stu_id and m.name=a.name ) chi_score,
(select score from stu001 m where course='english' and a.stu_id=m.stu_id and m.name=a.name ) eng_score
from stu001 a
group by stu_id,name) tmp2
order by stu_id,name;
|