[大数据] Hive复杂数据类型之Map

开发: C++知识库 Java知识库 JavaScript Python PHP知识库人工智能区块链大数据移动开发嵌入式开发工具数据结构与算法开发测试游戏开发网络协议系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑笔记本显卡显示器固态硬盘硬盘耳机手机 iphone vivo oppo 小米华为单反装机图拉丁

-> 大数据 -> Hive复杂数据类型之Map -> 正文阅读

[大数据]Hive复杂数据类型之Map

建库

hive> create database test_data;
hive>
    >
    > use test_data;

建表

hive> create table if not exists grade(
    > name string,
    > score map<string,int>
    > )
    > row format delimited
    > fields terminated by '\t'
    > collection items terminated by ','
    > map keys terminated by ':'
    > ;

在linux本地文件系统上准备数据文件

[root@hadoop01 test_data]# pwd
/usr/local/wyh/test_data
[root@hadoop01 test_data]# vi grade.txt
[root@hadoop01 test_data]# cat grade.txt
Lily    chinese:95,math:86,english:69,nature:72
Bob     chinese:62,math:43,english:79,nature:0
Tom     chinese:83,math:39

导入数据

hive> load data local inpath '/usr/local/wyh/test_data/grade.txt' into table grade;

查询数据

hive> select * from grade;
OK
Lily    {"chinese":95,"math":86,"english":69,"nature":72}
Bob     {"chinese":62,"math":43,"english":79,"nature":0}
Tom     {"chinese":83,"math":39}

查看语文成绩大于80的学生的英语和自然成绩

hive> select
    > g.name,
    > g.score['english'],
    > g.score['nature']
    > from grade g
    > where g.score['chinese']>80
    > ;
OK
Lily    69      72
Tom     NULL    NULL

查询每位学生的语文和数学成绩总和

hive> select
    > g.name,
    > g.score['chinese']+g.score['math']
    > from grade g
    > ;
OK
Lily    181
Bob     105
Tom     122

使用explode展开数据

hive> select explode(score) from grade;
OK
chinese 95
math    86
english 69
nature  72
chinese 62
math    43
english 79
nature  0
chinese 83
math    39


#这样就会把所有记录中的map中的key:value全部展开

hive> select explode(score) as (grade_subject,grade_score) from grade;
OK
chinese 95
math    86
english 69
nature  72
chinese 62
math    43
english 79
nature  0
chinese 83
math    39

#由于展开之后是两列，所以as后面的第一个参数是我们要指定的展开后的第一列的列名，第二个参数是展开后的第二列的列名

hive> select name,grade_subject,grade_score from grade lateral view explode(score) score_view as grade_subject,grade_score;
OK
Lily    chinese 95
Lily    math    86
Lily    english 69
Lily    nature  72
Bob     chinese 62
Bob     math    43
Bob     english 79
Bob     nature  0
Tom     chinese 83
Tom     math    39


#score_view是展开之后的虚拟表的名字，as后面跟的是展开之后的列名

查询每个学生的总成绩

hive> select name,sum(grade_score) from grade lateral view explode(score) score_view as grade_subject,grade_score group by name;

Bob     184
Lily    322
Tom     122

上面的案例都是将key:value形式的map数据进行展开查询，下面来实现一下将展开后的数据map成key:value形式。

新建表并导入数据

hive> create table map_grade as select name,grade_subject,grade_score from grade lateral view explode(score) score_view as grade_subject,grade_score;

查看表结构

hive> desc map_grade;
OK
name                    string
grade_subject           string
grade_score             int

查询数据

hive> select * from map_grade;
OK
Lily    chinese 95
Lily    math    86
Lily    english 69
Lily    nature  72
Bob     chinese 62
Bob     math    43
Bob     english 79
Bob     nature  0
Tom     chinese 83
Tom     math    39

使用concat函数将科目和成绩拼接在一起

hive> select name,concat(grade_subject,':',grade_score) as score from map_grade;
OK
Lily    chinese:95
Lily    math:86
Lily    english:69
Lily    nature:72
Bob     chinese:62
Bob     math:43
Bob     english:79
Bob     nature:0
Tom     chinese:83
Tom     math:39

使用collect_set函数将每个学生的各科成绩拼接在一个集合里

hive> select name,collect_set(concat(grade_subject,':',grade_score)) from map_grade group by name;

Bob     ["chinese:62","math:43","english:79","nature:0"]
Lily    ["chinese:95","math:86","english:69","nature:72"]
Tom     ["chinese:83","math:39"]

使用concat_ws函数将集合变成字符串

hive> select name,concat_ws(",",collect_set(concat(grade_subject,':',grade_score))) from map_grade group by name;

Bob     chinese:62,math:43,english:79,nature:0
Lily    chinese:95,math:86,english:69,nature:72
Tom     chinese:83,math:39

#concat_ws函数中的第一个参数表示用逗号来拼接每一个key:value

使用str_to_map函数将字符串转成map

hive> select name,str_to_map(concat_ws(",",collect_set(concat(grade_subject,':',grade_score))),',',':') from map_grade group by name;

Bob     {"chinese":"62","math":"43","english":"79","nature":"0"}
Lily    {"chinese":"95","math":"86","english":"69","nature":"72"}
Tom     {"chinese":"83","math":"39"}

#str_to_map函数中的参数一表示将哪个字符串进行转换，参数二表示用什么符号来对key:value与key:value之间进行分割，参数三表示key与value之间用什么符号来分割。

以上就是hive中map的正向和逆向的简单使用案例。