[大数据] 【Hive DQL之表连接】

开发: C++知识库 Java知识库 JavaScript Python PHP知识库人工智能区块链大数据移动开发嵌入式开发工具数据结构与算法开发测试游戏开发网络协议系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑笔记本显卡显示器固态硬盘硬盘耳机手机 iphone vivo oppo 小米华为单反装机图拉丁

-> 大数据 -> 【Hive DQL之表连接】 -> 正文阅读

[大数据]【Hive DQL之表连接】

在这里插入图片描述

11
-------------------------Full join ----你有，我有，--你有，我没有---，  你没有，我有  ---- 两表全都显示，




--笛卡尔积-----每一一个join一遍    -----数据量大的吓人   6 * 6 = 36 


-------------------------------------------------


hive (mydb)> select * from u1 join u2 on u1.id = u2.id;
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = root_20220112222831_c8641f66-8d56-4cb7-bf33-93e2b1e70128
Total jobs = 1
2022-01-12 22:28:39     Starting to launch local task to process map join;      maximum memory = 518979584
2022-01-12 22:28:40     Dump the side-table for tag: 0 with group count: 6 into file: file:/tmp/root/d6200ad0-564c-4cd8-8a3a-2aa6255ab21d/hive_2022-01-12_22-28-31_054_4724095481681766367-1/-local-10004/HashTable-Stage-3/MapJoin-mapfile00--.hashtable
2022-01-12 22:28:40     Uploaded 1 File to: file:/tmp/root/d6200ad0-564c-4cd8-8a3a-2aa6255ab21d/hive_2022-01-12_22-28-31_054_4724095481681766367-1/-local-10004/HashTable-Stage-3/MapJoin-mapfile00--.hashtable (386 bytes)
2022-01-12 22:28:40     End of local task; Time Taken: 1.147 sec.
Execution completed successfully
MapredLocal task succeeded
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1641995564638_0001, Tracking URL = http://linux123:8088/proxy/application_1641995564638_0001/
Kill Command = /opt/lagou/servers/hadoop-2.9.2/bin/hadoop job  -kill job_1641995564638_0001
Hadoop job information for Stage-3: number of mappers: 1; number of reducers: 0
2022-01-12 22:28:51,469 Stage-3 map = 0%,  reduce = 0%
2022-01-12 22:28:59,717 Stage-3 map = 100%,  reduce = 0%, Cumulative CPU 1.18 sec
MapReduce Total cumulative CPU time: 1 seconds 180 msec
Ended Job = job_1641995564638_0001
MapReduce Jobs Launched:
Stage-Stage-3: Map: 1   Cumulative CPU: 1.18 sec   HDFS Read: 6092 HDFS Write: 147 SUCCESS
Total MapReduce CPU Time Spent: 1 seconds 180 msec
OK
u1.id   u1.name u2.id   u2.name
4       d       4       d
5       e       5       e
6       f       6       f
Time taken: 29.726 seconds, Fetched: 3 row(s)
hive (mydb)> select * from u1 left join u2 on u1.id = u2.id;
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = root_20220112223213_e0369e9c-ff56-4164-a46e-1fbd05c7a21b
Total jobs = 1
2022-01-12 22:32:20     Starting to launch local task to process map join;      maximum memory = 518979584
2022-01-12 22:32:21     Dump the side-table for tag: 1 with group count: 6 into file: file:/tmp/root/d6200ad0-564c-4cd8-8a3a-2aa6255ab21d/hive_2022-01-12_22-32-13_210_5733734088434832898-1/-local-10004/HashTable-Stage-3/MapJoin-mapfile11--.hashtable
2022-01-12 22:32:21     Uploaded 1 File to: file:/tmp/root/d6200ad0-564c-4cd8-8a3a-2aa6255ab21d/hive_2022-01-12_22-32-13_210_5733734088434832898-1/-local-10004/HashTable-Stage-3/MapJoin-mapfile11--.hashtable (386 bytes)
2022-01-12 22:32:21     End of local task; Time Taken: 1.054 sec.
Execution completed successfully
MapredLocal task succeeded
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1641995564638_0002, Tracking URL = http://linux123:8088/proxy/application_1641995564638_0002/
Kill Command = /opt/lagou/servers/hadoop-2.9.2/bin/hadoop job  -kill job_1641995564638_0002
Hadoop job information for Stage-3: number of mappers: 1; number of reducers: 0
2022-01-12 22:32:31,981 Stage-3 map = 0%,  reduce = 0%
2022-01-12 22:32:37,151 Stage-3 map = 100%,  reduce = 0%, Cumulative CPU 0.74 sec
MapReduce Total cumulative CPU time: 740 msec
Ended Job = job_1641995564638_0002
MapReduce Jobs Launched:
Stage-Stage-3: Map: 1   Cumulative CPU: 0.74 sec   HDFS Read: 5770 HDFS Write: 213 SUCCESS
Total MapReduce CPU Time Spent: 740 msec
OK
u1.id   u1.name u2.id   u2.name
1       a       NULL    NULL
2       b       NULL    NULL
3       c       NULL    NULL
4       d       4       d
5       e       5       e
6       f       6       f
Time taken: 25.01 seconds, Fetched: 6 row(s)
hive (mydb)> select * from u1 right join u2 on u1.id = u2.id;
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = root_20220112223714_8fb055ba-3d36-4ea7-a252-b9ff2bb2ec54
Total jobs = 1
2022-01-12 22:37:21     Starting to launch local task to process map join;      maximum memory = 518979584
2022-01-12 22:37:22     Dump the side-table for tag: 0 with group count: 6 into file: file:/tmp/root/d6200ad0-564c-4cd8-8a3a-2aa6255ab21d/hive_2022-01-12_22-37-14_046_7767994758097546401-1/-local-10004/HashTable-Stage-3/MapJoin-mapfile20--.hashtable
2022-01-12 22:37:22     Uploaded 1 File to: file:/tmp/root/d6200ad0-564c-4cd8-8a3a-2aa6255ab21d/hive_2022-01-12_22-37-14_046_7767994758097546401-1/-local-10004/HashTable-Stage-3/MapJoin-mapfile20--.hashtable (386 bytes)
2022-01-12 22:37:22     End of local task; Time Taken: 1.094 sec.
Execution completed successfully
MapredLocal task succeeded
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1641995564638_0003, Tracking URL = http://linux123:8088/proxy/application_1641995564638_0003/
Kill Command = /opt/lagou/servers/hadoop-2.9.2/bin/hadoop job  -kill job_1641995564638_0003
Hadoop job information for Stage-3: number of mappers: 1; number of reducers: 0
2022-01-12 22:37:32,594 Stage-3 map = 0%,  reduce = 0%
2022-01-12 22:37:37,701 Stage-3 map = 100%,  reduce = 0%, Cumulative CPU 0.96 sec
MapReduce Total cumulative CPU time: 960 msec
Ended Job = job_1641995564638_0003
MapReduce Jobs Launched:
Stage-Stage-3: Map: 1   Cumulative CPU: 0.96 sec   HDFS Read: 5770 HDFS Write: 213 SUCCESS
Total MapReduce CPU Time Spent: 960 msec
OK
u1.id   u1.name   u2.id   u2.name
4       d                    4       d
5       e                   5       e
6       f                    6       f
NULL    NULL       7       g
NULL    NULL       8       h
NULL    NULL       9       i
Time taken: 24.754 seconds, Fetched: 6 row(s)
hive (mydb)> select * from u1 full join u2 on u1.id = u2.id;
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = root_20220112223920_77ef5961-e704-43d2-9a6b-2a56e7064e42
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Defaulting to jobconf value of: 4
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1641995564638_0004, Tracking URL = http://linux123:8088/proxy/application_1641995564638_0004/
Kill Command = /opt/lagou/servers/hadoop-2.9.2/bin/hadoop job  -kill job_1641995564638_0004
Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 4
2022-01-12 22:39:28,499 Stage-1 map = 0%,  reduce = 0%
2022-01-12 22:39:40,692 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.7 sec
2022-01-12 22:39:48,647 Stage-1 map = 100%,  reduce = 50%, Cumulative CPU 5.34 sec
2022-01-12 22:39:51,866 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 7.94 sec
MapReduce Total cumulative CPU time: 7 seconds 940 msec
Ended Job = job_1641995564638_0004
MapReduce Jobs Launched:
Stage-Stage-1: Map: 2  Reduce: 4   Cumulative CPU: 7.94 sec   HDFS Read: 27544 HDFS Write: 540 SUCCESS
Total MapReduce CPU Time Spent: 7 seconds 940 msec
OK
u1.id   u1.name u2.id   u2.name
4       d       4       d
NULL    NULL    8       h
1       a       NULL    NULL
5       e       5       e
NULL    NULL    9       i
2       b       NULL    NULL
6       f       6       f
3       c       NULL    NULL
NULL    NULL    7       g
Time taken: 32.148 seconds, Fetched: 9 row(s)
hive (mydb)> select * from u1,u2;
FAILED: SemanticException Cartesian products are disabled for safety reasons. If you know what you are doing, please sethive.strict.checks.cartesian.product to false and that hive.mapred.mode is not set to 'strict' to proceed. Note that if you may get errors or incorrect results if you make a mistake while using some of the unsafe features.
hive (mydb)> set hive.strict.checks.cartesian.product;
hive.strict.checks.cartesian.product=true
hive (mydb)> set hive.strict.checks.cartesian.product=false;
hive (mydb)> select * from u1,u2;
Warning: Map Join MAPJOIN[9][bigTable=?] in task 'Stage-3:MAPRED' is a cross product
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = root_20220112225121_2784f4ef-a0f1-4e44-95f5-d090246f17cc
Total jobs = 1
2022-01-12 22:51:29     Starting to launch local task to process map join;      maximum memory = 518979584
2022-01-12 22:51:30     Dump the side-table for tag: 0 with group count: 1 into file: file:/tmp/root/d6200ad0-564c-4cd8-8a3a-2aa6255ab21d/hive_2022-01-12_22-51-21_357_175171533818054252-1/-local-10004/HashTable-Stage-3/MapJoin-mapfile30--.hashtable
2022-01-12 22:51:30     Uploaded 1 File to: file:/tmp/root/d6200ad0-564c-4cd8-8a3a-2aa6255ab21d/hive_2022-01-12_22-51-21_357_175171533818054252-1/-local-10004/HashTable-Stage-3/MapJoin-mapfile30--.hashtable (320 bytes)
2022-01-12 22:51:30     End of local task; Time Taken: 1.185 sec.
Execution completed successfully
MapredLocal task succeeded
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1641995564638_0005, Tracking URL = http://linux123:8088/proxy/application_1641995564638_0005/
Kill Command = /opt/lagou/servers/hadoop-2.9.2/bin/hadoop job  -kill job_1641995564638_0005
Hadoop job information for Stage-3: number of mappers: 1; number of reducers: 0
2022-01-12 22:51:37,443 Stage-3 map = 0%,  reduce = 0%
2022-01-12 22:51:43,602 Stage-3 map = 100%,  reduce = 0%, Cumulative CPU 1.26 sec
MapReduce Total cumulative CPU time: 1 seconds 260 msec
Ended Job = job_1641995564638_0005
MapReduce Jobs Launched:
Stage-Stage-3: Map: 1   Cumulative CPU: 1.26 sec   HDFS Read: 5721 HDFS Write: 807 SUCCESS
Total MapReduce CPU Time Spent: 1 seconds 260 msec
OK
u1.id   u1.name u2.id   u2.name
1       a       4       d
2       b       4       d
3       c       4       d
4       d       4       d
5       e       4       d
6       f       4       d
1       a       5       e
2       b       5       e
3       c       5       e
4       d       5       e
5       e       5       e
6       f       5       e
1       a       6       f
2       b       6       f
3       c       6       f
4       d       6       f
5       e       6       f
6       f       6       f
1       a       7       g
2       b       7       g
3       c       7       g
4       d       7       g
5       e       7       g
6       f       7       g
1       a       8       h
2       b       8       h
3       c       8       h
4       d       8       h
5       e       8       h
6       f       8       h
1       a       9       i
2       b       9       i
3       c       9       i
4       d       9       i
5       e       9       i
6       f       9       i
Time taken: 24.318 seconds, Fetched: 36 row(s)
hive (mydb)>

大数据最新文章

实现Kafka至少消费一次

亚马逊云科技：还在苦于ETL？Zero ETL的时代

初探MapReduce

【SpringBoot框架篇】32.基于注解+redis实现

Elasticsearch：如何减少 Elasticsearch 集

Go redis操作

Redis面试题

专题五 Redis高并发场景

基于GBase8s和Calcite的多数据源查询

Redis——底层数据结构原理

加:2022-01-14 02:03:13 更:2022-01-14 02:04:40

360图书馆购物三丰科技阅读网日历万年历 2025年8日历

-2025/8/25 18:36:43-

图片自动播放器
↓图片自动播放器↓

TxT小说阅读器
↓语音阅读,小说下载,古典文学↓

一键清除垃圾
↓轻轻一点,清除系统垃圾↓

图片批量下载器
↓批量下载图片,美女图库↓

网站联系: qq:121756557 email:121756557@qq.com IT数码