IT数码 购物 网址 头条 软件 日历 阅读 图书馆
TxT小说阅读器
↓语音阅读,小说下载,古典文学↓
图片批量下载器
↓批量下载图片,美女图库↓
图片自动播放器
↓图片自动播放器↓
一键清除垃圾
↓轻轻一点,清除系统垃圾↓
开发: C++知识库 Java知识库 JavaScript Python PHP知识库 人工智能 区块链 大数据 移动开发 嵌入式 开发工具 数据结构与算法 开发测试 游戏开发 网络协议 系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑 笔记本 显卡 显示器 固态硬盘 硬盘 耳机 手机 iphone vivo oppo 小米 华为 单反 装机 图拉丁
 
   -> 大数据 -> 【Hive DQL之表连接】 -> 正文阅读

[大数据]【Hive DQL之表连接】

在这里插入图片描述

11
-------------------------Full join ----你有,我有,--你有,我没有---,  你没有,我有  ---- 两表全都显示,




--笛卡尔积-----每一一个join一遍    -----数据量大的吓人   6 * 6 = 36 


-------------------------------------------------


hive (mydb)> select * from u1 join u2 on u1.id = u2.id;
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = root_20220112222831_c8641f66-8d56-4cb7-bf33-93e2b1e70128
Total jobs = 1
2022-01-12 22:28:39     Starting to launch local task to process map join;      maximum memory = 518979584
2022-01-12 22:28:40     Dump the side-table for tag: 0 with group count: 6 into file: file:/tmp/root/d6200ad0-564c-4cd8-8a3a-2aa6255ab21d/hive_2022-01-12_22-28-31_054_4724095481681766367-1/-local-10004/HashTable-Stage-3/MapJoin-mapfile00--.hashtable
2022-01-12 22:28:40     Uploaded 1 File to: file:/tmp/root/d6200ad0-564c-4cd8-8a3a-2aa6255ab21d/hive_2022-01-12_22-28-31_054_4724095481681766367-1/-local-10004/HashTable-Stage-3/MapJoin-mapfile00--.hashtable (386 bytes)
2022-01-12 22:28:40     End of local task; Time Taken: 1.147 sec.
Execution completed successfully
MapredLocal task succeeded
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1641995564638_0001, Tracking URL = http://linux123:8088/proxy/application_1641995564638_0001/
Kill Command = /opt/lagou/servers/hadoop-2.9.2/bin/hadoop job  -kill job_1641995564638_0001
Hadoop job information for Stage-3: number of mappers: 1; number of reducers: 0
2022-01-12 22:28:51,469 Stage-3 map = 0%,  reduce = 0%
2022-01-12 22:28:59,717 Stage-3 map = 100%,  reduce = 0%, Cumulative CPU 1.18 sec
MapReduce Total cumulative CPU time: 1 seconds 180 msec
Ended Job = job_1641995564638_0001
MapReduce Jobs Launched:
Stage-Stage-3: Map: 1   Cumulative CPU: 1.18 sec   HDFS Read: 6092 HDFS Write: 147 SUCCESS
Total MapReduce CPU Time Spent: 1 seconds 180 msec
OK
u1.id   u1.name u2.id   u2.name
4       d       4       d
5       e       5       e
6       f       6       f
Time taken: 29.726 seconds, Fetched: 3 row(s)
hive (mydb)> select * from u1 left join u2 on u1.id = u2.id;
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = root_20220112223213_e0369e9c-ff56-4164-a46e-1fbd05c7a21b
Total jobs = 1
2022-01-12 22:32:20     Starting to launch local task to process map join;      maximum memory = 518979584
2022-01-12 22:32:21     Dump the side-table for tag: 1 with group count: 6 into file: file:/tmp/root/d6200ad0-564c-4cd8-8a3a-2aa6255ab21d/hive_2022-01-12_22-32-13_210_5733734088434832898-1/-local-10004/HashTable-Stage-3/MapJoin-mapfile11--.hashtable
2022-01-12 22:32:21     Uploaded 1 File to: file:/tmp/root/d6200ad0-564c-4cd8-8a3a-2aa6255ab21d/hive_2022-01-12_22-32-13_210_5733734088434832898-1/-local-10004/HashTable-Stage-3/MapJoin-mapfile11--.hashtable (386 bytes)
2022-01-12 22:32:21     End of local task; Time Taken: 1.054 sec.
Execution completed successfully
MapredLocal task succeeded
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1641995564638_0002, Tracking URL = http://linux123:8088/proxy/application_1641995564638_0002/
Kill Command = /opt/lagou/servers/hadoop-2.9.2/bin/hadoop job  -kill job_1641995564638_0002
Hadoop job information for Stage-3: number of mappers: 1; number of reducers: 0
2022-01-12 22:32:31,981 Stage-3 map = 0%,  reduce = 0%
2022-01-12 22:32:37,151 Stage-3 map = 100%,  reduce = 0%, Cumulative CPU 0.74 sec
MapReduce Total cumulative CPU time: 740 msec
Ended Job = job_1641995564638_0002
MapReduce Jobs Launched:
Stage-Stage-3: Map: 1   Cumulative CPU: 0.74 sec   HDFS Read: 5770 HDFS Write: 213 SUCCESS
Total MapReduce CPU Time Spent: 740 msec
OK
u1.id   u1.name u2.id   u2.name
1       a       NULL    NULL
2       b       NULL    NULL
3       c       NULL    NULL
4       d       4       d
5       e       5       e
6       f       6       f
Time taken: 25.01 seconds, Fetched: 6 row(s)
hive (mydb)> select * from u1 right join u2 on u1.id = u2.id;
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = root_20220112223714_8fb055ba-3d36-4ea7-a252-b9ff2bb2ec54
Total jobs = 1
2022-01-12 22:37:21     Starting to launch local task to process map join;      maximum memory = 518979584
2022-01-12 22:37:22     Dump the side-table for tag: 0 with group count: 6 into file: file:/tmp/root/d6200ad0-564c-4cd8-8a3a-2aa6255ab21d/hive_2022-01-12_22-37-14_046_7767994758097546401-1/-local-10004/HashTable-Stage-3/MapJoin-mapfile20--.hashtable
2022-01-12 22:37:22     Uploaded 1 File to: file:/tmp/root/d6200ad0-564c-4cd8-8a3a-2aa6255ab21d/hive_2022-01-12_22-37-14_046_7767994758097546401-1/-local-10004/HashTable-Stage-3/MapJoin-mapfile20--.hashtable (386 bytes)
2022-01-12 22:37:22     End of local task; Time Taken: 1.094 sec.
Execution completed successfully
MapredLocal task succeeded
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1641995564638_0003, Tracking URL = http://linux123:8088/proxy/application_1641995564638_0003/
Kill Command = /opt/lagou/servers/hadoop-2.9.2/bin/hadoop job  -kill job_1641995564638_0003
Hadoop job information for Stage-3: number of mappers: 1; number of reducers: 0
2022-01-12 22:37:32,594 Stage-3 map = 0%,  reduce = 0%
2022-01-12 22:37:37,701 Stage-3 map = 100%,  reduce = 0%, Cumulative CPU 0.96 sec
MapReduce Total cumulative CPU time: 960 msec
Ended Job = job_1641995564638_0003
MapReduce Jobs Launched:
Stage-Stage-3: Map: 1   Cumulative CPU: 0.96 sec   HDFS Read: 5770 HDFS Write: 213 SUCCESS
Total MapReduce CPU Time Spent: 960 msec
OK
u1.id   u1.name   u2.id   u2.name
4       d                    4       d
5       e                   5       e
6       f                    6       f
NULL    NULL       7       g
NULL    NULL       8       h
NULL    NULL       9       i
Time taken: 24.754 seconds, Fetched: 6 row(s)
hive (mydb)> select * from u1 full join u2 on u1.id = u2.id;
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = root_20220112223920_77ef5961-e704-43d2-9a6b-2a56e7064e42
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Defaulting to jobconf value of: 4
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1641995564638_0004, Tracking URL = http://linux123:8088/proxy/application_1641995564638_0004/
Kill Command = /opt/lagou/servers/hadoop-2.9.2/bin/hadoop job  -kill job_1641995564638_0004
Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 4
2022-01-12 22:39:28,499 Stage-1 map = 0%,  reduce = 0%
2022-01-12 22:39:40,692 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.7 sec
2022-01-12 22:39:48,647 Stage-1 map = 100%,  reduce = 50%, Cumulative CPU 5.34 sec
2022-01-12 22:39:51,866 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 7.94 sec
MapReduce Total cumulative CPU time: 7 seconds 940 msec
Ended Job = job_1641995564638_0004
MapReduce Jobs Launched:
Stage-Stage-1: Map: 2  Reduce: 4   Cumulative CPU: 7.94 sec   HDFS Read: 27544 HDFS Write: 540 SUCCESS
Total MapReduce CPU Time Spent: 7 seconds 940 msec
OK
u1.id   u1.name u2.id   u2.name
4       d       4       d
NULL    NULL    8       h
1       a       NULL    NULL
5       e       5       e
NULL    NULL    9       i
2       b       NULL    NULL
6       f       6       f
3       c       NULL    NULL
NULL    NULL    7       g
Time taken: 32.148 seconds, Fetched: 9 row(s)
hive (mydb)> select * from u1,u2;
FAILED: SemanticException Cartesian products are disabled for safety reasons. If you know what you are doing, please sethive.strict.checks.cartesian.product to false and that hive.mapred.mode is not set to 'strict' to proceed. Note that if you may get errors or incorrect results if you make a mistake while using some of the unsafe features.
hive (mydb)> set hive.strict.checks.cartesian.product;
hive.strict.checks.cartesian.product=true
hive (mydb)> set hive.strict.checks.cartesian.product=false;
hive (mydb)> select * from u1,u2;
Warning: Map Join MAPJOIN[9][bigTable=?] in task 'Stage-3:MAPRED' is a cross product
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = root_20220112225121_2784f4ef-a0f1-4e44-95f5-d090246f17cc
Total jobs = 1
2022-01-12 22:51:29     Starting to launch local task to process map join;      maximum memory = 518979584
2022-01-12 22:51:30     Dump the side-table for tag: 0 with group count: 1 into file: file:/tmp/root/d6200ad0-564c-4cd8-8a3a-2aa6255ab21d/hive_2022-01-12_22-51-21_357_175171533818054252-1/-local-10004/HashTable-Stage-3/MapJoin-mapfile30--.hashtable
2022-01-12 22:51:30     Uploaded 1 File to: file:/tmp/root/d6200ad0-564c-4cd8-8a3a-2aa6255ab21d/hive_2022-01-12_22-51-21_357_175171533818054252-1/-local-10004/HashTable-Stage-3/MapJoin-mapfile30--.hashtable (320 bytes)
2022-01-12 22:51:30     End of local task; Time Taken: 1.185 sec.
Execution completed successfully
MapredLocal task succeeded
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1641995564638_0005, Tracking URL = http://linux123:8088/proxy/application_1641995564638_0005/
Kill Command = /opt/lagou/servers/hadoop-2.9.2/bin/hadoop job  -kill job_1641995564638_0005
Hadoop job information for Stage-3: number of mappers: 1; number of reducers: 0
2022-01-12 22:51:37,443 Stage-3 map = 0%,  reduce = 0%
2022-01-12 22:51:43,602 Stage-3 map = 100%,  reduce = 0%, Cumulative CPU 1.26 sec
MapReduce Total cumulative CPU time: 1 seconds 260 msec
Ended Job = job_1641995564638_0005
MapReduce Jobs Launched:
Stage-Stage-3: Map: 1   Cumulative CPU: 1.26 sec   HDFS Read: 5721 HDFS Write: 807 SUCCESS
Total MapReduce CPU Time Spent: 1 seconds 260 msec
OK
u1.id   u1.name u2.id   u2.name
1       a       4       d
2       b       4       d
3       c       4       d
4       d       4       d
5       e       4       d
6       f       4       d
1       a       5       e
2       b       5       e
3       c       5       e
4       d       5       e
5       e       5       e
6       f       5       e
1       a       6       f
2       b       6       f
3       c       6       f
4       d       6       f
5       e       6       f
6       f       6       f
1       a       7       g
2       b       7       g
3       c       7       g
4       d       7       g
5       e       7       g
6       f       7       g
1       a       8       h
2       b       8       h
3       c       8       h
4       d       8       h
5       e       8       h
6       f       8       h
1       a       9       i
2       b       9       i
3       c       9       i
4       d       9       i
5       e       9       i
6       f       9       i
Time taken: 24.318 seconds, Fetched: 36 row(s)
hive (mydb)>


  大数据 最新文章
实现Kafka至少消费一次
亚马逊云科技:还在苦于ETL?Zero ETL的时代
初探MapReduce
【SpringBoot框架篇】32.基于注解+redis实现
Elasticsearch:如何减少 Elasticsearch 集
Go redis操作
Redis面试题
专题五 Redis高并发场景
基于GBase8s和Calcite的多数据源查询
Redis——底层数据结构原理
上一篇文章      下一篇文章      查看所有文章
加:2022-01-14 02:03:13  更:2022-01-14 02:04:40 
 
开发: C++知识库 Java知识库 JavaScript Python PHP知识库 人工智能 区块链 大数据 移动开发 嵌入式 开发工具 数据结构与算法 开发测试 游戏开发 网络协议 系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑 笔记本 显卡 显示器 固态硬盘 硬盘 耳机 手机 iphone vivo oppo 小米 华为 单反 装机 图拉丁

360图书馆 购物 三丰科技 阅读网 日历 万年历 2024年11日历 -2024/11/24 13:33:34-

图片自动播放器
↓图片自动播放器↓
TxT小说阅读器
↓语音阅读,小说下载,古典文学↓
一键清除垃圾
↓轻轻一点,清除系统垃圾↓
图片批量下载器
↓批量下载图片,美女图库↓
  网站联系: qq:121756557 email:121756557@qq.com  IT数码