一、Sqoop安装
1.1下载并解压
1)下载地址:http://mirrors.hust.edu.cn/apache/sqoop/1.4.6/
wget http://mirrors.hust.edu.cn/apache/sqoop/1.4.6/sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz
tar -zvxf sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz -C /usr/local/sqoop
1.2 修改配置文件
1)重命名配置文件
$ mv /usr/local/sqoop/conf/sqoop-env-template.sh /usr/local/sqoop/conf/sqoop-env.sh
- 添加环境变量
vi /etc/profile
SQOOP_HOME=/usr/local/sqoop-1.4.7
PATH=$PATH:$SQOOP_HOME/bin
export PATH SQOOP_HOME
source /etc/profile
1.3拷贝JDBC驱动
cp /usr/local/hive/lib/mysql-connector-java-8.0.15.jar
/usr/local/sqoop-1.4.7/lib/
1.4 验证sqoop
输入sqoop命令验证
sqoop help
二、Sqoop的简单使用
2.1 导入数据
在Sqoop中,“导入”概念指:从非大数据集群(RDBMS)向大数据集群(HDFS,HIVE,HBASE)中传输数据,叫做:导入,即使用import关键字
2.2 从RDBMS到HDFS在这里插入代码片
1)全部导入
$ bin/sqoop import \
--connect jdbc:mysql://hostname:3306/test \
--username root \
--password password \
--table staff \
--target-dir /user/test\
--delete-target-dir \
--num-mappers 1 \
--fields-terminated-by "\t"
2)查询导入
$ bin/sqoop import \
--connect jdbc:mysql://hostname:3306/test \
--username root \
--password password \
--target-dir /user/test\
--delete-target-dir \
--num-mappers 1 \
--fields-terminated-by "\t" \
--query 'select name,sex from staff where id <=1 and $CONDITIONS;'
3)导入指定列
$ bin/sqoop import \
--connect jdbc:mysql://hostname:3306/test \
--username root \
--password password \
--target-dir /user/test \
--delete-target-dir \
--num-mappers 1 \
--fields-terminated-by "\t" \
--columns id,sex \
--table staff
4)使用sqoop关键字筛选查询导入数据
$ bin\sqoop import \
--connect jdbc://mysql/hostname:3306/test \
--username root \
--password password \
--target-dir /user/test \
--delete-target-dir \
--num-mappers 1 \
--fields-terminated-by "\t" \
--table staff \
--where "id=1"
2.2导出数据
1)HIVE/HDFS到RDBMS
$ bin/sqoop export \
--connect jdbc:mysql://hostname:3306/test \
--username root \
--password password \
--table staff \
--num-mappers 1\
--export-dir /user/staff_hive \
--input-fields-terminated=by "\t"
|