2022-05-18 20:19:55.197 INFO 4772 --- [ Thread-70] org.apache.hadoop.hdfs.DFSClient : Exception in createBlockOutputStream
java.net.ConnectException: Connection timed out: no further information
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[na:1.8.0_212]
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[na:1.8.0_212]
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) ~[hadoop-common-2.7.4.jar:na]
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531) ~[hadoop-common-2.7.4.jar:na]
at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1533) ~[hadoop-hdfs-2.7.3.jar:na]
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1309) [hadoop-hdfs-2.7.3.jar:na]
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1262) [hadoop-hdfs-2.7.3.jar:na]
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:448) [hadoop-hdfs-2.7.3.jar:na]
2022-05-18 20:19:55.199 INFO 4772 --- [ Thread-70] org.apache.hadoop.hdfs.DFSClient : Abandoning BP-754408308-172.17.9.73-1431769234492:blk_1073741856_1032
2022-05-18 20:19:55.234 INFO 4772 --- [ Thread-70] org.apache.hadoop.hdfs.DFSClient : Excluding datanode DatanodeInfoWithStorage[172.17.0.2:50010,DS-ded577c3-21b1-4374-bae5-31e7bcd3ca07,DISK]
2022-05-18 20:19:55.263 WARN 4772 --- [ Thread-70] org.apache.hadoop.hdfs.DFSClient : DataStreamer Exception
org.apache.hadoop.ipc.RemoteException: File /test/video could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1550)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3067)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:722)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:492)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
at org.apache.hadoop.ipc.Client.call(Client.java:1476) ~[hadoop-common-2.7.4.jar:na]
at org.apache.hadoop.ipc.Client.call(Client.java:1413) ~[hadoop-common-2.7.4.jar:na]
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) ~[hadoop-common-2.7.4.jar:na]
at com.sun.proxy.$Proxy119.addBlock(Unknown Source) ~[na:na]
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418) ~[hadoop-hdfs-2.7.3.jar:na]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_212]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_212]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_212]
at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_212]
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191) ~[hadoop-common-2.7.4.jar:na]
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) ~[hadoop-common-2.7.4.jar:na]
at com.sun.proxy.$Proxy120.addBlock(Unknown Source) ~[na:na]
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1455) ~[hadoop-hdfs-2.7.3.jar:na]
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1251) ~[hadoop-hdfs-2.7.3.jar:na]
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:448) ~[hadoop-hdfs-2.7.3.jar:na]
2022-05-18 20:19:55.287 ERROR 4772 --- [nio-8080-exec-3] o.a.c.c.C.[.[.[/].[dispatcherServlet] : Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception
org.apache.hadoop.ipc.RemoteException: File /test/video could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1550)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3067)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:722)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:492)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
at org.apache.hadoop.ipc.Client.call(Client.java:1476) ~[hadoop-common-2.7.4.jar:na]
at org.apache.hadoop.ipc.Client.call(Client.java:1413) ~[hadoop-common-2.7.4.jar:na]
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) ~[hadoop-common-2.7.4.jar:na]
at com.sun.proxy.$Proxy119.addBlock(Unknown Source) ~[na:na]
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418) ~[hadoop-hdfs-2.7.3.jar:na]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_212]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_212]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_212]
at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_212]
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191) ~[hadoop-common-2.7.4.jar:na]
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) ~[hadoop-common-2.7.4.jar:na]
at com.sun.proxy.$Proxy120.addBlock(Unknown Source) ~[na:na]
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1455) ~[hadoop-hdfs-2.7.3.jar:na]
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1251) ~[hadoop-hdfs-2.7.3.jar:na]
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:448) ~[hadoop-hdfs-2.7.3.jar:na]
问题
最近使用docker 部署一淘hadoop开发测试,在使用过程中遇到一些hdfs上传问题,报错如上。
看报错,一个是连接超时。
一个是:
could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
最小需要1个datanode,但现在一个也没有。
测试文件查看没有问题。
再看日志
org.apache.hadoop.hdfs.DFSClient : Excluding datanode DatanodeInfoWithStorage[172.17.0.2:50010,DS-ded577c3-21b1-4374-bae5-31e7bcd3ca07,DISK]
去连接的datanode ip 不对。 经过排查 172.17.0.2 这个ip 是docker 运行时自动生成的ip,这个时候需要改docker容器的IP。
方法一
使用 docker network host模式,host模式类似于Vmware的桥接模式。
直接在docker启动命令里加上 --net=host 启动。
docker run -it -p 50070:50070 -p 8088:8088 -p 50075:50075 -p 9000:9000 -p 50010:50010 -p 50020:50020 --hostname node1 --net=host --add-host node1:192.168.41.3 sequenceiq/hadoop-docker:2.7.0 /etc/bootstrap.sh -bash --privileged=true -d
方法二
使用桥接模式。
1.9版本后的Docker可使用下面这种方式:(坑 网段不要和主机网段一样,这个会启动网卡,一样的话现在的网直接用不了,需要删除和重启网卡/机器)
NETWORK ID NAME DRIVER SCOPE
1fb22da7d8a3 bridge bridge local
fe259334b842 host host local
8c5971ff48d8 network_my bridge local
3aaf0356c19c none null local
启动时添加:
–hostname :指定hostname; –net : 指定网络模式 –ip:指定IP –add-host :指定往/etc/hosts添加的host
docker run -it -p 50070:50070 -p 8088:8088 -p 50075:50075 -p 9000:9000 -p 50010:50010 -p 50020:50020 --hostname node1 --net network_my --ip 192.168.42.3 --add-host node1:192.168.42.3 sequenceiq/hadoop-docker:2.7.0 /etc/bootstrap.sh -bash --privileged=true -d
这样的好处时我在添加Zookeeper,Hbase时可以用不同的docker容器,配置好hosts即可。但是该模式下 Docker Container 不具有一个公有 IP,即和主机的 eth0 不处于同一个网段。导致的结果是宿主机以外的其他主机不能直接和容器进行通信。
如果java应用与hadoop不在一台机器上,就依然会报上面的错误。 解决方案就是在两台主机之间添加路由。
添加路由规则
ip route add 对方容器所在的ip网段/子网掩码 via 对方虚拟机ip dev 通过哪个网卡通信
ip route add 192.168.42.0/24 via 192.168.41.3 dev eno0
|