Hive

Q: Hive执行报错:

'org.apache.hadoop.yarn.exceptions.YarnRuntimeException(java.lang.InterruptedException: sleep interrupted)'

org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.InterruptedException: sleep interrupted

       at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:339)

       at org.apache.hadoop.mapred.ClientServiceDelegate.getTaskReports(ClientServiceDelegate.java:444)

       at org.apache.hadoop.mapred.YARNRunner.getTaskReports(YARNRunner.java:572)

       at org.apache.hadoop.mapreduce.Job$3.run(Job.java:543)

       at org.apache.hadoop.mapreduce.Job$3.run(Job.java:541)

       at java.security.AccessController.doPrivileged(Native Method)

       at javax.security.auth.Subject.doAs(Subject.java:415)

       at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)

       at org.apache.hadoop.mapreduce.Job.getTaskReports(Job.java:541)

       at org.apache.hadoop.mapred.JobClient.getTaskReports(JobClient.java:639)

       at org.apache.hadoop.mapred.JobClient.getMapTaskReports(JobClient.java:629)

       at org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:259)

       at org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:547)

       at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:426)

       at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136)

       at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)

       at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)

       at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:72)

// 错误堆栈2

java.io.IOException: 断开的管道

       at sun.nio.ch.FileDispatcherImpl.write0(Native Method)

       at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)

       at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)

       at sun.nio.ch.IOUtil.write(IOUtil.java:65)

       at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487)

       at org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:63)

       at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)

       at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159)

       at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117)

       at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)

       at java.io.DataOutputStream.write(DataOutputStream.java:107)

       at org.apache.hadoop.hdfs.DFSOutputStream$Packet.writeTo(DFSOutputStream.java:285)

       at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:591)

2015-08-10 02:46:18,127 WARN  [Thread-13]: hdfs.DFSClient (DFSOutputStream.java:waitForAckedSeqno(2074)) - Slow waitForAckedSeqno took 73275ms (threshold=30000ms)

2015-08-10 02:46:18,140 WARN  [Thread-10]: hdfs.DFSClient (DFSOutputStream.java:waitForAckedSeqno(2074)) - Slow waitForAckedSeqno took 73884ms (threshold=30000ms)

2015-08-10 02:46:18,190 WARN  [DataStreamer for file /tmp/hadoop-yarn/staging/hhive/.staging/job_1439027917379_19257/job.jar block BP-1797264656-192.168.4.128-1431244532842:blk_1094532259_20796961]: hdfs.DFSClient (DFSOutputStream.java:run(639)) - DataStreamer Exception

A:此类问题一般是由于DN或NN压力过大无法及时响应,可通过调整以下参数改善(hdfs-site.xml

dfs.datanode.handler.count: # 增大,提升DN服务线程数,增加DN接收请求、处理指令能力
dfs.namenode.handler.count: # 增大, 提升NN服务线程数,提升处理RPC请求能力

Q:操作文件失败

java.io.EOFException: Premature EOF: no length prefix available

       at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2109)

       at org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:176)

A:文件操作超出租期,即dataStreamer操作文件过程中,文件已经被删除导致,可通过修改hdfs-site.xml参数优化

dfs.datanode.max.transfer.threads # 增大该参数,可提升DN节点并发能力

Q:链接被重置问题

java.io.IOException: Connection reset by peer

       at sun.nio.ch.FileDispatcherImpl.write0(Native Method)

       at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)

       at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)

       at sun.nio.ch.IOUtil.write(IOUtil.java:65)

       at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487)

       at org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:63)

       at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)

       at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159)

       at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)

       at java.io.DataOutputStream.flush(DataOutputStream.java:123)

       at org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.sendAckUpstreamUnprotected(BlockReceiver.java:1396)

       at org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.sendAckUpstream(BlockReceiver.java:1335)

       at org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run(BlockReceiver.java:1256)

       at java.lang.Thread.run(Thread.java:745)

A:解决方案为优化NN性能,通过修改hdfs-site.xml参数实现

dfs.namenode.handler.count # (加大)  NN的服务线程数。用于处理RPC请求
dfs.namenode.replication.interval #(减小)  NN周期性计算DN的副本情况的频率,秒
dfs.client.failover.connection.retries #(建议加大)  专家设置。IPC客户端失败重试次数。在网络不稳定时建议加大此值

Q:Socket链接超时问题

java.io.IOException: Bad response ERROR for block BP-1797264656-192.168.4.128-1431244532842:blk_1094409843_20674430 from datanode 192.168.4.118:50010

       at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:840)

java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/192.168.16.70:50010 remote=/192.168.4.143:52416]

       at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)

       at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)

       at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)

       at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)

       at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)

       at java.io.BufferedInputStream.read(BufferedInputStream.java:334)

       at java.io.DataInputStream.read(DataInputStream.java:149)

       at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:192)

       at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)

       at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)

       at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)

       at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:468)

       at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:772)

       at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:724)

       at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:126)

       at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:72)

       at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:226)

       at java.lang.Thread.run(Thread.java:745)

A: 通过调整DN参数优化

dfs.datanode.socket.write.timeout #(加大)向datanode写入数据超时设置
dfs.client.socket-timeout #(加大) dfsclients与集群间进行网络通信的超时设置

文章作者: Semon
版权声明: 本博客所有文章除特別声明外,均采用 CC BY 4.0 许可协议。转载请注明来源 Semon !
评论
  目录