Skip to content

[SUPPORT] HBase connection closed exception #6509

@xicm

Description

@xicm

Tips before filing an issue

  • Have you gone through our FAQs?

  • Join the mailing list to engage in conversations and get faster support at [email protected].

  • If you have triaged this as a bug, then file an issue directly.

Describe the problem you faced

I described in https://issues.apache.org/jira/browse/HUDI-3983.

I get a connection closed exception with HBase index. We use relocation in spark bundle, when I remove the relocations, the job succeed.

I have been debugging the differences between with relocation and without relocation for a long time, but found nothing.

To Reproduce

Steps to reproduce the behavior:

As we use relocation in spark bundle, this conf will cause ClassNotFoundException,
comment the listener class in in hudi-common/src/main/resources/hbase-site.xml.

  <property>
    <name>hbase.status.listener.class</name>
    <value>org.apache.hadoop.hbase.client.ClusterStatusListener$MulticastListener</value>
    <description>
      Implementation of the status listener with a multicast message.
    </description>
  </property>

Add this conf we can get the error message quickly.

<property>
    <name>hbase.client.retries.number</name>
    <value>0</value>
 </property>

  1. Add org.apache.hbase.thirdparty:hbase-shaded-gson in packaging/hudi-spark-bundle/pom.xml
  2. mvn clean package -DskipTests -Dspark3.1 -Dscala-2.12 -Dhadoop.version=3.3.0 -Dhive.version=3.1.2
  3. write data with hbase index.
    df.write.format("org.apache.hudi").
      options(getQuickstartWriteConfigs).
      option(PRECOMBINE_FIELD.key, "ts").
      option(RECORDKEY_FIELD.key, "uuid").
      option(PARTITIONPATH_FIELD.key, "partitionpath").
      option(TBL_NAME.key, tableName).
      option(TABLENAME.key(), tableName).
      option(INDEX_TYPE.key, "HBASE").
      option(ZKQUORUM.key, "${hbase.zookeeper.quorum}").
      option(ZKPORT.key, "2181").
      option(ZK_NODE_PATH.key, "${zooKeeper.znode.parent }").
      option("hoodie.metadata.index.column.stats.enable", "true").
      option("hoodie.embed.timeline.server", "false").
      mode(Overwrite).
      save(tablePath)

Expected behavior

A clear and concise description of what you expected to happen.

Environment Description

  • Hudi version :

  • Spark version : 3.1.1

  • Hive version : 3.1.2

  • Hadoop version : 3.3.0

  • Storage (HDFS/S3/GCS..) : HDFS

  • Running on Docker? (yes/no) : no

Additional context

Add any other context about the problem here.

Stacktrace

org.apache.hudi.org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=1, exceptions:
2022-08-26T07:12:57.603Z, RpcRetryingCaller{globalStartTime=2022-08-26T07:12:56.651Z, pause=100, maxAttempts=1}, org.apache.hudi.org.apache.hadoop.hbase.exceptions.ConnectionClosedException: Call to address=x.x.x.x/x.x.x.x:16020 failed on local exception: org.apache.hudi.org.apache.hadoop.hbase.exceptions.ConnectionClosedException: Connection closed    at org.apache.hudi.org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:146)
    at org.apache.hudi.org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:80)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hudi.org.apache.hadoop.hbase.exceptions.ConnectionClosedException: Call to address=x.x.x.x/x.x.x.x:16020 failed on local exception: org.apache.hudi.org.apache.hadoop.hbase.exceptions.ConnectionClosedException: Connection closed
    at org.apache.hudi.org.apache.hadoop.hbase.ipc.IPCUtil.wrapException(IPCUtil.java:214)
    at org.apache.hudi.org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:384)
    at org.apache.hudi.org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$100(AbstractRpcClient.java:89)
    at org.apache.hudi.org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:415)
    at org.apache.hudi.org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:411)
    at org.apache.hudi.org.apache.hadoop.hbase.ipc.Call.callComplete(Call.java:118)
    at org.apache.hudi.org.apache.hadoop.hbase.ipc.Call.setException(Call.java:133)
    at org.apache.hudi.org.apache.hadoop.hbase.ipc.NettyRpcDuplexHandler.cleanupCalls(NettyRpcDuplexHandler.java:203)
    at org.apache.hudi.org.apache.hadoop.hbase.ipc.NettyRpcDuplexHandler.channelInactive(NettyRpcDuplexHandler.java:211)
    at org.apache.hudi.org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262)
    at org.apache.hudi.org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248)
    at org.apache.hudi.org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:241)
    at org.apache.hudi.org.apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:389)
    at org.apache.hudi.org.apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:354)
    at org.apache.hudi.org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262)
    at org.apache.hudi.org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248)
    at org.apache.hudi.org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:241)
    at org.apache.hudi.org.apache.hbase.thirdparty.io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:81)
    at org.apache.hudi.org.apache.hbase.thirdparty.io.netty.handler.timeout.IdleStateHandler.channelInactive(IdleStateHandler.java:277)
    at org.apache.hudi.org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262)
    at org.apache.hudi.org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248)
    at org.apache.hudi.org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:241)
    at org.apache.hudi.org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1405)
    at org.apache.hudi.org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262)
    at org.apache.hudi.org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248)
    at org.apache.hudi.org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:901)
    at org.apache.hudi.org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:831)
    at org.apache.hudi.org.apache.hbase.thirdparty.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164)
    at org.apache.hudi.org.apache.hbase.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472)
    at org.apache.hudi.org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:497)
    at org.apache.hudi.org.apache.hbase.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
    at org.apache.hudi.org.apache.hbase.thirdparty.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
    at org.apache.hudi.org.apache.hbase.thirdparty.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
    ... 1 more
Caused by: org.apache.hudi.org.apache.hadoop.hbase.exceptions.ConnectionClosedException: Connection closed
    ... 26 more

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

Status

🏁 Triaged

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions