Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -171,13 +171,15 @@ public static void main(String[] args) {
System.exit(1);
}
final JavaSparkContext jsc = UtilHelpers.buildSparkContext("compactor-" + cfg.tableName, cfg.sparkMaster, cfg.sparkMemory);
int ret = 0;
try {
HoodieCompactor compactor = new HoodieCompactor(jsc, cfg);
compactor.compact(cfg.retry);
ret = compactor.compact(cfg.retry);
} catch (Throwable throwable) {
LOG.error("Fail to run compaction for " + cfg.tableName, throwable);
} finally {
jsc.stop();
System.exit(ret);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why we need call System.exit here ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your review. Here you can find out why. https://issues.apache.org/jira/browse/HUDI-3945

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi
We got an issue in HoodieCompactor recently, which might be related with this PR.

This is the exception log we got in our application, it showed ApplicationMaster: Final app status: FAILED, exitCode: 16, (reason: Shutdown hook called before final status was reported.)

...
22/12/19 02:29:49 INFO EmbeddedTimelineService: Closed Timeline server
22/12/19 02:29:49 INFO AbstractConnector: Stopped Spark@24872375{HTTP/1.1, (http/1.1)}{0.0.0.0:8090}
22/12/19 02:29:49 INFO SparkUI: Stopped Spark web UI at http://xxxxx:8090
22/12/19 02:29:49 INFO YarnClusterSchedulerBackend: Shutting down all executors
22/12/19 02:29:49 INFO YarnSchedulerBackend$YarnDriverEndpoint: Asking each executor to shut down
22/12/19 02:29:49 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
22/12/19 02:29:49 INFO BlockManager: BlockManager stopped
22/12/19 02:29:49 INFO BlockManagerMaster: BlockManagerMaster stopped
22/12/19 02:29:49 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
22/12/19 02:29:49 INFO SparkContext: Successfully stopped SparkContext
22/12/19 02:29:49 INFO ApplicationMaster: Final app status: FAILED, exitCode: 16, (reason: Shutdown hook called before final status was reported.)
22/12/19 02:29:49 INFO ApplicationMaster: Unregistering ApplicationMaster with FAILED (diag message: Shutdown hook called before final status was reported.)
22/12/19 02:29:49 INFO Metrics: Stopping the metrics reporter...
22/12/19 02:29:50 INFO AMRMClientImpl: Waiting for application to be successfully unregistered.
22/12/19 02:29:50 INFO ApplicationMaster: Deleting staging directory hdfs://xxxxx
22/12/19 02:29:50 INFO ShutdownHookManager: Shutdown hook called
22/12/19 02:29:50 INFO ShutdownHookManager: Deleting directory /xxxxx

After trouble shooting, I think this error is caused by this line here, which is an inappropriate way to shutdown Spark application in yarn cluster mode. (reference in other repo: broadinstitute/gatk#3400, sparklyr/sparklyr#1903).

And I have tried reproduce this issue https://issues.apache.org/jira/browse/HUDI-3945 by using approach mentioned in the ticket description, it said HoodieCompactor can't exit after compaction. But I can't reproduce it in my environment.

May I ask if there is a clear way that we can reproduce the issue HUDI-3945? If it can't be reproduced, think we can remove this line here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing a similar case to above exception as well.

}
}

Expand Down