Skip to content

[Fix][Connector-Jdbc]prevent duplicate XA XID in exactly-once writer and rollback prepared tx on begin failure#10459

Open
yzeng1618 wants to merge 3 commits intoapache:devfrom
yzeng1618:dev-XA-lock
Open

[Fix][Connector-Jdbc]prevent duplicate XA XID in exactly-once writer and rollback prepared tx on begin failure#10459
yzeng1618 wants to merge 3 commits intoapache:devfrom
yzeng1618:dev-XA-lock

Conversation

@yzeng1618
Copy link
Collaborator

Purpose of this pull request

Fixes #10449
This PR fixes XA transaction stability issues in JDBC exactly-once writer and a related multi-table abort routing issue.

Does this PR introduce any user-facing change?

Yes.

Previous behavior:

  • In JDBC exactly-once + multi-table scenarios, rapid XA transaction transitions could trigger duplicate XID errors (e.g. XAER_DUPID).
  • When beginTx failed after a successful prepare, prepared XA transactions could remain and hold locks.
  • Multi-table abort routing could miss per-table abort info due to key type mismatch.

New behavior:

  • Duplicate XID caused by rapid same-writer transaction creation is prevented via monotonic tx-id allocation.
  • Prepared XA transaction is rolled back if opening next XA transaction fails.
  • Abort routing for multi-table committer correctly dispatches per-table commit info.

How was this patch tested?

Added tests:

  • JdbcExactlyOnceSinkWriterTest
    • testPrepareCommitWithSameCheckpointGeneratesMonotonicTxIds
    • testPrepareCommitRollbackPreparedXidWhenStartNextTxFailed
    • testAbortPrepareRollbackPreparedAndCurrentTransaction
    • testCloseRollbackPreparedAndCurrentTransaction
  • MultiTableSinkCommitterTest
    • testRouteByTableIdentifierForCommitAndAbort

Check list

@DanielCarter-stack
Copy link

Issue 1: Test constructor visibility issue

Location: JdbcExactlyOnceSinkWriter.java:108-124

JdbcExactlyOnceSinkWriter(
        SinkWriter.Context sinkcontext,
        JobContext context,
        List<JdbcSinkState> states,
        XaFacade xaFacade,
        XaGroupOps xaGroupOps,
        XidGenerator xidGenerator,
        JdbcOutputFormat<SeaTunnelRow, JdbcBatchStatementExecutor<SeaTunnelRow>> outputFormat) {
    // ... initialization code
}

Issue Description:

  • The newly added package-visible constructor is intended to facilitate unit testing, which is reasonable.
  • However, this constructor does not call xidGenerator.open(), which may cause some logic dependent on open() initialization to fail.

Potential Risks:

  • If the open() method of SemanticXidGenerator initializes critical resources (such as gtridBuffer and bqualBuffer), using this constructor will leave these resources uninitialized.
  • The tests use a mocked XidGenerator, so the tests pass, but there will be issues if used in production code.

Impact Scope:

  • Only affects the constructor used for unit testing.
  • Production code uses the public constructor and is not affected.

Severity: MINOR

Suggestion:

JdbcExactlyOnceSinkWriter(
        SinkWriter.Context sinkcontext,
        JobContext context,
        List<JdbcSinkState> states,
        XaFacade xaFacade,
        XaGroupOps xaGroupOps,
        XidGenerator xidGenerator,
        JdbcOutputFormat<SeaTunnelRow, JdbcBatchStatementExecutor<SeaTunnelRow>> outputFormat) {
    this.sinkcontext = sinkcontext;
    this.context = context;
    this.recoverStates = states;
    this.connectionProvider = xaFacade;
    this.xaFacade = xaFacade;
    this.xaGroupOps = xaGroupOps;
    this.xidGenerator = xidGenerator;
    this.outputFormat = outputFormat;
    // Add this line to ensure initialization
    this.xidGenerator.open();
}

Rationale: Ensure consistent behavior between the test constructor and production constructor.

Issue 2: Initialization value of lastGeneratedTxId may cause early ID range waste

Location: JdbcExactlyOnceSinkWriter.java:72

private transient long lastGeneratedTxId = Long.MIN_VALUE;

Issue Description:

  • Using Long.MIN_VALUE as the initial value, the first call to nextTxId will set it to txIdHint (usually the current timestamp).
  • If txIdHint is less than Long.MIN_VALUE (almost impossible), it will trigger the overflow check.

Potential Risks:

  • Not an actual bug, but code readability is slightly poor.
  • Using 0 or -1 as the initial value would be more intuitive.

Impact Scope:

  • Code readability
  • Does not affect functional correctness

Severity: MINOR

Suggestion:

private transient long lastGeneratedTxId = -1;  // Use -1 to indicate uninitialized

Rationale: -1 is more intuitive than Long.MIN_VALUE, and equally ensures the first txId is used.

Issue 3: MultiTableSinkCommitter's commit method has the same issue

Location: MultiTableSinkCommitter.java:37-60

Issue Description:

  • The PR description mentions fixing the routing issue in the abort method, but the commit method in the original code actually has the same issue (using wrong type matching).
  • This PR also fixes the commit method, but the PR description does not explicitly mention it.

Potential Risks:

  • If only reading the PR description, one might overlook the fix in the commit method.
  • However, the code is actually correct, only the description is not complete enough.

Impact Scope:

  • Documentation completeness
  • Does not affect functional correctness

Severity: MINOR

Suggestion:
Explicitly mention in the PR description that the commit method also fixes the same issue.

Issue 4: Missing integration tests for empty transaction scenarios

Location: JdbcExactlyOnceSinkWriterTest.java

Issue Description:

  • Existing tests cover normal transaction prepare and rollback scenarios.
  • However, there is a lack of dedicated tests for EmptyXaTransactionException scenarios (i.e., the emptyXaTransaction = true branch in prepareCommit).

Potential Risks:

  • The empty transaction branch has relatively complex logic and should have dedicated test coverage.
  • In current tests, the test where beginTx fails after prepareCommit succeeds does not cover empty transaction cases.

Impact Scope:

  • Test coverage
  • Potential edge case bugs

Severity: MINOR

Suggestion:
Add test case:

@Test
void testPrepareCommitWithEmptyTransactionDontRollback() throws Exception {
    TestContext context = createWriter();
    
    // Simulate prepare returning empty transaction
    when(context.xaFacade.endAndPrepare(any()))
        .thenThrow(new XaFacade.EmptyXaTransactionException(mock(Xid.class)));
    
    // Simulate beginTx failure
    doThrow(new RuntimeException("start tx failed"))
        .when(context.xaFacade)
        .start(any());
    
    Assertions.assertThrows(
        JdbcConnectorException.class, () -> context.writer.prepareCommit(10L));
    
    // Verify rollback is not called (because it's an empty transaction)
    verify(context.xaFacade, never()).rollback(any());
}

@yzeng1618
Copy link
Collaborator Author

The relevant content has been optimized in accordance with the suggestions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] [ORACLE-CDC] when mutisink oracle tables with more than one schema to sink to mysql ,mysql get XA lock

2 participants