[Improve][connector-elasticsearchv2]Optimize Elasticsearch source che… by CosmosNi · Pull Request #10446 · apache/seatunnel

CosmosNi · 2026-02-04T09:26:48Z

This change narrows the checkpoint lock scope in the Elasticsearch source reader so ES I/O, parsing, and sleeps happen outside the lock, reducing checkpoint delay while keeping collect atomic. It also adds a best‑effort SQL cursor close to release ES server‑side resources when SQL pagination is interrupted or completes.

Purpose of this pull request

Does this PR introduce any user-facing change?

How was this patch tested?

Check list

If any new Jar binary package adding in your PR, please add License Notice according
New License Guide
If necessary, please update the documentation to describe the new feature. https://github.com/apache/seatunnel/tree/dev/docs
If necessary, please update incompatible-changes.md to describe the incompatibility caused by this PR.
If you are contributing the connector code, please check that the following files are updated:
1. Update plugin-mapping.properties and add new connector information in it
2. Update the pom file of seatunnel-dist
3. Add ci label in label-scope-conf
4. Add e2e testcase in seatunnel-e2e
5. Update connector plugin_config

…ckpointing and SQL cursor cleanup

DanielCarter-stack · 2026-02-04T10:10:30Z

Issue 1: PIT hasMore logic change may cause infinite loop or additional queries

Location: EsRestClient.java:1072

// Before modification
boolean hasMore = docs.size() > 0 && totalHits > 0 && docs.size() < totalHits;

// After modification
boolean hasMore = !docs.isEmpty();

Related Context:

Caller: ElasticsearchSourceReader.java:205 (while (pitResult.isHasMore()))
ES official documentation: PIT pagination should determine if there is more data through docs.size() < totalHits

Problem Description:
The original logic considered pagination semantics: when the number of returned documents is less than the total, it indicates there is more data. The new logic simplifies to "continue as long as there is data," which will lead to additional invalid queries when processing the last batch of data.

Potential Risks:

Risk 1: When querying the last batch of data, !docs.isEmpty() returns true, causing the loop to execute one more time, and ES returns empty results
Risk 2: Increases unnecessary network overhead and latency
Risk 3: If ES throws an exception instead of returning empty results when there is no data, it may cause task failure

Scope of Impact:

Direct Impact: All Source tasks using PIT API
Impact Area: Single Connector (Elasticsearch)

Severity: MAJOR

Improvement Suggestions:

// Restore the original logic, or if simplification is indeed necessary, explain the reason in the PR description
boolean hasMore = !docs.isEmpty() && docs.size() < totalHits;
// If totalHits is unreliable, other judgment methods can be used, but thorough testing is required

Rationale:

The original logic is the correct pagination judgment
If the original logic has problems, it should be explained in the PR description with test evidence
There is currently no evidence that the original logic has a bug

Issue 2: PR description does not match actual changes

Location: PR description

"This change narrows the checkpoint lock scope in the Elasticsearch source reader"

Related Context:

ElasticsearchSourceReader.java:78-91: The synchronized block scope of the pollNext() method was not changed
All changes in the PR are inside the scrollSearchResult() method

Problem Description:
The PR title and description claim to "optimize checkpoint lock scope," but the actual code does not modify the scope of the checkpoint lock at all. All operations (including ES I/O, parsing, sleep) are still inside the synchronized (output.getCheckpointLock()) block.

Potential Risks:

Risk 1: Misleads reviewers and users into thinking performance issues have been resolved
Risk 2: Actual checkpoint performance issues remain unfixed
Risk 3: If users do not report performance issues because of this PR, the problem will be hidden

Scope of Impact:

Direct Impact: All tasks using Elasticsearch Source
Indirect Impact: User experience and community trust

Severity: MAJOR

Improvement Suggestions:

Correct PR description: Change the title to "[Improve][connector-elasticsearchv2] Add SQL cursor cleanup for Elasticsearch source"
Or truly optimize checkpoint lock: If you really want to optimize the lock scope, you should modify it like this:

@Override
public void pollNext(Collector<SeaTunnelRow> output) throws Exception {
    ElasticsearchSourceSplit split;
    synchronized (output.getCheckpointLock()) {
        split = splits.poll();  // Only perform state reads inside the lock
    }
    
    if (split != null) {
        // Execute I/O outside the lock
        SeaTunnelRowType seaTunnelRowType = split.getSeaTunnelRowType();
        ElasticsearchConfig sourceIndexInfo = split.getElasticsearchConfig();
        scrollSearchResult(seaTunnelRowType, sourceIndexInfo, output);
    } else if (noMoreSplit) {
        synchronized (output.getCheckpointLock()) {
            log.info("Closed the bounded ELasticsearch source");
            context.signalNoMoreElement();
        }
    } else {
        Thread.sleep(pollNextWaitTime);  // Sleep outside the lock
    }
}

However, this change requires more careful concurrent safety review, because output.collect() needs to be called inside the lock

Rationale:

Code review must be based on actual changes, not PR descriptions
Mismatch between description and implementation can cause serious issues to be overlooked

Issue 3: Timing of SQL cursor closure may cause resources not to be released

Location: ElasticsearchSourceReader.java:119-127

} finally {
    if (StringUtils.isNotEmpty(cursor)) {
        try {
            esRestClient.closeSqlCursor(cursor);
        } catch (Exception e) {
            log.warn("Failed to close SQL cursor: " + cursor, e);
        }
    }
}

Related Context:

ES SQL API documentation: cursor indicates completion when the last query returns an empty string
EsRestClient.java:421-449: In the getDocsFromSqlResponse() method, scrollId is only set when the response contains the "cursor" field

Problem Description:
Under normal circumstances, ES SQL pagination will set the cursor to an empty string (or not return the cursor field) on the last return, at which point the while loop exits. The finally block will attempt to close the last non-empty cursor, which is correct.

However, there is an edge case: if searchBySql() returns empty results on the first call (for example, the queried table is empty), the cursor is null at this time and the close operation will not be executed. But if the first query returns a cursor but no rows (unlikely, but theoretically possible), the cursor will be closed, which is the correct behavior.

Potential Risks:

Risk 1: If ES returns a cursor but no data under abnormal circumstances, the finally block will correctly close it (this is good)
Risk 2: If outputFromScrollResult() throws an exception, the cursor will be correctly closed (this is good)
Actual Verification: This logic is correct and there is no problem

Scope of Impact:

Direct Impact: SQL query scenarios
Actual Impact: Positive (fixed resource leak)

Severity: MINOR (not a problem, but needs verification)

Improvement Suggestions:
It is recommended to add test cases to verify the following scenarios:

Cursor closure upon normal completion
Cursor closure upon abnormal interruption
First query returns empty results

Rationale:

Although the logic appears correct, it needs to be verified through tests to verify the actual behavior of the ES API

Issue 4: Missing unit tests and integration tests

Location: Entire PR

Problem Description:
The PR does not contain any test code, but this is an important modification involving resource management and concurrent safety.

Potential Risks:

Risk 1: PIT hasMore logic changes are not covered by tests and may fail in certain edge cases
Risk 2: SQL cursor closure logic is not tested and cannot be verified whether it is correct under various abnormal circumstances
Risk 3: Future refactoring may break this logic

Scope of Impact:

Direct Impact: Code quality and maintainability
Indirect Impact: Users may encounter problems in production environments

Severity: MAJOR

Improvement Suggestions:
Add the following tests:

@Test
public void testSqlCursorClosedOnNormalCompletion() {
    // Test that cursor is closed when test completes normally
}

@Test
public void testSqlCursorClosedOnException() {
    // Test that cursor is closed when test throws exception
}

@Test
public void testPitHasMoreLogic() {
    // Test PIT hasMore return values in various scenarios
    // - Empty results
    // - Intermediate batch
    // - Last batch
    // - Total count is 0
}

Rationale:

Resource management logic must have test coverage
Behavior changes (PIT hasMore) must have test verification

LiJie20190102 · 2026-02-04T14:13:57Z

...c/main/java/org/apache/seatunnel/connectors/seatunnel/elasticsearch/client/EsRestClient.java

+     */
+    public boolean closeSqlCursor(String cursor) {
+        if (StringUtils.isEmpty(cursor)) {
+            log.warn("Attempted to close SQL cursor with empty cursor");


Thank you for your contribution. I don't quite understand the log printing here. You can say that an empty cursor doesn't need to be closed, What do you think?

LiJie20190102 · 2026-02-04T14:14:06Z

...c/main/java/org/apache/seatunnel/connectors/seatunnel/elasticsearch/client/EsRestClient.java

+                return false;
+            }
+        } catch (Exception ex) {
+            log.warn("Failed to close SQL cursor: " + cursor, ex);


When printing logs, it is best to standardize the format and use placeholders {}

LiJie20190102 · 2026-02-04T14:14:22Z

...rg/apache/seatunnel/connectors/seatunnel/elasticsearch/source/ElasticsearchSourceReader.java

+                    try {
+                        esRestClient.closeSqlCursor(cursor);
+                    } catch (Exception e) {
+                        log.warn("Failed to close SQL cursor: " + cursor, e);


…ckpointing and SQL cursor cleanup

zhangshenghang · 2026-02-09T13:23:29Z

Please add some tests

[Improve][connector-elasticsearchv2]Optimize Elasticsearch source che…

69fa63a

…ckpointing and SQL cursor cleanup

github-actions bot added connectors-v2 elasticsearch labels Feb 4, 2026

LiJie20190102 reviewed Feb 4, 2026

View reviewed changes

LiJie20190102 assigned CosmosNi Feb 4, 2026

[Improve][connector-elasticsearchv2]Optimize Elasticsearch source che…

b8a79df

…ckpointing and SQL cursor cleanup

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Improve][connector-elasticsearchv2]Optimize Elasticsearch source che…#10446

[Improve][connector-elasticsearchv2]Optimize Elasticsearch source che…#10446
CosmosNi wants to merge 2 commits intoapache:devfrom
CosmosNi:improve-es-code

CosmosNi commented Feb 4, 2026

Uh oh!

DanielCarter-stack commented Feb 4, 2026

Uh oh!

LiJie20190102 Feb 4, 2026

Uh oh!

LiJie20190102 Feb 4, 2026

Uh oh!

CosmosNi Feb 5, 2026

Uh oh!

LiJie20190102 Feb 4, 2026

Uh oh!

zhangshenghang commented Feb 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

CosmosNi commented Feb 4, 2026

Purpose of this pull request

Does this PR introduce any user-facing change?

How was this patch tested?

Check list

Uh oh!

DanielCarter-stack commented Feb 4, 2026

Issue 1: PIT hasMore logic change may cause infinite loop or additional queries

Issue 2: PR description does not match actual changes

Issue 3: Timing of SQL cursor closure may cause resources not to be released

Issue 4: Missing unit tests and integration tests

Uh oh!

LiJie20190102 Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

LiJie20190102 Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

CosmosNi Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

LiJie20190102 Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

zhangshenghang commented Feb 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants