-
Notifications
You must be signed in to change notification settings - Fork 2.1k
[FLINK-37278] Optimize regular schema evolution topology's performance #3912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Would @hiliuxg like to take a look? |
Shawn-Hx
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that SchemaChangeResponse#ResponseCode can only be SUCCESS now. Can we remove SchemaChangeResponse#ResponseCode and simplify the logic in SchemaOperator#handleSchemaChangeEvent ?
...e/src/main/java/org/apache/flink/cdc/runtime/operators/schema/regular/SchemaCoordinator.java
Show resolved
Hide resolved
...e/src/main/java/org/apache/flink/cdc/runtime/operators/schema/regular/SchemaCoordinator.java
Show resolved
Hide resolved
2fa29b6 to
a0d9b68
Compare
|
Thanks for Shawn's kindly review, comments addressed. |
Shawn-Hx
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@yuxiqian @Shawn-Hx |
Thanks for @gongzexin's report. IIUC, the root cause of this problem is I wonder if we can handle it in another PR, and focus on modifying the schema evolution request queueing logic here? |
lvyanquan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
...e/src/main/java/org/apache/flink/cdc/runtime/operators/schema/regular/SchemaCoordinator.java
Show resolved
Hide resolved
|
Hi, @leonardBang @ruanhang1993, could you take a look at this? |
…'s performance This closes apache#3912.
|
@yuxiqian I have a table that I want to delete and resynchronize with both full and incremental data. The schema recorded by the Schema Manager will not be automatically deleted when the table is deleted, skipping the new table creation statement. What should I do |
This closes FLINK-37278.
Currently, regular SE topology uses the following process to drain existing
DataChangeEvents in the pipeline:FlushEventto downstream.FlushSuccessEventnotifications from Sink.As a result, all schema change requests will took at least 1 second to finish, after at least one polling interval.
This PR replaces the polling code with maintaining a pending schema change request queue, where SchemaCoordinator could manage all pending clients and effectively blocking them from handling upstream events. Schema evolution process could start immediately after
FlushSuccessEventgot reported, needless to wait for polling requests from clients.With this change, time consumption of
testRegularTablesSourceInMultipleParallelismtest case has been reduced from ~6 minutes to ~50 seconds.