Skip to content

Commit 6dc28a9

Browse files
committed
[FLINK-37120][cdc-connector] add ending split chunk first to avoid TaskManager oom
1 parent 858371c commit 6dc28a9

File tree

69 files changed

+710
-86
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

69 files changed

+710
-86
lines changed

docs/content.zh/docs/connectors/flink-sources/db2-cdc.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -264,6 +264,17 @@ Db2 server.
264264
so it does not need to be explicitly configured 'execution.checkpointing.checkpoints-after-tasks-finish.enabled' = 'true'
265265
</td>
266266
</tr>
267+
<tr>
268+
<td>scan.incremental.snapshot.assign-ending-chunk-first.enabled</td>
269+
<td>optional</td>
270+
<td style="word-wrap: break-word;">false</td>
271+
<td>Boolean</td>
272+
<td>
273+
Whether to assign the ending chunk first during snapshot reading phase.<br>
274+
This might help reduce the risk of the TaskManager experiencing an out-of-memory (OOM) error when taking a snapshot of the largest unbounded chunk.<br>
275+
Experimental option, defaults to false.
276+
</td>
277+
</tr>
267278
</tbody>
268279
</table>
269280
</div>

docs/content.zh/docs/connectors/flink-sources/mongodb-cdc.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -332,6 +332,17 @@ MongoDB 的更改事件记录在消息之前没有更新。因此,我们只能
332332
<td>TIMESTAMP_LTZ(3) NOT NULL</td>
333333
<td>它指示在数据库中进行更改的时间。 <br>如果记录是从表的快照而不是改变流中读取的,该值将始终为0。</td>
334334
</tr>
335+
<tr>
336+
<td>scan.incremental.snapshot.assign-ending-chunk-first.enabled</td>
337+
<td>optional</td>
338+
<td style="word-wrap: break-word;">false</td>
339+
<td>Boolean</td>
340+
<td>
341+
快照读取阶段是否先分配 EndingChunk。<br>
342+
这有助于降低 TaskManager 在快照阶段同步最后一个chunk时遇到内存溢出 (OOM) 的风险。<br>
343+
这是一项实验特性,默认为 false。
344+
</td>
345+
</tr>
335346
</tbody>
336347
</table>
337348

docs/content.zh/docs/connectors/flink-sources/mysql-cdc.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -389,6 +389,17 @@ Flink SQL> SELECT * FROM orders;
389389
这是一项实验性功能。
390390
</td>
391391
</tr>
392+
<tr>
393+
<td>scan.incremental.snapshot.assign-ending-chunk-first.enabled</td>
394+
<td>optional</td>
395+
<td style="word-wrap: break-word;">false</td>
396+
<td>Boolean</td>
397+
<td>
398+
快照读取阶段是否先分配 EndingChunk。<br>
399+
这有助于降低 TaskManager 在快照阶段同步最后一个chunk时遇到内存溢出 (OOM) 的风险。<br>
400+
这是一项实验特性,默认为 false。
401+
</td>
402+
</tr>
392403
</tbody>
393404
</table>
394405
</div>

docs/content.zh/docs/connectors/flink-sources/oracle-cdc.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -422,6 +422,17 @@ Connector Options
422422
<td>The chunk key of table snapshot, captured tables are split into multiple chunks by a chunk key when read the snapshot of table.
423423
By default, the chunk key is 'ROWID'. This column must be a column of the primary key.</td>
424424
</tr>
425+
<tr>
426+
<td>scan.incremental.snapshot.assign-ending-chunk-first.enabled</td>
427+
<td>optional</td>
428+
<td style="word-wrap: break-word;">false</td>
429+
<td>Boolean</td>
430+
<td>
431+
Whether to assign the ending chunk first during snapshot reading phase.<br>
432+
This might help reduce the risk of the TaskManager experiencing an out-of-memory (OOM) error when taking a snapshot of the largest unbounded chunk.<br>
433+
Experimental option, defaults to false.
434+
</td>
435+
</tr>
425436
</tbody>
426437
</table>
427438
</div>

docs/content.zh/docs/connectors/flink-sources/postgres-cdc.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -245,6 +245,17 @@ Connector Options
245245
The checkpoint LSN offsets will be committed in rolling fashion, the earliest checkpoint identifier will be committed first from the delayed checkpoints.
246246
</td>
247247
</tr>
248+
<tr>
249+
<td>scan.incremental.snapshot.assign-ending-chunk-first.enabled</td>
250+
<td>optional</td>
251+
<td style="word-wrap: break-word;">false</td>
252+
<td>Boolean</td>
253+
<td>
254+
Whether to assign the ending chunk first during snapshot reading phase.<br>
255+
This might help reduce the risk of the TaskManager experiencing an out-of-memory (OOM) error when taking a snapshot of the largest unbounded chunk.<br>
256+
Experimental option, defaults to false.
257+
</td>
258+
</tr>
248259
</tbody>
249260
</table>
250261
</div>

docs/content.zh/docs/connectors/flink-sources/sqlserver-cdc.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -238,6 +238,17 @@ Connector Options
238238
<td>The chunk key of table snapshot, captured tables are split into multiple chunks by a chunk key when read the snapshot of table.
239239
By default, the chunk key is the first column of the primary key. This column must be a column of the primary key.</td>
240240
</tr>
241+
<tr>
242+
<td>scan.incremental.snapshot.assign-ending-chunk-first.enabled</td>
243+
<td>optional</td>
244+
<td style="word-wrap: break-word;">false</td>
245+
<td>Boolean</td>
246+
<td>
247+
Whether to assign the ending chunk first during snapshot reading phase.<br>
248+
This might help reduce the risk of the TaskManager experiencing an out-of-memory (OOM) error when taking a snapshot of the largest unbounded chunk.<br>
249+
Experimental option, defaults to false.
250+
</td>
251+
</tr>
241252
</tbody>
242253
</table>
243254
</div>

docs/content.zh/docs/connectors/pipeline-connectors/mysql.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -312,6 +312,17 @@ pipeline:
312312
<td>Boolean</td>
313313
<td>是否将TINYINT(1)类型当做Boolean类型处理,默认true。</td>
314314
</tr>
315+
<tr>
316+
<td>scan.incremental.snapshot.assign-ending-chunk-first.enabled</td>
317+
<td>optional</td>
318+
<td style="word-wrap: break-word;">false</td>
319+
<td>Boolean</td>
320+
<td>
321+
快照读取阶段是否先分配 EndingChunk。<br>
322+
这有助于降低 TaskManager 在快照阶段同步最后一个chunk时遇到内存溢出 (OOM) 的风险。<br>
323+
这是一项实验特性,默认为 false。
324+
</td>
325+
</tr>
315326
</tbody>
316327
</table>
317328
</div>

docs/content/docs/connectors/flink-sources/db2-cdc.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -263,6 +263,17 @@ Db2 server.
263263
If the flink version is greater than or equal to 1.15, the default value of 'execution.checkpointing.checkpoints-after-tasks-finish.enabled' has been changed to true,
264264
so it does not need to be explicitly configured 'execution.checkpointing.checkpoints-after-tasks-finish.enabled' = 'true'
265265
</td>
266+
</tr>
267+
<tr>
268+
<td>scan.incremental.snapshot.assign-ending-chunk-first.enabled</td>
269+
<td>optional</td>
270+
<td style="word-wrap: break-word;">false</td>
271+
<td>Boolean</td>
272+
<td>
273+
Whether to assign the ending chunk first during snapshot reading phase.<br>
274+
This might help reduce the risk of the TaskManager experiencing an out-of-memory (OOM) error when taking a snapshot of the largest unbounded chunk.<br>
275+
Experimental option, defaults to false.
276+
</td>
266277
</tr>
267278
</tbody>
268279
</table>

docs/content/docs/connectors/flink-sources/mongodb-cdc.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -320,6 +320,17 @@ Connector Options
320320
<td style="word-wrap: break-word;">true</td>
321321
<td>Boolean</td>
322322
<td>MongoDB server normally times out idle cursors after an inactivity period (10 minutes) to prevent excess memory use. Set this option to true to prevent that. Only available when parallelism snapshot is enabled.</td>
323+
</tr>
324+
<tr>
325+
<td>scan.incremental.snapshot.assign-ending-chunk-first.enabled</td>
326+
<td>optional</td>
327+
<td style="word-wrap: break-word;">false</td>
328+
<td>Boolean</td>
329+
<td>
330+
Whether to assign the ending chunk first during snapshot reading phase.<br>
331+
This might help reduce the risk of the TaskManager experiencing an out-of-memory (OOM) error when taking a snapshot of the largest unbounded chunk.<br>
332+
Experimental option, defaults to false.
333+
</td>
323334
</tr>
324335
</tbody>
325336
</table>

docs/content/docs/connectors/flink-sources/mysql-cdc.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -415,6 +415,17 @@ During a snapshot operation, the connector will query each included table to pro
415415
When 'use.legacy.json.format' = 'false', the data would be converted to {"key1": "value1", "key2": "value2"}, with whitespace before values and after commas preserved.
416416
</td>
417417
</tr>
418+
<tr>
419+
<td>scan.incremental.snapshot.assign-ending-chunk-first.enabled</td>
420+
<td>optional</td>
421+
<td style="word-wrap: break-word;">false</td>
422+
<td>Boolean</td>
423+
<td>
424+
Whether to assign the ending chunk first during snapshot reading phase.<br>
425+
This might help reduce the risk of the TaskManager experiencing an out-of-memory (OOM) error when taking a snapshot of the largest unbounded chunk.<br>
426+
Experimental option, defaults to false.
427+
</td>
428+
</tr>
418429
</tbody>
419430
</table>
420431
</div>

0 commit comments

Comments
 (0)