Skip to content

Commit 3b3f797

Browse files
author
yuezhang
committed
review
1 parent b57acdd commit 3b3f797

1 file changed

Lines changed: 12 additions & 13 deletions

File tree

rfc/rfc-56/rfc-56.md

Lines changed: 12 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -102,18 +102,18 @@ As we know, hoodie has two ways to create and maintain markers:
102102
2. TimelineServerBasedWriteMarkers: marker operations are all handled at the timeline service which serves as a proxy
103103

104104
Therefore, for different types of Marker, we must implement the corresponding check marker conflict logic.
105-
Here we expand the existing `ConflictResolutionStrategy` interface to ensure the scalability of checking marker conflict.
105+
Here we design a new interface `HoodieEarlyConflictDetectionStrategy` to ensure the scalability of checking marker conflict.
106106

107107
![](flow1.png)
108108

109-
In this design, we provide `DirectMarkerWithTransactionConflictResolutionStrategy` and
110-
`SimpleDirectMarkerConflictResolutionStrategy` for DirectWriteMarkers to perform corresponding conflict detection and
111-
conflict resolution. And we provide `AsyncTimelineMarkerConflictResolutionStrategy` for TimelineServerBasedWriteMarkers
112-
to perform corresponding conflict detection and conflict resolution
109+
In this design, we provide `SimpleTransactionDirectMarkerBasedEarlyConflictDetectionStrategy` and
110+
`SimpleDirectMarkerBasedEarlyConflictDetectionStrategy` for DirectWriteMarkers to perform corresponding conflict detection and
111+
conflict resolution. And we provide `AsyncTimelineMarkerEarlyConflictDetectionStrategy
112+
` for TimelineServerBasedWriteMarkers to perform corresponding conflict detection and conflict resolution
113113

114114
#### DirectWriteMarkers related strategy
115115

116-
##### DirectMarkerWithTransactionConflictResolutionStrategy
116+
##### SimpleTransactionDirectMarkerBasedEarlyConflictDetectionStrategy
117117

118118
![](figure2.png)
119119

@@ -133,9 +133,9 @@ are going to create maker file based on partition path 2022/07/01, and fileID ff
133133
During marker conflict checking, we do not need to list all the partitions we have under ".temp". Just list marker files under
134134
$BasePath/.hoodie/.temp/instantTime/2022/07/01 and check if fileID ff26cb9e-e034-4931-9c59-71bac578f7a0-0 existed or not.
135135

136-
##### SimpleDirectMarkerConflictResolutionStrategy
136+
##### SimpleDirectMarkerBasedEarlyConflictDetectionStrategy
137137

138-
Compared with `DirectMarkerWithTransactionConflictResolutionStrategy`, the current strategy drops the steps of
138+
Compared with `SimpleTransactionDirectMarkerBasedEarlyConflictDetectionStrategy`, the current strategy drops the steps of
139139
transaction including new transaction, begin transaction and end transaction. The advantages are that the checking
140140
speed is faster, and multiple writers will not affect each other. But the downside is that the conflict detection may be
141141
delayed.
@@ -147,12 +147,11 @@ If so, the conflict can only be found in the pre-commit conflict checking and fa
147147
This leads to waste of resources, but don't compromise the correctness of the data.
148148

149149

150-
As we can see, not only DirectMarkerWithTransactionConflictResolutionStrategy but also SimpleDirectMarkerConflictResolutionStrategy
151-
have extra fs calling for ealy conflict detection.
150+
As we can see, all these direct based early conflict detection strategy need extra fs calling.
152151

153152
#### TimelineServerBasedWriteMarkers related strategy
154153

155-
##### AsyncTimelineMarkerConflictResolutionStrategy
154+
##### AsyncTimelineMarkerEarlyConflictDetectionStrategy
156155

157156
This design expands the create marker api on timeline server.
158157

@@ -220,8 +219,8 @@ At this time, this behavior is consistent with the existing OCC based conflict d
220219
This RFC adds three new configs to control the behavior of early conflict detection
221220

222221
1. `hoodie.write.lock.early.conflict.detection.enable` default false. Enable early conflict detection based on markers. It will try to detect writing conflict before create markers and fast fail which will release cluster resources as soon as possible.
223-
2. `hoodie.write.lock.early.conflict.async.checker.batch.interval` default 30000L. Used for timeline based marker AsyncTimelineMarkerConflictResolutionStrategy. The time to delay first async marker conflict checking.
224-
3. `hoodie.write.lock.early.conflict.async.checker.period` default 30000L. Used for timeline based marker AsyncTimelineMarkerConflictResolutionStrategy. The period between each marker conflict checking.
222+
2. `hoodie.write.lock.early.conflict.async.checker.batch.interval` default 30000L. Used for timeline based marker AsyncTimelineMarkerEarlyConflictDetectionStrategy. The time to delay first async marker conflict checking.
223+
3. `hoodie.write.lock.early.conflict.async.checker.period` default 30000L. Used for timeline based marker AsyncTimelineMarkerEarlyConflictDetectionStrategy. The period between each marker conflict checking.
225224
4. `hoodie.write.lock.early.conflict.detection.strategy` default AsyncTimelineMarkerEarlyConflictDetectionStrategy. Early conflict detection class name, this should be subclass of oorg.apache.hudi.common.model.HoodieEarlyConflictDetectionStrategy
226225

227226

0 commit comments

Comments
 (0)