Skip to content

Commit 56af8e8

Browse files
yanboliangjkbradley
authored andcommitted
[SPARK-14298][ML][MLLIB] LDA should support disable checkpoint
## What changes were proposed in this pull request? In the doc of [```checkpointInterval```](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/param/shared/sharedParams.scala#L241), we told users that they can disable checkpoint by setting ```checkpointInterval = -1```. But we did not handle this situation for LDA actually, we should fix this bug. ## How was this patch tested? Existing tests. cc jkbradley Author: Yanbo Liang <ybliang8@gmail.com> Closes #12089 from yanboliang/spark-14298.
1 parent 94ac58b commit 56af8e8

2 files changed

Lines changed: 6 additions & 3 deletions

File tree

mllib/src/main/scala/org/apache/spark/mllib/impl/PeriodicCheckpointer.scala

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,8 @@ import org.apache.spark.storage.StorageLevel
5252
* - This class removes checkpoint files once later Datasets have been checkpointed.
5353
* However, references to the older Datasets will still return isCheckpointed = true.
5454
*
55-
* @param checkpointInterval Datasets will be checkpointed at this interval
55+
* @param checkpointInterval Datasets will be checkpointed at this interval.
56+
* If this interval was set as -1, then checkpointing will be disabled.
5657
* @param sc SparkContext for the Datasets given to this checkpointer
5758
* @tparam T Dataset type, such as RDD[Double]
5859
*/
@@ -89,7 +90,8 @@ private[mllib] abstract class PeriodicCheckpointer[T](
8990
updateCount += 1
9091

9192
// Handle checkpointing (after persisting)
92-
if ((updateCount % checkpointInterval) == 0 && sc.getCheckpointDir.nonEmpty) {
93+
if (checkpointInterval != -1 && (updateCount % checkpointInterval) == 0
94+
&& sc.getCheckpointDir.nonEmpty) {
9395
// Add new checkpoint before removing old checkpoints.
9496
checkpoint(newData)
9597
checkpointQueue.enqueue(newData)

mllib/src/main/scala/org/apache/spark/mllib/impl/PeriodicGraphCheckpointer.scala

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,8 @@ import org.apache.spark.storage.StorageLevel
6969
* // checkpointed: graph4
7070
* }}}
7171
*
72-
* @param checkpointInterval Graphs will be checkpointed at this interval
72+
* @param checkpointInterval Graphs will be checkpointed at this interval.
73+
* If this interval was set as -1, then checkpointing will be disabled.
7374
* @tparam VD Vertex descriptor type
7475
* @tparam ED Edge descriptor type
7576
*

0 commit comments

Comments
 (0)