Skip to content

Conversation

@yunzoud
Copy link
Contributor

@yunzoud yunzoud commented Jul 31, 2019

What changes were proposed in this pull request?

Add configuration spark.scheduler.listenerbus.eventqueue.${name}.capacity to allow configuration of different event queue size.

How was this patch tested?

Unit test in core/src/test/scala/org/apache/spark/scheduler/SparkListenerSuite.scala

@yunzoud yunzoud changed the title [SPARK-28574] Allow to config different sizes for event queues [SPARK-28574][CORE] Allow to config different sizes for event queues Jul 31, 2019
@jiangxb1987
Copy link
Contributor

add to whitelist

@jiangxb1987
Copy link
Contributor

ok to test

Copy link
Contributor

@jiangxb1987 jiangxb1987 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good only some nits

test("event queue size can be configued through spark conf") {
val conf = new SparkConf(false)
.set(LISTENER_BUS_EVENT_QUEUE_CAPACITY, 5)
.set("spark.scheduler.listenerbus.eventqueue.shared.capacity", "1")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I would use s${SHARED_QUEUE}.capacity

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

val conf = new SparkConf(false)
.set(LISTENER_BUS_EVENT_QUEUE_CAPACITY, 5)
.set("spark.scheduler.listenerbus.eventqueue.shared.capacity", "1")
.set("spark.scheduler.listenerbus.eventqueue.eventLog.capacity", "2")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly, I would use s${EVENT_LOG_QUEUE}.capacity

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

val counter2 = new BasicJobCounter()
val counter3 = new BasicJobCounter()

bus.addToSharedQueue(counter1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add comment to explain this is just to trigger add a new Queue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added comments

// if no such conf is specified, use the value specified in
// LISTENER_BUS_EVENT_QUEUE_CAPACITY
protected def capacity: Int = conf.getInt(
s"spark.scheduler.listenerbus.eventqueue.${name}.capacity",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shall assert the capacity is > 0, and add test to cover the case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add assertion


// For testing only.
private[scheduler] def getQueueCapacity(name: String): Int = {
queues.asScala.find(_.name == name) match {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe queues.asScala.find(_.name == name).map(_.capacity).getOrElse(-1) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

@yaooqinn
Copy link
Member

the $name is from private class, considering add some instructions for each in the doc configurations.md?

@zsxwing
Copy link
Member

zsxwing commented Jul 31, 2019

add to whitelist

@zsxwing
Copy link
Member

zsxwing commented Jul 31, 2019

ok to test

@SparkQA
Copy link

SparkQA commented Jul 31, 2019

Test build #108483 has finished for PR 25307 at commit 36995f1.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 31, 2019

Test build #108485 has finished for PR 25307 at commit 4ab040c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@zsxwing zsxwing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall. Left some nits.

}

// For testing only.
private[scheduler] def getQueueCapacity(name: String): Int = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could you change the return type to Option[Int] and use None to indicate an unknown queue rather than -1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

.set(s"spark.scheduler.listenerbus.eventqueue.${SHARED_QUEUE}.capacity", "1")
.set(s"spark.scheduler.listenerbus.eventqueue.${EVENT_LOG_QUEUE}.capacity", "2")

val bus = new LiveListenerBus(conf)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: bus.stop() is missing. It should be called in finally to stop the internal thread.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bus is actually never started, so no need to stop.

private[scheduler] def capacity: Int = {
val queuesize = conf.getInt(s"spark.scheduler.listenerbus.eventqueue.${name}.capacity",
conf.get(LISTENER_BUS_EVENT_QUEUE_CAPACITY))
assert(queuesize > 0, s"capacity for event queue $name must be greater than 0," +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: missing space after ,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@zsxwing
Copy link
Member

zsxwing commented Aug 1, 2019

the $name is from private class, considering add some instructions for each in the doc configurations.md?

Since names of the queues are private, it's fine to not document them to keep them internal right now.

@zsxwing
Copy link
Member

zsxwing commented Aug 2, 2019

LGTM pending tests

@SparkQA
Copy link

SparkQA commented Aug 2, 2019

Test build #108574 has finished for PR 25307 at commit 261beb0.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 2, 2019

Test build #108576 has finished for PR 25307 at commit 1c6f69a.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zsxwing
Copy link
Member

zsxwing commented Aug 2, 2019

retest this please

@jiangxb1987
Copy link
Contributor

LGTM

@SparkQA
Copy link

SparkQA commented Aug 2, 2019

Test build #108579 has finished for PR 25307 at commit 1c6f69a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zsxwing
Copy link
Member

zsxwing commented Aug 2, 2019

Thanks! Merging to master.

// LISTENER_BUS_EVENT_QUEUE_CAPACITY
private[scheduler] def capacity: Int = {
val queuesize = conf.getInt(s"spark.scheduler.listenerbus.eventqueue.${name}.capacity",
conf.get(LISTENER_BUS_EVENT_QUEUE_CAPACITY))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: indent.

conf.get(LISTENER_BUS_EVENT_QUEUE_CAPACITY))
// The capacity can be configured by spark.scheduler.listenerbus.eventqueue.${name}.capacity,
// if no such conf is specified, use the value specified in
// LISTENER_BUS_EVENT_QUEUE_CAPACITY
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to update the conf description of LISTENER_BUS_EVENT_QUEUE_CAPACITY.

// if no such conf is specified, use the value specified in
// LISTENER_BUS_EVENT_QUEUE_CAPACITY
private[scheduler] def capacity: Int = {
val queuesize = conf.getInt(s"spark.scheduler.listenerbus.eventqueue.${name}.capacity",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of hard-coded here, can we define it in core/src/main/scala/org/apache/spark/internal/config/package.scala ?

@gatorsmile
Copy link
Member

cc @jiangxb1987 @Ngone51

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants