Skip to content

Conversation

@zhzhan
Copy link
Contributor

@zhzhan zhzhan commented Mar 6, 2017

What changes were proposed in this pull request?

When BytesToBytesMap spills, its longArray should be released. Otherwise, it may not released until the task complete. This array may take a significant amount of memory, which cannot be used by later operator, such as UnsafeShuffleExternalSorter, resulting in more frequent spill in sorter. This patch release the array as destructive iterator will not use this array anymore.

How was this patch tested?

Manual test in production

@srowen
Copy link
Member

srowen commented Mar 6, 2017

Probably one for @JoshRosen

@SparkQA
Copy link

SparkQA commented Mar 6, 2017

Test build #74036 has finished for PR 17180 at commit e4be325.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: i would do this after the memory is cleaned up

@tejasapatil
Copy link
Contributor

ping @JoshRosen @davies @sameeragarwal

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line seems to explain the behavior of current code (before applying this PR). It would be good to explain the behavior of the new code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, what does this short-circuiting achieve?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, spill is called when other consumers want to allocate in-memory pages but can't get any memory. If we are able to release ample memory by free'ing up the longArray, then the purpose of spill is achieved. Hence the short circuit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC we can only free this pointer array if we spill its related data pages. Is this short circuit safe to do?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan, I was wondering this, too: I think that this is okay because we're in a destructive iterator over the map, so we can just walk through the data pages without needing to worry about hash-style lookups. If this is the case, though, then we should just destroy the pointer array earlier (as I commented downthread on the PR).

@JoshRosen
Copy link
Contributor

If we truly don't need the longArray here then why not just pre-emptively destroy it in the constructor of MapIterator when destructive = true?

@HyukjinKwon
Copy link
Member

Hi @zhzhan is this PR active? if so, would you answer or address the review comment?

@zhzhan
Copy link
Contributor Author

zhzhan commented Jul 24, 2017

per review comments, release the longArray on destructive iterator creation.

@SparkQA
Copy link

SparkQA commented Jul 25, 2017

Test build #79918 has finished for PR 17180 at commit d315c60.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zhzhan
Copy link
Contributor Author

zhzhan commented Jul 25, 2017

The test failure us caused by call method on the map after destructiveIterator() has been called.
It is illegal by the definition.
https://github.com/apache/spark/blob/master/core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java#L417

We should remove this test case as it does not follow the restriction. Please let me know the feedback.

Stack trace of the failure.
sbt.ForkMain$ForkError: java.lang.AssertionError: null
at org.apache.spark.unsafe.map.BytesToBytesMap.safeLookup(BytesToBytesMap.java:463)
at org.apache.spark.unsafe.map.BytesToBytesMap.lookup(BytesToBytesMap.java:453)
at org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMap.getAggregationBufferFromUnsafeRow(UnsafeFixedWidthAggregationMap.java:125)
at org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMap.getAggregationBufferFromUnsafeRow(UnsafeFixedWidthAggregationMap.java:120)
at org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMap.getAggregationBuffer(UnsafeFixedWidthAggregationMap.java:116)
at org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMapSuite$$anonfun$3.apply$mcV$sp(UnsafeFixedWidthAggregationMapSuite.scala:141)
at org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMapSuite$$anonfun$testWithMemoryLeakDetection$1.apply$mcV$sp(UnsafeFixedWidthAggregationMapSuite.scala:80)
at org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMapSuite$$anonfun$testWithMemoryLeakDetection$1.apply(UnsafeFixedWidthAggregationMapSuite.scala:65)
at org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMapSuite$$anonfun$testWithMemoryLeakDetection$1.apply(UnsafeFixedWidthAggregationMapSuite.scala:65)

@kiszk
Copy link
Member

kiszk commented Jul 26, 2017

Is it better to fix this test instead of remove it?

@zhzhan
Copy link
Contributor Author

zhzhan commented Jul 27, 2017

Will fix the unit test.

@SparkQA
Copy link

SparkQA commented Jul 27, 2017

Test build #79989 has finished for PR 17180 at commit 2403c30.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zhzhan
Copy link
Contributor Author

zhzhan commented Jul 27, 2017

retest it please.

@gatorsmile
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Jul 29, 2017

Test build #80048 has finished for PR 17180 at commit 2403c30.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member

LGTM Wait for @JoshRosen for final sign off.

@JoshRosen
Copy link
Contributor

JoshRosen commented Jul 31, 2017 via email

@gatorsmile
Copy link
Member

Thanks! Merging to master.

@zhzhan Could you address the comments about the test case in the follow-up PR? Thanks!

@asfgit asfgit closed this in 44e501a Jul 31, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants