Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -279,11 +279,13 @@ class CodegenContext {
inlineToOuterClass: Boolean = false): String = {
// The number of named constants that can exist in the class is limited by the Constant Pool
// limit, 65,536. We cannot know how many constants will be inserted for a class, so we use a
// threshold of 1600k bytes to determine when a function should be inlined to a private, nested
// sub-class.
// threshold to determine when a function should be inlined to a private, nested sub-class
val generatedClassLengthThreshold = SparkEnv.get.conf.getInt(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you ensure SparkEnv.get is always true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be empty? In

there is no check about it being None...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code that you pointed out is not in CodeGenerator.scala. In my case at CodeGenerator.scala, there are some cases that SparkEnv.get is null. If you have not seen this null in you case, it would be good.
@gatorsmile will implement a better approach to get this conf soon.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SQLConf.GENERATED_CLASS_LENGTH_THRESHOLD.key,
SQLConf.GENERATED_CLASS_LENGTH_THRESHOLD.defaultValue.get)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you simplify the above three lines into the following?

val generatedClassLengthThreshold = SparkEnv.get.conf.get(
  SQLConf.GENERATED_CLASS_LENGTH_THRESHOLD)

val (className, classInstance) = if (inlineToOuterClass) {
outerClassName -> ""
} else if (currClassSize > 1600000) {
} else if (currClassSize > generatedClassLengthThreshold) {
val className = freshName("NestedClass")
val classInstance = freshName("nestedClassInstance")

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -934,6 +934,15 @@ object SQLConf {
.intConf
.createWithDefault(10000)

val GENERATED_CLASS_LENGTH_THRESHOLD =
buildConf("spark.sql.codegen.generatedClass.size.threshold")
.doc("Threshold in bytes for the size of a generated class. If the generated class " +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add .internal(). I think this is not a parameter for users.

Copy link
Contributor Author

@mgaido91 mgaido91 Oct 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for my ignorance. If users cannot set it, how can this parameter be changed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is not a frequently-used parameter for users. Also, other JVM implimentation-specific parameters are internal, e.g., WHOLESTAGE_HUGE_METHOD_LIMIT. cc: @rednaxelafx

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you, but might you please explain me how to set a parameter which is internal? Thanks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users can change this value in the same way with other public parameters, but internal just means that parameters are not explicitly exposed to users.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the explanation, I'll add it immediately.

"size is higher of this value, a private nested class is created and used." +
"This is useful to limit the number of named constants in the class " +
"and therefore its Constant Pool. The default is 1600k.")
.intConf
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add .checkValue here with a valid minimum and maximum.

.createWithDefault(1600000)

object Deprecated {
val MAPRED_REDUCE_TASKS = "mapred.reduce.tasks"
}
Expand Down