Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 16 additions & 1 deletion core/src/main/scala/org/apache/spark/Partitioner.scala
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,8 @@ class HashPartitioner(partitions: Int) extends Partitioner {
case _ =>
false
}

override def hashCode: Int = numPartitions
}

/**
Expand Down Expand Up @@ -119,7 +121,7 @@ class RangePartitioner[K : Ordering : ClassTag, V](
}
}

def numPartitions = partitions
def numPartitions = rangeBounds.length + 1
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If using numPartitions = partitions, there is a chance that p1 == p2 && p1.numPartitions != p2.numPartitions is true. For example, if rdd.sample is empty, p1 = new RangePartitioner[...](10, rdd, true), and p2 = new RangePartitioner[...](1, rdd, true).

That's confusing. So I changed partitions to rangeBounds.length + 1.


private val binarySearch: ((Array[K], K) => Int) = CollectionsUtils.makeBinarySearch[K]

Expand Down Expand Up @@ -155,4 +157,17 @@ class RangePartitioner[K : Ordering : ClassTag, V](
case _ =>
false
}


override def hashCode(): Int = {
val prime = 31
var result = 1
var i = 0
while (i < rangeBounds.length) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be simplified a lot with Arrays.hashCode()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is not a generics Arrays.hashCode. Arrays.hashCode(rangeBounds) cannot be compiled.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Darn, java.util.Arrays.hashCode(Object[]) doesn't match any scala array? java.util.Arrays.hashCode(Array(1,2,3)) works fine but that's not quite the situation here. Oh well. Maybe wrap as a List and use its hashCode? may not be worth it to save the code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe Scala compiler can not determine that using which one here: Arrays.hashCode(Object[]), Arrays.hashCode(int[]), or Arrays.hashCode(double[])...

result = prime * result + rangeBounds(i).hashCode
i += 1
}
result = prime * result + ascending.hashCode
result
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -50,4 +50,6 @@ private[spark] class PythonPartitioner(
case _ =>
false
}

override def hashCode: Int = 31 * numPartitions + pyPartitionFunctionId.hashCode
}
Original file line number Diff line number Diff line change
Expand Up @@ -124,4 +124,6 @@ class CustomPartitioner(partitions: Int) extends Partitioner {
c.numPartitions == numPartitions
case _ => false
}

override def hashCode: Int = numPartitions
}