Skip to content

Commit 088c76c

Browse files
committed
Adds comment about SPARK-8501
1 parent 99a5e7e commit 088c76c

1 file changed

Lines changed: 6 additions & 0 deletions

File tree

sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/OrcSourceSuite.scala

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,12 @@ abstract class OrcSuite extends QueryTest with BeforeAndAfterAll {
4343
orcTableDir.mkdir()
4444
import org.apache.spark.sql.hive.test.TestHive.implicits._
4545

46+
// Originally we were using a 10-row RDD for testing. However, when default parallelism is
47+
// greater than 10 (e.g., running on a node with 32 cores), this RDD contains empty partitions,
48+
// which result in empty ORC files. Unfortunately, ORC doesn't handle empty files properly and
49+
// causes build failure on Jenkins, which happens to have 32 cores. Please refer to SPARK-8501
50+
// for more details. To workaround this issue before fixing SPARK-8501, we simply increase row
51+
// number in this RDD to avoid empty partitions.
4652
sparkContext
4753
.makeRDD(1 to 100)
4854
.map(i => OrcData(i, s"part-$i"))

0 commit comments

Comments
 (0)