Skip to content

Commit 8b6232b

Browse files
cloud-fangatorsmile
authored andcommitted
[SPARK-27521][SQL] Move data source v2 to catalyst module
## What changes were proposed in this pull request? Currently we are in a strange status that, some data source v2 interfaces(catalog related) are in sql/catalyst, some data source v2 interfaces(Table, ScanBuilder, DataReader, etc.) are in sql/core. I don't see a reason to keep data source v2 API in 2 modules. If we should pick one module, I think sql/catalyst is the one to go. Catalyst module already has some user-facing stuff like DataType, Row, etc. And we have to update `Analyzer` and `SessionCatalog` to support the new catalog plugin, which needs to be in the catalyst module. This PR can solve the problem we have in #24246 ## How was this patch tested? existing tests Closes #24416 from cloud-fan/move. Authored-by: Wenchen Fan <[email protected]> Signed-off-by: gatorsmile <[email protected]>
1 parent 3f102a8 commit 8b6232b

60 files changed

Lines changed: 65 additions & 28 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

project/MimaExcludes.scala

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -291,6 +291,45 @@ object MimaExcludes {
291291
case _ => true
292292
},
293293

294+
// [SPARK-27521][SQL] Move data source v2 to catalyst module
295+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.vectorized.ColumnarBatch"),
296+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.vectorized.ArrowColumnVector"),
297+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.vectorized.ColumnarRow"),
298+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.vectorized.ColumnarArray"),
299+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.vectorized.ColumnarMap"),
300+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.vectorized.ColumnVector"),
301+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.sources.GreaterThanOrEqual"),
302+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.sources.StringEndsWith"),
303+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.sources.LessThanOrEqual$"),
304+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.sources.In$"),
305+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.sources.Not"),
306+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.sources.IsNotNull"),
307+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.sources.LessThan"),
308+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.sources.LessThanOrEqual"),
309+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.sources.EqualNullSafe$"),
310+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.sources.GreaterThan$"),
311+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.sources.In"),
312+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.sources.And"),
313+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.sources.StringStartsWith$"),
314+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.sources.EqualNullSafe"),
315+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.sources.StringEndsWith$"),
316+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.sources.GreaterThanOrEqual$"),
317+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.sources.Not$"),
318+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.sources.IsNull$"),
319+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.sources.LessThan$"),
320+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.sources.IsNotNull$"),
321+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.sources.Or"),
322+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.sources.EqualTo$"),
323+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.sources.GreaterThan"),
324+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.sources.StringContains"),
325+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.sources.Filter"),
326+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.sources.IsNull"),
327+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.sources.EqualTo"),
328+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.sources.And$"),
329+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.sources.Or$"),
330+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.sources.StringStartsWith"),
331+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.sources.StringContains$"),
332+
294333
// [SPARK-26216][SQL] Do not use case class as public API (UserDefinedFunction)
295334
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.expressions.UserDefinedFunction$"),
296335
ProblemFilters.exclude[AbstractClassProblem]("org.apache.spark.sql.expressions.UserDefinedFunction"),

sql/catalyst/pom.xml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -114,6 +114,10 @@
114114
<version>2.7.3</version>
115115
<type>jar</type>
116116
</dependency>
117+
<dependency>
118+
<groupId>org.apache.arrow</groupId>
119+
<artifactId>arrow-vector</artifactId>
120+
</dependency>
117121
</dependencies>
118122
<build>
119123
<outputDirectory>target/scala-${scala.binary.version}/classes</outputDirectory>

sql/core/src/main/java/org/apache/spark/sql/sources/v2/SessionConfigSupport.java renamed to sql/catalyst/src/main/java/org/apache/spark/sql/sources/v2/SessionConfigSupport.java

File renamed without changes.

sql/core/src/main/java/org/apache/spark/sql/sources/v2/SupportsRead.java renamed to sql/catalyst/src/main/java/org/apache/spark/sql/sources/v2/SupportsRead.java

File renamed without changes.

sql/core/src/main/java/org/apache/spark/sql/sources/v2/SupportsWrite.java renamed to sql/catalyst/src/main/java/org/apache/spark/sql/sources/v2/SupportsWrite.java

File renamed without changes.

sql/core/src/main/java/org/apache/spark/sql/sources/v2/TableProvider.java renamed to sql/catalyst/src/main/java/org/apache/spark/sql/sources/v2/TableProvider.java

Lines changed: 1 addition & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,6 @@
1818
package org.apache.spark.sql.sources.v2;
1919

2020
import org.apache.spark.annotation.Evolving;
21-
import org.apache.spark.sql.sources.DataSourceRegister;
2221
import org.apache.spark.sql.types.StructType;
2322
import org.apache.spark.sql.util.CaseInsensitiveStringMap;
2423

@@ -56,13 +55,7 @@ public interface TableProvider {
5655
* @throws UnsupportedOperationException
5756
*/
5857
default Table getTable(CaseInsensitiveStringMap options, StructType schema) {
59-
String name;
60-
if (this instanceof DataSourceRegister) {
61-
name = ((DataSourceRegister) this).shortName();
62-
} else {
63-
name = this.getClass().getName();
64-
}
6558
throw new UnsupportedOperationException(
66-
name + " source does not support user-specified schema");
59+
this.getClass().getSimpleName() + " source does not support user-specified schema");
6760
}
6861
}

sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/Batch.java renamed to sql/catalyst/src/main/java/org/apache/spark/sql/sources/v2/reader/Batch.java

File renamed without changes.

sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/InputPartition.java renamed to sql/catalyst/src/main/java/org/apache/spark/sql/sources/v2/reader/InputPartition.java

File renamed without changes.

sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/PartitionReader.java renamed to sql/catalyst/src/main/java/org/apache/spark/sql/sources/v2/reader/PartitionReader.java

File renamed without changes.

sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/PartitionReaderFactory.java renamed to sql/catalyst/src/main/java/org/apache/spark/sql/sources/v2/reader/PartitionReaderFactory.java

File renamed without changes.

0 commit comments

Comments
 (0)