Skip to content

Conversation

@yaooqinn
Copy link
Member

What changes were proposed in this pull request?

This PR brings the Java ServiceProvider Interface (SPI) Support for dynamic JDBC dialect registering.

A custom JDBC dialect can be registered easily instead of calling JdbcDialects.registerDialect manually.

Why are the changes needed?

For pure SQL and other non-Java API users, it's difficult to register a custom JDBC dialect to use. With this patch, this can be done when the jar containing the dialect class is visible to the spark classloader.

Does this PR introduce any user-facing change?

Yes, but mostly for third-party developers

How was this patch tested?

new tests

Was this patch authored or co-authored using generative AI tooling?

no

@github-actions github-actions bot added the SQL label Mar 21, 2024
override def supportsOffset: Boolean = true
}

private[jdbc] object OracleDialect {
Copy link
Member

@dongjoon-hyun dongjoon-hyun Mar 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ur, actually, case object OracleDialect has more features (serializable, hashCode, toString) than object OracleDialect. Do we need to shrink features like this?

Copy link
Member Author

@yaooqinn yaooqinn Mar 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only a place for containing some const ints, which represent some variants from Oracle JDBC extensions. Thus, object is enough

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opps, the errors occured

- simple scan with LIMIT *** FAILED *** (65 milliseconds)
[info]   org.apache.spark.SparkException: Task not serializable

assert(namedObservation.get === expected)
}


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra empty line?

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM (with minor comments). It's a nice feature, @yaooqinn !

@yaooqinn
Copy link
Member Author

Thank you @dongjoon-hyun.

I addressed the comments

@dongjoon-hyun
Copy link
Member

#45626 (comment) seems to be missed still. Are you going to recover it, @yaooqinn ?

I addressed the comments

@yaooqinn
Copy link
Member Author

Yes @dongjoon-hyun it was caused by the unserializable object SpecificTypes inside MsSqlServerDialect

@beliefer
Copy link
Contributor

beliefer commented Mar 21, 2024

@yaooqinn It's really worth? as I know, users could register dialects with JdbcDialects.registerDialect in V2 catalog.
Users can implement CatalogPlugin and register the catalog and the dialects.

@github-actions github-actions bot added the BUILD label Mar 21, 2024
@yaooqinn
Copy link
Member Author

@beliefer Yes, it's worth it.

Actually, there are plenty of hacky ways to register dialects just like what you mentioned. We can't anticipate those as typical API usages.

@yaooqinn yaooqinn closed this in 2b0e841 Mar 21, 2024
@yaooqinn yaooqinn deleted the SPARK-47496 branch March 21, 2024 11:35
@yaooqinn
Copy link
Member Author

Merged to master.

Thank you @dongjoon-hyun

@smileyboy2019
Copy link

In which version will it be released, and can multiple libraries be joined for querying?

@elijah-pl
Copy link

@yaooqinn is it only in Spark 4 or 3 as well?

@yaooqinn
Copy link
Member Author

@elijah-pl please refer to the JIRA sizde for fixed versions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants