feat: Introduce DataFormatOptions to configure the output of BigQuery data types#4010
feat: Introduce DataFormatOptions to configure the output of BigQuery data types#4010
Conversation
Summary of ChangesHello @lqiu96, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request refactors how data formatting options are managed within the BigQuery client library by introducing a new Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a DataFormatOptions wrapper to encapsulate data format settings, deprecating the old useInt64Timestamps option. The changes are well-structured, but I've found a couple of issues. There's a potential for NullPointerException in the new DataFormatOptions class, and a state inconsistency issue in BigQueryOptions due to the new obsolete setter. My review includes suggestions to fix these problems.
google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/DataFormatOptions.java
Outdated
Show resolved
Hide resolved
google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryOptions.java
Show resolved
Hide resolved
|
/gemini review |
There was a problem hiding this comment.
Code Review
The pull request introduces a new DataFormatOptions class to centralize data format configurations, which is a great improvement for configurability. The changes correctly deprecate the old useInt64Timestamps option in BigQueryOptions and provide a backward-compatible way to handle it. However, the refactoring appears to be incomplete. While QueryRequestInfo is updated, several other places in BigQueryImpl still call the obsolete getOptions().getUseInt64Timestamps(). These should be updated to use the new DataFormatOptions to ensure consistent behavior. I've also added a few specific comments on the new DataFormatOptions class and on improving Javadoc clarity for the deprecated methods.
google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/DataFormatOptions.java
Outdated
Show resolved
Hide resolved
google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryOptions.java
Outdated
Show resolved
Hide resolved
google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/DataFormatOptions.java
Outdated
Show resolved
Hide resolved
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a new DataFormatOptions class to encapsulate data formatting settings, starting with useInt64Timestamp. The changes are well-implemented across the codebase, with good handling of backward compatibility by deprecating old methods and ensuring the new and old settings work together correctly. New tests have been added to verify this behavior. I've found one potential issue in the new DataFormatOptions class that could lead to a runtime exception, for which I've provided a suggestion.
google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/DataFormatOptions.java
Outdated
Show resolved
Hide resolved
| * | ||
| * <p>{@code DataFormatOptions.newBuilder().setUseInt64Timestamp(...).build()} | ||
| */ | ||
| @ObsoleteApi("Use setDataFormatOptions(DataFormatOptions) instead") |
There was a problem hiding this comment.
Is there a reason you would prefer @ObsoleteApi over @Deprecated?
There was a problem hiding this comment.
@Deprecated could result downstream customer's CI jobs failing based on their compiler settings. We prefer @ObsoleteApi as a first warning and then moving to @Deprecated in a future major version.
| } | ||
| if (request.getTimestampOutputFormat() != null) { | ||
| builder.timestampFormatOptions( | ||
| TimestampFormatOptions.valueOf(request.getTimestampOutputFormat())); |
There was a problem hiding this comment.
If BQ backend adds a new format, this would fail, right? Can we try-catch, so that we would fall back to TIMESTAMP_OUTPUT_FORMAT_UNSPECIFIED?
There was a problem hiding this comment.
Ah good point. This format was created from the proto files, but really only customers should be creating and setting it on the client side. This shouldn't come back as a server response at all.
I'll remove the fromPb() method.
… data types (#4010) * feat: Create DataFormatOptions in BigQuery * feat: Add Builder class for DataFormatOptions * fix: Update existing references of useInt64Timestamp to use DataFormatOption's variant * chore: Fix lint issues * chore: Address PR feedback * chore: Add tests for useInt64Timestamp behavior * chore: Address failing tests and GCA * chore: Remove unused fromPb method
* chore: sync with last release from main branch * feat: Introduce DataFormatOptions to configure the output of BigQuery data types (#4010) * feat: Create DataFormatOptions in BigQuery * feat: Add Builder class for DataFormatOptions * fix: Update existing references of useInt64Timestamp to use DataFormatOption's variant * chore: Fix lint issues * chore: Address PR feedback * chore: Add tests for useInt64Timestamp behavior * chore: Address failing tests and GCA * chore: Remove unused fromPb method * feat: Add timestamp_precision to Field (#4014) * feat: Add timestamp_precision to Field * Apply suggestion from @gemini-code-assist[bot] Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> * chore: Address GCA PR feedback * chore: Fix typo * chore: Remove default value * chore: Address PR feedback --------- Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> * chore: Use custom timestamp validator for ISO8601 timestamps with more than nanosecond precision (#4017) * chore: Use custom timestamp validator for ISO8601 timestamps with more than nanosecond precision * chore: Rename helper method to validateTimestamp * chore: Address GCA comments * chore: Address last GCA comment * chore: Update to use assert helper method * test: Clean up resources created in ITs (#4024) * chore: Cleanup initialized resources * chore: Delete created dataset * chore: Use try-with to close stream * chore: Cleanup Intellij test warnings (#4026) * chore: Cleanup initialized resources * chore: Use try-with to close stream * chore: Cleanup Intellij test warnings * chore: Allow for floating point inaccuracies * chore: Add otel delete operation * deps: update actions/upload-artifact action to v6 (#4027) Co-authored-by: Blake Li <[email protected]> * chore: Use assertThrows in tests (#4028) * chore: Cleanup initialized resources * chore: Use try-with to close stream * chore: Cleanup Intellij test warnings * chore: Allow for floating point inaccuracies * chore: Add otel delete operation * chore: Clean up rest of the IT test file * chore: Add GCA feedback * chore: Update renovate.json (#4031) deps update from bigquerystorage was incorrectly marked as chore(deps). This would cause the upgrade not show up in the release notes and also not trigger release please. This is because the pattern /^com.google.cloud:google-cloud-bigquery is configured in renovate.json, which matches both bigquery and bigquerystorage. However, I believe the original intention is only for bigquery update in samples, not for bigquerystorage update. Fixing the pattern so that the String must ends with bigquery * chore(main): release 2.57.2-SNAPSHOT (#4032) Co-authored-by: release-please[bot] <55107282+release-please[bot]@users.noreply.github.com> * chore: fix unresolved conflicts --------- Co-authored-by: Lawrence Qiu <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Mend Renovate <[email protected]> Co-authored-by: Blake Li <[email protected]> Co-authored-by: release-please[bot] <55107282+release-please[bot]@users.noreply.github.com>
* chore: add release-please config for protobuf-4.x (#4009) * chore: add release-please config for protobuf-4.x * Change release type from java-lts to java-yoshi * feat: Introduce DataFormatOptions to configure the output of BigQuery data types (#4010) * feat: Create DataFormatOptions in BigQuery * feat: Add Builder class for DataFormatOptions * fix: Update existing references of useInt64Timestamp to use DataFormatOption's variant * chore: Fix lint issues * chore: Address PR feedback * chore: Add tests for useInt64Timestamp behavior * chore: Address failing tests and GCA * chore: Remove unused fromPb method * feat: Add timestamp_precision to Field (#4014) * feat: Add timestamp_precision to Field * Apply suggestion from @gemini-code-assist[bot] Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> * chore: Address GCA PR feedback * chore: Fix typo * chore: Remove default value * chore: Address PR feedback --------- Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> * chore: Use custom timestamp validator for ISO8601 timestamps with more than nanosecond precision (#4017) * chore: Use custom timestamp validator for ISO8601 timestamps with more than nanosecond precision * chore: Rename helper method to validateTimestamp * chore: Address GCA comments * chore: Address last GCA comment * chore: Update to use assert helper method * chore(main): release 2.56.1-SNAPSHOT (#4001) Co-authored-by: release-please[bot] <55107282+release-please[bot]@users.noreply.github.com> * deps: update dependency com.google.cloud:sdk-platform-java-config to v3.54.2 (#4022) * chore(main): release 2.57.0 (#4021) Co-authored-by: release-please[bot] <55107282+release-please[bot]@users.noreply.github.com> * test: Clean up resources created in ITs (#4024) * chore: Cleanup initialized resources * chore: Delete created dataset * chore: Use try-with to close stream * chore(deps): update dependency com.google.cloud:google-cloud-bigquerystorage-bom to v3.19.0 (#4025) * chore(main): release 2.57.1-SNAPSHOT (#4023) Co-authored-by: release-please[bot] <55107282+release-please[bot]@users.noreply.github.com> * chore: Cleanup Intellij test warnings (#4026) * chore: Cleanup initialized resources * chore: Use try-with to close stream * chore: Cleanup Intellij test warnings * chore: Allow for floating point inaccuracies * chore: Add otel delete operation * deps: update actions/upload-artifact action to v6 (#4027) Co-authored-by: Blake Li <[email protected]> * chore: Use assertThrows in tests (#4028) * chore: Cleanup initialized resources * chore: Use try-with to close stream * chore: Cleanup Intellij test warnings * chore: Allow for floating point inaccuracies * chore: Add otel delete operation * chore: Clean up rest of the IT test file * chore: Add GCA feedback * chore(main): release 2.57.1 (#4029) Co-authored-by: release-please[bot] <55107282+release-please[bot]@users.noreply.github.com> * chore: Update renovate.json (#4031) deps update from bigquerystorage was incorrectly marked as chore(deps). This would cause the upgrade not show up in the release notes and also not trigger release please. This is because the pattern /^com.google.cloud:google-cloud-bigquery is configured in renovate.json, which matches both bigquery and bigquerystorage. However, I believe the original intention is only for bigquery update in samples, not for bigquerystorage update. Fixing the pattern so that the String must ends with bigquery * chore(main): release 2.57.2-SNAPSHOT (#4032) Co-authored-by: release-please[bot] <55107282+release-please[bot]@users.noreply.github.com> * test: Add integration tests for picosecond support (#4030) * test: Add integration tests for picosecond support * chore: Add a micro -> pico exact timestamp test case * chore: Add additional test cases * chore: Fix test cases with 'Z' * chore: Test if first query has an impact * chore: Remove flaky test for now * chore: Remove testing comment * fix: Job.isDone() uses Job.Status.State if available (#4039) * chore: remove build badges (#4046) b/468377909 * chore: handled race condition in stateless query integration test (#4045) * fix: handled race condition in stateless query integration test The testTableResultJobIdAndQueryId test was failing intermittently on slower networks. The test strictly asserted that Job ID must be null for stateless queries. However, the library correctly falls back to creating a Job ID if the stateless query times out. This change updates the assertion logic to accept either a valid Query ID (stateless success) or a valid Job ID (fallback success). Fixes #4008 * refactor: use XOR assertion for conciseness Applied feedback from code review to use exclusive OR operator for validating JobID/QueryID mutual exclusivity. * fix: apply race condition logic to testStatelessQueries Applied XOR assertion logic to testStatelessQueries. Test was failing on slow networks because they did not account for JOB_CREATION_OPTIONAL falling back to job creation. Fixes #4002 * docs: add comment explaining stateless query fallback behavior * docs: add comment explaining stateless query fallback behavior in testTableResultJobIdAndQueryId() * chore: Ignore unused declared junit-jupiter-engine error in dependencies check (#4048) * chore: Ingore unused junit-jupiter-engine error in dependencies check * chore: Update pom.xml Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> * chore: Update comment format for junit-jupiter-engine --------- Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> * fix: gracefully handle thread interruption in ConnectionImpl to preve… (#4047) * fix: gracefully handle thread interruption in ConnectionImpl to prevent CI flakes Fixes #3992 * fix: consolidate interrupt checks and handle raw InterruptedException as per review * fix: remove CancelledException check per review feedback * chore(deps): update dependency com.google.cloud:libraries-bom to v26.73.0 (#3922) * docs: Add specific samples for creating and query timestamps (#4051) * docs: Add specific samples for creating and query timestamps * chore: Fix samples checkstyle issue * chore: Address gemini suggestions * chore: Update expiration time for test * chore: Migrate tests to JUnit5 (#4052) * feat: Migrate to JUnit 5 and add parallel test execution * feat: Migrate tests to JUnit5 * chore: Add surefire-junit-platform dep for ITs * chore: Fix broken tests * chore: Upgrade existing integration tests to JUnit 5 syntax and features * chore: Upgrade ITNightlyBigQueryTest to JUnit 5 features and package-private * chore: Make the tests package-private * feat: migrate tests to JUnit 5 assertThrows and static imports * chore: Remove wildcard imports * chore: revert samples to use junit4 * chore: Address code comments * chore: Close connection after test --------- Co-authored-by: AbgarSim <[email protected]> * chore: move Google JDBC driver code (#4050) * test: Use unique table names in tests (#4053) * feat: Migrate to JUnit 5 and add parallel test execution * feat: Migrate tests to JUnit5 * chore: Add surefire-junit-platform dep for ITs * test: Enable parallel tests for JUnit5 * chore: Fix broken tests * chore: Upgrade existing integration tests to JUnit 5 syntax and features * chore: Upgrade ITNightlyBigQueryTest to JUnit 5 features and package-private * chore: Make the tests package-private * feat: migrate tests to JUnit 5 assertThrows and static imports * chore: Remove wildcard imports * chore: revert samples to use junit4 * chore: Address code comments * chore: Close connection after test * chore: Fix flaky tests * chore: Fix flaky tests * chore: Fix tests * chore: Disable cache for query stats * chore: Add unique id to each table * chore: Use unique test table names * chore: Remove parallel test execution * chore: Add comment for cache * chore: Fix broken test --------- Co-authored-by: AbgarSim <[email protected]> * feat:Add JUnit 5 and add parallel test execution (#4058) * chore: cleanup release-please config (#4013) * chore: cleanup release-please config - Remove redundant options already declared at the top level.\n- Remove bumpMinorPreMajor for repositories after the first major release. * chore: format release-please.yml * chore(deps): update dependency com.google.cloud:sdk-platform-java-config to v3.55.1 (#4060) * chore(main): update CODEOWNERS with JDBC ownership (#4063) * chore(deps): update dependency com.google.cloud:google-cloud-bigquerystorage-bom to v3.19.1 (#4036) * chore(deps): update dependency com.google.cloud:google-cloud-bigquerystorage-bom to v3.20.0-rc1 * Update google-cloud-bigquerystorage-bom version --------- Co-authored-by: Diego Marquez <[email protected]> * chore: Enable exponential backoff for retries in tests (#4059) * chore: Enable exponential backoff for retries in tests * chore: Clone table for DML query tests * chore: Use GCA retry settings code suggestion * chore(jdbc): moving helper scripts & simplify dependencies (#4062) * chore: Fix flaky testListTablesWithPartitioning test (#4068) * chore: Fix flaky testListTablesWithPartitioning test * chore: Fix test comment * chore: Fix typo to partitionType * chore(main): release 2.57.2 (#4044) Co-authored-by: release-please[bot] <55107282+release-please[bot]@users.noreply.github.com> --------- Co-authored-by: Diego Marquez <[email protected]> Co-authored-by: Lawrence Qiu <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: release-please[bot] <55107282+release-please[bot]@users.noreply.github.com> Co-authored-by: Mend Renovate <[email protected]> Co-authored-by: Abgar Simonean <[email protected]> Co-authored-by: Tomo Suzuki <[email protected]> Co-authored-by: Sivamurugan P <[email protected]> Co-authored-by: Kirill Logachev <[email protected]>
See b/447623336 for more information.
Introduces a DataFormatOptions class to configure how to format the data for BigQuery outputs. It includes the existing functionality for
useInt64Timestampsthat is currently a setting in BigQueryOptions. Those settings are absorbed into the new DataFormatOptions class.This PR does not change the logic for the existing BigQuery code, but converts any existing use cases of BigQueryOptions's
useInt64Timestampsto now reference BigQueryOption's DataFormatOptions.useInt64Timestamps value instead.