Skip to content

[Feature]Add Kingbase Catalog Support#10427

Open
LeonYoah wants to merge 21 commits intoapache:devfrom
LeonYoah:featrue/kingbase-catalog
Open

[Feature]Add Kingbase Catalog Support#10427
LeonYoah wants to merge 21 commits intoapache:devfrom
LeonYoah:featrue/kingbase-catalog

Conversation

@LeonYoah
Copy link
Collaborator

Purpose of this pull request

Added support for the catalog functionality of the domestic database Kingbase, enabling metadata querying and auto table creation. Additionally, a related fix was implemented: When using query parameters in JDBC source, if the Kingbase instance is running in MySQL compatibility mode, the operation previously failed due to unsupported INT type errors. This occurred because the Kingbase JDBC driver's compatibility adaptations returned INT types, prompting this corrective update

Does this PR introduce any user-facing change?

How was this patch tested?

Check list

@davidzollo
Copy link
Contributor

Hi @LeonYoah, thanks for this contribution! Adding Catalog support for Kingbase is a great enhancement.

After reviewing the code, I have a few suggestions to improve code consistency and compatibility:

1. Naming Inconsistency

The existing classes in the project typically use Kingbase (only 'K' capitalized), such as KingbaseDialect and KingbaseTypeMapper.
However, the new classes introduced in this PR use KingBase (CamelCase 'K' and 'B'), such as KingBaseCatalog and KingBaseCatalogFactory.

Suggestion: Please rename the new classes to match the existing convention:

  • KingBaseCatalog -> KingbaseCatalog
  • KingBaseCatalogFactory -> KingbaseCatalogFactory
  • KingBaseCreateTableSqlBuilder -> KingbaseCreateTableSqlBuilder

2. Compatibility Risk with rownum

In KingBaseCatalog.java:

@Override
protected String getExistDataSql(TablePath tablePath) {
    return String.format(
            "select * from \"%s\".\"%s\" WHERE rownum = 1",
            tablePath.getSchemaName(), tablePath.getTableName());
}

Risk: rownum is an Oracle-specific syntax. While Kingbase supports an Oracle compatibility mode, it usually runs in Postgres compatibility mode by default, where rownum might not be supported or behaves differently. This could cause the catalog to fail in environments running in standard PG mode.

Suggestion: Unless rownum is guaranteed to work in all Kingbase modes, it is safer to use the standard SQL standard or Postgres syntax: LIMIT 1.

"select * from \"%s\".\"%s\" LIMIT 1"

3. Documentation

It would be helpful to update the Features section in docs/en/connectors/source/Kingbase.md to indicate that Kingbase now supports the Catalog feature.

@LeonYoah
Copy link
Collaborator Author

Hi @LeonYoah, thanks for this contribution! Adding Catalog support for Kingbase is a great enhancement.

After reviewing the code, I have a few suggestions to improve code consistency and compatibility:

1. Naming Inconsistency

The existing classes in the project typically use Kingbase (only 'K' capitalized), such as KingbaseDialectKingbaseTypeMapper。 However, the new classes introduced in this PR use KingBase (CamelCase 'K' and 'B'), such as KingBaseCatalogKingBaseCatalogFactory

Suggestion: Please rename the new classes to match the existing convention:

  • KingBaseCatalog -> KingbaseCatalog
  • KingBaseCatalogFactory -> KingbaseCatalogFactory
  • KingBaseCreateTableSqlBuilder -> KingbaseCreateTableSqlBuilder

2. Compatibility Risk with rownum

In KingBaseCatalog.java:

@Override
protected String getExistDataSql(TablePath tablePath) {
    return String.format(
            "select * from \"%s\".\"%s\" WHERE rownum = 1",
            tablePath.getSchemaName(), tablePath.getTableName());
}

Risk: rownum is an Oracle-specific syntax. While Kingbase supports an Oracle compatibility mode, it usually runs in Postgres compatibility mode by default, where rownum might not be supported or behaves differently. This could cause the catalog to fail in environments running in standard PG mode.

Suggestion: Unless rownum is guaranteed to work in all Kingbase modes, it is safer to use the standard SQL standard or Postgres syntax: LIMIT 1

"select * from \"%s\".\"%s\" LIMIT 1"

3. Documentation

It would be helpful to update the Features section in docs/en/connectors/source/Kingbase.md to indicate that Kingbase now supports the Catalog feature.

Thank you for your review. I will make the changes.

@DanielCarter-stack
Copy link

Issue 1: All Tests Disabled

Location: seatunnel-connectors-v2/connector-jdbc/src/test/java/org/apache/seatunnel/connectors/seatunnel/jdbc/catalog/kingbase/KingbaseCatalogTest.java:21

@Disabled("Please Test it in your local environment")
class KingbaseCatalogTest {
    // All test methods are disabled
}

Problem Description:
The test class is disabled by the @Disabled annotation, preventing CI from running these tests. This violates the "covered with tests" promise in the PR Checklist.

Potential Risks:

  • Unable to verify code correctness in CI
  • Potential regression issues cannot be detected in a timely manner
  • Reduced code quality assurance

Scope of Impact:

  • Direct impact: KingbaseCatalog, KingbaseCreateTableSqlBuilder
  • Indirect impact: All features depending on Kingbase Catalog
  • Affected area: Kingbase Connector

Severity: CRITICAL

Improvement Suggestions:

// Remove @Disabled or use Testcontainers
@Testcontainers
class KingbaseCatalogTest {
    @Container
    private static final PostgreSQLContainer<?> kingbaseContainer =
        new PostgreSQLContainer<>("kingbase/kes:latest")
            .withDatabaseName("test")
            .withUsername("kingbase")
            .withPassword("kingbase");
    
    @Test
    void databaseExists() {
        // Test using real containers
    }
}

Reason: Apache projects require all code changes to be covered by executable tests.


Issue 2: Hardcoded Chinese Exception Messages

Location: seatunnel-connectors-v2/connector-jdbc/src/main/java/org/apache/seatunnel/connectors/seatunnel/jdbc/catalog/kingbase/KingbaseCatalog.java:146

} catch (SQLException e) {
    throw new CatalogException("查询数据库是否存在失败: " + databaseName, e);
}

Problem Description:
Using hardcoded Chinese strings as exception messages does not comply with internationalization standards. SeaTunnel is an international project, and error messages should be in English.

Related Context:

  • Parent class AbstractJdbcCatalog uses English error messages
  • Other Catalog implementations (such as PostgresCatalog) use English

Potential Risks:

  • International users cannot understand error messages
  • Difficult log analysis
  • Inconsistent with project standards

Scope of Impact:

  • Direct impact: KingbaseCatalog.databaseExists()
  • Indirect impact: All user code calling this method
  • Affected area: Single Connector

Severity: MAJOR

Improvement Suggestions:

} catch (SQLException e) {
    throw new CatalogException(
        String.format("Failed to check if database exists: %s", databaseName), 
        e
    );
}

Reason: Complies with Apache project internationalization standards.


Issue 3: Missing Null Check

Location: seatunnel-connectors-v2/connector-jdbc/src/main/java/org/apache/seatunnel/connectors/seatunnel/jdbc/catalog/kingbase/KingbaseCatalog.java:196-214

@Override
protected Column buildColumn(ResultSet resultSet) throws SQLException {
    String columnName = resultSet.getString("COLUMN_NAME");
    String typeName = resultSet.getString("TYPE_NAME");
    String fullTypeName = resultSet.getString("FULL_TYPE_NAME");
    // ...
    Object defaultValue = resultSet.getObject("DEFAULT_VALUE");
    boolean isNullable = resultSet.getString("IS_NULLABLE").equals("YES");
    // ...
}

Problem Description:
There is no null check on the return value of getString(). If certain columns in SQL query results are NULL, it will cause NullPointerException.

Related Context:

  • SELECT_COLUMNS_SQL_TEMPLATE uses LEFT JOIN, which may produce NULL values
  • Especially column_comment and default_value fields may be NULL

Potential Risks:

  • When querying columns containing NULL values, NPE will be thrown
  • Users cannot retrieve table metadata
  • Job failure

Scope of Impact:

  • Direct impact: KingbaseCatalog.getTable()
  • Indirect impact: All operations using Catalog API to query table metadata
  • Affected area: Single Connector

Severity: MAJOR

Improvement Suggestions:

@Override
protected Column buildColumn(ResultSet resultSet) throws SQLException {
    String columnName = resultSet.getString("COLUMN_NAME");
    if (columnName == null) {
        throw new SQLException("COLUMN_NAME cannot be null");
    }
    
    String typeName = resultSet.getString("TYPE_NAME");
    String fullTypeName = resultSet.getString("FULL_TYPE_NAME");
    long columnLength = resultSet.getLong("COLUMN_LENGTH");
    long columnPrecision = resultSet.getLong("COLUMN_PRECISION");
    int columnScale = resultSet.getInt("COLUMN_SCALE");
    
    String columnComment = resultSet.getString("COLUMN_COMMENT");
    Object defaultValue = resultSet.getObject("DEFAULT_VALUE");
    
    String isNullableStr = resultSet.getString("IS_NULLABLE");
    boolean isNullable = "YES".equalsIgnoreCase(isNullableStr);
    
    BasicTypeDefine typeDefine = BasicTypeDefine.builder()
            .name(columnName)
            .columnType(typeName)
            .dataType(typeName)
            .length(columnLength)
            .precision(columnPrecision)
            .scale(columnScale)
            .nullable(isNullable)
            .defaultValue(defaultValue)
            .comment(columnComment)
            .build();
    return KingbaseTypeConverter.INSTANCE.convert(typeDefine);
}

Reason: Defensive programming to avoid NPE.


Issue 4: SQL Injection Risk (Potential)

Location: seatunnel-connectors-v2/connector-jdbc/src/main/java/org/apache/seatunnel/connectors/seatunnel/jdbc/catalog/kingbase/KingbaseCatalog.java:189-192

@Override
protected String getSelectColumnsSql(TablePath tablePath) {
    return String.format(
            SELECT_COLUMNS_SQL_TEMPLATE, 
            tablePath.getSchemaName(), 
            tablePath.getTableName());
}

Problem Description:
Using String.format to directly concatenate user input into SQL poses SQL injection risks. Although there are single quotes wrapping it in SELECT_COLUMNS_SQL_TEMPLATE, the input is not escaped.

Related Context:

  • Other implementations of AbstractJdbcCatalog (such as PostgresCatalog) also use the same pattern
  • This is a systemic issue, not specific to this PR
  • However, this PR introduces this risk point

Potential Risks:

  • If schema or table names contain single quotes, SQL syntax errors will occur
  • Malicious users may construct special table names for SQL injection attacks

Scope of Impact:

  • Direct impact: KingbaseCatalog.getSelectColumnsSql()
  • Indirect impact: buildColumn(), getTable()
  • Affected area: Single Connector

Severity: MAJOR

Improvement Suggestions:

@Override
protected String getSelectColumnsSql(TablePath tablePath) {
    String schemaName = escapeSqlIdentifier(tablePath.getSchemaName());
    String tableName = escapeSqlIdentifier(tablePath.getTableName());
    return String.format(SELECT_COLUMNS_SQL_TEMPLATE, schemaName, tableName);
}

private String escapeSqlIdentifier(String identifier) {
    return identifier.replace("'", "''");
}

Reason: Prevent SQL injection, comply with secure coding standards.


Issue 5: KB_CLOB Type Sets columnLength Twice

Location: seatunnel-connectors-v2/connector-jdbc/src/main/java/org/apache/seatunnel/connectors/seatunnel/jdbc/internal/dialect/kingbase/KingbaseTypeConverter.java:190-194

case KB_CLOB:
    builder.dataType(BasicType.STRING_TYPE);
    builder.columnLength(typeDefine.getLength());
    builder.columnLength((long) (1024 * 1024 * 1024));  // Second setup
    break;

Problem Description:
columnLength is set twice, the second time will overwrite the first value, causing typeDefine.getLength() to be ignored.

Related Context:

  • builder is Lombok's @Builder pattern
  • The second call will overwrite the first value
  • This is an obvious copy-paste error

Potential Risks:

  • CLOB type length is always set to 1GB
  • Ignores the actual length defined in the database
  • May lead to inaccurate type mapping

Scope of Impact:

  • Direct impact: KB_CLOB type conversion
  • Indirect impact: CLOB column metadata
  • Affected area: Single Connector

Severity: MINOR

Improvement Suggestions:

case KB_CLOB:
    builder.dataType(BasicType.STRING_TYPE);
    if (typeDefine.getLength() != null && typeDefine.getLength() > 0) {
        builder.columnLength(typeDefine.getLength());
    } else {
        builder.columnLength((long) (1024 * 1024 * 1024));
    }
    break;

Reason: Fix logic error, preserve user-defined length.


Issue 6: MySqlTypeConverter Constant Visibility Modification Affects Encapsulation

Location: seatunnel-connectors-v2/connector-jdbc/src/main/java/org/apache/seatunnel/connectors/seatunnel/jdbc/internal/dialect/mysql/MySqlTypeConverter.java:45-98

// Before modification
-static final String MYSQL_INT = "INT";
-static final String MYSQL_DATETIME = "DATETIME";
// ... 38 constants

// After modification
+public static final String MYSQL_INT = "INT";
+public static final String MYSQL_DATETIME = "DATETIME";
// ... 38 constants

Problem Description:
Changing 38 private constants to public constants increases API surface area and breaks encapsulation. Although this solves Kingbase's needs, it is not the optimal solution.

Related Context:

  • Referenced location: KingbaseTypeConverter.java:73-116
  • Caller: case MySqlTypeConverter.MYSQL_INT:
  • Similar references: OracleTypeConverter.ORACLE_NUMBER, SqlServerTypeConverter.SQLSERVER_DATETIME2

Potential Risks:

  • These constants now become part of the public API
  • Modifying these constants will affect all callers
  • Increases maintenance burden for MySQL, Oracle, and SQL Server implementations
  • Violates the "Principle of Least Knowledge"

Scope of Impact:

  • Direct impact: MySqlTypeConverter, OracleTypeConverter, SqlServerTypeConverter
  • Indirect impact: Other code that may depend on these constants in the future
  • Affected area: Multiple Connectors

Severity: MINOR

Improvement Suggestions:

// Create new file: JdbcCompatibilityTypeConstants.java
package org.apache.seatunnel.connectors.seatunnel.jdbc.internal.dialect;

public final class JdbcCompatibilityTypeConstants {
    // MySQL compatibility types
    public static final String MYSQL_INT = "INT";
    public static final String MYSQL_DATETIME = "DATETIME";
    // ...
    
    // Oracle compatibility types
    public static final String ORACLE_NUMBER = "NUMBER";
    // ...
    
    // SQLServer compatibility types
    public static final String SQLSERVER_DATETIME2 = "DATETIME2";
    // ...
    
    private JdbcCompatibilityTypeConstants() {}
}

Then reference in Kingbase:

import static org.apache.seatunnel.connectors.seatunnel.jdbc.internal.dialect.JdbcCompatibilityTypeConstants.*;

case MYSQL_INT:
    builder.dataType(BasicType.INT_TYPE);
    break;

Reason: Centrally manage compatibility type constants, reduce coupling, improve maintainability.


Issue 8: seatunnel-dist/pom.xml Not Updated

Location: seatunnel-dist/pom.xml

Problem Description:
The PR did not update the seatunnel-dist/pom.xml file. According to the Checklist, when adding a new connector, this file needs to be updated.

Related Context:

  • Checklist requirement: "Update the pom file of seatunnel-dist"
  • Kingbase uses the existing connector-jdbc, so may not need independent dependencies

Potential Risks:

  • If Kingbase requires specific JDBC driver dependencies, they need to be declared in the dist pom
  • May result in releases missing necessary drivers

Scope of Impact:

  • Direct impact: SeaTunnel release
  • Indirect impact: Users cannot use Kingbase after downloading the release
  • Affected area: Build and distribution

Severity: MAJOR

Improvement Suggestions:
Check if Kingbase JDBC driver dependency needs to be added:

<!-- 在 seatunnel-dist/pom.xml 中 -->
<dependency>
    <groupId>com.kingbase</groupId>
    <artifactId>kingbase8</artifactId>
    <version>${kingbase.driver.version}</version>
</dependency>

Reason: Ensure releases include all necessary dependencies.


Issue 9: Incomplete Documentation

Location: docs/en/connectors/source/Kingbase.md, docs/zh/connectors/source/Kingbase.md

Problem Description:
The documentation only adds Connector configuration items, without explaining how to use Catalog functionality, supported data type mappings, and special handling for MySQL compatibility mode.

Related Context:

  • PR title: "Add Kingbase Catalog Support"
  • Documentation only updates the Source configuration section
  • Missing Catalog API usage examples

Potential Risks:

  • Users don't know how to use Kingbase Catalog functionality
  • Missing type mapping table may cause usage confusion
  • Special handling for MySQL compatibility mode is not documented

Scope of Impact:

  • Direct impact: Kingbase users
  • Indirect impact: User support and troubleshooting
  • Affected area: Kingbase Connector

Severity: MAJOR

Improvement Suggestions:
Add in docs/en/connectors/source/Kingbase.md:

## Catalog Support

Kingbase connector supports catalog functionality for metadata querying and automatic table creation.

### Supported Data Types

| Kingbase Type | SeaTunnel Type | Notes |
|---------------|----------------|-------|
| INT | INT | MySQL compatibility mode |
| INTEGER | INT | Native type |
| VARCHAR | STRING | |
| NUMBER | DECIMAL | Oracle compatibility mode |
| ... | ... | |

### MySQL Compatibility Mode

When Kingbase is running in MySQL compatibility mode, the connector will automatically map MySQL types (e.g., INT, DATETIME) to appropriate SeaTunnel types.

### Example
```hocon
source {
  KingbaseCatalog {
    catalog {
      database = "test_db"
      schema = "public"
    }
    table_path = "public.my_table"
  }
}

** Reason**: Provide complete usage documentation to reduce user learning costs.

---

# ## Issue 10: label-scope-conf.yml not updated

** Location**: `.github/workflows/labeler/label-scope-conf.yml`

** Problem Description**:
PR 没有更新 `label-scope-conf.yml` 文件。根据 Checklist,添加新 connector 时需要添加 CI label。

** Related Context**:
- Checklist 要求: "Add ci label in label-scope-conf.yml"
- Kingbase 是新添加的 Catalog 支持
- 需要 CI 系统能够识别 Kingbase 相关的 PR

** Potential Risks**:
- Kingbase 相关的 PR 无法自动打上 label
- 可能导致 CI 流程不正确

** Impact Scope**:
- 直接影响: GitHub Actions CI
- 间接影响: PR 自动化处理
- 影响面: CI/CD 流程

** Severity**: MINOR

** Improvement Suggestions**:
在 `.github/workflows/labeler/label-scope-conf.yml` 中添加:
```yaml
connector-jdbc:
  - connector-jdbc/**/*
  - "**/*kingbase*"

Reason: Complies with PR Checklist requirements, ensures correct CI operation.


@LeonYoah LeonYoah changed the title Featrue/Add Kingbase Catalog Support [Feature]Add Kingbase Catalog Support Feb 2, 2026

import java.util.List;

@Disabled("Please Test it in your local environment")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unit testing introduces containers, plus dialect and catalog testing, reference: #10210

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I will modify the Kingbase unit test support and add container test support.

BasicTypeDefine typeDefine =
BasicTypeDefine.builder()
.name(columnName)
.columnType(typeName)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FULL_TYPE_NAME is read but not used; columnType/sourceType loses information such as VARCHAR(255), CHAR(10), and NUMERIC(38,18).

.append(CatalogUtils.quoteIdentifier(column.getName(), fieldIde, "\""))
.append(CatalogUtils.quoteIdentifier(" IS '", fieldIde))
.append(column.getComment())
.append("'");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unescaped single quotes in comment concatenation: it is recommended to use replace("'", "''").

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your review. Please note that this feature is already available in other connectors. I forgot to add it and will include it later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please obtain KINGBASE_LICENSE through environment variables

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you give an example? Which variable is it placed in? Does it need to be set in the GitHub repository, such as in GitHub Secrets?

@github-actions github-actions bot added the CI&CD label Feb 4, 2026
davidzollo
davidzollo previously approved these changes Feb 6, 2026
Copy link
Contributor

@davidzollo davidzollo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1
LGTM

./mvnw -B -T 1 clean verify -DskipUT=false -DskipIT=true -D"license.skipAddThirdParty"=true -D"skip.ui"=true --no-snapshot-updates
env:
MAVEN_OPTS: -Xmx4096m
KINGBASE_LICENSE: ${{ vars.KINGBASE_LICENSE }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is not used, please delete it

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants