Skip to content

[Fix][Connector-V2][MySQL-CDC] Add support for MYSQL_SET_UNSIGNED type in MySqlTypeConverter#10453

Open
wgzhao wants to merge 7 commits intoapache:devfrom
wgzhao:fix_10451
Open

[Fix][Connector-V2][MySQL-CDC] Add support for MYSQL_SET_UNSIGNED type in MySqlTypeConverter#10453
wgzhao wants to merge 7 commits intoapache:devfrom
wgzhao:fix_10451

Conversation

@wgzhao
Copy link
Contributor

@wgzhao wgzhao commented Feb 5, 2026

Purpose of this pull request

This pull request try to fix #10451

Does this PR introduce any user-facing change?

No

How was this patch tested?

I tested in my production env

Check list

@DanielCarter-stack
Copy link

DanielCarter-stack commented Feb 5, 2026

Issue 1: The added MYSQL_SET_UNSIGNED type does not exist in MySQL

Location: MySqlTypeConverter.java:79

static final String MYSQL_SET_UNSIGNED = "SET UNSIGNED";

Issue Description:
MySQL officially does not support the SET UNSIGNED data type. The SET type is a string collection type, while the UNSIGNED attribute is only for numeric types. Adding this constant would mislead people into thinking this type exists in MySQL.

Related Context:

  • MySQL documentation: SET type syntax is SET('value1','value2',...), which does not support UNSIGNED
  • All *_UNSIGNED constants in the same file (lines 31-62) correspond to numeric types
  • OceanBaseMySqlTypeConverter doesn't even define MYSQL_SET (line 75), implying compatibility issues

Potential Risks:

  • Misleading future code maintainers
  • If someone tries to create a table using "SET UNSIGNED" type, it will fail at the database level
  • Unclear code semantics

Impact Scope:

  • Direct impact: MySqlTypeConverter class
  • Indirect impact: No actual runtime impact (because Debezium will not return this type)
  • Affected area: MySQL JDBC Connector + MySQL CDC Connector

Severity: MAJOR

Improvement Suggestions:

// Option 1: If truly necessary, add detailed comments explaining the source
/**
 * Handle edge case where Debezium may report type as "SET UNSIGNED" 
 * even though MySQL doesn't officially support this type.
 * See issue #10451 for details.
 */
static final String MYSQL_SET_UNSIGNED = "SET UNSIGNED";

// Option 2: Normalize at the type conversion entry point (recommended)
// In MySqlTypeUtils.convertToSeaTunnelColumn():
String dataType = column.typeName().toUpperCase();
if ("SET UNSIGNED".equals(dataType)) {
    log.warn("Normalizing unexpected type name 'SET UNSIGNED' to 'SET' for column {}", 
             column.name());
    dataType = "SET";
}
builder.dataType(dataType);

Rationale:

  • If this edge case truly needs to be handled, there should be clear documentation and comments
  • A better approach is to normalize at the data entry point, rather than adding a non-existent type in the core type conversion logic

Issue 2: Missing corresponding unit tests

Location: MySqlTypeConverterTest.java

Issue Description:
The PR adds support for the MYSQL_SET_UNSIGNED type but does not add corresponding test cases. In contrast, when PR #9553 previously added MYSQL_SET support, it also added a testConvertSet() test.

Related Context:

  • Lines 1092-1105: There is a testConvertSet() test using dataType="SET"
  • Missing testConvertSetUnsigned() test using dataType="SET UNSIGNED"

Potential Risks:

  • Code changes are unverified by tests
  • Future refactoring might accidentally break this functionality
  • Cannot confirm this case actually works

Impact Scope:

  • Direct impact: Test coverage
  • Indirect impact: Code quality assurance
  • Affected area: MySQL JDBC Connector test suite

Severity: MAJOR

Improvement Suggestions:

@Test
public void testConvertSetUnsigned() {
    BasicTypeDefine<Object> typeDefine =
            BasicTypeDefine.builder()
                    .name("test")
                    .columnType("SET UNSIGNED")
                    .dataType("SET UNSIGNED")
                    .length(100L)
                    .build();
    Column column = MySqlTypeConverter.DEFAULT_INSTANCE.convert(typeDefine);
    Assertions.assertEquals(typeDefine.getName(), column.getName());
    Assertions.assertEquals(BasicType.STRING_TYPE, column.getDataType());
    Assertions.assertEquals(100, column.getColumnLength());
    Assertions.assertEquals(typeDefine.getColumnType(), column.getSourceType());
    
    // Test with default length
    typeDefine = BasicTypeDefine.builder()
            .name("test")
            .columnType("SET UNSIGNED")
            .dataType("SET UNSIGNED")
            .length(0L)
            .build();
    column = MySqlTypeConverter.DEFAULT_INSTANCE.convert(typeDefine);
    Assertions.assertEquals(100, column.getColumnLength());
}

Rationale:

  • Follow the project's testing standards
  • Ensure the new code path is verified
  • Provide usage examples for future maintainers

Issue 3: PR description does not match actual changes

Location: PR description section

Issue Description:
The PR description explicitly states "Does this PR introduce any user-facing change? No", but actually adds support for a new type. If MYSQL_SET_UNSIGNED is indeed a scenario that needs support, then this is a user-visible change.

Related Context:

Potential Risks:

  • Users are unaware that this version may fix certain issues
  • Release Notes cannot accurately reflect changes
  • Users may not be aware of this bug fix when upgrading

Impact Scope:

  • Direct impact: Release notes
  • Indirect impact: User perception
  • Affected area: Users using CDC Connector

Severity: MINOR

Improvement Suggestions:

Does this PR introduce _any_ user-facing change?

Yes. Users using MySQL CDC with tables containing SET columns may have 
previously encountered type conversion errors if the underlying Debezium 
library reported the type as "SET UNSIGNED". This fix ensures those columns 
are now properly recognized and converted to STRING type.

Rationale:

  • Accurately communicate the impact of changes
  • Help users understand the benefits of upgrading
  • Follow Apache project transparency requirements

Issue 4: Missing warning logs for exceptional types

Location: MySqlTypeConverter.java:248-250

case MYSQL_ENUM:
case MYSQL_SET:
case MYSQL_SET_UNSIGNED:
    builder.dataType(BasicType.STRING_TYPE);
    // ...

Issue Description:
Compared to the handling of other UNSIGNED types in the code, both MYSQL_FLOAT_UNSIGNED and MYSQL_DOUBLE_UNSIGNED have log.warn warnings about possible value overflow. If MYSQL_SET_UNSIGNED is an exceptional case (should not appear but did appear), there should be logging.

Related Context:

  • Line 180: log.warn("{} will probably cause value overflow.", MYSQL_FLOAT_UNSIGNED);
  • Line 186: log.warn("{} will probably cause value overflow.", MYSQL_DOUBLE_UNSIGNED);
  • Line 193: log.warn("{} will probably cause value overflow.", MYSQL_DECIMAL);

Potential Risks:

  • Cannot track this exceptional situation during operations
  • Cannot统计 the frequency of this type occurrence
  • Difficult to diagnose the root cause of related issues

Impact Scope:

  • Direct impact: Observability
  • Indirect impact: Problem diagnosis
  • Affected area: MySQL CDC Connector users

Severity: MINOR

Improvement Suggestions:

case MYSQL_ENUM:
case MYSQL_SET:
    builder.dataType(BasicType.STRING_TYPE);
    if (typeDefine.getLength() == null || typeDefine.getLength() <= 0) {
        builder.columnLength(100L);
    } else {
        builder.columnLength(typeDefine.getLength());
    }
    break;
case MYSQL_SET_UNSIGNED:
    log.warn("Unexpected type '{}' encountered for column '{}'. MySQL does not officially "
             + "support SET UNSIGNED. This may indicate a Debezium reporting issue. "
             + "Treating as SET type.", MYSQL_SET_UNSIGNED, typeDefine.getName());
    builder.dataType(BasicType.STRING_TYPE);
    if (typeDefine.getLength() == null || typeDefine.getLength() <= 0) {
        builder.columnLength(100L);
    } else {
        builder.columnLength(typeDefine.getLength());
    }
    break;

Rationale:

  • Provide diagnostic clues for issues
  • Help identify data source configuration problems
  • Consistent with the pattern of handling other exceptional types in the project

Issue 5: OceanBaseMySqlTypeConverter may need synchronized changes

Location: OceanBaseMySqlTypeConverter.java

Issue Description:
OceanBaseMySqlTypeConverter inherits from a similar design but does not define the MYSQL_SET constant (line 75 only has MYSQL_ENUM). If OceanBase also supports the SET type, similar issues may exist.

Related Context:

  • OceanBase is based on MySQL but may have differences
  • OceanBaseMySqlTypeConverter line 222 only handles MYSQL_ENUM, not MYSQL_SET

Potential Risks:

  • SET type columns for OceanBase users may not be handled correctly
  • If OceanBase's Debezium connector also reports "SET UNSIGNED", the same issue will occur

Impact Scope:

  • Direct impact: OceanBase Connector
  • Indirect impact: OceanBase CDC users
  • Affected area: OceanBase MySQL mode users

Severity: MINOR (needs confirmation if OceanBase supports SET)

Improvement Suggestions:

  1. Confirm if OceanBase supports SET type
  2. If supported, add corresponding constants and handling logic
  3. If not supported, document this in the documentation
// If OceanBase supports SET
static final String MYSQL_SET = "SET";

// Add in switch
case MYSQL_ENUM:
case MYSQL_SET:
    builder.dataType(BasicType.STRING_TYPE);
    // ...

Rationale:

  • Maintain consistency between MySQL and OceanBase connectors
  • Avoid issues for users migrating between different databases

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] [MySQL-CDC] Fails to init when encountering mysql.event/mysql.proc with SET UNSIGNED field type

2 participants