Skip to content

Conversation

@2010YOUY01
Copy link
Contributor

Which issue does this PR close?

Try to implement the idea in #12415 for a single substr function

Rationale for this change

See #12415

What changes are included in this PR?

Organize sqllogictests for a single string function into different files (see substr_runner.slt for the idea)
Then we can set up logically equivalent table with different physical string column type (String/StringView), and repeat tests defined in another file

One decision to make is whether to keep similar tests in one file, or one folder per function under test
Generally different function require different kinds of data in table to test, and also they might have different number of string columns, so I think it's cleaner to separate functions to different files (but similar function like ltrim/rtim can be kept in single file)

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Sep 11, 2024
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks great @2010YOUY01 -- thank you 🙏

Something else that would be useful after we merge this PR is to document the pattern in https://github.com/apache/datafusion/tree/main/datafusion/sqllogictest#readme / give some hints to future readers how the test are structured

I also manually tested this locally by introducing an error on purpose and the test fails as expected:

Running "string_functions/substr/substr_runner.slt"
External error: query result mismatch:
[SQL] select
    substr(c1, 1),
    substr(c1, 3),
    substr(c1, 100),
    substr(c1, -1),
    substr(c1, 0, 0),
    substr(c1, -1, 2),
    substr(c1, -2, 10),
    substr(c1, -100, 200),
    substr(c1, -10, 10),
    substr(c1, -100, 10),
    substr(c1, 1, 100),
    substr(c1, 5, 3),
    substr(c1, 100, 200),
    substr(c1, 8, 0)
from test_substr;
[Diff] (-expected|+actual)
    foo o (empty) foo (empty) (empty) foo foo (empty) (empty) foo (empty) (empty) (empty)
    hello🌏世界 llo🌏世界 (empty) hello🌏世界 (empty) (empty) hello🌏世 hello🌏世界 (empty) (empty) hello🌏世界 o🌏世 (empty) (empty)
-   💩 (empty) (empty) 💩 (empty) (empty) 💩 💩 (dd) (empty) 💩 (empty) (empty) (empty)
+   💩 (empty) (empty) 💩 (empty) (empty) 💩 💩 (empty) (empty) 💩 (empty) (empty) (empty)
    ThisIsAVeryLongASCIIString isIsAVeryLongASCIIString (empty) ThisIsAVeryLongASCIIString (empty) (empty) ThisIsA ThisIsAVeryLongASCIIString (empty) (empty) ThisIsAVeryLongASCIIString IsA (empty) (empty)
    (empty) (empty) (empty) (empty) (empty) (empty) (empty) (empty) (empty) (empty) (empty) (empty) (empty) (empty)
    NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
at test_files/string_functions/substr/./substr_table.slt.part:22
at test_files/string_functions/substr/substr_runner.slt:39

Error: Execution("1 failures")
error: test failed, to rerun pass `-p datafusion-sqllogictest --test sqllogictests`

Caused by:
  process didn't exit successfully: `/Users/andrewlamb/Software/datafusion/target/debug/deps/sqllogictests-2b1885c4946979a7 string_functions` (exit status: 1)

# --------------------------------------
# Test `substr()` with literal arguments
# --------------------------------------
include ./substr_literal.slt.part
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a neat idea to break the literals as well into their own file

@alamb alamb changed the title Automate sqllogictest for StringView (for one function) Automate sqllogictest for StringView (for one function, substr) Sep 11, 2024
@alamb
Copy link
Contributor

alamb commented Sep 11, 2024

One decision to make is whether to keep similar tests in one file, or one folder per function under test
Generally different function require different kinds of data in table to test, and also they might have different number of string columns, so I think it's cleaner to separate functions to different files (but similar function like ltrim/rtim can be kept in single file)

I agree

@alamb alamb merged commit f71b0e0 into apache:main Sep 12, 2024
@alamb
Copy link
Contributor

alamb commented Sep 12, 2024

Thanks again @2010YOUY01

Would you like me to file tickets for other functions?

Or would you rather make a few more examples before I file the tickets?

@2010YOUY01
Copy link
Contributor Author

Thanks again @2010YOUY01

Would you like me to file tickets for other functions?

Yes, thank you so much

@2010YOUY01 2010YOUY01 deleted the automate-sqllogictest-stringview branch September 13, 2024 03:15
@alamb
Copy link
Contributor

alamb commented Sep 16, 2024

Thanks again @2010YOUY01
Would you like me to file tickets for other functions?

Yes, thank you so much

This is on my list to do, likely tomorrow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants