Skip to content

Conversation

@facetosea
Copy link
Contributor

Description

Please briefly describe the code changes in this pull request.

Jira: https://jira.taosdata.com:18080/browse/TD-

Checklist

Please check the items in the checklist if applicable.

  • Is the user manual updated?
  • Are the test cases passed and automated?
  • Is there no significant decrease in test coverage?

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @facetosea, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the test coverage for several time-series window functions, including INTERVAL, STATE_WINDOW, EVENT_WINDOW, SESSION, and COUNT_WINDOW. The added tests explore a broader range of query structures, such as selecting non-aggregate columns and constants, and specific windowing conditions. A notable change in the parser involves commenting out a syntax validation for window functions lacking aggregate functions or state keys, which directly supports the expanded testing scope and potentially new feature capabilities.

Highlights

  • Parser Modification: A syntax check preventing window functions without aggregate functions or state keys has been commented out in parTranslater.c, potentially enabling new query patterns.
  • Interval Window Tests: New test cases for INTERVAL window functions have been added, specifically validating queries that select non-aggregate columns alongside window metadata.
  • State Window Enhancements: Expanded tests for STATE_WINDOW now cover scenarios involving null values at the window start and diverse column selections.
  • Event Window Coverage: Additional test queries for EVENT_WINDOW have been introduced, focusing on _wstart, _wend, and constant projections with slimit and limit.
  • Session Window Validation: Comprehensive tests for SESSION window functions have been added, examining behavior with and without aggregate functions, and in conjunction with partition by clauses.
  • Count Window Scenarios: New test cases for COUNT_WINDOW validate various column selections, aggregate functions, and partition by usage.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enables window queries without requiring aggregate functions, a valuable feature enhancement. The core logic change in parTranslater.c is well-supported by a comprehensive suite of new test cases for various window types.

My review primarily focuses on improving code maintainability and style. I've suggested removing commented-out code in the C source file. For the Python test files, my recommendations include removing unused variables, adopting more idiomatic Python loops, ensuring consistent SQL query formatting, and refactoring duplicated test logic to improve readability and ease of maintenance.

Overall, the changes are solid and the tests are thorough. Addressing these minor points will further enhance the quality of the codebase.

Comment on lines +4647 to +4649
// if (NULL != pSelect->pWindow && !pSelect->hasAggFuncs && !pSelect->hasStateKey) {
// return generateSyntaxErrMsg(&pCxt->msgBuf, TSDB_CODE_PAR_NO_VALID_FUNC_IN_WIN);
// }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This block of code is commented out. If it's no longer needed, it should be removed to improve code clarity. Leaving dead code can be confusing for future maintenance.

mtPrefix = "m_in_mt"
tbNum = 10
rowNum = 20
totalNum = 200
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The variable totalNum is defined but never used. It should be removed to avoid confusion and improve code clarity.

Comment on lines +1558 to +1571
i = 0
while i < tbNum:
tb = tbPrefix + str(i)
tdSql.execute(f"create table {tb} using {mt} tags( {i} )")

x = 0
while x < rowNum:
cc = x * 60000
ms = 1601481600000 + cc

tdSql.execute(f"insert into {tb} values ({ms} , {x} )")
x = x + 1

i = i + 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For better readability and to follow Python best practices, these while loops with manual incrementing can be replaced with for loops using range().

Suggested change
i = 0
while i < tbNum:
tb = tbPrefix + str(i)
tdSql.execute(f"create table {tb} using {mt} tags( {i} )")
x = 0
while x < rowNum:
cc = x * 60000
ms = 1601481600000 + cc
tdSql.execute(f"insert into {tb} values ({ms} , {x} )")
x = x + 1
i = i + 1
for i in range(tbNum):
tb = tbPrefix + str(i)
tdSql.execute(f"create table {tb} using {mt} tags( {i} )")
for x in range(rowNum):
cc = x * 60000
ms = 1601481600000 + cc
tdSql.execute(f"insert into {tb} values ({ms} , {x} )")

Comment on lines +1607 to +1619
sql = "select _wstart, 1, tgcol, count(tbcol) from m_in_mt0 partition by tbname interval(5m);"
tdSql.query(sql)
tdSql.checkRows(40)

sql = "select _wstart, _wend, tbname, 1, tgcol, count(tbcol) from m_in_mt0 partition by tbname interval(5m);"
tdSql.query(sql)
tdSql.checkRows(40)

sql = "select _wstart, tbcol, tbname, 1, tgcol, count(tbcol) from m_in_mt0 partition by tbname interval(5m);"
tdSql.error(sql)

sql = "select _wstart, ts, tbname, 1, tgcol, count(tbcol) from m_in_mt0 partition by tbname interval(5m);"
tdSql.error(sql)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There's an inconsistent use of trailing semicolons in the SQL queries within this block. For consistency with the rest of the file and general Python practice for single-statement strings, it's better to remove them.

tdSql.checkData(3, 0, "2025-09-01 10:00:10.001")
tdSql.checkData(3, 1, 2999)
tdSql.checkData(3, 2, "2025-09-01 10:00:13.000")
tdSql.checkData(3, 3, "b")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This line has trailing whitespace, which should be removed to maintain code style consistency.

Suggested change
tdSql.checkData(3, 3, "b")
tdSql.checkData(3, 3, "b")

tdSql.checkData(0, 3, 'xx')

tdLog.info(f"======== test_event successfully executed")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The file should end with a newline character. This is a common convention to prevent issues with file concatenation and some tools.

Suggested change

Comment on lines +153 to +185
sql = "select _wstart, _wend, 1, count(*) from dev_001 session(ts,1d)"
tdSql.query(sql)
tdSql.checkRows(4)
tdSql.checkData(0, 0, "2020-05-13 10:00:00.000")
tdSql.checkData(0, 1, "2020-05-14 13:00:00.001")
tdSql.checkData(0, 2, 1)
tdSql.checkData(0, 3, 13)
tdSql.checkData(1, 0, "2020-05-15 14:00:00.000")
tdSql.checkData(1, 1, "2020-05-15 14:00:00.000")
tdSql.checkData(1, 2, 1)
tdSql.checkData(2, 0, "2020-05-20 10:00:00.000")
tdSql.checkData(2, 1, "2020-05-20 10:00:00.000")
tdSql.checkData(2, 2, 1)
tdSql.checkData(3, 0, "2020-05-27 10:00:00.001")
tdSql.checkData(3, 1, "2020-05-27 10:00:00.001")
tdSql.checkData(3, 2, 1)


sql = "select _wstart, _wend, 1 from dev_001 session(ts,1d)"
tdSql.query(sql)
tdSql.checkRows(4)
tdSql.checkData(0, 0, "2020-05-13 10:00:00.000")
tdSql.checkData(0, 1, "2020-05-14 13:00:00.001")
tdSql.checkData(0, 2, 1)
tdSql.checkData(1, 0, "2020-05-15 14:00:00.000")
tdSql.checkData(1, 1, "2020-05-15 14:00:00.000")
tdSql.checkData(1, 2, 1)
tdSql.checkData(2, 0, "2020-05-20 10:00:00.000")
tdSql.checkData(2, 1, "2020-05-20 10:00:00.000")
tdSql.checkData(2, 2, 1)
tdSql.checkData(3, 0, "2020-05-27 10:00:00.001")
tdSql.checkData(3, 1, "2020-05-27 10:00:00.001")
tdSql.checkData(3, 2, 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The test logic for queries with and without count(*) is very similar, leading to significant code duplication. Consider refactoring this into a helper function to improve readability and maintainability. The function could take the select clause as an argument and perform the common checks.

Comment on lines +193 to +240
sql = "select _wstart, _wend, 1, tbname, count(*) from st partition by tbname session(ts,1d) order by tbname;"
tdSql.query(sql)
tdSql.checkRows(5)
tdSql.checkData(0, 0, "2020-05-13 10:00:00.000")
tdSql.checkData(0, 1, "2020-05-14 13:00:00.001")
tdSql.checkData(0, 2, 1)
tdSql.checkData(0, 3, "dev_001")
tdSql.checkData(1, 0, "2020-05-15 14:00:00.000")
tdSql.checkData(1, 1, "2020-05-15 14:00:00.000")
tdSql.checkData(1, 2, 1)
tdSql.checkData(1, 3, "dev_001")
tdSql.checkData(2, 0, "2020-05-20 10:00:00.000")
tdSql.checkData(2, 1, "2020-05-20 10:00:00.000")
tdSql.checkData(2, 2, 1)
tdSql.checkData(2, 3, "dev_001")
tdSql.checkData(3, 0, "2020-05-27 10:00:00.001")
tdSql.checkData(3, 1, "2020-05-27 10:00:00.001")
tdSql.checkData(3, 2, 1)
tdSql.checkData(3, 3, "dev_001")
tdSql.checkData(4, 0, "2020-05-13 10:00:00.000")
tdSql.checkData(4, 1, "2020-05-13 10:00:00.510")
tdSql.checkData(4, 2, 1)
tdSql.checkData(4, 3, "dev_002")


sql = "select _wstart, _wend, 1, tbname from st partition by tbname session(ts,1d) order by tbname;"
tdSql.query(sql)
tdSql.checkRows(5)
tdSql.checkData(0, 0, "2020-05-13 10:00:00.000")
tdSql.checkData(0, 1, "2020-05-14 13:00:00.001")
tdSql.checkData(0, 2, 1)
tdSql.checkData(0, 3, "dev_001")
tdSql.checkData(1, 0, "2020-05-15 14:00:00.000")
tdSql.checkData(1, 1, "2020-05-15 14:00:00.000")
tdSql.checkData(1, 2, 1)
tdSql.checkData(1, 3, "dev_001")
tdSql.checkData(2, 0, "2020-05-20 10:00:00.000")
tdSql.checkData(2, 1, "2020-05-20 10:00:00.000")
tdSql.checkData(2, 2, 1)
tdSql.checkData(2, 3, "dev_001")
tdSql.checkData(3, 0, "2020-05-27 10:00:00.001")
tdSql.checkData(3, 1, "2020-05-27 10:00:00.001")
tdSql.checkData(3, 2, 1)
tdSql.checkData(3, 3, "dev_001")
tdSql.checkData(4, 0, "2020-05-13 10:00:00.000")
tdSql.checkData(4, 1, "2020-05-13 10:00:00.510")
tdSql.checkData(4, 2, 1)
tdSql.checkData(4, 3, "dev_002")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to the previous section, these two test blocks for queries with and without count(*) on a partitioned table are highly repetitive. A helper function could abstract the common setup and assertions, making the test code more concise and easier to maintain.

sql = f"select _wstart, 1, ta, tb, tc, tbname from t1 partition by tbname count_window(2) slimit 2 limit 2;"
tdSql.query(sql)
tdSql.checkRows(2)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This line contains unnecessary whitespace. It should be removed to maintain a clean code style.

Comment on lines +653 to +663
sql = f"select _wstart, 1, ta, tb, tc, tbname from t1 count_window(2) slimit 2 limit 2;"
tdSql.error(sql)
sql = f"select _wstart, 1, ta, tb, tc, count(*), tbname from t1 count_window(2) slimit 2 limit 2;"
tdSql.error(sql)

sql = f"select _wstart, 1 from t1 count_window(2);"
tdSql.query(sql)
tdSql.checkRows(3)

sql = f"select _wstart, 1, count(*) from t1 count_window(2);"
tdSql.query(sql)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Some SQL queries in this block end with a semicolon, while others do not. For consistency, it's recommended to remove the trailing semicolons from all queries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants