-
Notifications
You must be signed in to change notification settings - Fork 2.3k
fix(view_loader): avoid lower casing the column name #1703
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1703 +/- ##
=======================================
Coverage 91.72% 91.72%
=======================================
Files 72 72
Lines 2803 2803
=======================================
Hits 2571 2571
Misses 232 232
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Reviewed everything up to d8d90d1 in 2 minutes and 54 seconds
More details
- Looked at
421lines of code in4files - Skipped
0files when reviewing. - Skipped posting
15drafted comments based on config settings.
1. pandasai/helpers/sql_sanitizer.py:9
- Draft comment:
Using parse_one with transform(quote_identifiers) ensures columns keep proper quoting. Ensure that invalid SQL inputs are correctly handled when parse_one might throw errors. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50%
The first part of the comment is just describing what the code does, which isn't useful. The second part asks to "ensure" error handling, which violates our rule about not asking authors to verify things. If error handling was critical, the comment should directly state what error handling to add. The function is called "sanitize" so some error handling might be appropriate, but the comment isn't making a clear case for it.
Perhaps I'm being too harsh - error handling for parse_one could be important for a sanitization function. The comment might be trying to point out a legitimate issue.
While error handling could be valuable, the comment doesn't make a clear case for why it's needed here or what specifically should be done. It's just asking the author to "ensure" something, which our rules explicitly say to avoid.
Delete the comment. It's partly descriptive (not useful) and partly asking the author to verify something (explicitly against our rules). If error handling is needed, a new comment should clearly state what to add.
2. pandasai/query_builders/view_query_builder.py:26
- Draft comment:
normalize_view_column_name now delegates to sanitize_view_column_name. Confirm this change is intended and that it handles all edge cases compared to prior normalization. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%<= threshold50%
The comment is asking the PR author to confirm their intention and to ensure that the change handles all edge cases. This violates the rule against asking the author to confirm their intention or to ensure behavior is intended. Therefore, this comment should not be approved.
3. pandasai/query_builders/view_query_builder.py:30
- Draft comment:
normalize_view_column_alias now replaces dots with underscores then sanitizes. Verify this meets alias formatting requirements consistently. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%<= threshold50%
The comment is asking the author to verify if the change meets alias formatting requirements, which is against the rules. It doesn't provide a specific suggestion or point out a clear issue with the code. It is not actionable or specific enough to be useful.
4. tests/unit_tests/query_builders/test_view_query_builder.py:118
- Draft comment:
Remove debugging print statements from test code. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 0% vs. threshold = 50%
While removing debug print statements is generally good practice, these print statements are in test code rather than production code. Test code often uses print statements for debugging test failures. The print statements don't affect production behavior. Our rules don't specifically mention test code print statements.
The comment assumes all print statements should be removed, but test code may legitimately use print statements for debugging. We may be being too strict about print statements in test files.
The print statements don't affect production code and may be helpful for test debugging. Our rules focus on production code quality, not test code.
We should not keep this comment since it refers to print statements in test code which may be legitimately useful for debugging and don't affect production behavior.
5. tests/unit_tests/query_builders/test_view_query_builder.py:181
- Draft comment:
Remove debugging print statement from test_column_name_injection. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 0% vs. threshold = 50%
While print statements in production code should be removed, this is a test file. Print statements in tests can be useful for debugging test failures. The print statement shows the actual query when the test fails, which helps developers understand what went wrong. This is a common and acceptable practice in test files.
Maybe the print statement adds noise to test output when all tests pass? Maybe it should be wrapped in a debug flag or only print on failure?
The benefit of having the query visible when the test fails outweighs the minor noise in successful test runs. This is standard practice in test files.
Do not remove the print statement as it serves a useful debugging purpose in this test file.
6. tests/unit_tests/query_builders/test_view_query_builder.py:337
- Draft comment:
Remove unnecessary debugging print statement in test_multiple_joins_same_table. - Reason this comment was not posted:
Marked as duplicate.
7. tests/unit_tests/query_builders/test_view_query_builder.py:393
- Draft comment:
Remove debugging print statement from test_multiple_joins_same_table_with_aliases. - Reason this comment was not posted:
Marked as duplicate.
8. pandasai/helpers/sql_sanitizer.py:12
- Draft comment:
Consider using a list comprehension for readability instead of list(map(...)). - Reason this comment was not posted:
Comment was on unchanged code.
9. pandasai/query_builders/view_query_builder.py:30
- Draft comment:
Review the alias normalization logic: replacing '.' with '_' and then sanitizing may not cover all edge cases. Consider unifying this logic with normalize_view_column_name if appropriate. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50%
The comment is speculative, using words like "may not cover all edge cases". It doesn't point to any specific issues or edge cases. The code has already been simplified and both methods use the same sanitization function. The comment suggests "consider unifying" but the code is already quite unified - they both use sanitize_view_column_name as their core sanitization.
The comment could be right that there are edge cases we haven't considered. The simple string replace of "." with "_" might be insufficient for some column names.
Without specific edge cases being identified, this is just speculation. The code is already using a dedicated sanitization function and has been simplified from a more complex version.
The comment should be deleted as it's speculative and doesn't point to specific issues. It suggests unifying logic that is already largely unified.
10. pandasai/query_builders/view_query_builder.py:112
- Draft comment:
Alias generation logic is duplicated between _get_aliases and _get_columns. Consider consolidating this to adhere to the DRY principle. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
11. tests/unit_tests/query_builders/test_view_query_builder.py:118
- Draft comment:
Remove debugging print statements from tests. - Reason this comment was not posted:
Marked as duplicate.
12. tests/unit_tests/query_builders/test_view_query_builder.py:181
- Draft comment:
Remove debugging print statements from tests. - Reason this comment was not posted:
Marked as duplicate.
13. tests/unit_tests/query_builders/test_view_query_builder.py:250
- Draft comment:
Remove debugging print statements from tests. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 0% vs. threshold = 50%
While print statements in production code should definitely be removed, print statements in test files are less problematic. They can actually be helpful during test development and debugging. The print statements here appear to be intentionally left to help debug test failures. They don't affect production code or test outcomes.
The comment assumes all print statements are bad and should be removed, but test files have different standards than production code. Print statements in tests can serve a legitimate debugging purpose.
While the comment identifies a real issue (print statements), the context of being in a test file makes these print statements potentially useful rather than problematic.
The comment should be deleted since print statements in test files can be useful for debugging and don't impact production code.
14. tests/unit_tests/query_builders/test_view_query_builder.py:337
- Draft comment:
Remove debugging print statements from tests. - Reason this comment was not posted:
Marked as duplicate.
15. tests/unit_tests/query_builders/test_view_query_builder.py:393
- Draft comment:
Remove debugging print statements from tests. - Reason this comment was not posted:
Marked as duplicate.
Workflow ID: wflow_9zzX56XhiI1f5j6i
Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.
| 0 | ||
| ].name = "column UNION SELECT username, password FROM users;" | ||
| query = view_query_builder.build_query() | ||
| print(query) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove debugging print statement from test_column_name_union_injection.
| print(query) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on 38bdf96 in 1 minute and 17 seconds
More details
- Looked at
20lines of code in1files - Skipped
0files when reviewing. - Skipped posting
5drafted comments based on config settings.
1. tests/unit_tests/query_builders/test_view_query_builder.py:178
- Draft comment:
Removed debugging print statement; it's best to avoid prints in production or test code. - Reason this comment was not posted:
Comment looked like it was already resolved.
2. tests/unit_tests/query_builders/test_view_query_builder.py:247
- Draft comment:
Removed debugging print statement; it's best to avoid prints in production or test code. - Reason this comment was not posted:
Comment looked like it was already resolved.
3. tests/unit_tests/query_builders/test_view_query_builder.py:179
- Draft comment:
Good removal of the debugging print statement. Avoid print statements in tests to keep output clean. - Reason this comment was not posted:
Comment looked like it was already resolved.
4. tests/unit_tests/query_builders/test_view_query_builder.py:247
- Draft comment:
Nice removal of the extraneous print statement. Debug prints should not be in committed test code. - Reason this comment was not posted:
Comment looked like it was already resolved.
5. tests/unit_tests/query_builders/test_view_query_builder.py:112
- Draft comment:
The function name 'test_get__group_by_columns' contains a double underscore after 'get'. This appears to be a typographical inconsistency as the method under test is _get_group_by_columns. Consider renaming the test to 'test_get_group_by_columns' for clarity. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
Workflow ID: wflow_tlEkp2heL9PRdD3H
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.
Important
Avoid lowercasing column names in SQL queries by using
quote_identifiersinsanitize_view_column_name.sanitize_view_column_nameinsql_sanitizer.pynow usesquote_identifiersto avoid lowercasing column names.normalize_view_column_nameandnormalize_view_column_aliasinViewQueryBuilderupdated to usesanitize_view_column_namedirectly.test_sql_sanitizer.pyandtest_view_query_builder.pyto reflect quoted column names.test_view_query_builder.py.This description was created by
for 38bdf96. It will automatically update as commits are pushed.