-
Notifications
You must be signed in to change notification settings - Fork 2.3k
fix(duckdbConnection): make duckdb connection manager non singleton #1710
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1710 +/- ##
==========================================
- Coverage 91.71% 91.71% -0.01%
==========================================
Files 72 72
Lines 2800 2799 -1
==========================================
- Hits 2568 2567 -1
Misses 232 232
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Reviewed everything up to 80c4e5a in 2 minutes and 20 seconds
More details
- Looked at
64lines of code in1files - Skipped
0files when reviewing. - Skipped posting
5drafted comments based on config settings.
1. pandasai/data_loader/duck_db_connection_manager.py:18
- Draft comment:
Consider adding an explicit type annotation for the 'df' parameter in register() for clarity (e.g., pandas.DataFrame). - Reason this comment was not posted:
Comment was on unchanged code.
2. pandasai/data_loader/duck_db_connection_manager.py:14
- Draft comment:
Using del for closing resources may not be reliably invoked. Consider providing an explicit close() method usage and clear documentation on lifecycle management. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50%
While the comment raises a valid general concern about del, the code already implements the suggested solution - it has an explicit close() method. The del is just a backup mechanism. The comment seems to be asking for something that's already there. The documentation could be improved, but that's not what the comment is specifically requesting.
The comment might be hinting at a deeper issue about resource management patterns in Python that could be worth addressing. Maybe there should be context manager support?
While context manager support could be nice, that's not what this comment is suggesting. The core concern about having an explicit close() method is already addressed in the code.
The comment should be deleted because its main suggestion (having an explicit close method) is already implemented in the code.
3. pandasai/data_loader/duck_db_connection_manager.py:25
- Draft comment:
Ensure that your changes, particularly the new unregister() method and non-singleton pattern, are covered by unit tests to avoid potential regressions. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50%
The rules explicitly state not to ask authors to "ensure" things or to "double-check". The comment starts with "Ensure that..." which is a red flag. While having tests is important, the comment is phrased as a verification request rather than pointing out a specific missing test. We don't have visibility into whether tests already exist.
Tests are crucial for code quality, so removing this comment might lead to untested code. The changes are significant enough that tests would be valuable.
While tests are important, the rules are clear about not making verification requests. If there's a specific missing test, it should be pointed out directly rather than asking for verification.
The comment should be deleted as it violates the rule against asking authors to verify or ensure things, and we don't have clear evidence that tests are actually missing.
4. pandasai/data_loader/duck_db_connection_manager.py:36
- Draft comment:
The connection existence check can be simplified. Since self.connection is always initialized, consider replacing 'if hasattr(self, "connection") and self.connection:' with 'if self.connection is not None:'. - Reason this comment was not posted:
Confidence changes required:33%<= threshold50%
None
5. pandasai/data_loader/duck_db_connection_manager.py:8
- Draft comment:
Update documentation and tests to reflect the new instance-based (non-singleton) connection management, ensuring that multiple connections do not lead to conflicts. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 0% vs. threshold = 50%
This is a significant architectural change that affects how connections are managed. However, the comment is asking for documentation updates and test changes, which violates our rules about not commenting on cross-file issues (documentation and tests would be in different files). We also don't have access to the documentation or test files to verify if updates are actually needed.
The architectural change is indeed significant and could have important implications for users of this class. Documentation and tests are crucial for such changes.
While the concern is valid, we must follow the rules strictly - we can't verify documentation or test files, and we shouldn't comment on cross-file issues.
The comment should be deleted as it refers to documentation and tests which are in other files, violating our rule about ignoring cross-file issues.
Workflow ID: wflow_rJsBf9YScE7zYkrY
Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Incremental review on 80007c0 in 2 minutes and 6 seconds
More details
- Looked at
30lines of code in2files - Skipped
0files when reviewing. - Skipped posting
3drafted comments based on config settings.
1. tests/unit_tests/data_loader/test_duckdbmanager.py:17
- Draft comment:
When asserting table registration, consider adding an error message for clarity. - Reason this comment was not posted:
Confidence changes required:50%<= threshold50%
None
2. pandasai/data_loader/duck_db_connection_manager.py:18
- Draft comment:
Removed auto-unregistration for duplicate table names. Ensure that duckdb.register handles duplicate registrations gracefully, and consider adding an inline comment explaining this design decision. - Reason this comment was not posted:
Confidence changes required:33%<= threshold50%
None
3. tests/unit_tests/data_loader/test_duckdbmanager.py:14
- Draft comment:
Test accesses the private _registered_tables attribute directly. Consider providing a public accessor for better encapsulation. - Reason this comment was not posted:
Confidence changes required:33%<= threshold50%
None
Workflow ID: wflow_AyNtXeGQQM1KvlKf
Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.
memory issues in production where parallel request is happening same dataset name can override the already registered from other user
Important
DuckDBConnectionManageris refactored to be non-singleton, addingunregister()and ensuring proper connection closure.DuckDBConnectionManageris no longer a singleton; each instance manages its own connection.__new__method and singleton pattern; adds__del__to close connections.unregister()method to remove tables from_registered_tables.test_unregisterintest_duckdbmanager.pyto verifyunregister()functionality.close()does not throw errors when called multiple times.This description was created by
for 80007c0. It will automatically update as commits are pushed.