-
Notifications
You must be signed in to change notification settings - Fork 2.3k
feature(LocalLoader): enabling local loader to use duckdb loader #1620
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…355-query * commit '61ba3530794fbe2b5739a04c26dc35e704ce69c9': fix(dataset): slug format validation on load (sinaptik-ai#1609) fix(views): transformation using raw sql (sinaptik-ai#1608) Release v3.0.0-beta.10
* commit '78105f5c9f47500fa7771d3de6eeb5efba752487': Release v3.0.0-beta.11 fix(View): fixing aliases in view (sinaptik-ai#1614)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Reviewed everything up to 87d12c5 in 2 minutes and 26 seconds
More details
- Looked at
408lines of code in5files - Skipped
0files when reviewing. - Skipped posting
6drafted comments based on config settings.
1. pandasai/data_loader/local_loader.py:27
- Draft comment:
Initializing LocalQueryBuilder with both schema and dataset_path is clear. Ensure that any future changes to the constructor are reflected in tests so that context awareness of the local path is maintained. - Reason this comment was not posted:
Confidence changes required:0%<= threshold50%
None
2. pandasai/data_loader/local_loader.py:39
- Draft comment:
Refactored load() method now uses execute_query to build and execute the query. Confirm that any prior filtering/transformations are intentionally removed and that execute_query covers all necessary operations. - Reason this comment was not posted:
Confidence changes required:33%<= threshold50%
None
3. tests/unit_tests/data_loader/test_loader.py:15
- Draft comment:
Tests now correctly patch execute_query instead of the previous _read_csv_or_parquet. Ensure tests continue to cover schema validation and file reading for all supported local source types. - Reason this comment was not posted:
Confidence changes required:0%<= threshold50%
None
4. pandasai/data_loader/local_loader.py:11
- Draft comment:
Unused import: 'LOCAL_SOURCE_TYPES' is imported but no longer used. Consider removing this import for cleaner code. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
5. pandasai/query_builders/local_query_builder.py:22
- Draft comment:
To match the test expectations, consider returning uppercase function calls (e.g. 'READ_CSV' and 'READ_PARQUET') instead of 'read_csv'/'read_parquet'. - Reason this comment was not posted:
Marked as duplicate.
6. pandasai/data_loader/local_loader.py:48
- Draft comment:
Good use of SQL sanitization via is_sql_query_safe. Consider rewording the error message to 'The SQL query is considered unsafe and will not be executed.' for clarity. - Reason this comment was not posted:
Confidence changes required:33%<= threshold50%
None
Workflow ID: wflow_lCEcJtmwyc6i1W5M
Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.
Important
Refactor
LocalDatasetLoaderto useDuckDBfor query execution, updatingLocalQueryBuilderand tests accordingly.LocalDatasetLoaderinlocal_loader.pynow usesDuckDBfor executing queries instead of loading files directly.LocalQueryBuilderinlocal_query_builder.pyconstructs queries forDuckDBusingread_csvandread_parquet.test_loader.pyto mockexecute_queryinstead of file reading methods.test_query_builder.pyforLocalQueryBuilderto verify query construction for CSV and Parquet.test_group_by.pyto reflect changes in query building for local sources.This description was created by
for 87d12c5. It will automatically update as commits are pushed.