Skip to content

Conversation

@ArslanSaleem
Copy link
Contributor

@ArslanSaleem ArslanSaleem commented Mar 13, 2025

Important

Add method to replace READ_PARQUET blocks with a dummy table for SQL validation in LocalDatasetLoader and update tests accordingly.

  • Behavior:
    • Add _replace_readparquet_block_with_table() in LocalDatasetLoader to replace READ_PARQUET blocks with dummy_table for SQL validation.
    • Modify execute_query() in LocalDatasetLoader to use the new method for query validation.
  • Tests:
    • Add test_read_parquet_file and test_read_parquet_file_with_mock_query_validator in test_loader.py to test READ_PARQUET block replacement and query validation.
    • Ensure mock_is_query_safe is called with the modified query in test_read_parquet_file_with_mock_query_validator.

This description was created by Ellipsis for 5ad945f. It will automatically update as commits are pushed.

@ArslanSaleem ArslanSaleem requested a review from gventuri March 13, 2025 13:41
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes requested. Reviewed everything up to 5ad945f in 2 minutes and 20 seconds

More details
  • Looked at 74 lines of code in 2 files
  • Skipped 0 files when reviewing.
  • Skipped posting 6 drafted comments based on config settings.
1. pandasai/data_loader/local_loader.py:60
  • Draft comment:
    The error message could be more concise. Consider: 'Unsafe SQL query detected.' to improve clarity.
  • Reason this comment was not posted:
    Comment was on unchanged code.
2. tests/unit_tests/data_loader/test_loader.py:127
  • Draft comment:
    Similarly, use textwrap.dedent for the SQL strings in test_read_parquet_file_with_mock_query_validator to improve maintainability and avoid stray whitespace.
  • Reason this comment was not posted:
    Marked as duplicate.
3. tests/unit_tests/data_loader/test_loader.py:141
  • Draft comment:
    The assertion using assert_called_once_with on a multiline string is fragile. Consider normalizing or verifying a substring match to avoid test failures due to formatting.
  • Reason this comment was not posted:
    Marked as duplicate.
4. pandasai/data_loader/local_loader.py:43
  • Draft comment:
    Consider adding type hints (e.g., sql_query: str -> str) for _replace_readparquet_block_with_table to improve clarity.
  • Reason this comment was not posted:
    Marked as duplicate.
5. pandasai/data_loader/local_loader.py:57
  • Draft comment:
    Document the rationale for validating a sanitized query while executing the original one. This design assumes READ_PARQUET blocks (with potential '--' in URLs) are excluded only from safety checks.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
6. tests/unit_tests/data_loader/test_loader.py:140
  • Draft comment:
    The mock assertion is placed inside the pytest.raises block; if the exception is raised, the line may not execute. Consider moving the assert_called_once_with outside the with block.
  • Reason this comment was not posted:
    Marked as duplicate.

Workflow ID: wflow_SthrGZMwlR7Uuk5i


Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

path=self.dataset_path,
)

def _replace_readparquet_block_with_table(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add type hints and a docstring for _replace_readparquet_block_with_table. Consider annotating the sql_query parameter (e.g., sql_query: str) to clarify expected input.

loader = LocalDatasetLoader(sample_schema, "test/test")
with pytest.raises(RuntimeError):
loader.execute_query(
"""SELECT
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using textwrap.dedent to normalize the multiline SQL strings in test_read_parquet_file to avoid formatting issues.

@codecov
Copy link

codecov bot commented Mar 13, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 91.99%. Comparing base (3540d1f) to head (5ad945f).
Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1674      +/-   ##
==========================================
+ Coverage   91.97%   91.99%   +0.02%     
==========================================
  Files          72       72              
  Lines        2791     2799       +8     
==========================================
+ Hits         2567     2575       +8     
  Misses        224      224              
Flag Coverage Δ
unittests 91.99% <100.00%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ArslanSaleem ArslanSaleem merged commit 3d17c1a into main Mar 13, 2025
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants