Skip to content

Conversation

@fealho
Copy link
Member

@fealho fealho commented Oct 4, 2025

CU-86b6xp7a0, Resolve #2688
CU-86b6xrcah, Resolve #2690

@sdv-team
Copy link
Contributor

sdv-team commented Oct 4, 2025

@fealho fealho marked this pull request as ready for review October 4, 2025 22:16
@fealho fealho requested a review from a team as a code owner October 4, 2025 22:16
@codecov
Copy link

codecov bot commented Oct 4, 2025

Codecov Report

❌ Patch coverage is 94.87179% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 96.88%. Comparing base (6ce265a) to head (3ecceee).
⚠️ Report is 1 commits behind head on feature-branch-download-demo.

Files with missing lines Patch % Lines
sdv/datasets/demo.py 94.87% 2 Missing ⚠️
Additional details and impacted files
@@                       Coverage Diff                        @@
##           feature-branch-download-demo    #2699      +/-   ##
================================================================
- Coverage                         98.16%   96.88%   -1.29%     
================================================================
  Files                                74       74              
  Lines                              7896     7923      +27     
================================================================
- Hits                               7751     7676      -75     
- Misses                              145      247     +102     
Flag Coverage Δ
integration ?
unit 96.88% <94.87%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@sdv-team
Copy link
Contributor

sdv-team commented Oct 4, 2025

This Pull Request is not linked to an issue. To ensure our community is able to accurately track resolved issues, please link any issue that will be closed by this PR!

Base automatically changed from issue-2691-download-demo-yaml to issue-2687-download-demo-output-folder October 6, 2025 14:54
@fealho fealho force-pushed the issue-2687-download-demo-output-folder branch from c3bdfa2 to 62ee13b Compare October 6, 2025 15:14
@fealho fealho force-pushed the issue-2688-download-demo-zip branch 2 times, most recently from 32da919 to 74d3cc5 Compare October 6, 2025 16:16
Base automatically changed from issue-2687-download-demo-output-folder to feature-branch-download-demo October 6, 2025 16:52
@fealho fealho force-pushed the issue-2688-download-demo-zip branch from 74d3cc5 to af00578 Compare October 6, 2025 16:53
@sdv-team
Copy link
Contributor

sdv-team commented Oct 6, 2025

This Pull Request is not linked to an issue. To ensure our community is able to accurately track resolved issues, please link any issue that will be closed by this PR!

Copy link
Contributor

@amontanez24 amontanez24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make latin-1 a constant and explain why that encoding is special

@fealho fealho requested a review from amontanez24 October 8, 2025 11:03
Comment on lines 241 to 246
try:
data[table_name] = pd.read_csv(io.BytesIO(file_), low_memory=False)
except UnicodeDecodeError:
data[table_name] = pd.read_csv(io.BytesIO(file_), low_memory=False, encoding='latin-1')
except Exception as e:
skipped_files.append(f'{filename}: {e}')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and the previous bit of reading seems very similar, could we move it to its own function like read data ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's actually tricky. One approach for example would be to substitute lines 241-244 with:

def _read_csv_with_fallback(filepath_or_buffer, **kwargs):
    """Read a CSV with a fallback encoding on UnicodeDecodeError."""
    try:
        return pd.read_csv(filepath_or_buffer, **kwargs)
    except UnicodeDecodeError:
        kwargs = {**kwargs, 'encoding': FALLBACK_ENCODING}
        return pd.read_csv(filepath_or_buffer, **kwargs)

But this doesn't work because you need to rewind the BytesIO for the second call, and implementing that into this function makes it confusing.

Did you have some other approach in mind?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I had that in mind but I see why this won't work. It's okay to go ahead with your implementation. I think that if we did anything it would just be 'overcomplicating' a simple process.

@fealho fealho requested a review from pvk-developer October 8, 2025 18:59
@sdv-team
Copy link
Contributor

sdv-team commented Oct 8, 2025

This Pull Request is not linked to an issue. To ensure our community is able to accurately track resolved issues, please link any issue that will be closed by this PR!

@sdv-team
Copy link
Contributor

sdv-team commented Oct 8, 2025

This Pull Request is not linked to an issue. To ensure our community is able to accurately track resolved issues, please link any issue that will be closed by this PR!

@fealho fealho requested a review from amontanez24 October 9, 2025 15:47
@sdv-team
Copy link
Contributor

sdv-team commented Oct 9, 2025

This Pull Request is not linked to an issue. To ensure our community is able to accurately track resolved issues, please link any issue that will be closed by this PR!

@fealho fealho merged commit 18ece9e into feature-branch-download-demo Oct 16, 2025
22 of 47 checks passed
@fealho fealho deleted the issue-2688-download-demo-zip branch October 16, 2025 16:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants