Fix `download_demo` for data.zip files #2699

fealho · 2025-10-04T22:15:30Z

CU-86b6xp7a0, Resolve #2688
CU-86b6xrcah, Resolve #2690

sdv-team · 2025-10-04T22:15:35Z

Task linked: CU-86b6xp7a0 SDV - download_demo may fail for some data.zip files #2688
Task linked: CU-86b6xrcah SDV - download_demo should ignore non-csv files in data.zip #2690

codecov · 2025-10-04T22:17:14Z

Codecov Report

❌ Patch coverage is 94.87179% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 96.88%. Comparing base (6ce265a) to head (3ecceee).
⚠️ Report is 1 commits behind head on feature-branch-download-demo.

Files with missing lines	Patch %	Lines
sdv/datasets/demo.py	94.87%	2 Missing ⚠️

Additional details and impacted files

@@                       Coverage Diff                        @@
##           feature-branch-download-demo    #2699      +/-   ##
================================================================
- Coverage                         98.16%   96.88%   -1.29%     
================================================================
  Files                                74       74              
  Lines                              7896     7923      +27     
================================================================
- Hits                               7751     7676      -75     
- Misses                              145      247     +102

Flag	Coverage Δ
integration	`?`
unit	`96.88% <94.87%> (-0.02%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

sdv-team · 2025-10-04T22:28:39Z

This Pull Request is not linked to an issue. To ensure our community is able to accurately track resolved issues, please link any issue that will be closed by this PR!

sdv-team · 2025-10-06T18:10:54Z

This Pull Request is not linked to an issue. To ensure our community is able to accurately track resolved issues, please link any issue that will be closed by this PR!

amontanez24

Can we make latin-1 a constant and explain why that encoding is special

sdv/datasets/demo.py

pvk-developer · 2025-10-08T17:15:32Z

sdv/datasets/demo.py

+        try:
+            data[table_name] = pd.read_csv(io.BytesIO(file_), low_memory=False)
+        except UnicodeDecodeError:
+            data[table_name] = pd.read_csv(io.BytesIO(file_), low_memory=False, encoding='latin-1')
+        except Exception as e:
+            skipped_files.append(f'{filename}: {e}')


This and the previous bit of reading seems very similar, could we move it to its own function like read data ?

That's actually tricky. One approach for example would be to substitute lines 241-244 with:

def _read_csv_with_fallback(filepath_or_buffer, **kwargs): """Read a CSV with a fallback encoding on UnicodeDecodeError.""" try: return pd.read_csv(filepath_or_buffer, **kwargs) except UnicodeDecodeError: kwargs = {**kwargs, 'encoding': FALLBACK_ENCODING} return pd.read_csv(filepath_or_buffer, **kwargs)

But this doesn't work because you need to rewind the BytesIO for the second call, and implementing that into this function makes it confusing.

Did you have some other approach in mind?

Yeah I had that in mind but I see why this won't work. It's okay to go ahead with your implementation. I think that if we did anything it would just be 'overcomplicating' a simple process.

sdv-team · 2025-10-08T19:05:26Z

This Pull Request is not linked to an issue. To ensure our community is able to accurately track resolved issues, please link any issue that will be closed by this PR!

sdv-team · 2025-10-08T19:42:22Z

This Pull Request is not linked to an issue. To ensure our community is able to accurately track resolved issues, please link any issue that will be closed by this PR!

sdv/datasets/demo.py

sdv-team · 2025-10-09T15:48:48Z

This Pull Request is not linked to an issue. To ensure our community is able to accurately track resolved issues, please link any issue that will be closed by this PR!

fealho requested review from amontanez24 and pvk-developer October 4, 2025 22:16

fealho marked this pull request as ready for review October 4, 2025 22:16

fealho requested a review from a team as a code owner October 4, 2025 22:16

auto-assign bot assigned fealho Oct 4, 2025

Base automatically changed from issue-2691-download-demo-yaml to issue-2687-download-demo-output-folder October 6, 2025 14:54

fealho force-pushed the issue-2687-download-demo-output-folder branch from c3bdfa2 to 62ee13b Compare October 6, 2025 15:14

fealho force-pushed the issue-2688-download-demo-zip branch 2 times, most recently from 32da919 to 74d3cc5 Compare October 6, 2025 16:16

Base automatically changed from issue-2687-download-demo-output-folder to feature-branch-download-demo October 6, 2025 16:52

fealho added 4 commits October 6, 2025 09:53

Update the download_demo and get_available_demos functions (#2669)

4d0645b

Store metadata when downloading

07ee4f3

Make zip files work

2d4c619

Fix minor

af00578

fealho force-pushed the issue-2688-download-demo-zip branch from 74d3cc5 to af00578 Compare October 6, 2025 16:53

amontanez24 reviewed Oct 7, 2025

View reviewed changes

sdv/datasets/demo.py Outdated Show resolved Hide resolved

sdv/datasets/demo.py Show resolved Hide resolved

fealho requested a review from amontanez24 October 8, 2025 11:03

pvk-developer reviewed Oct 8, 2025

View reviewed changes

Feedback

7cb8fec

fealho requested a review from pvk-developer October 8, 2025 18:59

Feedback

5667207

amontanez24 reviewed Oct 9, 2025

View reviewed changes

sdv/datasets/demo.py Outdated Show resolved Hide resolved

Add fallback

3ecceee

fealho requested a review from amontanez24 October 9, 2025 15:47

pvk-developer approved these changes Oct 14, 2025

View reviewed changes

amontanez24 approved these changes Oct 14, 2025

View reviewed changes

fealho merged commit 18ece9e into feature-branch-download-demo Oct 16, 2025
22 of 47 checks passed

fealho deleted the issue-2688-download-demo-zip branch October 16, 2025 16:41

Fix download_demo for data.zip files #2699

Fix download_demo for data.zip files #2699

Uh oh!

Conversation

fealho commented Oct 4, 2025

Uh oh!

sdv-team commented Oct 4, 2025

Uh oh!

codecov bot commented Oct 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

sdv-team commented Oct 4, 2025

Uh oh!

sdv-team commented Oct 6, 2025

Uh oh!

amontanez24 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pvk-developer Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

fealho Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

pvk-developer Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

sdv-team commented Oct 8, 2025

Uh oh!

sdv-team commented Oct 8, 2025

Uh oh!

Uh oh!

sdv-team commented Oct 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Fix `download_demo` for data.zip files #2699

Fix `download_demo` for data.zip files #2699

codecov bot commented Oct 4, 2025 •

edited

Loading