Skip to content

Conversation

@afourney
Copy link
Member

No description provided.

@afourney afourney mentioned this pull request Mar 1, 2025
@afourney afourney marked this pull request as ready for review March 1, 2025 06:53
@afourney afourney requested a review from gagb March 1, 2025 07:03
@afourney
Copy link
Member Author

afourney commented Mar 1, 2025

@Zahlii
@robfitzgerald
@casperdcl
@AdrianVollmer

LMK what you thing about this approach. Main thing is that I want the converts to print useful message when they think they could handle the file, but don't have the right dependencies.

(alternatively, you could simply not register the converters at all... but then I worry about discoverability for people not reading docs)

@Zahlii
Copy link

Zahlii commented Mar 1, 2025

I think this will work fine for me - I didn’t check all of the dependencies in detail (maybe there are some more that could be marked as optional?) but at least those that currently lead to CVE warnings are now optional.

@afourney
Copy link
Member Author

afourney commented Mar 1, 2025

I think this will work fine for me - I didn’t check all of the dependencies in detail (maybe there are some more that could be marked as optional?) but at least those that currently lead to CVE warnings are now optional.

There's certainly a balance here. Some are needed just to help identify what filetypes are in use... and those need to be around in all cases. Others help with some of the fall-back converters (plain text etc.). And more generally, html-related dependencies are used by many converters (because the libraries might output HTML rather than markdown). Plus it's generally nice to be able to convert web pages out of the box, I'd argue. I think I'm satisfied with this first cut add organization.

Copy link

@casperdcl casperdcl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Comment on lines +73 to +74
* `[audio-transcription]` Installs dependencies for audio transcription of wav and mp3 files
* `[youtube-transcription]` Installs dependencies for fetching YouTube video transcription

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personal preference...

Suggested change
* `[audio-transcription]` Installs dependencies for audio transcription of wav and mp3 files
* `[youtube-transcription]` Installs dependencies for fetching YouTube video transcription
* `[audio]` Installs dependencies for audio transcription of wav and mp3 files
* `[youtube]` Installs dependencies for fetching YouTube video transcription

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree shorter is better. However, we can still get metadata for YouTube (title, video descriptions etc.) even if the transcription library is not installed. Likewise we can get metadata for audio files (runtime, track title, artist, album, etc) even when transcription is not enabled. To this end, I felt it was worth the added characters for precision.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strongly disagree; developer precision has nothing to do with user experience.

@AdrianVollmer
Copy link

I like it

@afourney
Copy link
Member Author

afourney commented Mar 3, 2025

K, merging to main... but I'll hold of from releasing to PyPi or cutting a release until we've had a chance to shake-test this a little more, and address a few other potentially API-breaking changes in the works.

@afourney afourney merged commit c5cd659 into main Mar 3, 2025
3 checks passed
@afourney afourney deleted the optional_dependencies branch March 3, 2025 17:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants