-
Notifications
You must be signed in to change notification settings - Fork 10.5k
Fix npo support #31976
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Fix npo support #31976
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your work!
I've made a few suggestions, really just conventions, so I'll let the CI test run now.
Co-authored-by: dirkf <[email protected]>
|
I have accepted all suggestions here. Sorry I missed some conventions. |
* simplify comment * force CI
|
@dirkf Is it ready to merge now? |
|
I have had great success using this fork to download NPO Start videos. Greatly appreciated @bartbroere. It would be nice to see these changes merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to merge this but it needs valid tests.
All the current test URLs redirect to the home page, but maybe that doesn't happen in NL. Please ensure that there is at least one valid test (with playable content, non-DRM) and mark any invalid URLs with
'skip': 'Content expired',(or ...: 'only available in NL',, or whatever is appropriate) after the info_dict. If an invalid test URL doesn't match the extractor's _VALID_URL, it should be deleted (but I don't think that applies here).
If URLs are only valid in NL, or with an account, please post a console log of successful tests.
I've made some additional suggestions that I might improve with the aid of a usable test URL.
Thanks for the suggested improvements. I'll address all the feedback, and re-request review once I feel it's all done. |
|
@dirkf Looking into it I realised this is probably caused by the recently released new NPO app and website. I'll check all tests and maybe add some new ones |
|
yt-dlp/yt-dlp#9319 addresses the same issues. I think (speaking for the Dutch users a bit) that support for npo.nl is the main feature, and all the other sites often just embed the NPO player in some way. Since almost all of the other sites no longer work in the same way as before, I would propose throwing away extractors for many other domains, since rebuilding them is probably a lot quicker. |
Re-implementing these is quicker for the cases where that's even still possible
|
It's a bit vexing that the yt-dlp PR has done the same job, but by all means plunder any useful stuff from there. I would generally merge/back-port an updated yt-dlp extractor but generally also only when ours doesn't work. |
I'm not convinced that the pull request on On the other hand, my branch probably doesn't work for some of the So I think we can do a nice bit of "cross-plundering" between my |
|
Great. Also, if there are standard patterns for the discarded sites where NPO videos are embedded, we can add a class variable @classmethod
def _extract_urls(cls, webpage):
def yield_urls():
for p in cls._EMBED_REGEX:
for m in re.finditer(p, webpage):
yield m.group('url')
return list(yield_urls())Eventually the webpage extraction system from yt-dlp should be pulled in, with the method standardised in |
|
Hey, this is amazing work! I wanted to test it out, but it no longer seems to work unfortunately. Issue with getting the token. |
Thanks for reporting this! I'll look into it. |
|
@bartbroere NPO has renamed 'token' to 'jwt' |
|
@bartbroere can this branch be (also) merged in yt-dlp and can development be synced there - or is there any specific reason why this is built on youtube-dl? (this was also referenced before in relation to yt-dlp/yt-dlp#9319) I might be off, but it seems that youtube-dl isn't considered upstream for yt-dlp (anymore) so any youtube-dl changes won't make it back to yt-dlp by default? |
|
btw, related question. Is all content uploaded nowadays drm protected? This is a great PR, but it seems like it can only be used on older content as the rest is drm protected. |
Thanks! I changed it here too!
I wouldn't be against merging this there as well. I did indeed think it would end up there automatically once merged here, but if that's not the case I could create a PR there too. There is an open PR there as well, let's start by tagging them so we can compare our work later: yt-dlp/yt-dlp#11227 |
Yes, unfortunately it seems practically all new content is now DRM protected. I kind of understand adding DRM when a broadcaster buys the rights to an international television show, but for things like the news I think it is a bit much (especially for a publicly funded organisation). In my opinion there's lots of legitimate reasons to want to use a news fragment offline. And you're also correct on the older content still being downloadable. I think this older content makes it worth the effort to fix the NPO extractor here. |
Co-authored-by: dirkf <[email protected]>
Before submitting a pull request make sure you have:
In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:
What is the purpose of your pull request?
Fix support for NPO sites
This fixes two things that have been changed on the NPO websites:The current video player no longer returns the token in the JSON body, but instead provides us with an XSRF token in the cookie.A second call changed from GET to POST and should include this XSRF token.This branch started out as a small fix, but in the ~11 months the PR was open the NPO site was updated heavily, so now it changes a lot more than the things above. Many of the broadcaster (NL: omroep) sites' extractors (vpro etc.) no longer worked, so these have been removed entirely. I'm always willing to look into re-implementing support for some of these, if that's still possible. However, typically the broadcaster's website just embeds the npo.nl player, so grabbing the same media from the npo.nl URL is recommended.