[francetv] stop using retired FranceTV API and enable new one #29996

sarnoud · 2021-09-20T01:30:10Z

Please follow the guide below

You will be asked some questions, please read them carefully and answer honestly
Put an x into all the boxes [ ] relevant to your pull request (like that [x])
Use Preview tab to see how your pull request will actually look like

Before submitting a pull request make sure you have:

[ X] Searched the bugtracker for similar pull requests
[ X] Read adding new extractor tutorial
[ X] Read youtube-dl coding conventions and adjusted the code to meet them
[ X] Covered the code with tests (note that PRs without tests will be REJECTED)
[ X] Checked the code with flake8

In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:

[ X] I am the original author of this code and I am willing to release it under Unlicense
I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

[ X] Bug fix
Improvement
New extractor
New feature

Description of your pull request and other information

The initial goal is to fix the newly broken francetv extractor (see issue #29956 (comment))

Few changes:

extractor fixes for new backend
removed obsolete extractors in the francetv.py file (they now all redirect to the main site)
Fixed tests
implemented the ability to extract caption streams from m3u8 (behind include_subtitles flag)
implemented a new field 'download' that is a lambda function to retrieve subtitles (along 'data' and 'url')

dirkf · 2021-09-20T10:47:37Z

youtube_dl/YoutubeDL.py

                        except (OSError, IOError):
                            self.report_error('Cannot write subtitles file ' + sub_filename)
                            return
+                    elif sub_info.get('downloader') is not None:


Would callable(sub_info.get('downloader')) be a safer test?

Good one. Done.

dirkf · 2021-09-20T10:50:17Z

You might also consider this, if your other changes haven't populated the properties mentioned.

lcheylus · 2021-09-20T12:26:49Z

Hi @sarnoud,

I reviewed and tested your code with fix for FranceTV extractor :

get GIT repository from your sarnoud-francetv branch
install/test it in a virtual env (Python 3.9 on Debian testing).

Everything is OK :

extract and video formats (HLS and Dash)
get subtitles
tests with test/test_download.py for FranceTV, FranceTVSite and FranceTVInfo extractors.

But there is a bug with the infos returned as JSON => not serializable

$ youtube-dl -j https://www.francetvinfo.fr/replay-jt/france-3/soir-3/jt-grand-soir-3-lundi-26-aout-2019_3569073.html
WARNING: [FranceTV] Unknown MIME type application/mp4 in DASH manifest
Traceback (most recent call last):
  File "/home/fox/dev/youtube-dl.git/env/bin/youtube-dl", line 11, in <module>
    load_entry_point('youtube-dl==2021.6.6', 'console_scripts', 'youtube-dl')()
  File "/home/fox/dev/youtube-dl.git/env/lib/python3.9/site-packages/youtube_dl-2021.6.6-py3.9.egg/youtube_dl/__init__.py", line 475, in main
    _real_main(argv)
  File "/home/fox/dev/youtube-dl.git/env/lib/python3.9/site-packages/youtube_dl-2021.6.6-py3.9.egg/youtube_dl/__init__.py", line 465, in _real_main
    retcode = ydl.download(all_urls)
  File "/home/fox/dev/youtube-dl.git/env/lib/python3.9/site-packages/youtube_dl-2021.6.6-py3.9.egg/youtube_dl/YoutubeDL.py", line 2070, in download
    res = self.extract_info(
  File "/home/fox/dev/youtube-dl.git/env/lib/python3.9/site-packages/youtube_dl-2021.6.6-py3.9.egg/youtube_dl/YoutubeDL.py", line 808, in extract_info
    return self.__extract_info(url, ie, download, extra_info, process)
(...)
  File "/home/fox/dev/youtube-dl.git/env/lib/python3.9/site-packages/youtube_dl-2021.6.6-py3.9.egg/youtube_dl/YoutubeDL.py", line 1774, in __forced_printings
    self.to_stdout(json.dumps(info_dict))
  File "/usr/lib/python3.9/json/__init__.py", line 231, in dumps
    return _default_encoder.encode(obj)
  File "/usr/lib/python3.9/json/encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/lib/python3.9/json/encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
  File "/usr/lib/python3.9/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type function is not JSON serializable

This issue breaks Youtube-DL hook with mpv (Mplayer).

dirkf · 2021-09-20T12:44:58Z

Presumably YoutubeDL.py l.1774 needs a default parameter to json.dumps() like the one added to json.dump() at utils.py l.1833?

            # json.dumps may need a default=lambda x: ...
            self.to_stdout(json.dumps(info_dict))

Also at l.2077:

                    self.to_stdout(json.dumps(res))

sarnoud · 2021-09-20T14:39:25Z

Good catches! Submitted changes.

lcheylus · 2021-09-20T17:29:15Z

After these both commits, JSON output is correct.

But after some analysis, I don't understand your code to get subtitle/downloader : youtube_dl/extractor/francetv.py lines 209-211 :

for lang, sts in info['subtitles'].items():
    for st in sts:
        st['downloader'] = lambda ydl, filename: PROTOCOL_MAP['m3u8_native'](ydl, ydl.params).download(filename, st)

With this lambda, st['downloader'] is a reference to a Python function => not JSON serializable.

sarnoud · 2021-09-20T17:44:10Z

Yes you are correct.
Check the change in YoutubeDL.py:1882.

There were 2 ways to get captions: via an url ('url' attrribute) or via providing the data directly ('data' attribute).

The issue with captions now is that they are delivered as a m3u8 stream that needs to be downloaded. So there would be 2 options: i- download captions all the time and put them in 'data', or ii- provide a lambda that will only do the download when needed.
I implemented ii- and this is the 'downloader' attribute basically.

                elif callable(sub_info.get('downloader')):
                    sub_info.get('downloader')(self, encodeFilename(sub_filename))

The lambda is also the thing that was not getting properly serialized in the json.dump calls BTW.

Does that make sense?

dirkf · 2021-09-20T21:38:24Z

Possibly the subtitle generator should be evaluated when generating JSON rather than just emitting "not serializable"?

sarnoud · 2021-09-20T22:07:21Z

Technically possible, but the download itself is quite involved though. See for yourself with something like:
python3 -m youtube_dl --all-subs --skip-download https://www.france.tv/france-2/les-invisibles/les-invisibles-saison-1/2748331-pachelbel.html

I would advise against it for every JSON serialization?

renalid · 2021-09-21T14:14:40Z

@sarnoud thanks a lot for the fix ! works well 👍
Do you think there is a way to get (back) the description of the video ?

dirkf · 2021-09-21T15:38:42Z

...
There were 2 ways to get captions: via an url ('url' attrribute) or via providing the data directly ('data' attribute).

The issue with captions now is that they are delivered as a m3u8 stream that needs to be downloaded. So there would be 2 options: i- download captions all the time and put them in 'data', or ii- provide a lambda that will only do the download when needed.
I implemented ii- and this is the 'downloader' attribute basically.
...

As this is adding a new element to the extractor-core API, I think it needs a bit more thought.

Presumably the reason for not downloading the captions in the extractor is to avoid unnecessarily downloading them without second-guessing the core logic (YoutubeDL.py) that makes that decision?

So instead of returning the actual subtitles, we return a downloader function, essentially a closure, and on first glance this seems like a cleaner solution than the ['url'] mechanism: as currently implemented, this has the core logic find some random extractor instance to use for downloading the subtitles.

However I believe that these are really two alternatives of the same sort. One is a URL that has to be downloaded as a plain text page; the new case is a URL that has to be downloaded as M3U8. Surely we should deal with both cases in the same logic? And, just as the core shouldn't be playing with random IEs, I'm not sure that an extractor should know about the downloader's PROTOCOL_MAP, as in the downloader lambda.

Here's my proposal that addresses both concerns and (if it works) has some other benefits:

scrap the ['downloader'] mechanism (sorry) -- benefit - no need to worry about unserializable items for JSON;
from the extractor, pass the m3u8 URL as st['url'] and also set st['protocol'] as 'm3u8_native' -- benefit - the subtitle link is available in JSON outputs;
in the core method _process_info() where it handles subtitles, remove this

            ie = self.get_info_extractor(info_dict['extractor_key'])

in the conditional branch where sub_info['data'] is None, replace the try statement with something like this:

                        # essentially the lambda downloader here
                        # pass 1st self.params to respect --external_downloader, --hls-prefer-native?
                        fd = get_suitable_downloader(sub_info, self.params)(self, self.params)
                        try:
                            if self.params.get('verbose'):
                                self.to_screen('[debug] Invoking subtitle downloader on %r' % sub_info.get('url'))
                            # the FD is supposed to encodeFilename() ?
                            if not fd.download(sub_filename, sub_info):
                                # depending on the FD, it may catch errors and return False, or not
                                raise DownloadError('subtitle download failed')
                        except (compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error, OSError, IOError, YoutubeDLError) as err:
                            self.report_warning('Unable to download subtitle for "%s": %s' %
                                                (sub_lang, error_to_compat_str(err)))
                            continue

Provided that there aren't significant unwanted side-effects in using the FileDownloader (but it's basically working like the lambda was), I think that this would be a better structure.

chzurb · 2021-09-21T16:26:24Z

@dirkf I understand your motivation to encourage the download aspect of PR changes from @sarnoud to be reengineered back into ['url'] but one benefit of leaving them as he has implemented them at least initially is that it doesn't break backward compatibility or introduce any potential regression with the existing use of ['url'] by every other IE. Perhaps it's better to let it come in the way he has it now as separate ['download] which allows it to get some real world production use in isolation from only francetv IE users. The integration back into url might then be worked in parallel for a later PR in and of itself like you suggest. Just my 2 cents.

…eep the download logic in YoutubeDl

dirkf · 2021-09-21T16:36:34Z

Sure, understandable point.

There's a halfway house: instead of replacing the existing try statement, insert my proposed replacement before with a test elif sub_info.get('protocol'):. Then, as you suggest, a subsequent PR could combine the last two conditional branches, based on the valuable debugging of FranceTV users.

sarnoud · 2021-09-21T16:47:24Z

I implemented @dirkf proposal - which I agree makes sense.

@chzurb you're right that there is a risk of regression as it touches the core logic - even if it shouldn't? Tried downloading subtitles from a couple sites and it seems to work - but definitely not exhaustive.

dirkf · 2021-09-21T17:44:21Z

Unfortunately it's not easy to scan the extractors for use of ['url'] in subtitles as the same key is heavily used throughout (131 of 787 mention the string). Maybe the old FranceTV was the only one, or one of a few?

sarnoud · 2021-09-21T18:48:33Z

Yep.
So are we willing to go with it?

dirkf · 2021-09-21T20:20:00Z

It's good for me (if it works with FranceTV). If real maintainers reappear they might prefer the 2-stage approach but it would be easy to switch to that.

sarnoud · 2021-09-21T21:11:39Z

Thanks @dirkf . What would be the next step? You would approve to trigger the workflow and get it merged?

ajt-en-france · 2021-09-23T18:12:13Z

I implemented @dirkf proposal - which I agree makes sense.

@chzurb you're right that there is a risk of regression as it touches the core logic - even if it shouldn't? Tried downloading subtitles from a couple sites and it seems to work - but definitely not exhaustive.

I found two issues. It's possible that I've not correctly built the binary from git, but:

It pulls the first audio file, which may be the audio description, not the regular audio track
The subtitles it pulls are malformed and I can't use them

For example:

$ youtube-dl -F https://www.france.tv/france-2/les-invisibles/les-invisibles-saison-1/2764261-garenne.html
[FranceTVSite] 2764261-garenne: Downloading webpage
[FranceTV] 9f3d9cfc-e977-4480-9a19-8c574e10821b: Downloading desktop video JSON
[FranceTV] 9f3d9cfc-e977-4480-9a19-8c574e10821b: Downloading mobile video JSON
[FranceTV] 9f3d9cfc-e977-4480-9a19-8c574e10821b: Downloading signed dash manifest URL
[FranceTV] 9f3d9cfc-e977-4480-9a19-8c574e10821b: Downloading MPD manifest
WARNING: [FranceTV] Unknown MIME type application/mp4 in DASH manifest
[FranceTV] 9f3d9cfc-e977-4480-9a19-8c574e10821b: Downloading signed hls manifest URL
[FranceTV] 9f3d9cfc-e977-4480-9a19-8c574e10821b: Downloading m3u8 information
[info] Available formats for 9f3d9cfc-e977-4480-9a19-8c574e10821b:
format code                          extension  resolution note
hls-audio-aacl-96-Audio_Description  mp4        audio only [qtz] 
hls-audio-aacl-96-Audio_Français     mp4        audio only [fr] 
dash-audio_fre=96000                 m4a        audio only [fr] DASH audio   96k , m4a_dash container, mp4a.40.2 (48000Hz)
dash-audio_qtz=96000                 m4a        audio only [qtz] DASH audio   96k , m4a_dash container, mp4a.40.2 (48000Hz)
dash-video=400000                    mp4        384x216    DASH video  400k , mp4_dash container, avc1.42C01E, 25fps, video only
hls-522                              mp4        384x216     522k , avc1.42C01E, 25.0fps, video only
dash-video=950000                    mp4        640x360    DASH video  950k , mp4_dash container, avc1.4D401F, 25fps, video only
hls-1105                             mp4        640x360    1105k , avc1.4D401F, 25.0fps, video only
dash-video=1400000                   mp4        960x540    DASH video 1400k , mp4_dash container, avc1.4D401F, 25fps, video only
hls-1582                             mp4        960x540    1582k , avc1.4D401F, 25.0fps, video only
dash-video=2000000                   mp4        1280x720   DASH video 2000k , mp4_dash container, avc1.64001F, 25fps, video only
hls-2218                             mp4        1280x720   2218k , avc1.64001F, 25.0fps, video only (best)

and then in mpv:

$ mpv Les_invisibles_-_Garenne.mp4 
 (+) Video --vid=1 (*) (h264 1280x720 25.000fps)
 (+) Audio --aid=1 --alang=qtz (*) (aac 2ch 48000Hz)
File tags:
 Title: Les invisibles - Garenne
AO: [pulse] 48000Hz stereo 2ch float
VO: [gpu] 1280x720 yuv420p

I can manually get the correct file, by using the -f flag, but it no longer does the right thing automatically.

The subtitles look like this:

WEBVTT
X-TIMESTAMP-MAP=MPEGTS:900000,LOCAL:00:00:00.000

00:00:00.000 --> 00:00:33.033
...

WEBVTT
X-TIMESTAMP-MAP=MPEGTS:1800000,LOCAL:00:00:00.000

WEBVTT
X-TIMESTAMP-MAP=MPEGTS:2700000,LOCAL:00:00:00.000

WEBVTT
X-TIMESTAMP-MAP=MPEGTS:3600000,LOCAL:00:00:00.000

00:00:03.300 --> 00:00:05.733
-Voilà.
-Allez, les filles !

00:00:06.166 --> 00:00:07.500
-Hello !

00:00:07.766 --> 00:00:10.366
-Et voilà les plus belles !

dirkf · 2021-09-23T20:03:03Z

...
I found two issues. It's possible that I've not correctly built the binary from git, but:
1. It pulls the first audio file, which may be the audio description, not the regular audio track

Does FranceTV have a separate programme page for the AD version (as with BBC)? If so, the AD should only be used with that page's URL. Otherwise the AD should only be fetched if selected with -f.

2. The subtitles it pulls are malformed and I can't use them

How does that manifest itself? Although I'm no VTT expert, the text you quote looks like what is specified in RFC8216 section 3.5 and its sources.

From UK I get 403 on both M3U8 and MPD sources, so I can't test further.

In extractor/francetv.py, now that _extract_m3u8_formats() can return either False or (formats, subtitles), it would be good to test before assuming that its result is a tuple, as here:

            elif ext == 'm3u8':                                                 
                res = self._extract_m3u8_formats(                               
                    sign(video_url, format_id), video_id, 'mp4',                
                    entry_protocol='m3u8_native', m3u8_id=format_id,            
                    fatal=False, include_subtitles=True)                        
                if not res:                                                     
                    continue                                                    
                format, subtitle = res

lcheylus · 2021-09-23T23:22:58Z

I found two issues. It's possible that I've not correctly built the binary from git, but:
1. It pulls the first audio file, which may be the audio description, not the regular audio track

For this issue, we must set format['preference'] before sorting formats : extractor/francetv.py line 202

for f in info['formats']:    
    if f['format_id'].startswith('dash-audio_qtz') or f['format_id'].find('Audio Description') > 0:    
        f['preference'] = -1    
        f['format_note'] = "Audio description"
    elif f['format_id'].startswith('dash-audio'):    
        f['preference'] = 10    
    elif f['format_id'].startswith('hls-audio'):    
        f['preference'] = 20    
    else:    
        f['preference'] = 100    
    
self._sort_formats(info['formats'])

With this modification, HLS-Audio with no audio description is always chosen for "Best Audio" format. I had also a format note for audio description to display it with -F (list formats) flag.

ajt-en-france · 2021-09-24T06:48:15Z

...
I found two issues. It's possible that I've not correctly built the binary from git, but:
1. It pulls the first audio file, which may be the audio description, not the regular audio track
Does FranceTV have a separate programme page for the AD version (as with BBC)? If so, the AD should only be used with that page's URL. Otherwise the AD should only be fetched if selected with -f.

There could be several audio tracks for the same programme on the same web pages, French, French with audio description and sometimes the original version "Version Originale" which could be English, Italian etc... On the France.tv web page you can make a selection, but if you use the browser based player it defaults to the French audio track even if the original program was in Italian for example. The audio described is never the default as far as I can tell.

When previously using the -F and then -f flags to pick streams I found it a bit unreliable, as you could often get French no matter which steam you picked with youtube-dl, even though the web player would pay the Italian for example... Montalbano sounds really strange in French...

boulderob · 2021-09-24T21:29:00Z

There could be several audio tracks for the same programme on the same web pages, French, French with audio description and sometimes the original version "Version Originale" which could be English, Italian etc... On the France.tv web page you can make a selection, but if you use the browser based player it defaults to the French audio track even if the original program was in Italian for example. The audio described is never the default as far as I can tell.

When previously using the -F and then -f flags to pick streams I found it a bit unreliable, as you could often get French no matter which steam you picked with youtube-dl, even though the web player would pay the Italian for example... Montalbano sounds really strange in French...

I can confirm in the past that the VO audio tracks were often NOT present in the -F listings or if it seemed like they were that when you downloaded and played them they were all french audio even if they said otherwise. Yet the VO audio tracks were playable from the site which means they were available. This just means that the files were not being picked up correctly by the francetv IE in the past.

My thought is that the new PR introduced here will pick up these VO files in the new mpd formats that weren't being picked up by the old francetv IE. So far with my limited testing of the new PR, qtz either picks up the AD or the VO depending on the video source. That's a good sign.

The bad part is that the current qtz format IDs don't always include the text "VO" or "Audio Description" in the format ids. So you have to guess when downloading with -f after viewing the specs with -F what it's going to be. Usually this is pretty easy because VO sources are usually not AD and vice versa based on my past usage on the site anyway.

The only way to solve this is to add "descriptive" text to the AD and VO format ids. But that's only possible if that type of text is available in the source manifests.

boulderob · 2021-09-24T21:37:19Z

The subtitles look like this:

WEBVTT
X-TIMESTAMP-MAP=MPEGTS:900000,LOCAL:00:00:00.000

00:00:00.000 --> 00:00:33.033
...

WEBVTT
X-TIMESTAMP-MAP=MPEGTS:1800000,LOCAL:00:00:00.000

WEBVTT
X-TIMESTAMP-MAP=MPEGTS:2700000,LOCAL:00:00:00.000

WEBVTT
X-TIMESTAMP-MAP=MPEGTS:3600000,LOCAL:00:00:00.000

00:00:03.300 --> 00:00:05.733
-Voilà.
-Allez, les filles !

00:00:06.166 --> 00:00:07.500
-Hello !

00:00:07.766 --> 00:00:10.366
-Et voilà les plus belles !

I followed this issue and commit a bit on the yt-dlp site a few months ago to fix the broken subtitles but I haven't tested it yet. If you can download subtitles but they are out of synch it really is still broken because they are unusable. I would think the order of the subtitle segments is already fixed in the manifest so I'm not sure how they get out of synch during a download.

The ability to downloading subtitles isn't really a 'fix' if they're out of order and don't work. I'm going to clone sarnaud's PR and see for myself where things stand in this regard. I presume the old HLS subtitles work fine though since they were working before

dirkf · 2021-09-25T02:06:56Z

The ability to downloading subtitles isn't really a 'fix' if they're out of order and don't work. I'm going to clone sarnaud's PR and see for myself where things stand in this regard. I presume the old HLS subtitles work fine though since they were working before

Apparently yt-dlp has a whole module for reassembling WebVTT from fragment downloads, which might address the several ToDo items in the code related to WebVTT.

pukkandan · 2021-09-25T06:36:13Z

The subtitle downloading and reassembling code was submitted to youtube-dl first in #6144. Ofc, further improvements have been made to it in yt-dlp/yt-dlp#247. Hope either of these PRs and the discussions in them are helpful to you guys

sarnoud · 2021-09-25T22:24:41Z

                   * "url": A URL pointing to the subtitles resource

              With "url", a "protocol" entry (as for "formats" above)

              may be provided to indicate how the URL should be

              processed; by default it is a file downloaded by HTTP(S)

Done

boulderob · 2021-09-25T23:13:33Z

The ability to downloading subtitles isn't really a 'fix' if they're out of order and don't work. I'm going to clone sarnaud's PR and see for myself where things stand in this regard. I presume the old HLS subtitles work fine though since they were working before

Apparently yt-dlp has a whole module for reassembling WebVTT from fragment downloads, which might address the several ToDo items in the code related to WebVTT.

The subtitle downloading and reassembling code was submitted to youtube-dl first in #6144. Ofc, further improvements have been made to it in yt-dlp/yt-dlp#247. Hope either of these PRs and the discussions in them are helpful to you guys

Thanks to both of you for that! I just cloned sarnaud's PR repo but my guess is it would make more sense to clone yt-dlp to pick up the vtt ordering fixes already there and then try to merge sarnaud's PR #29996 into it vs the other way around due to the number of files changed in yt-dlp #247. @pukkandan are you going to merge in sarnaud's #29996 to yt-dlp or wait on it?

My merge skills are a little rusty. I presume I just need to open up remote to sarnaud fork, fetch his PR branch, merge to yt-dlp master (or my own branch of it) and fix conflicts if they arise. Sound right?

Thanks

fstirlitz · 2021-09-26T13:45:25Z

youtube_dl/extractor/francetv.py

+        info = {
+            'title': None,
+            'subtitle': None,
+            'image': None,
+            'subtitles': {},
+            'duration': None,
+            'videos': [],
+            'formats': [],
+        }


It seems rather silly to stuff extracted information into an info dictionary only to extract each key back one by one at the end. The churn of changing formats into info['formats'] later in the extractor inflates the diff size and makes code harder to review without actually accomplishing anything.

fstirlitz · 2021-09-26T15:42:28Z

youtube_dl/extractor/francetv.py

            'id': video_id,
-            'title': self._live_title(title) if is_live else title,
+            'title': self._live_title(info['title']) if is_live else info['title'],
            'description': clean_html(info.get('synopsis')),


This is no longer extracted. The info dictionary issue is masking a bug here.

fstirlitz · 2021-09-26T15:42:32Z

youtube_dl/extractor/francetv.py

-            'duration': int_or_none(info.get('real_duration')) or parse_duration(info.get('duree')),
+            'thumbnail': info.get('image'),
+            'duration': int_or_none(info.get('duration')),
            'timestamp': int_or_none(try_get(info, lambda x: x['diffusion']['timestamp'])),


fstirlitz · 2021-09-26T15:51:10Z

youtube_dl/extractor/francetv.py

-        self._sort_formats(formats)
+        for f in info['formats']:
+            preference = 100
+            if f['format_id'].startswith('dash-audio_qtz=96000') or (f['format_id'].find('Description') >= 0):


Suggested change

if f['format_id'].startswith('dash-audio_qtz=96000') or (f['format_id'].find('Description') >= 0):

if f.get('language') == 'qtz':

should work as well, does it not?

Trit34 · 2021-10-01T16:43:23Z

@sarnoud We could have a fix for DRM videos here:
#29956 (comment)

Original PR: ytdl-org/youtube-dl#29996 Closes: #970, ytdl-org/youtube-dl#29956, ytdl-org/youtube-dl#29957, ytdl-org/youtube-dl#29969, ytdl-org/youtube-dl#29990, ytdl-org/youtube-dl#30010 Authored by: fstirlitz, sarnoud

Trit34 · 2021-10-10T15:07:33Z

@sarnoud I know that youtube-dl has no vocation to bypass DRMs, but it did it before with France TV videos, so… Is there a way to code such a command in the francetv.py extractor?

ffmpeg -i "https://replayftv-vh.akamaihd.net/i/streaming-adaptatif_media-secure_france-dom-tom/2021/S40/J3/1045549522-615df4c0e3747-,standard1,standard2,standard3,standard4,qaa,.mp4.csmil/master.m3u8?hdnea=exp=1633877480~acl=%2fi%2fstreaming-adaptatif_media-secure_france-dom-tom%2f2021%2fS40%2fJ3%2f1045549522-615df4c0e3747-,standard1,standard2,standard3,standard4,qaa,.mp4.csmil*~hmac=326b54f56efb3fcb82c8c94862f25cac02cc295ed78901c87e90f41cee7fee8a" -c copy -bsf:a aac_adtstoasc "Murdoch.mp4"

(Based on the https://replayftv-vh.akamaihd.net/i/streaming-adaptatif_media-secure_france-dom-tom/YEAR/SWEEK/JDAY/PLURIMEDIA_ID-MANIFEST_ID_REVERSED-,standard1,standard2,standard3,standard4,qaa,.mp4.csmil/master.m3u8 pattern given by TiA4f8R: #29956 (comment))

kdliss · 2021-10-23T16:43:05Z

sorry - I got lost. is there any latest patch and howto for downloading e.g. https://www.france.tv/france-2/journal-20h00/2822769-edition-du-vendredi-22-octobre-2021.html
(I am sitting in China and having no direct stream or tunnel)
thanks

Original PR: ytdl-org/youtube-dl#29996 Closes: yt-dlp#970, ytdl-org/youtube-dl#29956, ytdl-org/youtube-dl#29957, ytdl-org/youtube-dl#29969, ytdl-org/youtube-dl#29990, ytdl-org/youtube-dl#30010 Authored by: fstirlitz, sarnoud

Trit34 · 2021-12-13T09:49:27Z

@sarnoud If you’re still around there, since I have updated Python to 3.10, your patch does not work anymore when I replace the original files by your patched ones:

$ youtube-dl -F https://www.france.tv/documentaires/art-culture/2941931-rochefort-noiret-marielle-les-copains-d-abord.html
Traceback (most recent call last):
  File "/usr/bin/youtube-dl", line 33, in <module>
    sys.exit(load_entry_point('youtube-dl==2021.6.6', 'console_scripts', 'youtube-dl')())
  File "/usr/bin/youtube-dl", line 25, in importlib_load_entry_point
    return next(matches).load()
  File "/usr/lib/python3.10/importlib/metadata/__init__.py", line 162, in load
    module = import_module(match.group('module'))
  File "/usr/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 883, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/usr/lib/python3.10/site-packages/youtube_dl/__init__.py", line 3, in <module>
    from .common import FileDownloader
ModuleNotFoundError: No module named 'youtube_dl.common'

Sacha Arnoud added 3 commits September 19, 2021 03:03

more complete patch with subtitles

ecbd463

Fixing test

f96eff4

style guide

bfa16e8

sarnoud changed the title ~~Sarnoud francetv~~ [francetv] stop using retired Francetv API Sep 20, 2021

sarnoud changed the title ~~[francetv] stop using retired Francetv API~~ [francetv] stop using retired FranceTV API and enable new one Sep 20, 2021

dirkf reviewed Sep 20, 2021

View reviewed changes

Sacha Arnoud added 2 commits September 20, 2021 14:33

json serialization bugs

633ab0c

review commetns

7f23f02

sarnoud mentioned this pull request Sep 20, 2021

[FranceTV] Download no longer possible #29956

Open

5 tasks

update supported sites

d077456

Sacha Arnoud added 2 commits September 21, 2021 16:31

Moving protocol to download subtitles back to the subtitle_info and k…

5fb593d

…eep the download logic in YoutubeDl

json.dump bug fix

f02f87d

flake8 style fixes

3b9dad9

document "protocol" field

8e8e95a

Sacha Arnoud added 2 commits September 25, 2021 22:36

Bug fix when _extract_m3u8_formats returns False

1b0746b

set a preference for formats

a2c4658

fix tests

19b3af2

fstirlitz reviewed Sep 26, 2021

View reviewed changes

sarnoud mentioned this pull request Sep 26, 2021

[Broken] FranceTV yt-dlp/yt-dlp#970

Closed

fstirlitz reviewed Sep 26, 2021

View reviewed changes

fstirlitz mentioned this pull request Sep 26, 2021

[francetv] Update extractor yt-dlp/yt-dlp#1096

Merged

5 tasks

dirkf mentioned this pull request Oct 20, 2021

Download broken for france3-regions.francetvinfo.fr #30137

Closed

5 tasks

dirkf force-pushed the master branch from 01bf89e to 4c6fba3 Compare August 26, 2022 07:51

This was referenced May 14, 2025

[FranceTV] Back-port and update extractors dirkf/youtube-dl#8

Closed

[FranceTV] Back-port and update extractors #33131

Open

	if f['format_id'].startswith('dash-audio_qtz=96000') or (f['format_id'].find('Description') >= 0):
	if f.get('language') == 'qtz':

[francetv] stop using retired FranceTV API and enable new one #29996

Are you sure you want to change the base?

[francetv] stop using retired FranceTV API and enable new one #29996

Uh oh!

Conversation

sarnoud commented Sep 20, 2021

Please follow the guide below

Before submitting a pull request make sure you have:

In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:

What is the purpose of your pull request?

Description of your pull request and other information

Uh oh!

dirkf Sep 20, 2021

Choose a reason for hiding this comment

Uh oh!

sarnoud Sep 20, 2021

Choose a reason for hiding this comment

Uh oh!

dirkf commented Sep 20, 2021

Uh oh!

lcheylus commented Sep 20, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dirkf commented Sep 20, 2021

Uh oh!

sarnoud commented Sep 20, 2021

Uh oh!

lcheylus commented Sep 20, 2021

Uh oh!

sarnoud commented Sep 20, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dirkf commented Sep 20, 2021

Uh oh!

sarnoud commented Sep 20, 2021

Uh oh!

renalid commented Sep 21, 2021

Uh oh!

dirkf commented Sep 21, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chzurb commented Sep 21, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dirkf commented Sep 21, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sarnoud commented Sep 21, 2021

Uh oh!

dirkf commented Sep 21, 2021

Uh oh!

sarnoud commented Sep 21, 2021

Uh oh!

dirkf commented Sep 21, 2021

Uh oh!

sarnoud commented Sep 21, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ajt-en-france commented Sep 23, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dirkf commented Sep 23, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lcheylus commented Sep 23, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ajt-en-france commented Sep 24, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

boulderob commented Sep 24, 2021

Uh oh!

boulderob commented Sep 24, 2021

Uh oh!

dirkf commented Sep 25, 2021

Uh oh!

pukkandan commented Sep 25, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sarnoud commented Sep 25, 2021

lcheylus commented Sep 20, 2021 •

edited

Loading

sarnoud commented Sep 20, 2021 •

edited

Loading

dirkf commented Sep 21, 2021 •

edited

Loading

chzurb commented Sep 21, 2021 •

edited

Loading

dirkf commented Sep 21, 2021 •

edited

Loading

sarnoud commented Sep 21, 2021 •

edited

Loading

ajt-en-france commented Sep 23, 2021 •

edited

Loading

dirkf commented Sep 23, 2021 •

edited

Loading

lcheylus commented Sep 23, 2021 •

edited

Loading

ajt-en-france commented Sep 24, 2021 •

edited

Loading

pukkandan commented Sep 25, 2021 •

edited

Loading

boulderob commented Sep 25, 2021 •

edited

Loading