-
Notifications
You must be signed in to change notification settings - Fork 105
Open
Description
I'm using model 4.1, on this video, downloaded in what youtube-dl/yt-dlp identifies as format number 248 (the default if you exclude resolutions above 1080p).
There is a "cut" between scenes/shots at 15 frames past the 11-second mark in the input video. In the output video (with default settings), this turns into a downright disorienting transition. Looking frame by frame, the first two frames after the cut are heavily distorted.
Here are those two frames, plus the next two for context:

The culprit is the if block starting at line 227:
if ssim > 0.996:
frame = read_buffer.get() # read a new frame
if frame is None:
break_flag = True
frame = lastframe
else:
temp = frame
I1 = torch.from_numpy(np.transpose(frame, (2,0,1))).to(device, non_blocking=True).unsqueeze(0).float() / 255.
I1 = pad_image(I1)
I1 = model.inference(I0, I1, args.scale)
I1_small = F.interpolate(I1, (32, 32), mode='bilinear', align_corners=False)
ssim = ssim_matlab(I0_small[:, :3], I1_small[:, :3])
frame = (I1[0] * 255).byte().cpu().numpy().transpose(1, 2, 0)[:h, :w]
If I change the initial threshold from 0.996 to 1 (which I think disables the block because ssim can never be >1), the issue disappears:

ST02-droid, hzwer and steelywing
Metadata
Metadata
Assignees
Labels
No labels