If we can use GANs to produce believable images of a subject and we can identify whether a video frame can exist between two others, it seems like we can produce infinitely high framerate videos that look like they were captured that way. This also means we can make slow motion video when we didn't record at a high framerate. I think with that, colorization, and possibly similar tools for audio, we might see some amazing recreations of classic performances.
Yep, you certainly can. 'In-betweening', like superresolution, has been a GAN thing for years now, because triplets of frames are a clean dataset but you also care more about perceptual plausibility than pixel error. People use in-betweening GANs to make things like 60 FPS anime. (Not entirely sure why, but they do.)
>People use in-betweening GANs to make things like 60 FPS anime
Animation seems like an especially poor fit to me, since the actual framerate is often much lower than the video's framerate. Framerate can vary between scenes and even within different parts of one scene! Typically the background is very low framerate (sometimes as low as 4 FPS), the foreground is higher framerate (typically 8-12 FPS), while pans, zooms, and 3D elements are at a full 24 FPS. Most of the additional frames from interpolation will therefore be exact duplicates of other frames.
This does little to improve the smoothness of the video. It just adds in artifacts. And, since the frames between two drawings will be interpolated while frames within one drawing will be unchanged, the framerate will be inconsistent and appear as judder.
Interpolation will never work for 2D animation. No way, no how. Any worthwhile system will need to modify existing frames rather than simply adding more in between the original frames. I can understand interpolation for live action (though I still dislike it), but it is absolutely god-awful for animation.
I think that's wrong: the whole point of GANs is that they're quite intelligent and good at faking outputs. I've seen interpolated/in-betweened videos (mostly but not entirely live-action), and it looks realistic to me.
The reason I'm somewhat skeptical is that just because something looks realistic doesn't mean that it what is intended. It's a version of the 'zoom in, enhance, enhance' problem. It's like the _Hobbit_ problem: a GAN could perfectly well fake a 60FPS version of a 30FPS version of the _Hobbit_ such that you couldn't tell that it wasn't the actual 60FPS version that Peter Jackson shot... but the problem is that it's 60FPS and that just feels wrong for cinema. Animators, anime included, use the limitations of framerate and deliberate switches between animating 'on twos' etc, with reductions in framerates being deliberately done for action segments and sakuga and other reasons. An anime isn't simply a film which was unavoidably shot with a too-low framerate.
(This is less true of superresolution: in most cases, if an anime studio could have afforded to animate at a higher resolution originally, they would have; and you're not compromising any 'artistic vision' if you use a GAN to do a good upscaling job instead of a lousy bilinear upscale built into your video player.)
That's the problem: no matter how smart your algorithm is, you cannot make animation look smooth by only adding frames. Not even human animators could do that.
The framerate of animation is irrelevant. What matters is the number of drawings per second, not the number of frames. An intelligent system would interpolate between drawings, which would often require modifying or deleting frames from the source.
I'm not some purist claiming that this is an evil technology. It just plain doesn't apply to animation, except for pans or the rare scene animated at a full 24 FPS.
I'm not following. (If it doesn't apply at all, how is anyone doing it...?) Of course you can identify drawings per second, much the same way a monitor can display a 24FPS video at 120hz without needing to be an 'intelligent system': you increase or decrease the number of duplicates as necessary. You in-between pairs of different frames, replacing all the identical ones which are simply displaying the same drawing.
We're so used to 24fps movies that it's become a subconscious cue for identifying 'realistic' film. Higher frame rates like like video games because all of the high frame rate CGI we see is in games.
(IMO this is just something we have to push through. I hate the low frame rate of movies.)
SVP [0], a Windows / Linux program, can handle animation (as well as film) quite well, interpolating at 60fps or greater. Try it out for yourself and see. It doesn't use GANs however, but a sufficiently complex algorithm that does interpolation.
Yes, and it does so by turning the system off almost entirely. For animation, people often disable interpolation for everything except pans. As I mentioned before, those are usually at 24 FPS already.
Couple this then be used to reduce the storage required for phones to capture slow motion video since it can simply be done with regular video in post server side?