What exactly is the problem? That they worked on video generation models? That they only used YouTube? That they downloaded videos from YouTube? That they downloaded multiple videos from YouTube?