I'd like to know more about the compression technique used as well. It would be nice to know how much "smear" or "lossiness" there is.
Now just free-wheel brainstorming here... but it seems to me that given an acceptably "clean" compression of the video, one could generate a good fingerprint from the hashes of the histograms of each frame.
Then a client-side app could sample a reasonable number of frames and given an algorithm for finding invariant histograms (given changes in lighting, capture quality, etc the algorithm would output a deterministic histogram within some acceptable but narrow tolerance) it could generate a fingerprint that is good enough for a heuristic algorithm to search a database of movie fingerprints...
Now just free-wheel brainstorming here... but it seems to me that given an acceptably "clean" compression of the video, one could generate a good fingerprint from the hashes of the histograms of each frame.
Then a client-side app could sample a reasonable number of frames and given an algorithm for finding invariant histograms (given changes in lighting, capture quality, etc the algorithm would output a deterministic histogram within some acceptable but narrow tolerance) it could generate a fingerprint that is good enough for a heuristic algorithm to search a database of movie fingerprints...
essentially, shazam for video.