Yeah, games like StarCraft will probably need a working memory component. The task that they solve here with RL is a simple puzzle game. It'll be interesting to see if this works for Atari games or StarCraft.
But, sadly, it doesn't work as advertised. My hypothesis is that it's extremely difficult to get great results with greedy training. Jointly learning multiple layers from huge datasets is why things started working so well.
If you need to understand emotional reaction on video sources, our API can fill in the gaps not currently filled by Google's Cloud Vision API: https://www.kairos.com/emotion-analysis-api
We use neural nets to generate descriptors of videos where motion is observed, and classify events as normal/abnormal.