Good question. Right now I am using AWS spot instances + credits bought for Bitcoin. This makes the thing maybe 70 times cheaper than on-demand instances paid normally.
But when I go above several billions of images, I'll need some investment for sure :-)
Tineye indexed about 20 billions of images, and my estimates show that I should be fine with my own resources up to 1B. This is good enough to improve and test the tech
I'll agree with you that it's much harder than it should be (thankfully, finding the implementations is the hard part, not using them), but yes, these methods do exist.
DeepLIFT (the method I linked in my original comment: https://github.com/kundajelab/deeplift), takes a Keras model (with Theano or TensorFlow backend) as input and provides feature importance scores for any desired layer of the network (raw data inputs, inputs to dense layers following convolution, etc.). Keras-Vis (https://github.com/raghakot/keras-vis) is another nice package that allows for easy visualization of saliency maps and convolutional filters. Perturbing inputs and looking at the effect on the output of the network is another technique people use pretty often.
I think there's a lot of room for this space to become easier to use, especially for newer deep learning practitioners. To that point, I definitely agree with the author of this blog post.
It is more about economic power.
UN could make Iran to give up on nuclear weapon using economic sanctions, but in this situation cats are much stronger economically.