Anyone who has tried to do significant translation knows that Google Translate is pretty bad. If you just want to translate a word or even a single sentence, it's fine. But if you want to venture into larger pieces of text(like a script for example) then you couldn't do much worse than Google Translate.
There are definitely options available for Hebrew transcription, and Israel has no shortage of domestic computer talent. Human transcription services run about $150 per hour of program material, which is not a huge expense for a business. It's certainly an inconvenience, but a predictable one for the market they're operating in.
Cool, it's impressive how much can it do with a short sample, although this seems like an easy way for end users to deep fake their friends / enemies saying something.
Maybe the solution is to have a randomly generated paragraph of text to read which expires in short amount of time. So you can't predict it and you don't have enough time to splice together a fake reading from something else.
The problem with any anti abuse measure is someone can create another project which does not have any of this. There are a handful of projects which can do pretty good voice synthesis right now. It would be about as easy as getting a consensus for all photo editing tools to place a watermark on the image to prevent abuse.