Is there reason to believe Julia is actually fast outside of microbenchmarks? Their strategy of aggressive specialization will always look good in microbenchmarks, where there's only one code path, but could blow up in a large codebase where you actually have to dispatch across multiple options. I've never seen a Julia benchmark on a big piece of code.
I've had some problems with Julia performance for a finite element method implementation (https://github.com/scharris/WGFEA), mostly I believe because of (1) garbage generated in loops causing big slowdowns and (2) slow calls of function closures. Functions defined at the top-level which don't close over environment values are fast, closures however are quite slow, which is really painful for situations where closures are so useful, e.g. to pass as integrands to integration functions.
There is an github-issue about the closure slowdown, but I don't have it handy. Both can be worked around, by writing in a lower level style, e.g. by using explicit loops acting on pre-allocated buffers, avoiding higher-order functions, etc. The pre-allocated buffers can be a lurking hazard though (Rust avoids the danger in the same strategy with its safe immutable "borrow" idea). I felt like these workarounds were giving up too much of the advantages of a high level approach for my own tastes.
I have converted to Rust to avoid the garbage collection for sure, and I'm extremely pleased with the performance. It would be nice having a REPL though, I do miss that. And I do intend to stay involved with Julia. I'm sure the situation will improve.
Good high performance garbage collectors aren't easy (and they are easy to take for granted after being on the JVM for a while) - that's probably the biggest challenge for Julia as a high performance language, IMO.