Jeremy here. Thank you for sharing my article on HN! I've used more programming languages than I can count since I started coding 40 years ago, and it's not often I've gotten really excited about a new language. So the fact I've written this should tell you how special I think Mojo is.
It's early days of course - Mojo is still a prerelease, which means it's only available in an online playground and it's not open source yet (although that's planned). But for such a small team to create something like this so quickly gives me a lot of confidence for what's coming over the next few months.
Let me know if you have any questions or comments.
It seems like Julia already had a pretty good solution to the two language problem as well as being designed from the ground up for numerical computation (Python has numpy, but it seems bolted on and clunky by comparison). Yes, there are issues with large executables (large runtime), non-optimal gpu kernels, etc, but it seems like many of Julia's nagging issues could have been fairly easily solved if it had received as much investment as some of the alternatives have received - Modular appears to have a lot of funding to develop Mojo, and frameworks like PyTorch and TensorFlow have the backing of mega tech corporations. At one point Swift for TensorFlow was going to be the next big ML language and was being funded and developed by Google.
I'm not sure there's a question here... more of a frustration that Julia seems to be a very good solution already (and is more mature than Mojo) and yet it seems to get passed over because companies decide to fund some new(er) shiny thing instead.
BTW: I think your matrix multiplication demo could achieve very similar performance (including the SIMD, parallelization and LoopVectorization) in Julia.
"get passed over because companies decide to fund some new(er) shiny thing instead" is not a fair or reasonable assessment. Why would a startup that's racing against time and money decide to create a new language when something already exists that can meet their needs with just a little tweaking?
My post embeds my recent keynote at JuliaCon in which I lay out some of the shortcomings of the language, as I see it.
MLIR didn't even exist when Julia was created. A language that can really take advantage of MLIR, and is designed from the ground up for modern accelerators, is not something that can be readily created by tweaking Julia AFAICT.
> Why would a startup that's racing against time and money decide to create a new language when something already exists that can meet their needs with just a little tweaking?
Have you MET programmers? The idea that we/they would do this is not anywhere close to surprising.
> MLIR didn't even exist when Julia was created. A language that can really take advantage of MLIR, and is designed from the ground up for modern accelerators, is not something that can be readily created by tweaking Julia AFAICT.
MLIR is another level of intermediate representation that sits at a higher level than LLVM-IR. Should the Julia folks decide that MLIR would offer advantages for optimization it wouldn't be difficult for them to add in an MLIR intermediate stage - that's kind of the whole point of an IR. But again, that would take some development effort - if the kind of investment that Mojo was getting were available for Julia it would get done. Nothing about the language itself would need to be impacted, it's the mid and backend where MLIR can offer advantages.
Disagree that Julia solves the two-language problem, although it was a major step in that direction. It may currently solve it for specific roles (e.g. data scientists), but it does not solve it for general-purpose programming.
One specific example? Robotics and autonomous systems frequently use ahead-of-time (AOT) compiled binaries and libraries. You don't want your quadrotor / RC car / self-driving Tesla trying to perform JIT compilation while operating in time-sensitive scenarios. AOT compilation is a major Achilles heel for Julia--it's possible (with major limitations), but feels more like an afterthought than a core feature.
I can't see Julia scaling to general-purpose programming until its AOT compiling capabilities improve. JAX has some of the same drawbacks. I'm very interested to see if Mojo can fill the gap of AOT compilation with a reasonably simple syntax. (And yes, I am aware of Nim--I do wish its scientific ecosystem was more developed though).
This is a very good point. You must support AOT compilation if you are going to have a general purpose system. This was clear when Julia came out and I interacted with core developers communicating our experience with creating SciPy and the critical reliance on the AOT features of Fortran/C/C++ as well as the bindings to Python. I believe simpler spellings can be achieved (i.e. more unification between scripting, dynamic JIT, and foundational AOT use-case), but the ecosystem is not close to a one-language to rule them all scenario.
I've always felt that the choice to 1-index and favor mathematics over engineering has kept the engineers away.
You can build a fantastic language for mathematicians, but people outside that domain won't feel welcome or compelled to use the language to scratch their itches.
I don’t think “mathematicians” is the right category. More like scientists/engineers who would otherwise use Fortran and Matlab (both have 1 based indexing) to do number crunching. That’s actually a lot of people though writing quite a bit of code. It’s just they just don’t work in the software industry, and usually for them the code is not the work product.
I think the aspiration was to provide something that a Matlab user can immediately pick up, that is almost as fast natively as Fortran, that also provides features to do good modern software engineering if you want to. And it succeeds amazingly at this.
In my own view, adoption has not been wider because 1) most people don’t need speed so the trade offs with the compiler + using a weird language aren’t worth it, and more importantly 2) outside of some incredible software gems, the package ecosystem tends to be flaky and maintained by overworked academics in their spare time.
> outside of some incredible software gems, the package ecosystem tends to be flaky and maintained by overworked academics in their spare time.
That was the point I was trying to make.
This language is too esoteric and leafy for most hackers / engineers / enthusiasts to spend their spare time building support packages. They'll pick up something like Nim or Rust as a new language before they'll look at Julia.
By catering to mathematicians, scientists, and engineers, the broader population of software folks were excluded. Or, if "excluded" is too harsh, they at least weren't incentivized.
Because of this, the best parsers, serializers, protocol implementations, scrapers, web servers, and other assortment of important technologies get written in other languages instead.
I think you are right about this -- I just wanted to draw the distinction between mathematicians and the much larger population of scientists/engineers, who really do write a lot of code which is in its way "serious computing" even if it doesn't deal with any of the concerns and practices of "good software engineering".
(I also think Julia would be okay with the current level of support for the "computer sciencey" technologies you describe like parser and web servers and so on -- it's pretty good at calling out to C or C++ if needed. But even many of the scientific/numerical packages don't have the maintenance resources they really need, so are often "good but flaky". This situation does seem to be slowly improving though.)
I've never understood why some people seem to care so much about 1- or 0-based indexing. Literally who cares? I cannot comprehend actually refusing to use an otherwise-perfectly-good language just because the indexes happen to start at 1.
People care about difference things. There are probably people who don't give a rat's behind about stuff you care passionately about or irks you in some significant way. Does that invalidate your caring?
"... am I supposed to type (or copy paste) that symbol every time I want to xor two numbers? There must be a better way right?"
"Sorry, no."
I immediately gave up the language after this. Too bad, I really loved it and was looking forward to progress in DL stack in Julia at that time. Too math-y.
You can always just use `xor(a,b)`. Julia makes sure that there aren't any unicode operators that don't have an ascii equivalent (in Base at least). Also, most editors will allow you to type ⊻ as `\xor` and tab complete to ⊻.
Julia has built in support for special characters like that by typing \charname<tab>. You have to remember the name but it's not all that hard to type.
Julia didn't really gain momentum because syntax is offputting . When we started our first ML project in 2013 , we tried R, Julia , Python , and even at python with rugged ML libaries , we choose it because none of us liked other language syntaxes , coming from C++/Javas background. Then we fully falled in love with python.
Familiarity changes rapidly in the world of software development. But I do agree about discoverability.
However, I think the issue is rather that we need to find new discoverability systems that work for multiple dispatch rather than limit our languages to single dispatch because we can't come up with better ways to achieve discoverability.
Maybe (a,b)f would be better syntax.. or we could start writing backwards ;)
but now we're getting into seriously unfamiliar territory.
Multiple dispatch might actually be superior for discoverability when we get this right because you can filter the method list based on all arguments rather than just the first one.
python's syntax is whitespace-sensitive and doesn't have curly braces around blocks
they did finally add assignments in expressions and conditional expressions, but every time i show a list comprehension to a java programmer they glaze over
Hi Jeremy, thanks for sharing this, hope it works out well. Also, thanks for doing the past work on Dawnbench&etc that y'all did. Really helped push an era of speedy deep learning to the forefront and helped stoke the flames of my passion for this particular subfield of research. FastAI really put out a ton of good work that inspired and helped me a lot. I'm generally stuck on the linear version of OneCycle, but I've referred a lot to y'all and the community over the years when building out my toolbox.
In any case, you have more years under your belt than me in the software field, so I'm sure you have some of the standard skepticisms too, hopefully it works out really well. Will be keeping an eye on this software, it seems promising. Esp as I tend towards the more esoteric methods that tend to break the existing deep learning tooling out there.
Thanks, Jeremy. It does sound exciting. It reminds me a bit of what Blaze hoped to offer (a unified interface but to diverse data storage instead of processing systems) but never came to fruition. And folks have been talking about needing an ML/AI-centric language that could feel more natural and expressive than the abstractions that Tensorflow, PyTorch, and Jax provide. Maybe Mojo is it.
To get full performance though, you can't write just Python. As you show in your demo, you have to add verbose typedefs, structs, additional fn/defs, SIMD calls, vectorization hooks, loop-unrolling, autotune insertions, etc.
While great, this adds mental overhead and clutters an otherwise concise and elegant syntax. Do you think this syntactic molasses
will become second nature to developers? Will IDE tools make writing it easier?
> To get full performance though, you can't write just Python. As you show in your demo, you have to add verbose typedefs, structs, additional fn/defs, SIMD calls, vectorization hooks, loop-unrolling, autotune insertions, etc.
As an engineer, it feels like you can never escape this. Speaking for myself, nor do I want to!
Zooming in & out of different syntaxes that are tuned to that local context seems to be the better developer experience.
That’s what I felt when I saw mixed jsx (html & javascript) or mixed swift & objc & cpp code. Though, tbh, sometimes mixing stuff in does come with baggage; that’s hard to jettison.
Regardless, as a software engineer who works with ml engineers, it is horrendously painful not having a unified systems language that helps build the model and deploy it into prod. Putting the ML scientists in charge of deploying to production by learning about SIMD calls, vectorization hooks, structs & typedefs is something I welcome.
Most of what you’re calling syntax isn’t syntax, it’s just extra code that allows you to be specific/concrete about how certain types and code should operate.
When it comes to types, once you’ve written the optimized type definition, you don’t have to think about it as much when actually using it. It doesn’t add extra clutter either.
As far as the other things, there isn’t really much extra to type other than being specific about what kind of optimization you want the code to use… vectorization, unrolling, etc.
Also I think it’s worth pointing out that AIs will be writing more and more code for us in the future. So assuming we’re writing much code at all in the future, it will probably be in as simple of a form as possible (like python) and then we can ask the AI to write whatever performance annotations it thinks will be effective.
2) they're the same thing. Consume is the word we're currently using for the operator, owned is the argument convention. We may need to iterate on terminology a bit more.
3) Because it composes properly with chained expressions: `x.foo().bar().baz^.do_thing()` vs something like `(move x.foo().bar().baz).do_thing()`
This is very exciting , I had seen you had mentioned about Numba and Cython but any comments on Pypy ? How about Interfacing Mojo LLVM+MLIR With PyPy? Would it make it more compatible with python ecosystem since 90% of python ecosystem works on PyPy.
I afraid i am not making any sense , but this is what i had been waiting for past 13 years since i touched python 2.5.
The broader context of the target audience of this language is very important here — it’s oriented towards the familiarity and needs of a community most comfortable with python
It's early days of course - Mojo is still a prerelease, which means it's only available in an online playground and it's not open source yet (although that's planned). But for such a small team to create something like this so quickly gives me a lot of confidence for what's coming over the next few months.
Let me know if you have any questions or comments.