Julia 1.6: what has changed since Julia 1.0?

celrod · on Feb 14, 2021

> (* Technically not all mutable objects live on the heap, because some never live at all, as they are optimized away so are never allocated in the first place.)

The compiler will often stack allocate mutable objects in Julia. This is not the same as "never existed in the first place", because the stack pointer gets incremented and underlying data layout is you load from it is the same as that of the mutable object you allocated.

Here is one example where that's very obviously what's happening: https://discourse.julialang.org/t/why-is-svector-faster-than... But it can happen now generally with mutable structures that can't practically just live in registers.

oxinabox · on Feb 14, 2021

fixed, to be more limitted in the claim

celrod · on Feb 14, 2021

Cool, and fantastic summary! I enjoyed reading it.

calebwinston · on Feb 15, 2021

If they are optimized away, are their destructors/finalizers still called? I'm really hoping the answer is yes...

StefanKarpinski · on Feb 15, 2021

Yes, that is guaranteed.

klmadfejno · on Feb 14, 2021

> People often complain about the “Time To First Plot” (TTFP) in Julia. I personally have never minded it – by the time I am plotting something, I have done minutes of thinking so 20 seconds of compilation is nothing.

This amuses me. I hadn't really considered the author's perspective, and now I think it aligns with my take on it pretty well.

vanderZwan · on Feb 14, 2021

If that's what you're used to it's fine probably, you'll often have no choice but to adjust your workflow to it and make it work. But if you've ever experienced an environment that instantly shows you results and lets you get into a fast iterative loop to explore your data, it can be hard to give that up again.

Source: my current and previous job were basically data viz programming jobs which were all about optimizing said iterative loop for scientists. Going from minute-long to sub-second rendering speeds is a game-changer for many.

EDIT: having said that, this kind of reminds me of what the biggest difference between analog and digital photography is for me, namely whether or not you get instant feedback. I do remember from my art school days that in my experience film was a much better option for training the skill of observation and composition than digital, because it forces you to essentially picture the photograph before you take it. However, once you get somewhat decent at that... I'd switch to digital and reap all the benefits it has ;). The same logic might apply to learning how to plot your data.

stjohnswarts · on Feb 14, 2021

I used to thing c++/rust was bad. Then I did some FPGA work and it was quite painful waiting for rebuilds :) . I even appreciated c++/rust build times a bit more after that. Waiting an hour before being able to test your code will definitely make you cognizant of dotting your i's and crossing your t's in new code. We had a design that filled almost 80% of the chip and had pretty strict timing (near the limit of the chip) and it took it quite a while for the fitter to meet the timing requirements.

socialdemocrat · on Feb 14, 2021

It is literally “first” time, not every time. Thus if your worry is that you cannot do fast iterations in Julia where you plot multiple times, then your worries are misplaced.

Second, third, forth etc plots in Julia are fast. Likely faster than any of the competition as it is running highly optimized native code at that point.

cbkeller · on Feb 14, 2021

Totally fair, though note that these slow times are only for _first_ plot - I leave my repl open all day, and all but the first plot are sub-second

vanderZwan · on Feb 15, 2021

Ah, then I suppose it's just a good excuse to go fetch another cup of coffee or tea. I can imagine it being a bit irritating to some users though.

linspace · on Feb 14, 2021

It is nevertheless im portant because a simple plot is a nice test if you are considering using a language for scientific computing. Criticism about plotting libraries quality is even more valid. I cannot believe that plotting an scatter plot of a few million points is so slow while AAA videogames render millions of pixels in real time.

On the long term? I think Julia is a better language than Python, Matlab or R for several reasons like modularity, package management and performance. But these are things that require at least 10 hours of use (to say something) instead of 10 minutes. With so many languages promising enlightent and transcendence to a next power level you cannot expect people to make that kind of investment.

eigenspace · on Feb 14, 2021

Makie.jl [1] does it's plotting on the GPU (like a video game engine), so can handle millions of datapoints just fine.

Also, note that for people to whom plotting is really important, it's quite easy nowadays to just AOT compile your plotting library to your sysimage with PackageCompiler.jl [2] for instant plots.

[1] https://github.com/JuliaPlots/Makie.jl

[2] https://github.com/JuliaLang/PackageCompiler.jl

The_rationalist · on Feb 15, 2021

What does Julia bring over Python Poetry package manager ? https://github.com/python-poetry/poetry What about modularity, concretely?

krastanov · on Feb 15, 2021

Julia's model is based around multimethods, which enables pretty crazy levels of interoperability (imagine being able to use scipy and tensorflow together, with tensorflow's autograd just magically working on most of scipy's functions). More about it here https://m.youtube.com/watch?v=kc9HwsxE1OY

linspace · on Feb 15, 2021

I don't use poetry but I will risk a comment. You will have to judge:

At the language level Julia separates files from modules. Modules are just a language construct, another object. They can span several files, have several them in the same file or even declare them in the REPL. This last point is important for me since I can (re)evaluate code in the REPL without polluting it. Usually Python feels more interactive but this is a point where Julia wins at the REPL.

Julia's package manager is just another library and you have support integrated for it inside the REPL.

I'm not devops, I don't think I can judge the relative merits of Julia's package manager design relative to Python's but for me "it feels" better. You will have to judge for yourself reading the docs and this things about federated package management.

I have not tried it but Julia allows to build a sysimage with all dependencies included. I know there are similar things to build standalone Python apps but last time I tried (long ago) it was a pain.

Finally, Python depends a lot on external C libraries to achieve performance. This obviously complicates deployment and is a reason why I use Ubuntu: I have binary wheels for almost everything. Julia provides performance without external tools and packages are usually pure Julia code.

I think Julia's is vastly superior concerning packaging but of course for 99.9% of people this is not enough to make a switch yet (me included, for the moment I use it for hobby projects).

Regarding modularity is amazing and not obvious at first why Julia's performance increases modularity. The reason is that in Python since you need to use C/C++ for performance your data structures need to be shaped appropriately when they cross this interface. This rigidity propagates through your program and makes you build big frameworks. I have in mind for example PyTorch or Tensorflow. So you have Numpy arrays, PyTorch tensor... you have of course almost transparent conversion between them since numpy is a standard. But all of this is achieved because a behemoth like Facebook or Google are behind injecting money and manpower. It's pyramid building: some engineering but a lot of work. Even then you are stuck with arrays and contorting your program to vectorized operations.

There are of course some dark spots for Julia but I have the feeling they will be solved. I don't consider myself a fan boy or an early adopter but I think it has a future. So I thought for Python 20 years ago :)

oblio · on Feb 14, 2021

That won't stop these people. It's the same discussion as the interpreted languages vs compiled languages or the editor flame wars.

On one hand you have people that say "thinking takes a lot longer than waiting a bit for compilation or actually editing source code" (I'm in this camp) and people that go "I don't want to wait for compilation and I want my editing to be hyper-efficient even if I have to invest hundreds and thousands of hours into it, so that I'm always in the zone/in the flow".

People are just different but every camp thinks They're Right and The Others Are Dumb and Stupid And Dangerous.

My personal guess is that besides the split in personalities/workflows, there's also a difference in projects. People who work on existing projects tend to read a ton more code/docs/team comms/architectural diagrams and edit/compile less so they care less about these issues. People who constantly create tons of mini-projects with short lifecycles care more about them.

cmeacham98 · on Feb 14, 2021

You seem to be falling into the same trap you lament in your comment:

- The other side is "these people" that "won't stop"

- They're striving for "hyper-efficiency" at the cost of "hundreds and thousands of hours"

- People who complain about this issue don't (or do significantly less of) reading documentation/code/etc

People talk about TTFP because it is a real issue that is off-putting for many programmers that would otherwise love to use Julia. Julia is roughly an order of magnitude slower than python in this instance on my computer, and that's not a good first impression.

That doesn't mean that everyone is going to be impacted by this issue (obvious ex: you aren't), but this isn't akin to a flamewar because unlike editor choice (which is opinion), Julia would be better for everyone if TTFP was improved. Whether or not it should be prioritized as a development goal is up to the Julia team, but it's not just some difference of opinion like interpreted vs compiled or functional vs OO.

DNF2 · on Feb 14, 2021

I cannot fathom how ttfp is important. It's time to first plot, not to every plot. After it's down to ~10 seconds, why on earth does anyone care?

enriquto · on Feb 14, 2021

> I cannot fathom how ttfp is important.

I personally do care for my concrete usage pattern.

This is my use case: A long shell script that does a lot of things. At some point, inside a loop that runs hundreds of times, it needs to solve a couple of small linear systems and plot a simple graph. There's hundreds of png graphs, that are then combined into a video sequence. Right now, the computation is done by calling octave (inside the loop) and then gnuplot (to create the actual graph from the octave computed data points). I would like to replace each call to octave+gnuplot to a single call to julia. Yet, this would make my script run in a few hours instead of a few seconds, because for this usage pattern all plots are first plots

Before you suggest that I should rewrite the whole thing in julia, maybe you are right but

1) it would take me a few weeks that I don't have

2) that's not my point. A good tool is a tool that can be used for purposes that it was not intended to, like this. If the time to first plot in julia was a millisecond instead of 10 seconds, then julia would be a much better tool.

stillyslalom · on Feb 14, 2021

You may want to look at [DaemonMode.jl](https://github.com/dmolina/DaemonMode.jl), which spins up a persistent Julia process in the background so you don't have to pay the TTFP penalty more than once, even when shell-scripting.

jarvist · on Feb 14, 2021

As you already have the gnuplot commands, perhaps using Julia + https://github.com/gcalderone/Gnuplot.jl might work? It's a very lightweight package, so even using it directly should be OK.

I do all my plotting with Gnuplot.jl, as gnuplot is fast, and I can save .gpt files which reproduce the plots for later reference and making publication-quality.

DNF2 · on Feb 14, 2021

But this is really the worst conceivable use case for Julia. Why are you interested in making this change, when Julia offers no advantage in this scenario?

enriquto · on Feb 14, 2021

> Why are you interested in making this change, when Julia offers no advantage in this scenario?

Well I love julia the language. It's the interpreter quirks that I find annoying. If julia had something lean and superfast like luajit it would be incredible!

The_rationalist · on Feb 15, 2021

If only they could switch to graalvm :m

whateverwhynot · on Feb 14, 2021

I'm not the parent, but I faced exactly the same issue when I tried to switch to Julia from scilab. As for why I'd like to switch: Julia is a better language, it is way easy to run your script on a cluster and I've been bitten more than once by the two language problem... So Julia seems better for everything but for one of my most common use cases...

aliceryhl · on Feb 14, 2021

It would be a perfectly fine use-case for Julia if it wasn't so damn slow.

stjohnswarts · on Feb 14, 2021

I still have a soft spot for octave. It was my first matrix oriented program. I've moved on to pandas and c++ but I still pull up octave for signal analysis because it's just plain easier and has some great functions and I think it may have created folds in my brain as an EE undergrad that will always be there :)

cbkeller · on Feb 14, 2021

It’s not trivial to do, but compiling plots into your sysimg might actually fix that

enriquto · on Feb 14, 2021

Thanks! Is there a way to compile all installed packages into the sysimage? If so, why isn't that the default?

cbkeller · on Feb 14, 2021

Technically yes, but then IIRC you’d have to recompile the whole sysimg any time you update any single package, which could get to be a pain. The usual compromise seems to be to include just a handful of your most used packages in the sysimg (say Plots + Revise)

enriquto · on Feb 14, 2021

> you’d have to recompile the whole sysimg any time you update any single package

Are there any downsides to this? You never care how long does a system update take. You always care how long do your programs run.

Sukera · on Feb 14, 2021

The downside is that PackageCompiler (or SnoopCompile, for that matter) can only pack into the sysimage what it sees. This is usually done by "recording" which methods get called during a session and putting the compiled native code for those into the sysimage. If it's not called during setup, you'll still have dynamic compilation (as for any code that hasn't ever run).

cbkeller · on Feb 14, 2021

I’ve never tried it myself, but can’t think of any reason why it wouldn’t work in principle if you do want to go that route

sgt101 · on Feb 14, 2021

Could you cache the requests or write the parameters into a file and then make the call to Juila, and ask it to loop over the requests?

saiojd · on Feb 14, 2021

You can't possibly be serious. As things stand, even just redefining a struct field requires restarting the interpreter, which forces you to wait ~10-30 seconds for everything to compile, over and over again. Maybe you only use Julia for small scripts, and can afford to never restart the interpreter?

fishmaster · on Feb 14, 2021

> even just redefining a struct field requires restarting the interpreter

It really doesn't if you use a module.

saiojd · on Feb 14, 2021

I mean, I still need to recompile the module no? Or do you split your project into several small modules to avoid this? (I've avoided this so far as I found the module system a bit tedious tbh, as it requires to explicitly list everything you need to export, as well as manually import stuff from the parent module).

fishmaster · on Feb 15, 2021

Usually as long as I develop I have structs that I change inside a small module, then I can recompile just the module and access the new struct, e.g. for neural networks.

It's not perfect but it's something.

saiojd · on Feb 17, 2021

Useful trick, I'll try that, thanks.

the-smug-one · on Feb 14, 2021

Dang, why can Common Lisp do this so well and Julia so poorly?

eigenspace · on Feb 14, 2021

The assumptions our compiler is able to make about types never changing (and a few other restrictions) allows for a lot of optimizations in julia that are not possible in Common Lisp. Julia's compiler is able to do a few things significantly more aggressively than CL.

That said, there is some very interesting work happening on getting around these restrictions. There was a PR from Tim Holy a while ago that could have allowed it, but there were some problems with the PR, and there were also some associated costs that were deemed too steep to pay.

That said, there's other great work on other ways around this. For instance, you can dynamically redefine structs all you want in Pluto.jl notebooks and there's no performance penalty!

This is something that kinda just fell out as a natural consequence of it's reactive design.

whateverwhynot · on Feb 14, 2021

Well, with Julia it is time to first plot for every new session. So if your typical workflow is to start Julia and just do a single plot of the recently acquired data (quite common workflow in my field), Julia is an order of magnitude slower than python or scilab.

cbkeller · on Feb 14, 2021

If that’s your workflow it would probably be highly advisable to compile plots.jl into your sysimg

walshemj · on Feb 14, 2021

Me too not to go all four yorkshire man here, but I can remember when you would have to write your own plotting programs in Fortran and then wait for the pen plotter to slowly draw your plot.

And not that long before that you would print the results out and manually plot them on graph paper.

dnautics · on Feb 14, 2021

> People are just different but every camp thinks They're Right and The Others Are Dumb and Stupid And Dangerous.

That's not quite right, I think. People are looking for excuses to not use Julia (r new technology X) because it serves as confirmation bias that their choice of <blub language> is still good and there is no need to start thinking of their extensive training and investment in blub is sunk.

detaro · on Feb 15, 2021

You really think "Julia is always superior and people just look for excuses" is the more likely option?

dnautics · on Feb 15, 2021

no, just for certain things. I don't use julia on a day to day, it is not the right tool for my job (CRUD apps).

jampekka · on Feb 14, 2021

I've tried it dozens of times. I tried it this week and gave up again after several hours.

As a language and technology it's way better than the alternatives, but usability of Julia as a programming language is broken by the ridiculous startup latency. I'm sure it's not even really hard to fix (at least by some caching hacks), but for some reason the Julia community is actively resisting such fixes.

And don't give me REPL. REPL is a fundamentally broken approach to programming, and REPL people just keep looking for excuses to keep using it.

Edit: And if you give me REPL, I can answer that I have tried it too. It's broken as well. Revise.jl breaks constantly with anything non-trivial. And with the effort going to horrible hacks like Revise.jl, I'm sure a simple caching of compilation results between calls would be nothing. It seems to be something ideological.

cbkeller · on Feb 14, 2021

I’m always amused to see these “I’m sure caching wouldn’t even be a hard fix, the community must be resisting it” takes.

Feel free to show me the PR’s that have have been rejected that would have solved the problem. Or if you think it’s easy, feel free to make that PR yourself.

As the quote goes, “There are two hard problems in programming: cache invalidation, naming things, and off-by-one errors”. This is cache invalidation.

jampekka · on Feb 14, 2021

I will probably give it a go when I have some time to spare. I don't expect to get a PR through, or even want one. Perhaps I'll release it as a package if it works at all.

Cache invalidation is not always that hard. For example pure functions are more or less trivial to cache, and you don't even have to do any explicit invalidation. Perhaps do some LRU type pruning if the disk starts to fill up.

I know next to nothing about Julia's internals, but given packages like Revise.jl, PackageCompiler.jl and SnoopCompile.jl are even possible, I don't think it can be that hard. Dumb caching should be a lot easier than any of these.

I may well be wrong, and that nobody has done this yet is a hint to me being wrong. But I think another scenario may be that Julia ecosystem is so hung up on REPLs and notebooks that this case just gets no attention. And very few non REPL-or-notebook people hang around long enough to get to know the internals at all. Maybe I'm just desperate enough?

There's also another possibility, which may sound bizarre but I think is possible. At some level people who come from scripting language background think that long compile times is a sign of a "real language". This is somewhat prevalent in e.g. Javascript scene, where more and more byzantine compilation systems are introduced for a language (or platform) that works just fine without compilation (or can do very fast on-the-fly "AOT" if needed).

socialdemocrat · on Feb 14, 2021

People have different style and preferences. I have programmed for over 30 years and I find that REPL based development in Julia beats anything else I have tried in term of productivity, and I have tried a ton of tools, IDEs and languages.

But sure it may not fit your particular preference or it may be that you have simply not learned to use it effectively. It takes some time to work effectively in a REPL style. It took me some years.

Not every language is suited for REPL development. Julia, LISP and Haskell seem quite well suited.

I don’t have quite the same good experience with Python e.g.

jampekka · on Feb 14, 2021

It probably depends also on what you program. If the task is simple enough and doesn't need much revisiting, REPL is probably fine. But OTOH, just writing the code for a simple case and running it isn't too bad either.

How do you persist and document your code with REPL-development? Do you log the commands to some separate file? How do you recreate the REPL state if it crashes or you have to reboot? How do you make sure the REPLs state is what you think it is?

These are (some of) the concrete problems that I see with REPL, and that don't exist for program based workflow. And I think these are fundamentally impossible to solve for REPL, and very important for e.g. reproducibility (and IMHO sanity).

_coveredInBees · on Feb 15, 2021

I just don't think you've tried or been exposed to responsible REPL prototyping and development. Like sure, you could have all sorts of issues if you use REPL irresponsibly and pollute your state with undocumented code and side effects. But if you use it responsibly, it can be really fast and efficient when prototyping things... Even complex subsystems in a larger codebase. I've prototyped entire reimplementations of SOTA object-detection DNN algorithms in Pytorch with no issues and have been far more productive because I can develop functionality in small chunks while intimately understanding the problem at hand and associated data and then immediately graduate it to a working script/module/package as I go along.

For more complex things, I can even have a stub for a function I am developing and breakpoint into it and then I prototype the functionality in the REPL and that is wayyy faster and less bug prone than trying to go at it blindly in your IDE without being able to experiment and verify your code as you develop it.

jampekka · on Feb 15, 2021

I probably haven't. But I have seen quite a bit of irresponsible REPL development.

What I don't see is the benefit of the REPL. I effectively use my editor and the shell as a "REPL", but instead prefer to have the code in a program structure all the time. This means I don't have to have extra discipline for not accidentally polluting the state. Plus I can use version control, which means I can quite easily try out quite deep changes and still revert back to any state I had before. This is difficult with REPL. And with this workflow I don't have to do any extra "graduating" step; the code usually cleans up during the process.

A big additional benefit is that I can use the shell. I can e.g. pipe stuff from other programs, and I make a CLI for the program on the fly.

The main problem perhaps is when there are some longer computations, as there often are in analyses. For these I prefer memoization to the disk. The usual way to do this across REPL-sessions is to write ad-hoc temporary result files. This gets hairy really fast when you have to update these when the codepath before the dumps change.

I don't see much benefits of REPL over my workflow. Maybe some completions are nicer in REPL and you may get a bit nicer formatted output, but these are quite trivial. Perhaps people who are not accustomed to shell think that REPL is the only way to "rapidly iterate"?

oxinabox · on Feb 15, 2021

Generally I am copy-pasting code out of a WIP package to test out an idea as I go. So the code is already being written in a file. Or I am using Juno/VS-Code/Vim-Slime to simplify the copy-paste.

Other times I have the package I am editing loaded with Revise.jl active and I am calling in and trying out the methods I am concurrently writing in my text editor.

It's more like TDD than anything else. It's got that same quick back and for of run, write, run write. But a but more interactive. (Note I am not saying that is TDD -- it isn't -- tests are not nesc written or saved. Though I do often use this while doing TDD to run a test I have written)

krastanov · on Feb 15, 2021

I think you are being unfair. Just look at the various Julia forums: addressing startup latency has been the main focus since 1.3 and each of the version since then (especially the 1.6 beta) have had significant improvements. Same with the REPL comments: you can easily just write scripts instead of using Revise/Pluto/IJulia or another REPL approach.

jampekka · on Feb 15, 2021

> Same with the REPL comments: you can easily just write scripts instead of using Revise/Pluto/IJulia or another REPL approach.

The problem is that I can't. The startup latency makes using scripts practically impossible. I think it could be relatively easy to fix (hack). I'll shut up about this on the very second there's some way to get same magnitude of latency with scripts as there's with REPL.

oxinabox · on Feb 15, 2021

Well then, DaemonMode.jl as a hack to run scripts as fast as in REPL by leaving a julia Daemon running and just sending work to it

https://github.com/dmolina/DaemonMode.jl

gugagore · on Feb 14, 2021

What do you prefer as an alternative to a REPL? How do you interact with your programs?

What seems broken to me about REPLs is how text-centric they usually are. But I want to be able to easily introspect and play with my programs, and REPLs are one way to do that. Really good debuggers and environments for static languages are "another" way.

jampekka · on Feb 14, 2021

I just make programs that have an "entry point" (e.g. main() in C or if __name__ == "__main__" in python) and run them off the shell. Oftentimes with some arguments. This is how I interact with most of my programs, and it works fine for my own ones too.

I agree that this is probably not for everybody, i.e. if you're not used to the CLI workflow. And admittedly it would be sometimes nice to have e.g. embedded graphics, but unfortunately the troubles usually outweigh the benefits (looking here at emulating damn 70's terminals too...).

I mostly do the introspection with print, dir and help straight in the code. Not ideal, but rarely fails you, and I've yet to find a debugger GUI or IDE that isn't more trouble than it's worth.

Something like autoupdating RMarkdown/Sweave/Pweave/etc would probably work often as well. I sometimes do use Pweave, although it tends to be a bit buggy as well, and doesn't have any caching logic (although it's easy to use your own).

Sadly most efforts seem to go to Jupyter notebooks and such, whose state/code inconsistency are simply a non-starter if one wants to keep some sanity.

gugagore · on Feb 14, 2021

> I mostly do the introspection with print, dir and help straight in the code.

This means you have to change your code in order to debug it. And you have to know what you're debugging before you change your code. This is really, in my opinion, much less than ideal, because you have to iteratively instrument your code while you figure out what is wrong. It's a cycle of You print out the first suspect thing, then that produces 5 potential suspects, and you have to decide which one to print next, or print all of them.

You're absolutely right that it rarely fails you, and so I surely want that facility to be at my finger tips. But getting a text representation of a value is literally 33% of what a REPL is for.

jampekka · on Feb 14, 2021

If I'm debugging something I'm writing myself it's usually quite obvious where the problem probably is. With other people's code it takes a bit of digging, but is usually found without a debugger. Just print enough stuff, with interpreted languages this is usually really fast. Something like Javascript's console.log is actually really nice for this (and browsers have hands down best debugging tools anyway).

I'm not sure how REPL helps you out of this. You still have to somehow change the state of the program, but if you "monkeypatch" it using REPL, you now have to keep in your head what the state is.

In the end code is just description of how to bring the program to some state. I like to have that description on file so I don't have to keep it in my head.

Sukera · on Feb 14, 2021

If you don't mind me asking, how are you using julia? Are you using modules to wrap your code and functions to structure it, or are you just running one big script in top level scope?

Aside from compiler improvements, most caching related optimizations are happening on the module level, because that's where namespaces are seperated.

jampekka · on Feb 14, 2021

I'm not using modules. I usually start with one file with a demo or similarly named function that is called if the file is called as an entry point (like if __name__ == '__main__', except Julia makes it even worse). First the "actual" code is in separate functions in that file. No global state.

I tend to refactor code out of there to separate files, and then somehow import it. An ugly way is include, and I've tried Revise.jl with includet.

But I think the least ugly approach is the @from macro from here: https://github.com/Roger-luo/FromFile.jl Judging from some opinion in bug trackers, this is probably gonna get totally shunned by core devs and they'll keep on bikeshedding about the import stuff forever.

With this setup I have about 400 lines of code in three files. It compiles for 15 seconds. After every single change, and actually without any changes too.

I think performance wise this should be equivalent to using modules, but saving some pointless ceremony.

Sukera · on Feb 14, 2021

I'm using `!isinteractive() && main()` as my `if __name__ == '__main__'` equivalent, not sure how that's even worse?

It's not equivalent no - include doesn't introduce a namespace and neither does includet. Compiled stuff from packages (=modules with a Project.toml) is cached between runs, scripts just don't have that luxury of seperation. @from doesn't look into the files you're including and (somewhat simplified) verbatim pastes the code into your "main" file.

I don't think it's a lot of "pointless ceremony", especially since it keeps dependency management on a per project basis easy, is just a `]generate MyPkg` away and allows compiled code to be cached

If you don't want to use projects, that's fine - but please do so in a constructive manner and don't be surprised that the most common workflow (wrapping things in a package) gets more attention sooner. That just signals some disregard for other peoples' needs & wants, even if that's not intended.

jampekka · on Feb 14, 2021

!isinteractive() && main() doesn't probably work for my case. Oftentimes my files may have a main of their own, but that isn't called if the file is just imported. I don't use Julia interactively anyway (I've explained why many times in this thread). (Edit: the equivalent Julia chant is `abspath(PROGRAM_FILE) == @__FILE__,` IMHO slightly even more obtuse than Python's, but this is a minor detail).

If a "package" is used only by me and only from files controlled by me, Project.toml is clearly pointless ceremony. And `]generate MyPkg` too, and assumes REPL on top. Python manages this (albeit with some stupid arbitrary restrictions) fine, Node manages this fine. The compiler doesn't need that stuff for anything.

I didn't look into the implementation of @from, but I picked it up from a huge bikeshedding bug (still open, from 2013...) about local module imports, and assumed it's doing imports instead of including, as it also has a separate namespace. From the code [2] it's not clear to me exactly what it does when, but one branch seems to be generating a module with the code imported on the fly. Not sure this should be any different than any other module for the compiler. Are you sure you're not talking out of your ass on this one?

I don't care if people for some reason want to write their pointless ceremony, but what I don't understand is that people are so jealous of it that they insist of pushing it on everybody else too. I just want to somehow get access to those symbols defined in another file, why does this need more than the path of the file? I'm sure using just files-as-modules would probably be less work for the compiler, and it's easy to have a byzantine package ceremony on top if you want (Python has dozen or so available, so lots to draw from).

[1] https://github.com/JuliaLang/julia/issues/4600 [2] https://github.com/Roger-luo/FromFile.jl/blob/master/src/Fro...

simias · on Feb 14, 2021

I don't know how Julia fares but personally what bothers me with long compile times is when I can't context switch, waiting for the compiler's output.

What I mean in practice is that if you take Rust for instance, the compile times can be fairly long but the type checking occurs early on and is quite fast. Therefore once I know that this step succeeded I can usually let the compilation continue in the background while I focus my attention elsewhere.

If on the other hand if I need to wait a lot longer to confirm that my code is actually valid I find myself just staring at the output window, not willing to let go of my short term memory until I get a confirmation that my code was accepted.

The problem is not adding 20s to your overall dev time, it's to have a 20s interruption while you're "in the zone".

DNF2 · on Feb 14, 2021

If it were 20 seconds for every plot, this would be a major problem for me, as I tend to make lots of plots. But it's only the first plot where this is an issue. Surely you're not in 'the zone' that soon?

Seems to me like people are making a mountain out of a molehill.

jampekka · on Feb 14, 2021

If you don't use REPL (and for many good reasons you shouldn't) or some other such horror, every plot is the first plot. And it's pain. And not just plots really. Doing anything in Julia is pain if you try to use it as a programming language instead of an app for buggy, unreproducible and misunderstood ad-hoc analyses.

Seeing these answers makes me think Julia will never be fixed. I forecast Julia will be back in a niche within five years if they don't get their act together. And it's sad, because the alternatives are fundamentally broken. Julia isn't fundamentally broken, but the devs and the community seem to insist on superficial breakage.

eigenspace · on Feb 14, 2021

> (and for many good reasons you shouldn't)

Could you elaborate on this? I'd say repl based interactive programming is one of julia's greatest strengths, and avoiding the repl is probably setting yourself up for pain.

That said, if you do find yourself running lots of scripts and paying this penalty all the time, I'd suggest https://github.com/dmolina/DaemonMode.jl as a great way around these pains.

jampekka · on Feb 14, 2021

> Could you elaborate on this? I'd say repl based interactive programming is one of julia's greatest strengths, and avoiding the repl is probably setting yourself up for pain.

With REPL you have an invisible global state, can't reproduce what you have done, changes earlier in code path don't propagate to results, you don't have documentation of what you did.

It's for me really like trying to write a book by dictating. Except that you're dictating to somebody who's gonna give an independent summary of it to a third party and never gonna write down what you dictated. It boggles my mind how people can work like this, but they probably get hooked to REPL from the first tutorials and just don't know better.

I'll look into DaemonMode.jl. Not a fan of using a daemon (and I'm guessing there will be problems with e.g. interactive plots), but in the short term I'll take anything that could make Julia programming tolerable.

eigenspace · on Feb 14, 2021

> With REPL you have an invisible global state, can't reproduce what you have done, changes earlier in code path don't propagate to results, you don't have documentation of what you did.

Mhm, that's fair. I think Pluto.jl has a really neat approach to this, using reactivity (and technically even more state) to actually eliminate that experienced state.

If I could use it from emacs it might even be my goto way to interact with julia, but I also don't mind the statefulnes and find it manageable.

For me, the most important thing is that when I'm writing serious code, I create a local package. Then, in the REPL I load that package and have Revise.jl active so that it can watch the the package source ode and constantly do hot code reloading for me so that I'm never stuck with old versions of code running.

Then I do all my interactive analysis in the REPL, and plumbing in the package module. This eliminates a lot of statefulness, but keeps restarts to a minimum.

jampekka · on Feb 14, 2021

I briefly looked at Pluto.jl, and I think it's probably a good way. As I understood it it's a "notebook" that always runs the whole file. Like e.g. RMarkdown or sweave. All good. The state is fine too if it's explicit. But I'm fine with just CLI and print and occasional plot, which should be a lot simpler use case for development.

I create a "local package", meaning a file from which I relatively import. During development/analysis it's hard to foresee what the package structure is gonna be, so it's quite pointless to go through the whole packaging ceremony at this point. FromFile works fine for this.

As a temporary hack I could use REPL to call my "main" function and let Revise.jl update automatically. (In long term this is bad for interoperability with rest of the system). But in my experience Revise.jl tends to break a lot. Julia breakage is hard to analyze by itself, and Revise.jl often makes this more or less impossible.

I have to repeat that I really don't see how caching of the compilation results is even close the complications that Revise.jl or Pluto.jl have to do.

eigenspace · on Feb 14, 2021

> I briefly looked at Pluto.jl, and I think it's probably a good way. As I understood it it's a "notebook" that always runs the whole file.

Not quite. It builds a dependancy graph of your code and can figure out what definitions depend on others. So depending on what you change, maybe only one or two cells need to be rerun. Or in other circumstances, the whole notebook will have to re-run. It just depends on what changes.

> I have to repeat that I really don't see how caching of the compilation results is even close the complications that Revise.jl or Pluto.jl have to do.

I think the main trouble with the caching is that the native code you cache can depend very strongly on the exact combination of packages you have loaded. This means you can hit a combinatorial explosion of different methods to cache pretty quickly, so you'd need to find a very clever way to find the right methods to keep and which ones to delete once the cache gets too big.

I think there's also other potential issues that I understand less. This is being actively worked on though.

jampekka · on Feb 14, 2021

> Not quite. It builds a dependancy graph of your code and can figure out what definitions depend on others. So depending on what you change, maybe only one or two cells need to be rerun. Or in other circumstances, the whole notebook will have to re-run. It just depends on what changes.

But the effect is still that any changes up-file will be always reflected down-file? If so, I don't care how it's implemented (given it's fast enough and doesn't break), the semantics is the point.

> I think the main trouble with the caching is that the native code you cache can depend very strongly on the exact combination of packages you have loaded. This means you can hit a combinatorial explosion of different methods to cache pretty quickly, so you'd need to find a very clever way to find the right methods to keep and which ones to delete once the cache gets too big.

Yes, I think this is a problem for a clean solution. But for a big fat ugly hack that isn't too picky on wasting disk space or occasionally recompiling stuff needlessly it's probably less so.

For a lot of cases very rough invalidation would probably suffice. E.g. invalidate all definitions from all files that are changed from the last run (i.e. like Make does). And invalidate all definitions for any name that gets any definition. I'd guess accomplishing this would cut the startup time greatly; the end-user code rarely redefines (at least intentionally) anything that's in the packages, and vast majority of time is spent (re)compiling the packages themselves.

I'm sure there are complications with type inference. But I'd be willing to pepper some explicit typing in my code if it means I don't have to recompile it every time I run it. Binary of a method with concrete types should at least be trivially cacheable (given no library changes between runs).

> This is being actively worked on though.

It's been worked on for as long as I've known of Julia. AFAIK there's still absolutely zero logic on caching compilations of "end-user-stuff" (as opposed to stuff like package precompilation). I don't think this is necessarily due to technical issues, but because the community says that REPL (or notebook) is the only way of using Julia, and those don't suffer from the problem that much (Revise.jl breakage notwithstanding).

Technically it's probably very difficult to do "perfectly", and I'm thinking this is how the compiler devs want to do it. I'm not sure they even mean persisting-between-runs caching when they say "caching" in compiler related discussions. It may well be just some run-time caching of some compilation artefacts that are now compiled multiple times. And that would probably not have that dramatic performance gains for the re-run case.

For an AOT compiler Julia is clearly fast enough. There are probably no easy tricks left to make it a lot faster. But re-run performance doesn't need faster AOT, it just needs the compiler not to recompile the same identical stuff every time.

eigenspace · on Feb 14, 2021

> But the effect is still that any changes up-file will be always reflected down-file? If so, I don't care how it's implemented (given it's fast enough and doesn't break), the semantics is the point.

Yes, I was just bringing this up because it means that various things can be significantly faster, causing you to experience less latency than you normally would by re-running a whole file.

As to the rest of your most, I agree it'd be interesting to see a more quick and dirty solution. It appears that everyone who has the know-how to do this wants to 'do it right', so on the public facing side there's very little visible progress.

> It's been worked on for as long as I've known of Julia. AFAIK there's still absolutely zero logic on caching compilations of "end-user-stuff" (as opposed to stuff like package precompilation)

This is not really true. E.g. there's PackageCompiler.jl which does sysimage based caching and works quite well (at the expense of slow compilation and large binaries), and briefly there was StaticCompiler.jl which did good small binary compilation but then bitrotted quite fast.

All of our CPU compliation stuff is built using a small binary, static, AOT compiler (currently hosted in GPUCompiler.jl) and it's quite reliable. There's active work being done to make this work on the CPU again (basically a modern version of StaticCompiler.jl). So while I feel your frustration that this has been 'coming soon!' for a long time, progress has been made. The new compiler hooks for version 1.6 are partially designed to make this whole process less hacky and easier to iterate on.

jampekka · on Feb 15, 2021

Nice to hear about the progress. I did read up somewhere that AOT is already possible for GPUs. But I actually like the "just-in-time AOT" for development. For deployment a real AOT would be nice (but for the short term can be even something like precompiled blob with embedded runtime).

It would be huge if Julia could be compiled to shared objects with e.g. C interface. I don't even care if they are bloaty or hacky. Any way of accomplishing this would be an instant boost for using Julia in production. And would go beyond anything even close to Julia's productivity.

I think Julia people may underestimate the potential Julia has as a general purpose language, and overestimate the short term efforts to make it happen. Just add some hacks like AOT caching and any way to call with CFFI and it would go like wildfire.

I understand that most of Julia's community is about crunching data, and that's what I do most of the time too. But with that background it's probably not that clear how dire the situation in more general development is. An expressive, reasonably performant and interoperable language would be revolutionary.

orbifold · on Feb 15, 2021

I and probably many others intend to develop reproducible figures. The way to do that in python / matlab is to have a script which loads data from disk and then produces a figure (a png / pdf). You then execute that file many times each time tweaking one aspect of the figure. Julia makes that workflow almost impossibly slow.

cbkeller · on Feb 15, 2021

Then write a function that does everything your script would do in the clean local scope of that function, and call it many times as needed. I mean heck, that’s a more elegant solution even if script latency wasn’t in the equation.

jampekka · on Feb 15, 2021

I usually do a clean local scope of the function anyway. The problem is the reloading. Revise.jl works sometimes, but sometimes doesn't, and it makes debugging more difficult (this is difficult enough in Julia as is in my experience). Another problem is having to use the REPL that doesn't integrate as nicely with the rest of the OS as shell.

I don't see why REPLing it is a more elegant solution. With that solution I can call the function form the REPL if I want, but also from the shell if I want. With shell I get the elegance of having a persistent, complete and reproducible description of the state all the time, which can be e.g. version controlled.

eigenspace · on Feb 15, 2021

Why not just write a function? You can do this without restarting tour julia session every time...

orbifold · on Feb 15, 2021

Well to be more precise the script contains functions of course , each of which produces one of the figures and takes the path to the data / data as input. In any case the REPL has no place in such a workflow because I want to be able to check in the script at the end and have a reliable way of reproducing any of the figures I put into the paper later.

eigenspace · on Feb 16, 2021

You could literally just stick your script into a function and have it be just as reproducible (well, probably more reproducible) and never need to restart julia...

dm3 · on Feb 14, 2021

That's right. However, the 20s delay only happens once during the first plot action in the fresh REPL. That's when Julia compiles all of the functions not present in the system image. All of the repeated plotting will not require recompilation of the "base" plotting libraries.

stjohnswarts · on Feb 14, 2021

I'm relegated to that it's mostly the size and complexity of the project as a whole. If you ask me to write small automation scripts that will quickly become obsolete I will choose bash/python every time. If you ask me to make something that has to last a while and requirements won't change a lot and is of a decent size I might write those same scripts/apps in rust or c++ (modern! 17+) for maintainability and disciplined structure. I think a lot of people are in one camp or the other, but I like to be in the middle somewhere, hopefully seeing the advantage of both.

tomrod · on Feb 14, 2021

I come back every so often to check out Julia again. I have hopes for it.

Some questions still in mind since I reviewed previously

(1) How is its database connectivity?

(2) Is there something like python's `requests` lib?

(3) Are the features mature enough that I don't anticipate major rewrites for code each year?

Nosferican · on Feb 14, 2021

(1) I have used Julia packages for connecting to PostgreSQL, mongoDB, and SQLite. It has been extremely solid. I still wish for the GIS components to be more feature extensive (e.g., writing multilayers features directly).

(2) The HTTP package (HTTP.jl) is great. You also have you HTML and CSS selectors (Gumbo/Cascadia). Some of the the best JSON parsers across languages too. I also developed WebDriver.jl (you can use it with Selenium). Diana.jl is a solid GraphQL client/server package. Genie.jl is a comprehensive web framework.

(3) Those packages have been stable for years. Web and databases are quite straightforward. The least matured one would be the web framework which published its current major version last summer.

Those two ecosystems might be the most matured ones in all of Julia and in most programming languages. I use have been using them extensively in Julia for several years.

Sukera · on Feb 14, 2021

About (2), I've extensively used HTTP.jl and am pretty happy with it. I don't know about exact differences to e.g. requests, but I've found it sufficient for my uses.

Regarding (3), there's a daily CI job called "PkgEval" (which also runs before a new release is made) checking for regressions of julia vs. all registered packages, seeing if any break. This identifies misuse of internal APIs (which are allowed to break under semver) and actually breaking changes (which are then either undone or the packages are fixed). Additionally, you can [compat] bound julia itself in the Project.toml of your code.

Those two combined should mak sure you don't have to rewrite your code. All 1.x versions are backwards compatible after all.

oxinabox · on Feb 14, 2021

> (2) Is there something like python's `requests` lib?

For HTTP (etc) requests?

There is HTTP.jl which I have never had problems with; tons of packages use it. I have used it to wrap a ton of different REST APIs etc.

And in 1.6, as mentioned, there is the new Downloads standard library, based on libcurl. Despide the name, I believe it can be used more generally than simply downloading things. Can also be used as a normal library in Julia 1.3+ https://github.com/JuliaLang/Downloads.jl

And there are several other projects

StefanKarpinski · on Feb 15, 2021

It also supports all kinds of HTTP requests and should be quite efficient and solid, being based on libcurl.

leephillips · on Feb 14, 2021

I don’t know about (1) and (2), but since v. 1.0 there has been practically no need to rewrite. I believe there is a commitment to not introducing breaking changes, or at least a strong reluctance to do so.

EDIT after actually looking at the article: the “no breaking changes” commitment is right at the top.

oxinabox · on Feb 14, 2021

> (1) How is its database connectivity?

Databases are not really my thing but: From what I hear: It is ok. Not amazing. But decent.

LibPQ.jl is very mature (we run it in production). MySQL.jl exists, I hear about SQLite.jl being used pretty often.

I know people use ODBC.jl, and JDBC.jl, though only because they complaint about things. I suspect there are a fair few people using them without complain that i never hear from. Though I haven't heard any mention really of JDBC in a while.

While there is nothing like SQLAlchemy, a nice thing about DataBases in julia is they all conform to Tables.jl tables. So very easy to take your DataFrame library of choice (or CSV reader, or Arrow.jl or a dozen other formats), and use that is the input or output from a database query.

swagonomixxx · on Feb 14, 2021

For (2), it seems like HTTP.jl [0] is the equivalent.

[0]: https://github.com/JuliaWeb/HTTP.jl

enriquto · on Feb 14, 2021

> (3) Are the features mature enough that I don't anticipate major rewrites for code each year?

There is a very good heuristic for that. Write a non-trivial program using the last version of the language and according to current conventions. Then look how far can you go into past versions of the language so that your program runs correctly.

If the oldest version of the language that runs your program is X years old, then you can expect your program to stop running after X years in the future.

DNF2 · on Feb 14, 2021

No, this is incorrect, you are confusing backwards and forwards compatibility. Running your new code on old Julia versions could break immediately, just like in every programming language.

Backwards and forwards compatibility have very different horizons.

Running old code on new Julia versions should not break until the next major version. Packages, on the other hand, are different, and could break your code sooner, but that's the same in any language.

eigenspace · on Feb 14, 2021

Yeah, I just feel compelled to echo you.

What that person said is very wrong.

I guess that in a language without a good version bounding and manifest system like julia, it could be a semi-valid point because you might end up updating your packages and breaking your code that way, but julia has reproducible package environments, so you can get very strong guarantees about backwards compatibility even when you're updating packages.

enriquto · on Feb 14, 2021

Dudes, relax, it's just a heuristic, a rough zeroth-order estimate to measure the pace of evolution of a language.

eigenspace · on Feb 14, 2021

Even to zeroth order, this is a bad way to estimate backwards compatability. It's a fine way to estimate pace of language evolution, but that wasn't your claim.

For instance, Fortran 2018 has new features that would fail if you tried to use them in Fortran 2015. However, Fortran 2018 is still backwards compatible with Fortran 2015, and indeed is backwards compatible with Fortran 1977.

That is, in this example Fortran maintains 42 years of backwards compatability, yet only 3 years of forwards compatability.

The two things are effectively decoupled from eachother.

pkphilip · on Feb 14, 2021

There is a pretty decent framework (Genie) which can be used for building web apps, API backends with database connectivity etc. It is also quite well documented.

https://genieframework.github.io/Genie.jl/dev/guides/Working...

the__alchemist · on Feb 14, 2021

> Plotting, it turns out, is basically a really hard thing for a compiler. It is many, many, small methods, most of which are only called once. And unlike most Julia code, it doesn’t actually benefit all that much from Julia’s JIT. Julia’s JIT is normally specializing code, and running a ton of optimizations. But plotting itself isn’t in the hot-loop – optimizing the code takes longer than running it the few dozen times it might be used unoptimized. To make a long-story short, plotting is the poster child example for Julia needing to compile things before it can run them.

This is misleading. There's no reason you need to to compile the plotting library every time you load a REPL or program. Python, a "slow" interpreted language handles this quickly, as does Rust, a slow-to-compile Lang - You can compile and run a Rust program that plots more quickly than in Julia, since it doesn't need to compile the plotting lib after it's initially installed.

find · on Feb 14, 2021

You can certainly achieve the Rust plotting solution for Julia by compiling a plotting package with PackageCompiler. I use this for day-to-day research tasks.

However, Julia users are greedy! They want their cake (composability, portability, dynamic language features) and eat it too (performance). Lots of effort has thus been put into the language towards not needing solutions like PackageCompiler.

oxinabox · on Feb 14, 2021

> You don't need to compile the plotting library every time.

You don't need to but Julia does. Its definately a issue, and I know it is being worked on.

Julia doesn't store compiled binary code between sessions. Unless you compile it into a sysimage.

There are apparently reasons why caching compiled binary like this is complicated in Julia. But once that is solved, wow things are going to be nice.

CyberDildonics · on Feb 14, 2021

> There are apparently reasons why caching compiled binary like this is complicated in Julia. But once that is solved, wow things are going to be nice.

I put a lot of hours into julia five years ago and all the same things were being said. I don't know why the compilation is so slow or why the caching is so bad, but it was the main complaint then and still is. The solutions are all 'just around the corner'. It reminds me of java two decades ago.

Actually most languages that get a lot of use seem to go through this. The big problems for some reason have solutions "just around the corner" but they remain giant problems.

I think what really happens is that people work on what they want. Solving their hard problems is not fun and no one holds anyone's feet to the flames. C++ has had problems with compile time and template errors, but there has been real commercial pressure to making progress on those. Julia's problems are the same as they were half a decade ago. Start working and wait an enormous amount of time for the exact things to compile that you compiled yesterday when you started it up.

lhn · on Feb 14, 2021

I don't think it's fair to say Julia's problems haven't changed at all. While compilation latency is still an ongoing issue, it has consistently and noticeably improved over the years. Package caching is much better, you can save compilation results you depend on in your workflow with PackageCompiler, etc. There are now incredible tools (e.g., SnoopCompile.jl) for package developers to inspect closely where the compiler might have difficulty and fix the issues.

The major source of improvements in 1.6 is eliminating method invalidations. Julia's flexibility makes it vulnerable to invalidating already compiled code as new packages are loaded and new methods are defined. This triggers a disastrous cascade of recompiling a bunch of things, and is the main conceptual reason why nothing lower than type-inferred code is cached. If your method will be recompiled anyway, then what good is it to save the native code in the first place? Now that invalidations can be efficiently diagnosed and patched, there is definitely interest into caching lower levels of code in the compilation process, potentially even native machine code.

All of these progress however requires labor and care. I'd say the Julia community has spent an admirable amount of efforts into its latency issue, but there's a limit to how fast you can address these problems through open-source development without backing from major tech companies. Imagine the improvements to the Julia compiler had Google chose Julia for its S4TF project, for instance.

Certhas · on Feb 14, 2021

I don't think that's fair, simply because things have actually gotten a lot better. This is shown in this blog post and it also is evident to people who use Julia:

https://i.redd.it/ik4uymvb28k51.png

Also, compiling a sysimage used to be an arcane art, and now works reasonably simply/well. It's easy to imagine a future where, with the tooling that is already there and without a magic breakthrough in caching Julia code, we simply get a per project sys-image in VSC that is recompiled when needed.

DNF2 · on Feb 14, 2021

Well, a lot of resources have been put into reducing compilation times, and large improvements have been achieved. It's not just perennially 'around the corner', the improvements are tangible and happening right now.

CoolGuySteve · on Feb 14, 2021

Yeah, I got fed up with Julia's plotting library and wrote a C++ Qt plot function that forks and plots arrays of doubles.

It runs instantly even with millions of points.

gugagore · on Feb 14, 2021

Now what if your points aren't IEEE double-precision floating point numbers? What if they are Unix time represented as 64-bit integers, and you'd like to format the tick labels according to your locale?

That kind of composability is what keeps me interested in Julia, so I hope that the community keeps improving the overall UX of the language.

CoolGuySteve · on Feb 15, 2021

It’s QtCustomPlot so if I cared I could easily support those things. Julia does not have a monopoly on static types.

In the meantime, Julia GR plots are missing basic features and insist on putting bold black outlines around everything by default, obscuring the data.

npr11 · on Feb 14, 2021

Really nice to see so many little useability improvements -- like easy temporary envs, syntax highlighting in dependency conflict errors, and more partially-applied functions -- as well as more significant language development on threading, stack allocations, and reducing latency.

oscardssmith · on Feb 14, 2021

In a lot of ways, I feel like 1.0 was a backend LTS while 1.6 is more of a front end one. 1.0 had most of the basics nailed down, but it's taken a while for it to become as seemless as it now is.

saiojd · on Feb 14, 2021

I've tried Julia and really liked it, but the user experience was pretty bad. The language really needs faster interactivity or a strong type checker. I found myself waiting after than compiler a lot more than in some AOT compiled language like Rust... I fear the language suffers from being overused by people who are familiar with Matlab and Python and draws too much inspiration from them, much like how Rust draws too much inspiration from C++.

Sukera · on Feb 14, 2021

The most important part about the releases since 1.0 is that compile time has been significantly reduced, the article touches on basically all ways how this has been done - may I ask when you've last tried it?

saiojd · on Feb 14, 2021

A few months ago. First time to plot is noticeably better than it was a few versions ago, but still extremely slow compared to Python. The article gives a benchmark of 9 seconds. I mean, come on.

The main problem I had was simply that what any time I needed to modify a struct field, or anytime my program crashed, or any time the buggy IDE extension crashed, I needed to recompile everything. I also haven't found anyone particularly interested in basic things like interfaces, despite the language supporting type hierarchies (why can't we enforce contracts for types? The whole language is built around overloading...)

Overall this is very frustrating as the language is excellent is many regards, in particular multiple dispatch and the compilation model are just great. The "just ahed of time" compilation is one of those obvious-in-hindsight ideas IMO, better than full interpretation or full compilation for nearly all use cases, if only it could be cached between interpreter sessions or if you didn't need to restart all the time...

Sukera · on Feb 14, 2021

I know that there are plans to cache even more, but other than that I can basically only recommend to put distinct projects into proper projects (with a module and Project.toml). That will already cache precompiled code for that module, even between sessions. Having things in a script won't have that benefit. For experimenting with struct layouts, I've found that NamedTuples (https://docs.julialang.org/en/v1/base/base/#Core.NamedTuple) are amazing for prototyping, since they can be accessed via A.b just like structs but don't have the limitation of being const global.

The dynamism and flexibility combined with the compilation model is basically what leads down this path of recompilation, unfortunately. Since importing packages may change behaviour/invalidate some compiled method (that's what the SnoopCompile stuff in the article was about), it's nontrivial to just begin caching things left and right. You'd end up with an exponential explosion in the number of methods to cache, wasting huge amounts of disk space. That's not to say that there aren't more things that could be done, just that it's hard to do so.

saiojd · on Feb 14, 2021

I've read a bit on type invalidation and I know it's a hard problem (in fact it's hard for me to even wrap my head around it, lol). Still, it's unfortunate. One thing I would like to know is if the difficulties with invalidation are a symptom of the dynamic semantics, or of the compilation model.

Namedtuples are cool, but I'm not sure I understand the tradeoffs between using them and using structs. Can I just replace all structs in my project with named tuples, without having a performance hit?

Sukera · on Feb 14, 2021

Note that I didn't suggest replacing structs with NamedTuples entirely - only during prototyping, while you're figuring out what you want your struct to look like. Structs most definitely will be faster.

saiojd · on Feb 14, 2021

I mean, I could, it's just that its pretty hard to know in advance which structs I will have to modify... Most of the time I only need to do minor edits like add 1 field. By that point I need to recompile anyway if I am to switch to NamedTuples...

oxinabox · on Feb 14, 2021

> Structs most definitely will be faster.

I am not 100% sure this is true.

Structs will definately look cleaner in the code. Not sure they will be faster though.

saiojd · on Feb 14, 2021

Do you know what the differences end up being when it comes to compilation? (not a rhetorical question - I'd like to know)

oxinabox · on Feb 15, 2021

Run `@code_typed` and `@code_llvm` and find out?

Certhas · on Feb 14, 2021

I feel your pain on interfaces. As it stands, Julia simply doesn't encourage carefully thinking/documenting about what assumptions your code makes and just banging things together, hoping they work. Good luck dealing with any obscure MethodErros that result if they don't.

It still, at the end of the day, is a mostly academic language. So mostly small projects with very few people working on them. No need to architect bigger solutions/patterns, etc...

BadInformatics · on Feb 14, 2021

I'm not sure that's a fair characterization. Core team members have expressed serious interest in getting more static verification, interfaces and other type goodness into the language, but if you try to force the issue it turns into a Python 2->3 problem. Not to mention that there are few if any examples of how to fit type checking alongside multiple dispatch (e.g. C# punts with dynamic).

I personally find static types indispensable when working with a large codebase, but to say people don't care about this in the Julia community is just not correct.

Certhas · on Feb 14, 2021

One thing is the Core Team Members and their long term plans, which I'm not privy to. The other is the impression generated by interacting with others in the community online, where people disagree with the very premise that there is a problem.

I also see your point that not too much is known in this design space, but I also think that's why it would be good for the community to step up and experiment with this more. Figure out what works. I had plans to do that last summer but life intervened so I am stuck commentating from the sidelines. :P

adgjlsfhk1 · on Feb 14, 2021

Note that if you ask on slack, you often will get answers from the core team members. They're pretty open. The main place where plans aren't the most clear is when there are features that everyone knows would be good, but aren't on the top of any of the the main people's to-do list. The JuliaLang repo has over 1000 people who have contributed, so a lot of the time, a new feature is just the result of a community member making a random PR. Stack trace improvements for example, started as a package, got ported to Base, and got improved by a bunch of people contributing to design decisions.

dTal · on Feb 14, 2021

> The "just ahed of time" compilation is one of those obvious-in-hindsight ideas IMO

It's not new. One of the most widely used Lisp environments, SBCL, works this way. So does Chez Scheme, and therefore now Racket.

saiojd · on Feb 14, 2021

That's interesting, I didn't know Racket worked like that. What I meant wasn't so much that its a new idea, rather that it's a good one.

DNF2 · on Feb 14, 2021

No one interested in interfaces? My impression is that interfaces is very commonly discussed, and is one of the most anticipated features in the language, though it may not come until v2.0.

saiojd · on Feb 14, 2021

OK, I was being hyperbolic. It's just that the interest is low compared to other features, as most people involved in the project are used to dynamic languages and don't feel the need. It's been in discussion for a long time, with no action so far. From what I've seen, the general attitude is a bit dismissive of the utility of static verification ("its different in our language because X" type attitude)

Certhas · on Feb 14, 2021

My impression with this is that it's mostly coming from a few academics that never work on code they haven't written themselves, and that have no understanding of use-cases other than their own...

superbcarrot · on Feb 14, 2021

I'm trying really hard to like Julia but constantly chasing "ERROR: MethodError: no method matching" messages gets frustrating very quickly.

DNF2 · on Feb 14, 2021

When do you run into this kind of problem? Is it possible that you are over-typing your function signatures? The generally recommended style is to write generic code with loose type restrictions, or even none at all.

sgt101 · on Feb 14, 2021

>"strong type checker"????

I like julia because of the super powerful and super strong type checking. Have I misunderstood what is meant by strong type checker?

aliceryhl · on Feb 14, 2021

What do you even mean? It doesn't even catch that an argument's type doesn't match on a function call with types specified. It also can't catch trivial stuff like misspelling a struct field on a variable of known type.

If you want to be called a "super strong" type checker, you really have to catch that kind of simple issue at compile-time, _not_ when I run the code.

adgjlsfhk1 · on Feb 14, 2021

Technically it is catching the type error at compile time (but compile time is Just Ahead Of Time). If you want something that feels more like type checking in a statically compiled language, you should definitely check out https://github.com/aviatesk/JET.jl

gugagore · on Feb 15, 2021

> It also can't catch trivial stuff like misspelling a struct field on a variable of known type.

The default definition is

`getproperty(x, f::Symbol) = getfield(x, f)`

and `getproperty` can be overridden for a type, so `foo.a` can succeed even if `getfield(foo, :a)` fails. (`getfield` cannot be overridden).

So it's not trivial to determine, given the the of `foo`, from the syntax `foo.a` whether that code errors.

sgt101 · on Feb 14, 2021

Ahhhh - you want a statically typed language. That's definitely not Julia, you're probably best off with Java there.

aliceryhl · on Feb 15, 2021

I'm well aware which languages are statically typed and which are not. I was answering your question:

> Have I misunderstood what is meant by strong type checker?

cambalache · on Feb 14, 2021

What boggles my mind is that a language oriented to scientific programming has such a lousy time to first plot. I know it has been improving, it is still not acceptable. Not for me as a user, not acceptable for a language who wants to become mainstream.

oscardssmith · on Feb 14, 2021

It's a better time to first plot than matlab, which is one of the other major contenders. On my computer it is about 3 seconds, which is noticable, but far from disqualifing.

systemvoltage · on Feb 14, 2021

Julia oversells itself as a general purpose language which I find absolutely out of line. Their marketing needs to be a lot more humble until they figure out the kinks.

Also, my guess would be Python and not Matlab as it’s main competitors.

eigenspace · on Feb 14, 2021

In what way is marketing julia as a general purpose programming language 'way out of line'?

People use julia to make webservers, write programming languages, create plotting libraries, do scientific analysis, do compiler research, make video games, do HPC, etc.

Julia has a design that's indeed strongly informed by scientific computing, but in order to actually meet the needs of the various people using it for scientific and technical purposes, it ended up needing to become a flexible enough language to be useful for anything.

systemvoltage · on Feb 14, 2021

Do you think R or Matlab is a general purpopse language? Sure you could do all these things but should you?

Julia is clearly positioned as a scientific computing language. Let's be clear.

eigenspace · on Feb 14, 2021

Doing these things in julia is very different from doing these things in R or Matlab. The tooling and ecosystem for non-scientific applications in Julia is growing rapidly and is quite competent.

Julia is absolutely a general purpose language. It’s user base skews heavily towards scientific computing, but the demographics and ecosystem are broadening daily.

systemvoltage · on Feb 14, 2021

Having used Julia for 2+ years, I couldn't disagree more. Productionizing Julia code has been a total nightmare. The community library support has been growing but hasn't gone through the wringer. Just because things are improving doesn't provide a meaningful understanding against its competitors.

I don't see any reason to use Julia over Go for backend webservers. Rust or C++ for systems programming. And frankly, I prefer Python for scientific computing.

Julia also has a tiny standard library and lots of flaky external libs which make productionization of code a risky adventure which I have personally been bitten by.

Most people are allured by Julia's overhyped marketing which is a shame because the original paper by Stefan is pretty impressive. We're seeing some criticisms of Julia in this thread, rightfully so.

My advise to people who are subscribed to Julia's marketing is to listen to people that are complaining. No one wants to just complain, they're saying that because of many reasons. Be humble and try to listen, accept Julia's many shortcomings (error messages and stack traces, library support, startup time, IDE, debugging, etc.). Julia has many shortcomings that are only apparently after using it outside of the Jupyter Notebooks. Not accepting those makes you an annoying fanboy.

eigenspace · on Feb 14, 2021

> I don't see any reason to use Julia over Go for backend webservers. Rust or C++ for systems programming. And frankly, I prefer Python for scientific computing.

Sure, I would never claim Julia is being the best language for webservers or systems programming. If someone came to me saying they wanted to do this in Julia, I'd probably tell them "if this is important, I'd probably look at a more established language for this purpose unless you have a good reason to want to use julia for this"

That doesn't make julia not a general purpose programming language. It just means it's not the best language for every imaginable purpose (no language is).

I personally prefer Julia very strongly for scientific computing to it's competitors, and because of the amount of time I've invested in it for that, I also do many other things in it and I find it quite nice for this.

It's totally fair that you prefer Python for scientific computing. Python has a great ecosystem and huge community with tonnes of investment! It's an incredibly stiff competitor. I prefer Julia, and think I have strong reasons to do so, but everyone's needs and desires and different.

> No one wants to just complain

This is an empirical claim about human psychology and it's false. But regardless, yes there are a lot of totally valid criticisms of julia in this thread! Just because these criticisms exist and some of them have good points doesn't make julia a bad language though.

Please consider the fact that not everybody has the same needs, desires and temperament as you. Every language has major probelms with it, but different people feel these problems differently.

For many people (for example, me), Julia is a gigantic breath of fresh air! For others, it's painful and clunky. I think there's a lot of good here that people should see and check out and think about, even if they decide it's not for them. Especially because these things improve every day.

___________________________________________________

Just a disclaimer in case anyone is suspicious about my affiliations: I have absolutely zero financial stake in Julia's success. I am not employed by anyone who would benefit from more people switching to julia. I'm a physics PhD student. I simply find julia very useful and pleasant to use and want to share that with others.

systemvoltage · on Feb 14, 2021

> I'm a physics PhD student. I simply find julia very useful and pleasant to use and want to share that with others.

User base tends to be scientists and not seasoned Software Engineers. No offense to either one, just that the community inspires the language and its mechanics. This is exactly the reason it is not a general purpose language. You just proved my point.

Glad you find it useful for your endeavors. I reckon DiffEq and other hardcore math is great in Julia.

chrispeel · on Feb 14, 2021

> Most people are allured by Julia's overhyped marketing ...

Wikipedia says "Marketing refers to activities a company undertakes to promote the buying or selling of a product, service, or good." Julia is not a company; I think what you're calling "marketing" would better be labeled "user enthusiasm" :-)

systemvoltage · on Feb 14, 2021

The founders of Julia do have financial interest though! I think they're part of Julia Foundation. Same people wrote the marketing material I presume.

ChrisRackauckas · on Feb 14, 2021

> The mission of The Julia Foundation is to provide assistance to those in need while creating awareness of the power of art to heal and inspire.

http://www.thejuliafoundation.org/

I didn't think they were, but now it makes sense why Jeff Bezanson's voice is so soothing.

StefanKarpinski · on Feb 15, 2021

Out of curiosity, what marketing material are you referring to? I.e. what is it that you find misleading, specifically?

If anything, I feel that the Julia website and manual focus more on technical computing than they ought to and could stand to spend more time on general computing matters for which the language is also well suited. I’ve been meaning to write a blog post entitled “Julia is a General Purpose Language” for a long time. Which I suspect you would take issue with, but that’s ok.

gugagore · on Feb 14, 2021

You're going from "don't call it a general purpose language" to "be receptive to complaints about its short-comings". There's quite a gap in between.

I know that contributers are receptive about short-comings. And if you frankly prefer other languages, then that's totally fine too.

systemvoltage · on Feb 15, 2021

Short-comings are intrinsingly related to its general purposeness. Tiny standard library is one of them. Not being able to put things into production is another.

I guess I see general purpose languages such as Python and Go as rock solid. They have warts but they're well understood and wrinkles have been ironed out.

sgt101 · on Feb 14, 2021

>Not accepting those makes you an annoying fanboy.

Thanks for the constructive input to the debate.

systemvoltage · on Feb 14, 2021

I am allowed for a bit of unconstructivism after providing a pretty wide take on the productionization of Julia code. Spare me? :)

FridgeSeal · on Feb 15, 2021

Julia is a general-purpose language, with language and implementation features conducive to scientific computing.

adgjlsfhk1 · on Feb 14, 2021

Julia has a lot of main competitors. I would consider Julia a competitor to Fortran, C, Matlab, R, and python. If you look at DifferentialEquations.jl or the clima.jl package, these are packages that are competing with low level libraries that would traditionally be written in C or Fortran. It competes with these by offering comparable performance, while having much better quality of life features (like automatic PGO, a package system, metaprogramming, and not having to deal with make files). It competes with matlab by having an incredibly rich linear algebra library, while being free and not making you do dumb stuff like 1 function per file.

rightbyte · on Feb 14, 2021

The plot is also very simple. No zoom, brush, save to file, label, title, regression etc that you need to process it in a GUI and save for a report.

I would really like to use Julia as a "Matlab or Octave but with nice string concatenation" but the UI is just lacking for one off calculations and data processing.

Sukera · on Feb 14, 2021

I'm afraid I don't understand your comment - both Plots.jl as well as Makie.jl (the two most commonly used plotting packages, as far as I know) support all of those things. Makie.jl does so natively, Plots.jl does so if the backend (e.g. Plotly, PyPlot) supports interactivity (the rest is available by default). Do you mind giving an example, such that this could be improved further?

rightbyte · on Feb 14, 2021

Ye sorry I meant in the plot window (GUI), not the scripting. For data processing of lab measurements zooming and panning, brushing, "drag to select" etc in an easy way is really convenient since you don't know where in the plot interesting stuff will be in advance. Adding titles and labels, text arrows etc is a nice extra.

Matlab have quite good such capabilities, Octave is more limited (you can't add titles, labels, regression lines or brush away data points, get simple data statistics like sums or std devs like in Matlab, but you can zoom and pan, save to file etc).

Julia seems to launch a Qt-window with plot, so adding some menu bar with zoom and pan shouldn't add too much bloat.

E.g. exploring roots benefits from zooming a lot.

EDIT:

You seem to be able to switch "backend" of Plots to eg. PyPlot for some functionality I didn't know that.

Sukera · on Feb 14, 2021

Yes precisely - it's not the frontend that's doing the lifting here. The interactivity as you describe it depends on if the backend supports it, which not all do.

I agree that there's room for improvement here, but I also think that it won't happen in Plots.jl (and if it does, it'll depend on a backend that already provides interactivity). I think it'll be more likely to see something like this plop out of Makie.jl or something built on top of Makie.jl, as that has a lot of primitves for interactivity available already.

thebooktocome · on Feb 14, 2021

There are absolutely labels, titles, and the ability to save figures to files.

I used to deliver project reports to customers using a combination of Julia and LaTeX. It was perfectly suitable for that application.

oxinabox · on Feb 14, 2021

I think they mean no GUI for that. To manually (rather than programatically) do this.

ForHackernews · on Feb 14, 2021

...it's a programming language, not an Excel competitor.

You could use Julia to build some kind of GUI plot-making tool.

enriquto · on Feb 14, 2021

> Octave but with nice string concatenation

As a heavy octave user, I never felt a need to concatenate strings in any way. But I'd be happy if julia was an Octave but with fast loops, which it sort of is; but still not really there.

rightbyte · on Feb 14, 2021

When you use Matlab/Simulink for c-code generation and use it for a lot of purposes Matlab was not really made for you might run into processing strings.

But ye for normal use it is not really a problem.

kitsune_ · on Feb 14, 2021

It takes 180s to import Statistics, PlutoUI, Images and OffsetArrays in Pluto on this Macbook Pro with 32GB that is a couple of years old, while the fans are going crazy. If I'm unlucky a worker process will seg fault. All in all the interactive / REPL part (even with stuff like Revise) is kind of a let down thanks to the slow precompilation / lack of caching? I like it, but I really don't see how it is ever going to replace Matlab, R or Python if these issues aren't addressed.

eigenspace · on Feb 14, 2021

I assume that's including precompilation, otherwise something is seriously wrong with your computer. Precompilation can take a while, but it's cached between sessions, so it should be much faster the second time.

Also, in version 1.6, we have multithreaded precompilation which happens at installation time instead of when you first try to load the package. This makes it much faster.

Acur · on Feb 14, 2021

Just checked. On my slightly dated laptop the import of these packages in Pluto takes 5 seconds with Julia 1.6. In general 1.6 is a huge improvements in compilation times and usability for me.

pkofod · on Feb 15, 2021

What does "32GB" have to do with it? Why would a worker process segfault in this scenario?

swagonomixxx · on Feb 14, 2021

As someone who knows very little but wants to learn: what's the best way to get up and running with Julia? On the website, they link a lot of videos, but I prefer textual formats. Is there something like the Rust Book for Julia?

thetwentyone · on Feb 14, 2021

- [JuliaLang.org](https://julialang.org/), the home site with the downloads to get started, and links to learning resources.

- [JuliaHub](https://juliahub.com/ui/Home) indexes open-source Julia packages and makes the entire ecosystem and documentation searchable from one place.

- [JuliaAcademy](https://juliaacademy.com/courses), which has free short courses in Data Science, Introduction to Julia, DataFrames.jl, Machine Learning, and more.

- [Data Science Tutorials](https://alan-turing-institute.github.io/DataScienceTutorials...) from the Alan Turing Institute.

- [Learn Julia in Y minutes](https://learnxinyminutes.com/docs/julia/), a great quick-start if you are already comfortable with coding.

- [Think Julia](https://benlauwens.github.io/ThinkJulia.jl/latest/book.html), a free e-book (or paid print edition) book which introduces programming from the start and teaches you valuable ways of thinking.

- [Design Patterns and Best Practices](https://www.packtpub.com/application-development/hands-desig...), a book that will help you as you transition from smaller, one-off scripts to designing larger packages and projects.

- Lots more topical books (Statistics, Optimization, etc) if looking for a Julia-oriented subject matter

nextos · on Feb 14, 2021

The official manual is very good, and reads more or less like a textbook: https://docs.julialang.org/en/v1/

cbkeller · on Feb 14, 2021

Yeah, this is probably actually the closest equivalent to the Rust Book, but I’ll also second the suggestion from the other comment of “Think Julia” (beginner) and “Design Patterns and Best Practices with Julia” (intermediate /advanced).

For me, the most important thing to grasp when coming from another language was that Julia’s multiple dispatch brings with it effectively a whole paradigm of “dispatch-centric programming” that you have to embrace to really get the most out of Julia, including the c-like speed (have to strictly avoid type-instability for that) and the composability that everyone talks about.

hpcjoe · on Feb 14, 2021

TIL of OhMyREPL. And tried it. Its a keeper.

moelf · on Feb 15, 2021

any chance you use fzf for history search too? ;)

hpcjoe · on Feb 23, 2021

Now I must research what fzf is ... :D

295310e0 · on Feb 14, 2021

Can someone comment on the ease of distributing Julia code? Can I easily take a bit of code I write (w/o external libraries) and produce a single binary I can ship to someone or do they require a full Julia environment to run it?

eigenspace · on Feb 14, 2021

Doing this without the user having Julia installed would require using PackageCompiler.jl to make a standalone executable. https://julialang.github.io/PackageCompiler.jl/dev/apps/

This is a pretty stable well established process now, but the binaries it produces are huge because they actually have the full Julia runtime in them.

Active work is happening on small binary static compilation. We already do it for GPUs, we just have to repurpose our GPU AOT compilation pipeline for the CPU. There are proofs of concept that currently work, but something more polished is feeling like it’ll probably be another year or so.

jakobnissen · on Feb 14, 2021

They require the full Julia environment to run it - and it's a heavy environment. It is possible to compile a binary that includes the environment and the compiler, but IIRC, that will result in a >400 MB hello-world script taking 150 MB of RAM to run.

The core devs have mentioned they are going to add the capacity to compile to actual static binaries, but that does not seem to be a top priority, so I wouldn't hold my breadth waiting for it.

adgjlsfhk1 · on Feb 14, 2021

Note that this feature doesn't necessary require work by the core devs. It is completely feasible (at least in theory) to write a library that can output static exectuables.

ChrisRackauckas · on Feb 14, 2021

It's actually done all of the time. GPUCompiler.jl, the core of the CUDA and AMD GPU stack, builds static binaries (compiled to .ptx by LLVM for CUDA for example, but the choice is just a switch), then stashes those binaries to use with a ccall in a Julia function. You could in theory use that stack to statically-compile anything that's GPU compliable, and it's really well-tested.

baldfat · on Feb 14, 2021

I loved the idea of Julia but R and specifically the tiddyverse https://www.tidyverse.org/ Just makes everything else seem not as elegant to my humble eyes.

tfehring · on Feb 14, 2021

I use R and the Tidyverse extensively. Exploratory data analysis is definitely clunkier in Julia - you need the `@pipe` macro to patch up some limitations in native pipes, there's no `dbplyr` equivalent that I know of, and despite Julia's better metaprogramming in general, the lack of built-in equivalents to scoped `select`/`mutate`/`summarize` is a real drag. But Julia's type system, substantially better date/time system and utilities, explicit vectorization with `.`, and the use of functions instead of scoped expressions in functions like filter are all real benefits over R.

If you write a lot of Rcpp, Julia's performance without dropping down into a lower-level language is also a significant advantage. It's easy, bordering on trivial, to performantly implement a generic join (i.e. `join(f, df1, df2)`) in Julia; `dplyr` still doesn't have those at all, `data.table` only sort of does, and I believe the canonical R implementation (AFAIK) in the `fuzzyjoin` package requires holding the Cartesian product of the dataframes in memory, which is obviously not great.

diarrhea · on Feb 14, 2021

> tiddyverse

That is not what you meant to say.

fishmaster · on Feb 14, 2021

You can use RCall to use R from Julia: https://github.com/JuliaInterop/RCall.jl

phillc73 · on Feb 14, 2021

Have you tried Query.jl or DataFramesMeta.jl?

cwyers · on Feb 14, 2021

Very much not the parent, but as a heavy R user, I don't think either of them quite nail the way dplyr and the tidyverse work. The thing about dplyr is... it's just functions. Okay, so, it's functions that leverage features R has (notably lazy evaluation and non-standard evaluation). But it's just functions. All you need is a function that takes a data frame and returns a data frame. So you can take a function out of the R standard library, you can take a function from a package written before dplyr came around, you can take a function from a recent non-tidyverse package, you can write your own function... it's all just functions.

In DataFramesMeta.jl, though, you have a macro, and everything runs inside that macro. So if you want to take something that isn't a part of DataFramesMeta.jl... here's an example. Let's say you want to take the popular mtcars dataset, and get the five cars with the best gas milage. In dplyr, that goes

mtcars %>% arrange(mpg) %>% head(5)

arrange is a function from the dplyr package, head is a function from the standard library, but they both work seamlessly together.

DataFramesMeta.jl lets you work in a pipe-forward fashion, but (at last I knew, at least, it's been a while since I played with it), you couldn't use the Julia head function within a DataFramesMeta.jl pipeline. You have to do your data transformations, assign to a variable, and then get the head of that variable.

Which, okay, probably doesn't sound like a big deal. But I think it gets at the heart of what efforts to do something Tidyverse-like in other languages (Python and Julia, mostly) really miss. The key value proposition of the Tidyverse in R is that it is very composable and very extensible. That means, if you are trying to solve something in a Tidyverse way, you can probably find something that works for you. If you are doing financial analysis? Get tidyquant. If you're doing time series analysis, the tidyverts packages are for you. And it all works because there is so little friction involved in writing your own functions that extend the functionality of Tidyverse packages. Yes, dplyr is a useful querying DSL in its own right, but you can find a bunch of SQLish query languages, and they're all some degree of fine. Query.jl or DataFramesMeta.jl might expose a useful querying DSL for data frames, but they don't seem to me to be built to support building a whole ecosystem like dplyr and the Tidyverse are.

phillc73 · on Feb 14, 2021

That's a really good point that I'd not really thought about. I'd never really considered the difference between calling just functions versus macros.

Thinking about Query.jl and DataFramesMeta.jl, and I am for sure not an expert in either, I can't specifically speak to your `head` example, but other base functions can be combined with macros. For example, see the LINQ examples from DataFramesMeta.jl[1] where `mean` is being used. Or again the LINQ style examples in Query.jl[2], where `descending` is used in the first example, or `length` later in the Grouping examples.

Is that the kind of thing you meant?

For whatever reason, with the way my brain is wired, the LINQ style of query just works for me. I have never directly used LINQ, but do have some SQL experience. In fact, I wrote some dinky little wrapper functions[3] around duckdb[4] so I could directly query R dataframes and datatables with SQL using that backend, rather than sqldf[5].

[1] https://juliadata.github.io/DataFramesMeta.jl/stable/#@linq-...

[2] https://www.queryverse.org/Query.jl/stable/linqquerycommands...

[3] https://github.com/phillc73/duckdf

[4] https://duckdb.org/

[5] https://cran.r-project.org/web/packages/sqldf/index.html

baldfat · on Feb 14, 2021

I don't like working with DATA TABLES UNLESS it is a HUGE data frames. Then if it is huge I'll go towards sparks. I normally am working with under a million objects which with today's computers is not that big.

Edit I meant to say that DATA TABLES library in R reminds me more of Query.jl then tiddyverse

wodenokoto · on Feb 14, 2021

What are you doing in the tidyverse that is not related to dataframes (or tibbles as the subclass of data.frame, that tidyverse uses is called)?

phillc73 · on Feb 14, 2021

Query.jl supports two different paradigms, one inspired directly by LINQ, the other by dplyr.[1] I actually prefer data.table in R, over dplyr, and Query.jl is really quite different.

[1] http://www.queryverse.org/Query.jl/stable/

aliceryhl · on Feb 14, 2021

I really wish the language doesn't force me to use the REPL.

moelf · on Feb 15, 2021

most of the users use an IDE or a notebook, no need to only use REPL.

If you're a old school editor -> terminal run kind of person, checkout https://github.com/dmolina/DaemonMode.jl

aliceryhl · on Feb 15, 2021

Thanks, I'll have to check this out. I really wish these things were more discoverable. It took me a month of using Julia until I figured out that the compile-times were even avoidable on the REPL by using Revise.jl.

krastanov · on Feb 15, 2021

How is it forcing you!?

aliceryhl · on Feb 15, 2021

Because if I run it from the command-line, it takes 40 seconds before it has precompiled and prepared my project for use. The only way to avoid that wait on every change is to use the REPL, where I only need to wait on the first run.

Outside the REPL, every run is the first run.

pkofod · on Feb 15, 2021

You don't precompile just because it's run from the commandline though. There must be something missing from your workflow description, or you're somehow misunderstanding what is going on. Since this is a v1.6 blog post: did you try a v1.6 RC?

aliceryhl · on Feb 15, 2021

Maybe precompile is the wrong word, and I should have called it JIT or something else. I don't know.

The point is that running a .jl script that calls my algorithm on a small test case takes 40 seconds to run. If I change the script to run the algorithm more than once on that dataset, all calls after the first complete in less than one second.

Running it with v1.6-rc1, it appears to have improved the running time from 40 seconds to 30. That's pretty good, but still way too slow to enable any kind of workflow that doesn't involve the REPL.

(The 30 and 40 second numbers are very consistent from run to run.)