Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Jupyter style notebooks are already becoming the next VBA in some fields. And this is not a good thing.


I guess that some developers requirements are orthogonal to those of citizen developers. For example version control could be a must have for us, but for others a layer of complexity that is just waiting to stand in the way of getting things done.


Yes. Jupyter et al are quick to get started with and you can get quite far until things start to get unmanageable.

But this may not be good for anybody in the long run. For example it tends to lead many students to not understand basic concepts, like variables. Which is understandable because variables don't behave like variables in notebooks (e.g. the same variable in the same notebook may refer to different values in different cells depending on how they are run). For many students this can cause almost insurmountably wrong mental models (which they will of course carry to "production" later on).

But as I argued in another thread here, it doesn't have to be this way. E.g. Pluto does notebooks in a more rigorous manner.

Almost all "software engineering" languages and tools makes getting started and actually getting something done quickly needlessly difficult. Probably uncontroversial that git UI is a total mess, and things are getting even worse with more build tools, dogmatic static typing and general pointless ceremony.


Why? The usual suspects? Lack of version control? Hard to deploy (reproducibly)?


Yes and yes. But the larger and more fundamental problems are the mixing of the program logic and the state and inability to make the code composable or modular. Problems in version control and deployment/reproducibility almost necessarily follow from these.

These are probably not impossible to solve for notebook-style, but there are not many efforts to solve them or they are not even acknowledged as problems.

Edit: There is Pluto for Julia that attempts to solve the state-problem. I have not used it in practice though; I've given up on Julia, in large part because Julia community tends to be even actively hostile towards "stateless" development.


Thanks. Agreed.

By "stateless", I'm assuming you mean functional programming paradigms of immutable, idpotent, and no side effects.

FWIW, for build pipelines, my quarter-baked notion is to use ZFS snapshots (or equiv).

I'll check out Pluto for Julia.

As you know, state is a challenge for "serverless" too.

I've been reacquainting w/ RDBMS tools. There are a few new strategies (implementions) for change tracking. Back in the day, we just banged the rocks together (ook, ook), so I'm very eager to learn the new hotness.


In the notebook context the main gripe is that notebooks have the "invisible" memory state that means that one can't deduce from the notebook code what it actually does. Or more concretely the order of execution of the cells affects what the notebook does. This leads to sort of higher level side-effects. With usual side effects you get spaghetti, with notebooks you get moving spaghetti in five dimensional space.

Immutability and idempotencency are good, and related, ideals too, although I think these can get too "unergonomic" if taken too dogmatically (like in Haskell or Redux), they should be used with almost goto-level discretion.

Of course there's the clear (short term) usability benefit of maintaining the memory state in that stuff doesn't have to be recomputed. But we can have that benefit and be stateless with pure functions and memoization. I quite often whip up a buggy and brittle ad-hoc solution to do so. There was also the IncPy project [1] that did this more rigorously, but it hasn't been updated in 13 years.

In general I'm a bit baffled why pure function memoization is so rarely used or proposed. Despite the old adage, cache invalidation is not actually half of the three hard problems in CS. With pure functions it's trivial.

Another baffle is why snapshotting/change tracking (and compressing) file systems haven't caught on. Instead these tend to get implemented badly in any sufficiently complicated application.

[1] https://github.com/pajju/IncPy


Sorry: of course side effects and mutability (not pure functions and immutability) should be used with goto-level caution.

Also the "higher-level side effects" apply more or less identically to REPL development.


Nix somehow manages changes. (Relies on ZFS?) My future perfect notebook style build script would start there.

I wasn't even thinking about REPL style work. Mea culpa: I don't actually know how jupyter et al work, so I'm talking out my hat.

Your explanation reminded me of "prevalent" persistence (vs full orthogonal persistence). I guess I assumed something like that was happening between cells.

I suppose it's analogous to the transition of UI frameworks.

Bad: Mutant components directly.

Good: Mutate thru event queue. Get undo/redo for free. Debugging still sucks.

Better: Pretend it's a simulation and use an entity component system. I think this is what the kids are calling "reactive".

> pure function memoization

Answering just for me: because I'm just a simple bear.

I've been imperative for so long, continuations, currying, and lazy eval break my brain. Yup, a fully functional world would be a lot more simple. Maybe it's time for me to revisit clojure.

Thanks again. This is fun to think about.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: