Show HN: A toy interpreter for a minimalistic imperative language in 1.3k LoC

xigency · on Aug 14, 2015

Quite impressive. One improvement that I don't think would increase your lines of code might be to use a `realloc' for your variable storage, so you wouldn't run out of space or have a hard limit.

https://github.com/bbu/simple-interpreter/blob/master/src/ru...

And of course I want to test this myself with a few interesting programs.

In terms of computing power, I think this language is just shy of being Turing complete. It's probably in the same class as regular expressions.

I'm working on a similar project, with a minimal compiler for an integer-only language, and I expect it to have the same limitation at first. I think the easiest way to remedy that is to implement dynamic arrays or some form of recursion, although I'm not sure if the latter is sufficient.

DougMerritt · on Aug 16, 2015

> I think this language is just shy of being Turing complete. It's probably in the same class as regular expressions.

If it is possible to code an infinite loop, that tends to be enough to bring a language up to the level of being Turing equivalent.

To put it more strongly, it is really hard to define a nontrivial language that has infinite loops that is not Turing equivalent.

And in fact I can see how to simulate a cpu in this language, so there you go.

xigency · on Aug 17, 2015

Not so. Without the ability to implement or simulate a stack, a language cannot be Turing complete. Otherwise, that limits the computational power to that of a finite-state machine. Even so, being able to implement a stack is not sufficient, because pushdown automata are defined as finite-state machines combined with a stack, and PDA's are a proper subset of Turing-complete languages. Given that a CPU in this case could be stateless or have a limited number of states, that does not change the class of language.

Anyway, I see the author has implemented arrays now, and more specifically dynamically-sized arrays, so I will concede that it is Turing complete. That wasn't really the intent of my post, as there are other ways to devise a language, but cool none the less.

For a view of the classes of languages in computation, see this hierarchy: https://en.wikipedia.org/wiki/Chomsky_hierarchy#The_hierarch... Here, the set of languages that Turing machines recognize are labeled as recursively enumerable languages.

DougMerritt · on Aug 17, 2015

Thank you, but I am aware of all that, and I asked for arrays above in part to be able to emulate a stack (or a tape, if going the turing machine route).

(I could emulate a bounded stack with a bunch of variables and conditionals choosing them, but that's pretty ugly, and brings home just how bounded the result would be)

> CPU in this case could be stateless or have a limited number of states

The bounded state is an issue, but it is for Pentiums as well. We have approximations of Turing equivalent machines in real life, and that's close enough.

Regarding these other things:

> pushdown automata are defined as finite-state machines combined with a stack

I didn't say it was impossible, I said it was difficult with a nontrivial language -- because I've seen people try, and accidentally make things Turing equivalent.

If you know what you're doing, like you for instance, then sure, you just leave out problematic features and make sure you don't leave the level of the Chomsky hierarchy you were aiming for.

But no, people always want handy things like assignment and arithmetic expressions and loops and function calls and so on. :)

A classic example is pure standard SQL (up to a few years ago), which had no loop construct, and the language was famously sub-Turing.

But if the embedding language puts some SQL in a loop, it's trivial to emulate a CPU with simple SQL, which won't surprise someone classically trained like yourself, but does tend to surprise random programmers in the wild.

(I hear a more recent SQL standard added loops, which I have mixed feelings about.)

bluetomcat · on Aug 16, 2015

I didn't really intend to make the interpreter part complex. It's more like a proof of concept that the tree can be easily traversed and evaluated. Besides, anyone who intends to use more than 128 variables in this language has either chosen the wrong language, or has to extend it at least with functions :-)

The parts that I wanted to make robust and recoverable from all kinds of errors were the lexer and the parser.

DougMerritt · on Aug 16, 2015

Well add an array anyway, to make my above hypothetical project easier. :)

bluetomcat · on Aug 16, 2015

I just added array support: https://github.com/bbu/simple-interpreter/commit/392c38e7899...

The arrays themselves are dynamically realloc'ed and checked for out of bounds access. I've also added some runtime warnings for access to undefined variables and negative or out of bounds array access.

Also, there is an equivalence between variables and arrays: `var[0]` is the same as `var`.

All in 1466 lines :-)

DougMerritt · on Aug 17, 2015

Excellent, that's all good stuff. And yes, that's pretty concise. My toy interpreters tend to grow without bounds.

bluetomcat · on Aug 16, 2015

Suggestion considered :-)