SC, an s-expression based front end to C

coryrc · on Dec 15, 2008

This is exactly what I was looking for!

I write a lot of embedded microcontroller code (usually AVR but some ARM) and have written my own preprocessor that compiles lisp code between slash-star-L-star and star-L-star-slash tags (sorry, HN changed the characters into formats, I think you should divine what I mean), appending standard out into the C source, which is then compiled by gcc as normal. I have Lisp In Small Pieces and have intended to write something like this, but the time hasn't been worth it yet.

"Why?" you may ask. A couple reasons:

* If you use clock dividers to get a particular frequency, then change the main clock, you have to make sure you changed every instance in the code to a new divider ratio. I wrote functions that automatically generate the C code which correctly programs the timer, uarts, etc.

* Most programmers automatically generate their lookup tables, but they have to rerun the program then cut-and-paste the new values into the C source. Then you have to make sure the length is correct, etc. With the code right there, you can just change a single number and recompile, being assured the new table will be correct.

joe_bleau · on Dec 15, 2008

As an embedded hacker, I'd love to read more about some of your ideas.

coryrc · on Dec 16, 2008

Well, I'm a big Lispnik and wish I could use it for everything -- so I try! but AVRs (an 8-bit microcontroller) can't run managed code anywhere near fast enough. My code is completely statically or stack allocated and almost all structures are in the global namespace, so it is nearly perfectly optimized. A lot less is in the global namespace on ARM processors because the word size can index the entire memory.

One way I've used Lisp is to generate code in either an if-else form or switch-case form. If you have a sparse lookup table (say, 30 values out of 256 map to a byte) you don't want to allocate space for those useless 226 values, so you could write out each by hand. Instead, I maintain a list of pairs at the beginning of the file (along with all the other constants) and I wrote a CL function that converts them to C code. Sometimes using if-elses results in a smaller binary verses switch-case, and sometimes the other way. I only have to change two words and recompile to test the result both ways. Doing this by hand for each statement... yuck. But I saved over 100 bytes on my last project, enough to get a necessary feature working before shipping.

I use emacs, so when I edit code inside the lisp preprocessor tags, I type 'M-x lisp-mode' and press enter. Then emacs properly indents the lines and I can use SLIME to execute the preprocessor functions right there. Sometimes I need to add the comment:

;|

as the first line because the lisp mode catches a pipe in the C source (CL block comment) and formats everything incorrectly. Conversely, I sometimes have to place this after a Lisp section:

//'

...for the single quote (even inside a comment) freaks out the emacs reader. Not sure how this works in emacs23 yet.

Basically, I use Lisp to remove as many assumptions from the code as possible. As a result, my devices work just the way I say they will -- contrary to the other programmer here. Sadly, my coworkers think waterfall is a genius new idea they came up with. It's quite frustrating because I'd still like to learn from others, but I fear going somewhere where I'd be forced to follow a bad methodology (at least here I have the freedom to follow what I think are best practices, on the projects I develop/lead/design).

Back to the lisp preprocessor: When I call 'make' each source file is attached to the standard input of a lisp process launched off a file of <processor_name>.lisp, and the output is written into <original_file>.c.cpp-lisp.c (yes, very ugly, but it works). The gcc is run as normal, with the output written back to <original_file>.o to clean up the file structure.

One problem I ran into is CL up-casing all input. Fortunately, avr-libc has all constants as uppercase, and most of my preprocessor passes through strings unchanged (if it makes sense), but I have had many times when I had to work around this. The project above should allow these problems to be solved easily.

Atmel releases all the register information (names, bit names, read/write/both, etc) in an XML file. The next step in upgrading my preprocessor is to automatically generate mapping functions from generic names to specific names from this XML file (which is first processed by xml-to-sexp). This function lets me write something like this:

  (set-bit 'RXENn :sequence COMPUTER_UART)

and then change the zero to one in

  (defparameter COMPUTER_UART 0)

to use the second uart instead of the first. If this was done using pure C, either a search and replace for RXEN0->RXEN1 (and you'd have to check that you don't use any other UART registers, nor in any other files, etc) or you have series of #define's at the top of the file like #define COMPUTER_UART_RX_ENABLE() ... Finally, the other major problem is changing output pins. As the AVR is 8-bit, the output pins are in groups of eight. Normally, you would have code like this:

  #define LED_PIN PC1
  ...
  PORTC |= (1<<LED_PIN);

but, what if you change LED_PIN to be PB0? Now it is on PORTB so you must change everywhere you write it. So you do:

  #define LED_PIN PB0
  #define LED_PIN_PORT PORTB
  #define SET_LED_PIN() LED_PIN_PORT |= (1<<LED_PIN)
  ...
  SET_LED_PIN();

well, then you need CLEAR_LED_PIN(), and SET_LED_PIN_AS_OUTPUT() and SET_LED_PIN_AS_INPUT()... all of sudden you want one little constant and you get five extra garbage lines of boilerplate. Instead, my mapping function knows the port of PB0 is PORTB so you only need one line:

  (defparameter LED_PIN 'PB0)

Now, if you end up needing more direct control, you can still use the Lisp code to generate assertions.

Anyway, all the problems in these two posts are some of the most commons ones brought up to the forums at avrfreaks.net and that I've experienced, so I wrote these tools to mitigate them. Unfortunately, it is too gross (in the dark corners you can find some of my first real CL code!) and too poorly designed (should be automatically generated based off Atmel XML files) to be released to the public, but the proof is in the pudding: I came back to a six-month-old project, made what could have been major changes (we were upgrading the product) in less than a day and everything worked the first compile and passed all the "QA" tests.

bayareaguy · on Dec 16, 2008

Your system sounds like Cog http://nedbatchelder.com/code/cog except with lisp instead of python.

yters · on Dec 16, 2008

That is pretty sweet. On a much less cool level, I've been designing a similar system to do some basic formal verification of my java code.

sctb · on Dec 15, 2008

There's also PreScheme, used to implement Scheme48's VM, whose compiler is written in Scheme. It seems that PreScheme differs by providing macros in the PreScheme language rather than requiring the user to write `transformations' in the compiler language (Common Lisp in the case of SC).

rw · on Dec 15, 2008

Is this the panacea for those of us wanting both powerful macros and high(est) performance? Is there already something like this for a language with more support for distributed computation?

coliveira · on Dec 15, 2008

This looks interesting as a code generator, but I fail to understand how it can be useful to lisp programmers. After the code is generated, there is no easy way to call the C-compiled function, other than manually running a C compiler, then using whatever foreign method call method is available in your lisp version. That is: it sucks almost as badly as writing the C code manually. Am I missing something crucial here?

shiro · on Dec 16, 2008

I'm using similar technique in the Scheme implementation Gauche. Gauche compiles Scheme into VM instrunctions, but VM itself, some foreign bridges, and performance-critical pieces are in C. I started from plain C, then gradually moved to C-in-S-expression. Here's my experience:

* As others pointed, ability to use powerful macros are huge relief compared to the crude macro of C.

* I can manipulate a piece of C code (in S-expr) programmatically. For example, I can automate VM instruction fusion (create a piece of code that combines two or more basic VM instructions).

* I can mix Scheme code and C code more naturally, allowing me to micro-tune inner loop and/or reducing overhead of FFI.

Of course, if you have really good native Lisp compiler on your target platform, you won't need this kind of stuff.

s3graham · on Dec 16, 2008

I guess if you don't have a Lisp environment for your target, it could be somewhat better than having to write C code directly (because you could more easily generate it with macros, etc.)

jrockway · on Dec 15, 2008

Wow, it really scares me to see this today. I was just thinking about writing something like this last night.

arjungmenon · on Dec 15, 2008

There's also BitC: http://www.bitc-lang.org/

Quote from their website: BitC is a new systems programming language. It seeks to combine the flexibility, safety, and richness of Standard ML or Haskell with the low-level expressiveness of C.

ngvrnd · on Dec 15, 2008

In fact, when I went hunting for SC, not knowing the name, I first found BitC -- which I agree sounds quite interesting, but seems very heavyweight. I had heard the description "It's just C, written in S-expressions.", and BitC is decidedly not that.

apgwoz · on Dec 15, 2008

BitC's syntax probably isn't going to remain s-exp based. They used it originally because they were familiar with it and didn't want to worry about syntax while defining the core language.

malkia · on Dec 16, 2008

I know little bit about bitC, but it has different purpose and ideas, and the lispish syntax is just for the prototype phase.

SC sounds fun!

Imagine writting CUDA/OpenCL kernels in Scheme-Lisp way :)

Zak · on Dec 16, 2008

>the lispish syntax is just for the prototype phase

That's what they said about Lisp itself. Let's hope it's just as true.