Google is a mess of configuration languages. Starlark, GCL, piccolo, textproto, ...

danpalmer · on June 17, 2024

It's a mess, yes, but it's better than most places I've seen that end up using a bunch of YAML/JSON and then poorly defined custom tools to mangle things together.

Also I'd argue that there's a good core in Google's config stack: everything resolving to a proto at the bottom, textproto for "raw" representations, and having one language above that. If we could standardise on that it would be good, but of course that's unlikely.

Does Boq use a subset of Starlark? I thought it was Piccolo. I've always assumed Piccolo predated Starlark and that Starlark was essentially the external-first rewrite with some learnings.

laurentlb · on June 17, 2024

From what I remember, Piccolo uses a Python interpreter.

Blaze also used to call a Python interpreter to evaluate BUILD files, but this is something I've changed. Starlark is unrelated to Piccolo, it started as an interpreter written inside Blaze. Starlark was created just before we open-sourced Bazel (because I didn't want to open-source the Python preprocessing).

Edit: indeed, Boq's manifest.bzl is Starlark.

esprehn · on June 17, 2024

Boq has a lot of restrictions on what you can do (ex. No user functions except a tiny allow list they own). It lives in a manifest.bzl file though, and I'm pretty sure uses the golang starlark interpreter.

miki123211 · on June 17, 2024

> (because I didn't want to open-source the Python preprocessing).

Can you explain why?

I'm not trying to criticize anybody here, I'm just curious about the reasoning behind a decision to open source one but not the other.

laurentlb · on June 17, 2024

It was a complete mess, hard to maintain the code, with scalability and performance problems. The semantics were very error-prone because it combined two interpreters (Python interpreter, optional, as a preprocessing step) followed by the interpreter in Bazel.

Bazel started a Python process for each BUILD file to preprocess. If two BUILD files were using the same bzl file, that bzl file would be parsed/evaluated twice. In a graph with lots of dependencies, it was causing a lot of redundant work. It's a bit like #include vs modules in C++: Starlark can evaluate a dependency file once, and provide the result for each BUILD file.

Python can be hard to sandbox, and we got multiple security reports about exploits.

Interestingly, Facebook Buck went through the same way. They originally copied the Python preprocessing approach. When we open-sourced Bazel, the Buck team took our Starlark (aka Skylark) interpreter, and started the migration. See https://buck.build/concept/skylark.html

laurentlb · on June 17, 2024

Google is trying to standardize on a few configuration languages: GCL (because it was too hard to migrate away from it), Starlark (when BCL is not suitable) and Text proto (for things that don't need logic).

It used to be a complete mess, but a plan was written a couple of years ago. So there's hope of improvement.

danpalmer · on June 17, 2024

That's good to hear, but sad that I (as a SWE/SRE) haven't heard of the plan! Maybe I've been living under a rock, but knowing that I should be building new projects with these and not, say, Piccolo, would be nice. Updating codelabs to reflect this where there are multiple options would also be nice, IIRC the CDP codelab still uses Piccolo, but could probably use GCL fairly easily?

ithkuil · on June 17, 2024

Interesting to see that GCL still lives on.

Does it have a debugger now?

I worked there a decade ago and we were trying out the "gcl2" reimplementation that was based on an actual formal specification of the languages. The semantics were subtly different and we couldn't switch our complex config to it quite yet back then. I wonder if the experiment succeeded or you're back on to the implementation-defined language semantics

Rebelgecko · on June 17, 2024

Plus bcl, ncl, OWNERS, piccolo, SDL, and probably dozens more