aclements's comments

aclements · 2025-10-31T15:37:43 1761925063

Thanks for this question! We added a couple sentences to the blog post to explain what a page is. In general, a page is a region of memory that has a large-ish fixed power-of-two size and is also aligned to its size. Virtual memory structures memory around pages, which are typically 4 KiB to 64 KiB depending on the hardware. The Go memory manager, and many other memory managers, also structure memory around pages, which may or may not match the hardware page size. In Go, pages are always 8 KiB and aligned to 8 KiB.

aclements · 2025-10-31T15:33:29 1761924809

I'm a long-time fan of matcha and wrote the initial prototype that demonstrated Green Tea was viable while cafe crawling in Yokohama and drinking lots of matcha. "Matcha" didn't seem like a great name for a garbage collector, but matcha is a form of green tea and "Green Tea GC" rolled off the tongue, so I called my prototype Green Tea and the name stuck.

aclements · on March 6, 2017

Go already performs cross-package inlining, so it can already inline library calls. (This is relatively easy to do in Go compared to other languages because packages must form a DAG. Compiling package A writes out enough information in the object file for A that compiling package B that depends on A can inline calls to functions in A.)

pcwalton · on March 6, 2017

> This is relatively easy to do in Go compared to other languages because packages must form a DAG.

That doesn't make sense to me. The complicated part is storing the IR in packages in a form that can be read back into the compiler later. That's needed to do inlining at all. Once you have that done, doing LTO is trivial: you just slurp your IR for all modules linked together into the compiler and emit a single binary. (You can be fancier, like ThinLTO does, but again, the effort needed to do ThinLTO is independent of whether you have cyclic dependencies or not.)

DannyBee · on March 6, 2017

"Compiling package A writes out enough information in the object file for A that compiling package B that depends on A can inline calls to functions in A"

So it records the calling convention, architecture flags, alignment, and other ABI pieces etc? As well as an estimate of instruction-level inlining cost, summary info about arguments, etc, so you effectively decide whether inlining it will help or hurt, without having the IR around to try?

FWIW: Writing out the info is usually not the hard part, actually, and is unrelated to the DAG-ness of the packages.

GCC is just the perennial example here, but they refused to write it out for years for political reasons, not technical ones :)

aclements · on March 6, 2017

"So it records the calling convention, architecture flags, alignment, and other ABI pieces etc?"

No. At the moment it records the AST in the object file, because the inliner works at the Go AST level. In the future it may instead record the SSA representation (which would obviously give better cost estimates; the current heuristics are really extremely simple).

"FWIW: Writing out the info is usually not the hard part, actually, and is unrelated to the DAG-ness of the packages."

The DAG-ness means it's always available when compiling the call site, even if it's a cross-package call. It means you don't have to do it at link time.

DannyBee · on March 6, 2017

"In the future it may instead record the SSA representation (which would obviously give better cost estimates; the current heuristics are really extremely simple)."

This would be identical to what others do then :)

"The DAG-ness means it's always available when compiling the call site, even if it's a cross-package call. It means you don't have to do it at link time. "

This is unrelated to DAG-ness. Unless you mean something else by DAG-ness. DAG-ness means it's a directed-acyclic graph. That is, all other things being equal, it has no cycles.

This is unrelated to the problem.

For example, in other languages/etc, it could be weakly defined, or other some form of overridable, regardless of whether it has cycles, is actually multiply defined etc. That requires link-time resolution, becuase you can't optimize it making an assumption about its callees/callers, or even inline it, and then just hand that version to others because you may have screwed it up in a way one of those other callers depend on.

That is, the overridability is an attribute of the function, not a problem of how it is used.

Ditto on the ABI, alignment, etc.

None of the interesting problems they have to solve are related to packaging. They occur with DAGs or non-DAGs, are just related to these languages supporting a richer set of things you can do to functions :)

pcwalton · on March 6, 2017

> The DAG-ness means it's always available when compiling the call site, even if it's a cross-package call. It means you don't have to do it at link time.

Why is it any harder to do at link time?

(I've implemented this in a production compiler, and choosing whether to do it at compile time or link time was a trivial decision.)

DannyBee · on March 7, 2017

traditionally, this required a linker that understands there is ir in the files. in practice, i don't believe this has been a problem for many years now (and again, was only a problem in the open source world, so saying it's related to the language is kind of strange.).

Every good production C++ compiler has had some form of link time optimization for many years.

IBM's, for example, has been happily cross-optimizing between C++, java, fortran, PL/IX, etc without any issues, going on at least 15, maybe 25+ years now (I know it's 15 for sure, i suspect it's closer to 25).

aclements · on March 6, 2017

We plan to expose all of the inlining information in the DWARF tables so debuggers won't have any problems with this. Internally, the runtime uses a different representation just so we can make it more compact and optimized for the runtime's exact needs. This way, you can also strip the debug info without breaking the runtime's own ability to walk stacks.

CUViper · on March 6, 2017

Isn't that what `.eh_frame` is for?

mnemonik · on March 7, 2017

.eh_frame is DWARF with a couple tiny tweaks

CUViper · on March 7, 2017

Right, but it's an allocated section that doesn't get stripped like debuginfo.