Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Tensor diagrams are standard, but some notation is missing. My goal was to be able to handle the entire Matrix Cookbook.

For this I needed a good notation for functions applied to specific dimensions and broadcasting over the rest. Like softmax in a transformer.

The function chapter is still under development in the book though. So if you have any good references for how it's been done graphically in the past, that I might have missed, feel free to share them.



You can do broadcasting with a tensor, at least for products and sums. The product is multilinear, and a sum can be in two steps, first step using a tensor to implement fanout. Though I can see the value in representing structure that can be used more efficiently versus just another box for a tensor. Beyond that (softmax?) seems kind of awkward since you're outside the domain of your "domain specific language". I don't know why it's needed to extend the matrix cookbook to tensor diagrams.


I come back to this every few months and do some work trying to make sense of how tensors are used in machine learning. Tensors, as used in physics and whose notation these tools inherit, are there for coordinate transforms and nothing else.

Tensors, as used in ML, are much closer to a key-value store with composite keys and scalar values, with most of the complexity coming from deciding how to filter on those composite keys.

Drop me a line if you're interested in a chat. This is something I've been thinking about for years now.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: