Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

10/10 you're doing god's work my friend, can't wait to spend some time this weekend to try and understand what's going on here. I can't overstate how much I value small libraries. I can't think of a faster way to learn about a concept than to step through someone else's barebones implementation.


Thanks! Indeed, I agree that the project has an educational aspect and value. For me, it helped me get a better understanding of the neural network layers involved in the transformer model. Also, it was a good playground to practice my low-level optimization techniques. I guess another cool thing was that with the help of the community, we came up with a faster way to evaluate the Encoder (at the cost of some accuracy), which ultimately enabled the WASM and RPi4 examples (see #137 if interested in the discussion).


I liked reading the different implementations of the low-level tensor ops (simple C/AVX/AVX2/WASM128bit/ARM-NEON) -- it will help me learn about how to use x86 ASM. Thank you for writing this! Do you have any other recommendations/examples on how numerical code can be optimized via SIMD routines?


I don't have other recommendations as I am a novice myself when it comes to SIMD. I think the multiplication routines in `whisper.cpp` are relatively basic - dot product and fused multiply-add. With a few trial and errors I came up with these implementations - not sure if they are optimal.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: