Hacker Newsnew | past | comments | ask | show | jobs | submit | bkitano19's commentslogin

You can use voice prompting; it's supported on ElevenLabs and Hume.

Awesome post!


Indeed, the title undersells it and I'm glad I didn't skip over it, the article is basically an information-dense but approachable summary of audio generation.




Related work:

Interpreting Modular Addition in MLPs https://www.lesswrong.com/posts/cbDEjnRheYn38Dpc5/interpreti...

Paper Replication Walkthrough: Reverse-Engineering Modular Addition https://www.neelnanda.io/mechanistic-interpretability/modula...


And more recently, [Language Models Use Trigonometry to Do Addition](https://arxiv.org/abs/2502.00873)


hume.ai specializes in expressive prosody for TTS (disclaimer - I work here)


Time to first token is as important to know for many use cases, rarely are people reporting it



this is nuts


We think so too, big things coming :)


www.juicelabs.co


+1, had the fortune to work with him at a previous startup and meetup in person. Our convo very much broadened my perspective on engineering as a career and a craft, always excited to see what he's working on. Good luck Simon!


https://transformer-circuits.pub/2022/in-context-learning-an...

there is a lot of evidence to suggest that they are performing induction


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: