I vouch for this. Pretty solid and keeps improving. The OP is in the class of Magic Wizards of programming like Fabrice Bellard!
There are frequent updates and performance improvements. There is also a small community of active users around this.
All most all feedbacks get implemented and the OP is very responsive.
The OP made it possible to do state of the art voice recognition without the PyTorch baggage and in C/C++, pretty incredible! Its one of those rare high value projects.
Very grateful for this project and respect to the OP!
Some day if a ChatGPT open version becomes available, this could mean voice assistants that speak sense and understand the human - as long you have a beefy machine.
The current efficiency is pretty surprising, even on a low spec device it performs faster than real time.
I don't know what to say. But I'm blown away.
I expect to see more magic from the OP in future.
He has even a project for a cool sound modem that works over ultrasonic! Not new stuff, but the implementation is the most robust I have seen.
I recommend hackers here to check out his other project too and maybe contribute with testing and patches and stuff!
Yup this is so magical. I've always felt there was something off about requiring setting up what is essentially a pytorch/ml dev environment everytime end users "just" want to run inference.
A single binary that does this all w/o the python stack is just incredible!
Ran it and everything worked! Amazing. Note that only the large (3GB) whisper-v2 is available at the moment, but haven't seen any errors yet from the older small ones. Wild.
There are frequent updates and performance improvements. There is also a small community of active users around this.
All most all feedbacks get implemented and the OP is very responsive.
The OP made it possible to do state of the art voice recognition without the PyTorch baggage and in C/C++, pretty incredible! Its one of those rare high value projects.
Very grateful for this project and respect to the OP!
Some day if a ChatGPT open version becomes available, this could mean voice assistants that speak sense and understand the human - as long you have a beefy machine.
The current efficiency is pretty surprising, even on a low spec device it performs faster than real time.
I don't know what to say. But I'm blown away.
I expect to see more magic from the OP in future.
He has even a project for a cool sound modem that works over ultrasonic! Not new stuff, but the implementation is the most robust I have seen.
I recommend hackers here to check out his other project too and maybe contribute with testing and patches and stuff!