Currently OpenHermes2-Mistral-7B (via exllama2), OpenAI Whisper (via faster_whisper), and StyleTTS2 (uses HF Transformers). All PyTorch-based.
I will probably update to the OpenHermes vision model when Nous Research releases it, so it'll be able to see with the webcam or even read your screen and chat about what you're working on! I also need to update to Whisper-v3 or Distil-Whisper, and I need to update to a newer StyleTTS2. I also plan to add a Mandarin TTS and Qwen-7B bilingual LLM for a bilingual chatbot. The amount of movement in open AI (not to be confused with OpenAI) is difficult to keep up with.
Of course I need to add better attribution for all this stuff, and a whole lot of other things, like a basic UI. Very much an MVP at the moment.
This is why I think the personal Jarvis for everyday use won’t eventually be in the cloud. It can already be done on local hw as you are demonstrating and cloud has big downsides around privacy and latency and reliability.
Like you said it’s difficult to keep up with and to me it feels very much like open source stuff might win for inference.
And yet of all things word processing and spreadsheets are going to the cloud, or even coding.
Not sure big players won't be pushing heavily (as in, not releasing their best models) for the fat subscriptions/data gathering in the cloud, even if I'd much rather see local (as in cloud at home) computing
I understand what you mean. It makes me wonder if there is room for a solution those who want to own their own hardware and data. Almost like other appliances and equipment that initial cost too much for household ownership but maybe having a “brain” in your house will be a luxury appliance for example. As a Crestron system owner I would love to plugin Jarvis to my smart home somehow.
Maybe it can be a hardware with markup and support and consulting model. That way there could be many competitors in various regions or different countries and we all use the same collection of open source tools. That would be pretty neat. Unlikely I guess but still worth thinking about how it could work.