Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: What are you building LLM/RAG chatbots with
7 points by petervandijck on March 19, 2024 | hide | past | favorite | 9 comments
LangChain? Cohere? LLamaIndex? DIY?

Are you finding specific pros/cons for some of the ones that try to be a platform. As an example, we've found LangSmith's integration with LangChain super useful, even though LangChain itself has its pros and its cons.



I'm mainly hacking around with my LLM CLI tool, experimenting with different combinations of embedding models and LLMs: https://til.simonwillison.net/llms/embed-paragraphs#user-con...

I really need to add a web interface to that so it's a bit more accessible to people who don't live in the terminal!


Simon! Still loving your blog posts about this stuff. Thank you for doing that.

Agreed that not everyone lives in the terminal, but you know.




I'm taking a DIY approach to RAG/function calling for a work tool. We're looking for data sovereignty, so we're probably going to self-host. To that end, I'm using Ollama to serve some models. If you want to do DIY I would highly recommend using NexusRaven for your function calling model.

No promises but I'm hopeful we can opensource our work eventually.


+1, Did post the exact query in Nexusraven discord asking for an example or quick start with ollama yesterday. Before that, tried to hack their NexusRaven pip client which uses TGI inference endpoint and non-langchain.py from their evaluation repo which uses TGI pipeline. Both failed.


why NexusRaven specifically, what has your experience been?


In my testing it seems good at function calling including nested ones even when compared to GPT4 , since OpenAI function definition does not allow to specify return value name and its type . With ollama it’s quantized and can run on laptop GPU. While there are other ones like Functionary and fireworks.ai function calling on hugging face , they are not quantized so could not test them.


I used LangChain and models hosted on Ollama for my latest project [1]. Since I have a GPU now and Ollama is now available for Windows I can build LLM based applications quickly with local debugging.

[1] https://github.com/bovem/chat-with-doc




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: