Hacker Newsnew | past | comments | ask | show | jobs | submit | anupsurendran's commentslogin

Some CTA stats included based on research with 12 SaaS companies


Does Pathway's ability to use multithreading improve the processing when you deal with multiple kafka topics?


I had the same question Jacknews, I couldnt find the person who is running this at vercel to ask this question. Maybe we should just tweet?


I don't think it is made by Vercel, they are hosting it on Vercel. There is a link to the developers Twitter and GitHub at the top of the page.


Historically yes - uppercases came first. Even when lower cases showed up, the combination key sequence to get the lowercases working was harder so that took more time to catchup


every side has a story to tell. a story without emotion is not worth telling.


"story" it's not a fucking story that's the whole point of the discussion.


This is a Google Colab notebook (that works in Jupyter too). The notebook can connect to live data sources (e.g. Kafka) and you can do analysis. You could do a CSV replay if you have timestamped data entries


100% vthallam. The upside for OpenAI is much higher.


.... Unless Sam Altman, major research leaders and 50%+ of the employees had transferred to Microsoft


Does this only work with Jupyter? Or does it also work with Google Colab?


Jan, can you explain briefly how the deduplicator checks if the new answer is significantly different? Is there code in the repository we can take a look at?


Sure: when a new response is produced because some source documents have changed we ask an LLM to compare the responses and tell if they are significantly different. Even a simplistic prompt, like the one used in the example would do:

    Are the two following responses deviating?
    Answer with Yes or No.

    First response: "{old}"

    Second response: "{new}"
(used in https://github.com/pathwaycom/llm-app/blob/69709a2cf58cdf6ea...)


Couldn't you just compare the similarity of the embeddings? I imagine that would work in the vast majority of cases and save a lot of LLM calls.


That's a good idea, the deduplication criterion is easy to change, using an llm is faster to get started, but after a while a corpus of decisions is created and can be used to either select another mechanism, or e.g. train one on top of bert embeddings.


I feel that there are too many moving pieces here especially for prototyping. There was a much more simpler app recently I took a look at on a recent hackernews post : https://news.ycombinator.com/item?id=36894142

They still have work to do with different connectors (e.g. PDF etc) but the realtime simple document pipeline is what helps a lot.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: