More

Akula112233 · 2025-03-13T17:19:57 1741886397

Our free tier doesn't include the anomaly/error detection (noted on the site, we can make it more clear though). And your numbers do add up! That's why you can't just run all your logs through an LLM.

Aggregated (+ simplified) versions of your logs + flagged anomalies get passed through our LLMs

Akula112233 · 2025-03-13T17:02:48 1741885368

I assume this is most in regards to our anomaly/error detection! Deterministic rules for flagging anomalies + human feedback help in adapting our flagging system accordingly. So hallucinations won't directly impact flagged anomalies. The rules (patterns) we generate are on the stricter end, so they err on the side of flagging more.

However, the rules themselves aren't deterministically generated (and therefore prone to LLM hallucinations). To address this, we currently have a simpler system that lets you mark incorrectly flagged anomalies so they can be incorporated into our generated rules. There's room to improve that we're actively working on: exposing our generated patterns in a human-digestible manner (so they can be corrected), introducing metrics and more data sources for context, and connecting with a codebase.

Akula112233 · 2025-03-12T20:41:04 1741812064

We offer competing core functionality in terms of storage and search, but we’re also focused on intelligence: real-time anomaly detection, semantic log analysis, and natural language search.

Would recommend the demo video and playground environment we linked above! Feel free to reach out at founders@runsift.com if you’d like to learn more

Akula112233 · 2025-03-12T20:15:49 1741810549

Yes, we support OpenTelemetry ingestion! Also Datadog, Splunk, and various other vendors’ agents/forwarders - even custom HTTP Daemons.

If you’ve already set up logging, good chance you can just point your instrumentation towards us and we know how to ingest and handle it.

Akula112233 · 2025-03-12T01:27:28 1741742848

Ah, any particular reason to want these SDKs public? Happy to, especially since you can see source on install anyway. Just curious!

And Kudos to SigNoz as well - have to check out other folks in the space :)

mdaniel · 2025-03-12T01:35:16 1741743316

My initial concern was what transitive deps it was pulling in, but the other answer to your question is the thing that most GH repos are good for: submitting bugs and submitting fixes

It is also good for finding out what the buffering story is, because I would want to know if I'm dragging in an unbounded queue into my app (putting memory pressure on me) or knowing that your service returning 503s is going to eat logs. The kind of thing that only looking at the source would say for sure because the docs don't even hint at such operational concerns

Anyway, the only reason I mentioned the dead link is because your PyPI page linked to GH in the first place. So if you don't intend people to think there's supposed to be a repo, then I'd suggest removing the repo link

Akula112233 · 2025-03-12T01:39:56 1741743596

Noted, thank you! Will make some changes accordingly.

Akula112233 · 2025-03-12T00:13:53 1741738433

> Can it run completely on prem ?

Yep we have an on-prem offering as well, got similar notes from folks before!

> What stops me from building my own logger that sends a request to write a record to a DB and later asks an LLM what it means ?

Great question! The main limitation over brute force is the sheer volume of noise, and therefore relevant context. We tried this and realized it wasn't working. From a numbers perspective, at even just 10s of GBs/day scale of data (not even close to enterprise scale), mainstream LLMs can't provide the context windows you need for more than a few minutes of operational data. And larger models suffer from other factors (like attention diffusion / dilution & drift).

> I see the landing page. The pricing should be clear though “ Contact Us” is scary. Noted!

999900000999 · 2025-03-12T00:27:46 1741739266

Thanks!

I hope my tone wasn’t too brash.

If you can update the pricing I might be able to pitch this to my org later this year. We’d definitely like an on prem solution though!

Akula112233 · 2025-03-11T22:11:08 1741731068

Yep, but it's sometimes a compromise people may be unwilling to make. Too often I hear (and have seen via DD customers) horror stories about initiatives to fix observability squashed by teams in hopes of shipping.

Moving fast has it's downsides and I can't say I blame people for deprioritizing good logging practices. But it does come back to bite...

Though as a caveat, you don't always have control over your logs -- especially with third party services, large but fragmented engineering organizations, etc. -- even with great internal practices, there's always something.

On another note, access to codebase + live logs gives room to develop better auto-instrumentation tooling. Though perhaps cursor could do a decent enough job at starting folks off

Akula112233 · 2025-03-11T21:44:33 1741729473

> I don't understand, what about that is a "silent failure"?

Silent failures can be "allowed" behavior in your applications that aren't actually labeled as errors but can be irregular. Think race conditions, deadlocks, silent timeouts, or even just mislabeled error logs.

> in order for your product to even know about it, wouldn't I need to write a log message for every single record update?

That's right, and this may not always feasible (or necessary!), but if your application can be impacted by errors like these, perhaps it may be worth logging anyway.

> the general question I have with any product that's marketing itself as being "AI-powered" - how do hallucinations get resolved?

> and if my architecture allows two microservices to update the same row in the same database...maybe it happening within 50ms is expected?

> if I ask your product "what caused such-and-such outage" and the answer that comes back is incorrect, how do I "teach" it the correct answer?

For these concerns, human-in-loop feedback is our preliminary approach! We have our own internally running to account for changes and false errors, but having explanations from human input (even as simple as "Not an error" or "Missed error" buttons) is very helpful.

> when that happens I can walk through their thought process and analysis chain with them and identify the gap that led them to the incorrect conclusion. often this is a useful signal that our system documentation needs to be updated, or log messages need to be clarified, or a dashboard should include a different metric, etc etc.

Got it, I imagine it'll be very helpful for us to display our chain of thought from our dashboards too. Great feedback, thank you!

evil-olive · 2025-03-12T07:03:08 1741762988

> Think race conditions, deadlocks, silent timeouts, or even just mislabeled error logs.

I agree that those are bad things.

but how does your product help me with them?

I have some code that has a deadlock. are you suggesting that I can find the deadlock by shipping my logs to a 3rd-party service that will feed them into an LLM?

Akula112233 · 2025-03-11T21:15:07 1741727707

Agreed! Metrics are a high priority, especially since working to increase the available context around each anomaly we flag.

Logs were a natural starting point because that’s where developers often spend a significant amount of time stuck reading & searching for the right information, manually tracking down issues + jumping between logs across services. In a way, just finding & summarizing relevant logs for the user gave people an easier time debugging.

But metrics will introduce more dimensions to establish baseline behavior, so we're pretty excited about it too.

vardaro · 2025-03-11T21:57:27 1741730247

I tend to use logs the least when debugging production issues. I realize that's a personal anecdote, so I see your point.

Akula112233 · 2025-03-11T20:43:46 1741725826

Very relatable experience with log diving, feels very much like a needle-in-haystack problem that gets so much harder when you're not the only one who contributed to the source of errors (often the case).

As for the skepticism with LLMs stumbling around raw logs: it's super deserved. Even the developers who wrote the program often refer to larger app context when debugging, so it's not as easy as throwing a bunch of logs into an LLM. Plus, context window limits & the relative lack of "understanding" with increasingly larger contexts is troublesome.

We found it helped a lot to profile application logs over time. Think aggregation, but for individual flows rather than similar logs. By grouping and ordering flows together, it's bringing the context of thousands of (repetitive) logs down to the core flows. Much easier to find when things are out of the ordinary.

Still a lot of improvements in regards to false positives and variations in application flows.