Hacker Newsnew | past | comments | ask | show | jobs | submit | doorhammer's commentslogin

I'm wildly out of my depth here, but sometimes I find I learn quickly if I try out my intuition publicly and fail spectacularly :)

> "This is necessary for optimization but can lead to invalid programs."

Is this not the case? It feels right in my head, but I assume I'm missing something.

My understanding: - Backtracking gets used to find other possible solutions - Cut stops backtracking early which means you might miss valid solutions - Cut is often useful to prune search branches you know are a waste of time but Prolog doesn't - But if you're wrong you might cut a branch with solutions you would have wanted and if Prolog iterates all other solutions then I guess you could say it's provided an invalid solution/program?

Again, please be gentle. This sounded reasonable to me and I'm trying to understand why it wouldn't be. It's totally possible that it feels reasonable because it might be a common misconception I've seen other places. My understanding of how Prolog actually works under-the-hood is very patchy.


>> Cut stops backtracking early which means you might miss valid solutions

That's right, but missing valid solutions doesn't mean that your program is "invalid", whatever that means. The author doesn't say.

Cuts are difficult and dangerous. The danger is that they make your program behave in unexpected ways. Then again, Prolor programs behave in unexpected ways even without the cut, and once you understand why, you can use the cut to make them behave.

In my experience, when one begins to program in Prolog, they pepper their code with cuts to try and stop unwanted bactracking, which can often be avoided by understanding why Prolog is backtracking in the first place. But that's a hard thing to get one's head around, so everyone who starts out makes a mess of their code with the cut.

There are very legitimate and safe ways to use cuts. Prolog textbooks sometimes introduce a terminology of "red" and "green" cuts. Red cuts change the set of answers found by a query, green cuts don't. And that, in itself, is already hard enough to get one's head around.

At first, don't use the cut, until you know what you're doing, is I think the best advice to give to beginner Prolog programmers. And to advanced ones sometimes. I've seen things...


> In my experience, when one begins to program in Prolog, they pepper their code with cuts to try and stop unwanted bactracking, which can often be avoided by understanding why Prolog is backtracking in the first place.

This gets to the heart of my problem with Prolog: it's sold as if it's logic programming - just write your first-order predicate logic and we'll solve it. But then to actually use it you have to understand how it's executed - "understanding why Prolog is backtracking in the first place".

At that point, I would just prefer a regular imperative programming language, where understanding how it's executed is really straightforward, combined with some nice unification library and maybe a backtracking library that I can use explicitly when they are the appropriate tools.


> This gets to the heart of my problem with Prolog: it's sold as if it's logic programming - just write your first-order predicate logic and we'll solve it. But then to actually use it you have to understand how it's executed

Prolog is a logic-flavored programming language. I don't recall Prolog ever being "sold" as pure logic. More likely, an uninformed person simply assumed that Prolog used pure logic.

Complaining that Prolog logic doesn't match mathematical logic is like complaining that C++ objects don't accurately model real-life objects.


    I don't recall Prolog ever being "sold" as pure logic.
One of the guides linked above describes it as:

    The core of Prolog is restricted to a Turing complete subset of first-order predicate logic called Horn clauses


> The core of Prolog is restricted to a Turing complete subset of first-order predicate logic called Horn clauses

Does this sound to you like an attempt to deceive the reader into believing, as the GP comment stated, that the user can

> just write your first-order predicate logic and we'll solve it.


It absolutely does sound like "write your first order logic in this subset and we'll solve it". There's no reasonable expectation that it's going to do the impossible like solve decideability for first order logic.


> It absolutely does sound like "write your first order logic in this subset and we'll solve it".

No it does not. Please read the words that you are citing, not the words that you imagine. I honestly can't tell if you are unable to parse that sentence or if you a cynically lying about your interpretation in order to "win" an internet argument.

All programming languages are restricted, at least, to a "Turing complete subset of first-order predicate logic." There is absolutely no implication or suggestion of automatically solving any, much less most, first order logic queries.


Except it cannot decide all Horn clauses.


>> This gets to the heart of my problem with Prolog: it's sold as if it's logic programming - just write your first-order predicate logic and we'll solve it. But then to actually use it you have to understand how it's executed - "understanding why Prolog is backtracking in the first place".

Prolog isn't "sold" as a logic programming language. It is a logic programming language. Like, what else is it?

I have to be honest and say I've heard this criticism before and it's just letting the perfect be the enemy of the good. The criticism is really that Prolog is not a 100% purely declarative language with 100% the same syntax and semantics as First Order Logic.

Well, it isn't, but if it was, it would be unusable. That would make the critics very happy, or at least the kind of critics that don't want anyone else to have cool stuff, but in the current timeline we just have a programming language that defines the logic programming paradigm, so it makes no sense to say it isn't a logic programming language.

Edit:

>> At that point, I would just prefer a regular imperative programming language, where understanding how it's executed is really straightforward, combined with some nice unification library and maybe a backtracking library that I can use explicitly when they are the appropriate tools.

Yeah, see what I mean? Let's just use Python, or Java, or C++ instead, which has 0% of FOL syntax and semantics and is 0% declarative (or maybe 10% in the case of C++ templates). Because we can't make do with 99% logic-based and declarative, gosh no. Better have no alternative than have a less than absolutely idealised perfect ivory tower alternative.

Btw, Prolog's value is its SLD-Resolution based interpretation. Backtracking is an implementation detail. If you need backtracking use yield or whatever other keyword your favourite imperative language gives you. As to unification, good luck with a "nice unification library" for other languages. Most programmers can't even get their head around regexes. And good luck convincing functional programmers that "two-way pattern matching" (i.e. unification) is less deadly than the Bubonic Plague.


> Red cuts change the set of answers found by a query, green cuts don't.

Ohhh, interesting. So a green cut is basically what I described as cutting branches you know are a waste of time, and red cuts are the ones where you're wrong and cut real solutions?

> At first, don't use the cut, until you know what you're doing, is I think the best advice to give to beginner Prolog programmers. And to advanced ones sometimes. I've seen things...

Yeah, I'm wondering how much of this is almost social or use-case in nature?

E.g., I'm experimenting with Prolog strictly as a logic language and I experiment with (at a really novice level) things like program synthesis or model-to-model transformations to emulate macro systems that flow kind of how JetBrains MPS handles similar things. I'm basically just trying to bend and flex bidirectional pure relations (I'm probably conflating fp terms here) because it's just sort of fun to me, yeah?

So cut _feels_ like something I'd only use if I were optimizing and largely just as something I'd never use because for my specific goals, it'd be kind of antithetical--and also I'm not an expert so it scares me. Basically I'm using it strictly because of the logic angle, and cut doesn't feel like a bad thing, but it feels like something I wouldn't use unless I created a situation where I needed it to get solutions faster or something--again, naively anyway.

Whereas if I were using Prolog as a daily GP language to actually get stuff done, which I know it's capable of, it makes a lot of sense to me to see cut and `break` as similar constructs for breaking out of a branch of computation that you know doesn't actually go anywhere?

I'm mostly spit-balling here and could be off base. Very much appreciate the response, either way.


>> So a green cut is basically what I described as cutting branches you know are a waste of time, and red cuts are the ones where you're wrong and cut real solutions?

Basically, yes, except it's not necessarily "wrong", just dangerous because it's tempting to use it when you don't really understand what answers you're cutting. So you may end up cutting answers you'd like to see after all. The "red" is supposed to signify danger. Think of it as red stripes, like.

Which make stuff go faster too (well, a little bit). So, yeah, cuts in general help the compiler/interpreter optimise code execution. I however use it much more for its ability to help me control my program. Prolog makes many concessions to efficiency and usability, and the upshot of this is you need to be aware of its idiosyncrasies, the cut being just one of them.

>> Whereas if I were using Prolog as a daily GP language to actually get stuff done, which I know it's capable of, it makes a lot of sense to me to see cut and `break` as similar constructs for breaking out of a branch of computation that you know doesn't actually go anywhere?

Cuts work like breaks sometimes, but not always. To give a clear example of where I always use cuts, there's a skeleton you use when you want to process the elements of a list that looks like this:

  list_processing([], Bind, Bind):-
    !. % <-- Easiest way to not backtrack once the list is empty.

  list_processing([X|Xs], ..., Acc, Bind, ... ):-
     condition(X)
     ,! % Easiest way to not fall over to the last clause.
     ,process_a(X,Y)
    ,list_processing(Xs, ..., [Y|Acc], Bind, ... ).

  list_processing([X|Xs], ..., Acc, Bind, ... ):-
     process_b(X,Y)
    ,list_processing(Xs, ..., [Y|Acc], Bind, ... ).
So, the first cut is a green cut because it doesn't change the set of answers your program will find, because once the list in the first argument is empty, it's empty, there's no more to process. However, Prolog will leave two choice points behind, for each of the other two clauses, because it can't know what you're trying to do, so it can't just stop because it found an empty list.

The second cut is technically a red cut: you'd get more answers if you allowed both process_a and process_b to modify your list's elements, but the point is you don't want that, so you cut as soon as you know you only want process_a. So this is forcing a path down one branch of search, not quite like a break (nor a continue).

You could also get the same behaviour without a cut, by e.g. having a negated condition(X) check in the last clause and also checking that the list is not empty in every other clause (most compilers are smart enough to know that means no more choice points are needed), but, why? All you gain this way is theoretical purity, and more verbose code. I prefer to just cut there and get it done. Others of course disagree.


> I'm wildly out of my depth here, but sometimes I find I learn quickly if I try out my intuition publicly and fail spectacularly :)

Fair enough. I believe this is a variation of Cunningham's Law, which states "the best way to get the right answer on the internet is not to ask a question; it's to post the wrong answer."

Everything you wrote about backtracking is completely correct. If I may paraphrase, it boils down to: cut can be used to avoid executing unnecessary code, but using it the wrong place will avoid executing necessary code, and that would be bad. My point is: the same could be said about the "break" keyword in C++: it can avoid unnecessary iterations in a loop, or it can exit a loop prematurely. Cut and break are both control structures which make sense in the context of their respective languages, but neither would be accurately described as "for optimization."


Well, sometimes you can gain a few LIPS by cutting strategically but it's not a big deal. Most textbooks will tell you that cuts help the compiler optimise etc, but most of the time you're not writing e.g. a raytracer in Prolog, so the efficiency gains are slim.


I always come back to prolog to tool around with it but haven’t done a ton.

Bidirectionality has always been super fascinating.

Didn’t know about Picat. 100% going to check it out.


I'll warn you that Picat is very much a "research language" and a lot of the affordances you'd expect with a polished PL just aren't there yet. There's also this really great "field notes" repo from another person who learned it: https://github.com/dsagman/picat


Side note: Just clocked your name. Read through Practical TLA+ recently modeling a few things at work. Incredibly helpful book for working through my first concrete model in practice.


Totally fair. Realistically “check it out” means I’ll probably spin up an env and try modeling a few things to see how it feels.

I’m mostly a language tourist they likes kicking the tires on modes of modeling problems that feel different to my brain.

Started skimming those notes. Really solid info. Appreciate it!


I think you're interpreting

> I understand how people can get addicted to it

as

> I understand how people can get addicted to it and I endorse it as a route to making all your worries go away.

I'm going to put words in the ops mouth here and assume what they were communicating is more akin to: "It's absolutely terrifying how quickly, easily, and thoroughly fentanyl can erase your sufferings and worries, replacing them with a feeling of total peace."

I'm assuming they didn't immediately become a fentanyl addict, precisely because they understand how destructive a path to equanimity it is.

Meditation and therapy are great, but addiction disorders often come with comorbidities like (or are comorbid to) PTSD, ADHD, MDD, and bipolar disorder. These are all things that can make establishing a habit like meditation difficult to impossible. Combine that with a lack of life skills and limited access to healthcare (or a complete unfamiliarity with navigating that system re:life skills) and therapy feels impossible as well.

In the last two years I've lost two very close family members to fentanyl. We scheduled therapy sessions and drove them there ourselves, we helped try to find rehab centers, we worked with them to find jobs, walked them through buying cheap transport on craigslist, helped work through medicaid paperwork with them, connected them with people we know who've gone through similar things, and in the end, they didn't make it.

I'm going to guess you're getting down-voted because your response interprets the OP as being against or unaware of meditation and therapy as tools for healthy living; it reads as lacking empathy and a recognition of the realities of addiction.

I'd encourage you to look into the literature in that area and read through the stories of people who have gone through it and survived. I find that for me it was especially helpful to find the stories of people who had life circumstances similar to mine, and still fell into addiction.

I also have strong opinions on the likelihood that meditation and therapy could mimic or match the physiological response a brain has to fentanyl, but the whole topic is draining for me. I hope you'll forgive me for passing on it. I think it might be worth your time to specifically research the physiological mechanisms as well, though.


It's less a reading of "GP endorses it as a route to making all your worries go away" and more one of "GP thinks it should be especially salient to us as a route to making all your worries go away". This is where I disagree. If the thought of erasing all your worries from the mind is tempting to you to the point that you "understand" the addictive potential of a narcotic drug through that one lens, your first-line approach should be learning about equanimity and structured therapy, not strong narcotics.

Also, clearly we don't need to "match the physiological response a brain has to fentanyl" (though there are newer substances like suboxone, now approved for medical use in the US and the EU, that seem to have some limited potential wrt. this), we only have to offer genuinely viable and sustainable approaches (which of course fentanyl isn't) to the narrower issue of dealing with the stressful worries in one's life.


Again, not the OP, so I can't speak to exactly their use-case, but the vast majority of call center calls fall into really clear buckets.

To give you an idea: Phonetic transcription was the "state of the art" when I was a QA analyst. It broke call transcripts apart into a stream of phonemes and when you did a search, it would similarly convert your search into a string of phonemes, then look for a match. As you can imagine, this is pretty error prone and you have to get a little clever with it, but realistically, it was more than good enough for the scale we operated at.

If it were an ecom site you'd already know the categories of calls you're interested in because you've been doing that tracking manually for years. Maybe something like "late delivery", "broken item", "unexpected out of stock", "missing pieces", etc.

Basically, you'd have a lot of known context to anchor the llms analysis, which would (probably) cover the vast majority of your calls, leaving you freed up to interact with outliers more directly.

At work as a software dev, having an LLM summarize a meeting incorrectly can be really really bad, so I appreciate the point you're making, but at a call center for an f500 company you're looking for trends and you're aware of your false positive/negative rates. Realistically, those can be relatively high and still provide a lot of value.

Also, if it's a really large company, they almost certainly had someone validate the calls, second-by-second, against the summaries (I know because that was my job for a period of time). That's a minimum bar for _any_ call analysis software so you can justify the spend. Sure, it's possible that was hand-waved, but as the person responsible for the outcome of the new summarization technique with LLMs, you'd be really screwing yourself to handwave a product that made you measurably less effective. There are better ways to integrate the AI hype train into a QA department than replacing the foundation of your analysis, if that's all you're trying to do.


Thanks for the detailed domain-specific explanation, if we assume that some whale clients of the company will end up in the call center is it not more probable that more competent human agents will be responsible for the call, whereas it's pretty much the same AI agent adressing the whale client as the regular customers in the alternative scenario?


Yeah, if I were running a QA department I wouldn't let llms anywhere near actual customers as far as trying to resolve a customer issue directly.

And, this is just a guess, but it's not uncommon that whale customers like that have their own dedicated account person and I'd personally stick with that model.

The use-case I'm like "huh, yeah, I could see that working well" is mostly around doing sentiment analysis and call tagging--maybe actual summaries that humans might read if I had a really well-design context for the llm to work within. Basically anything where you can have an acceptable false positive/negative rate.


I genuinely don't think that the GP is actually making someone actually listen to the transcription and summary and check if the summary is wrong.

I almost have this gut feeling that its the case (I may be wrong though)

Like imagine this, if the agent could just spend 3 minutes writing a summary, why would you use AI to create a summary and then have some other person listen to the whole audio recording and check if the summary is right

like it would take an agent 3 minutes out of lets say a 1 hour long conversation / (call?)

on the other hand you have someone listen to 1 hour whole recording and then check the summary? that's now 1 hour compared to 3 minutes Nah, I don't think so.

Even if we assume that multiple agents are contacted in the same call, they can all simply write the summary of what they did and to whom they redirected and just follow that line of summaries.

And after this, I think that your summary of seeing that they are really screwing away is accurately true.

Kinda funny how the gp comment was the first thing that I saw in this post and how even I was kinda convinced that they are one of the more smarter ones integrating AI but your comment made me come to realization of them actually just screwing themselves.

Imagine the irony, that a post about how AI companies are screwing themselves by burning a lot of money and then the people using them don't get any value out of it.

And then the one on Hn that sounded like it finally made sense for them is also not making sense... and they are screwing over themselves.

The irony is just ridiculous. So funny it made me giggle


They might not be, and their use-case might not be one I agree with. I can just imagine a plausible reality where they made a reasonable decision given the incentives and constraints, and I default to that.

I'm basically inferring how this would go down in the context I worked under, not the GP, because I don't know the details of their real context.

I think I'm seeing where I'm not being as clear as I could, though.

I'm talking about the lifecycle of a methodology for categorizing calls, regardless of whether or not it's a human categorizing them or a machine.

If your call center agent is writing summaries and categorizing their own calls, you still typically have a QA department of humans that listen to a random sample of full calls for any given agent on a schedule to verify that your human classifiers are accurately tagging calls. The QA agents will typically listen to them at like 4x speed or more, but mostly they're just sampling and validating the sample.

The same goes for _any_ automated process you want to apply at scale. You run it in parallel to your existing methodology and you randomly sample classified calls, verifying that the results were correct and you _also_ compare the overall results of the new method to the existing one, because you know how accurate the existing method is.

But you don't do that for _every_ call.

You find a new methodology you think is worth trying and you trial it to validate the results. You compare the cost and accuracy of that method against the cost and accuracy of the old one. And you absolutely would often have a real human listen to full calls, just not _all_ of them.

In that respect, LLMs aren't particularly special. They're just a function that takes a call and returns some categories and metadata. You compare that to the output of your existing function.

But it's all part of the: New tech consideration? -> Set up conditions to validate quantitatively -> run trials -> measure -> compare -> decide

Then on a schedule you go back and do another analysis to make sure your methodology is still providing the accuracy you need it to, even if you haven't change anything


Man firstly I wanted to say that I loved your comment to which I responded to and then this comment too. I feel actually happy reading it and maybe its hard explaing it but maybe its because I learned something new.

So firstly, I thought that you meant that they had to listen to every call so uh yeah a misunderstanding since I admittedly don't know much about it, but still its great to hear from an expert.

I also don't know about the GP's context but I truly felt like this because of how I said in some other comments too on how people are just slapping AI stickers and markets rewarding it even though they are mostly being reckless in how they are using AI (which the post basically says) and I thought of them as the same, though I still doubt them though. Only more context from their side can tell.

Secondly, I really appreciate the paragraph that you wrote about testing different strategies and almost how indepth you went into man. Really feel like one of those comments that I feel like will be useful for me one day or the other Seriously thanks!


Hey, thanks for saying that. I have huge gaps in time commenting on HN stuff because tbh, it's just social anxiety I don't need to sign up for :| so I really value someone taking the time to express appreciation if they got something out of my novels.

I don't ever want to come across like I think I know what's up better than someone else. I just want to share my perspective given my experience and if I'm wrong, hope someone will be kind when they point it out.

Tbh it's been awhile since I've worked directly in a call center (I've done some consulting type stuff here and there since then, but not much) so I'm mostly just extrapolating based on new tech and people I still know in that industry.

Fwiw, the way I try to approach interpreting something like the GPs post is to try to predict the possible realities and decide which ones I think are most plausible. After that I usually contribute the less represented perspective--but only if I think it's plausible.

I think the reality you were describing is totally plausible. My gut feeling is that it's probably not what's happening, but I wouldn't bet any money on that.

If someone said "Pick a side. I'll give you $20k if your right and take $20k if you're wrong" I'm just flat out not participating, lol. If I _had_ to participate I'd reluctantly take benefit-of-the-doubt side, but I wouldn't love having to commit to something I'm not at all confident about

As it stands it's just a fun vehicle to talk about call center dynamics. Weirdly, I think they're super interesting


I'm curious, have you noticed an impact on agent morale with this?

Specifically: Do they spend more time actually taking calls now? I guess as long as you're not at the burnout point with utilization it's probably fine, but when I was still supporting call centers I can't count the number of projects I saw trying to push utilization up not realizing how real burnout is at call centers.

I assume that's not news to you, of course. At a certain utilization threshold we'd always start to see AHTs creep up as agents got burned out and consciously or not started trying to stay on good calls.

Guess it also partly depends on if you're in more of a cust serv call center or sales.

I hated working as an actual agent on the phones, but call center ops and strategy at scale has always been fascinating.


Thank you, I came to say this too. You're mushing your humans harder, and they'll break. Those 5 mins of downtime post-call aren't 100% note taking - it's catching their breath, trying to re-compose after dealing with a nasty customer, trying to re-energise after a deep technical session etc.

I think AI in general is just being misused to optimise local minima in detriment to the overall system.


Imagine how AI changed the call center's work:

1. Some agents have been laid off.

2. Survivors got stripped off 5-minutes summarizing breaks between calls and assigned new higher targets of how many calls should they take per hour/day.

And it wasn't a good job before AI...


So, I fully agree that we should be aware how AI use is impacting front-line agents--honestly, I'd bet AI is overall a bad thing in most cases--but that's just a gut feeling.

That said, it's possible the agents weren't given extra time to make notes about calls and write summaries; often they're not.

You usually have different states you can be in as a call center agent. Something like: "On a call", "Available to take a new call", "Unavailable to take a new call"

Being on a call is also being unavailable to take a call, but you'd obviously track that separately.

"Unavailable" time is usually further broken down into paid time (breaks), unpaid time (lunch) etc

And _sometimes_ the agent will have a state called something like "After Call Work" which is an "Unavailable" state that you use to finish up tasks related to the call you were just on.

So, full disclosure: I did work for a huge e-com supporting huge call centers, but I only worked for one company supporting call centers. What I'm talking about is my experience there and what I heard from people who also worked there who had experience with other call centers.

A lot of call centers don't give agents any "After Call Work" time and if they do, it's heavily discouraged and negatively impacts your metrics. They're expected to finish everything related to the call _during_ the call.

If you're thinking "that's not great" then, yeah, I agree, but it was above my paygrade.

It's entirely possible that offloading that task to an LLM gives agents _more_ breathing room.

But also totally possible that you're right. I don't know the GPs exact situation, but I feel pretty confident that other call centers are doing similar things with AI tagging and summaries and that you see both situations (AI giving more breathing room some places and taking it away others).


>> It's entirely possible that offloading that task to an LLM gives agents _more_ breathing room

In theory, yes. But there is no way they are going to save millions by giving more breathing room to agents.


As a whole the incentives of capitalism are aligned as you suggest, but every major corp I've worked with has not-so-rare pockets of savvy middle managers that know how to play the game and also care about the welfare of their employees--even if the cultural incentives don't lean that way. (I'm assuming a US market here--and I'm at least tangentially aware that other cultures aren't identical)

E.g., when I worked in call centers I was directly part of initiatives that saved millions and made agents lives better, with an intentionality toward both outcomes.

I also saw people drive agents into the ground trying to maximize utilization and/or schedule adherence with total disregard for the negative morale and business value they were pushing.

It makes me wonder if there are any robust org psych studies about the prevalence and success of middle managers trying to strategically navigate those kinds of situations to benefit their employees. I'd bet it's more rare than not, but I have no idea by how much.


Moral Mazes ( https://en.wikipedia.org/wiki/Moral_Mazes ) is a sociology classic along these lines.

Here's a relevant interview with the author, Robert Jackall: https://anso.williams.edu/files/2015/07/Jackall_interview_Ch...


Sentiment analysis, nuanced categorization by issue, detecting new issues, tracking trends, etc, are the bread and butter of any data team at a f500 call center.

I'm not going to say every project born out of that data makes good business sense (big enough companies have fluff everywhere), but ime anyway, projects grounded to that kind of data are typically some of the most straight-forward to concretely tie to a dollar value outcome.


Yes that sound like important and useful use cases. However, these are solved by boring old school ML models since years...


I think what they're saying is that you need the summaries to do these things


It's easier and simpler to use an LLM service than to maintain those ad hoc models. Many replaced their old NLP pipelines with LLMs.


The place I work at, we replaced our old NLP pipelines with LLMs because they are easier to maintain and reach the same level of accuracy with much less work.

We are not running a call centre ourselves but we are a SaaS offering the services for call centre data analysis.


Sentiment analysis was not solved and companies were paying analyst firms shit tons of money to do that for them manually.


So, I wouldn't be surprised if someone in charge of a QA/ops department chose LLMs over similarly effective existing ML models in part because the AI hype is hitting so hard right now.

Two things _would_ surprise me, though:

- That they'd integrate it into any meaningful process without having done actual analysis of the LLM based perf vs their existing tech

- That they'd integrate the LLM into a core process their department is judged on knowing it was substantially worse when they could find a less impactful place to sneak it in

I'm not saying those are impossible realities. I've certainly known call center senior management to make more hairbrained decisions than that, but barring more insight I personally default to assuming OP isn't among the hairbrained.


My company gets a bunch of product listings from our clients and we try to group them together (so that if you search for a product name you can see all the retailers who are selling that product). Since there arent reliable UPCs for the kinds of products we work with, we need to generate embeddings (vectors) for the products by their name/brand/category and do a nearest-neighbor search. This problem has many many many "old school" ML solutions to it, and when i was asked to design this system I came up with a few implementations and proposed them.

Instead of doing any of those (we have the infrastructure to do it) we are paying OpenAI for their embeddings APIs. Perhaps openAI is just doing old school ML under the hood but there is definitely an instinct among product managers to reach for shiny tools from shiny companies instead of considering more conservative options


Yeah, I don't want to downplay the reality of companies making bad decisions.

I think for me, the way the GP phrased things just made me want to give them the benefit of the doubt.

Given my experience, people I've worked with, and how the GP phrased things, in my mind it's more likely than not that their not making a naive "chase-the-AI" decision, and that a lot of replies didn't have a whole lot of call center experience.

The department I worked with when I did work in call centers was particularly competent and also pretty org savvy. Decisions were always a mix of pragmatism and optics. I don't think it's hard to find people like that in most companies. I also don't think it's hard to find the opposite.

But yeah, when I say something would be surprising, I don't mean it's impossible. I mean that the GP sounds informed and competent, and if I assume that, it'd be surprising to me if they sacrificed long-term success for an immediate boost by slotting LLMs into something so core to their success metrics.

But, I could be wrong. It's just my hunch, not a quantitative analysis or anything. Feature factory product influence is a real thing, for sure. It's why the _main_ question I ask in interviews is for everyone to describe the relationship between product and eng, so I definitely self-select toward a specific dynamic that probably unduly influences my perspective. I've been places where the balance is hard product, and it sucks working somewhere like that.

But yeah, for deciding if more standard ML techniques are worth replacing with LLMs, I'd ultimately need to see actual numbers from someone concretely comparing the two approaches. I just don't have that context


Those have been done for 10+ years. We were running sentiment analysis on email support to determine prioritization back in 2013. Also ran bayesian categorization to offer support reps quick responses/actions. Don't need expensive LLMs it.


Yeah, I was a QA data analyst supporting three multi-thousand agent call-centers for an F500 in 2012 and we were using phoneme matching for transcript categorization. It was definitely good enough for pretty nuanced analysis.

I'm not saying any given department should, by some objective measure, switch to LLMs and I actually default to a certain level of skepticism whenever my department talks about applications.

I'm just saying I can imagine plausible realities where an intelligent and competent person would choose to switch toward using LLMs in a call center context.

There are also a ton of plausible realities where someone is just riding the hype train gunning for the next promotion.

I think it's useful to talk about alternate strategies and how they might compare, but I'm personally just defaulting to assuming the OP made a reasonable decision and didn't want to write a novel to justify it (a trait I don't suffer from, apparently), vs assuming they just have no idea what they're doing.

Everyone is free to decide which assumed reality they want to respond to. I just have a different default.


Not the op, but I did work supporting three massive call centers for an f500 ecom.

It's 100% plausible it's busy work but it could also be for: - Categorizing calls into broad buckets to see which issues are trending - Sentiment analysis - Identifying surges of some novel/unique issue - Categorizing calls across vendors and doing sentiment analysis that way (looking for upticks in problem calls related to specific TSPs or whatever) - etc

False positives and negatives aren't really a problem once you hit a certain scale because you're just looking for trends. If you find one, you go spot-check it and do a deeper dive to get better accuracy.

Which is also how you end up with some schlepp like me listening to a few hundreds calls in a day at 8x speed (back when I was a QA data analyst) to verify the bucketing. And when I was doing it everything was based on phonetic indexing, which I can't imagine touching llms in terms of accuracy, and it still provided a ton of business value at scale.


I think it’s just a mistype

I have a pro plan and I hammer o3–I’d guess more than a hundred a day sometimes—and have never run into limits personally

Wouldn’t shock me if something like that happened but haven’t seen evidence of it yet


> the debugger nonsense, and with the weird CLI live reload removal

C# is probably my favorite overall language and this resonates a lot with me. I did C# Windows dev for the first five years of my career. I think I've got about four years of Go sprinkled in through the rest (and mixtures of node, Ruby, Clojure, and some others filling the gaps)

When I was doing Windows dev full time I used LINQPad for almost all of my scripting because it was so convenient, and I still haven't found a clean workflow for that with C# outside of windows, partly because of things like that. I haven't checked back in the last year or so, so it might have been sorted, but I completely get that being a red flag.

I deeply respect the very intentional and consistent design philosophy of Go--it's obvious everything is there for a reason once you know the reasons--but personally prefer C#. That might just be because it's what I sort of "grew up" on.

Which reminds me that I've been meaning to go back to Go. I haven't used it in earnest since generics were added, so it's been awhile. Something I always really preferred in C# were the ergonomics of collection transformations with LINQ + MoreLinq over for loops--which isn't to say one or the other is bad, but I prefer the functional composition style, personally. Wasn't sure if those Go idioms had changed at all with their generics implementation.


C# and Clojure are probably my two favorite languages, but I've done much much less Clojure in production, and for my personal disposition C# is my overall favorite.

For bread-and-butter concurrency (make a bunch of external service calls at once, collect them up and do something) you've got async/await, but you probably knew that.

Some other things you might look into as far as distributed/parallel libraries/frameworks go: - Orleans, a framework for using an actor model in C# to build distributed systems - TPL Dataflow (Task Parallel Library), a lib for composing "blocks" to create parallel dataflow graphs

Productivity wise, the tooling for C# imo is pretty top notch, and I'd just be repeating what the article said.

Something I radically prefer about Clojure is the LISP heritage of highly interactive debugging and app dev. You can get great feedback loops with C#, but it won't be what you can do with Clojure and Calva or Cursive (is cursive still a thing? it's been awhile)

On the other hand, I personally prefer strongly typed languages, especially with C#, because of how good some of the static analysis and refactoring tooling can get in things like Rider (a JetBrains IDE for C#).

I think deployments are going to be a toss up. For my money Go is still the gold standard of "just make a binary and chuck it out there" and Clojure and C# are more normal "once you're used to it, it's completely fine, but it won't blow you away"


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: