Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

My current hypothesis: the more familiar you are with a topic the worse the results from any LLM.


Amen to this. As soon as you ask an LLM to explain something in detail that you’re a domain expert in, that’s when you notice the flaws.


Yes, it’s particularly bad when the information found on the web is flawed.

For example, I’m not a domain expert, but I was looking for an RC motor for a toy project and OpenAI had happily tried to source a few, with Deep Research. Only the best candidate it had picked contained an obvious typo in the motor spec (68 grams instead of 680 grams), which is just impossible for a motor of specified dimensions.


> Yes, it’s particularly bad when the information found on the web is flawed.

It's funny you say that because I was going to echo your parents sentiment and point out it's exactly the same with any news article you read.

The majority if content these LLMs are consuming is not from domain experts.


Right, but LLMs are also consuming AWS product documentation and Terraform language docs, some things I have read a lot of and they’re often badly wrong on things from both of those domains, which are really easy for me to spot.

This isn’t just “shit in, shit out”. Hallucination is real and still problematic.


I had it generate a baseball lineup the other day, it printed out a list of the 13 kids names, then said (12 players). Just straight up miscounted what it was doing, throwing a wrench to everything else it was doing beyond that point.


He was saying that 3.5 is better than 3.7 on the same topic he knows well tho.


> My current hypothesis: the more familiar you are with a topic the worse the results from any LLM.

That's not really true, since your prompts are also getting better. Better input leads to better output remains true, even with LLMs (when you see it as a tool).


Being more familiar with the topic definitely doesn't always make your prompts better. For a lot of things it doesn't really change (explain X, compare X and Y...) - and this is what is being discussed it. For giving "building" instructions (like writing code) it helps a bit, but even if you know exactly what you want it to write, getting it to do that is pretty much trial and errror (too much detail makes it follow word-for-word and produce bad code, too little and it misses important parts or makes dumb mistakes).


The opposite may be true, the more effective the model the lazier the prompting as it can seemingly handle not being micromanaged as with earlier versions.


The more familiar you are with the state of “Jira hygiene” in the megacorp environment, the less hope you have that LLMs will be able to make sense of things.

That said, the “AI all the things” mandates could be the lever that ultimately accomplishes what 100+ PjMs couldn’t - making people write issues as if they really mattered. Because garbage in, garbage out.


It is like this with expert humans too. Which is why, no matter what, we will continue to require expert humans not just "in the loop" but as the critical cogs that are the loop itself, just as it as always been. However, this time around those people will have AI augmentation, and be intellectually athletes of a nature our civilization has never seen.


I always tell people to trust the LLM to the same extent as an intern. Avoid giving it tasks you cannot verify the correctness of.


That is certainly the case in niche topics where published information is lacking, or needs common sense to synthesize proper outputs [1].

However in this specific example, I don't remember if it was chatgpt or gemini or 3.5 Haiku but the other(s) explained it well enough. I think I re-asked 3.5 Haiku at a later point of time, and to my complete non-surprise, it gave an answer that was quite decent.

1 - For example, the field of DIY audio - which was funnily enough the source of my question. I'm no speaker designer, but combining creativity with engineering basics/rules of thumb seems to be something LLms struggle with terribly. Ask them to design a speaker and they come up with the most vanilla, tired, textbook design - despite several existing market products that are already so much ahead/innovative.

I'm confident that if you asked an LLM an identical question for which there is more discourse - eg make an interesting/innovative phone - you'd get relatively much better results.


I built open baffle speakers based on measurements and discussion I had with Claude. I think it is really good.

I am a novice, maybe that's why I liked it.


Not really. I'm getting pretty good Computer Science theory out of Gemini and even ChatGPT.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: