Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Anyone who doesn't understand this either isn't required to use to utility it provides or has no idea how to prompt it correctly. My wife is a bookkeeper. There are some tasks that are a pain in the ass without writing some custom code. In her case, we just saved her about 2 hours by asking Claude to do it. It wrote the code, applied the code to a CSV we uploadrd and gave us exactly what we needed in 2 minutes.


>Anyone who doesn't understand this either isn't required to use to utility it provides or has no idea how to prompt it correctly.

Almost every counter-criticism of LLMs almost boil down to

1. you're holding it wrong

2. Well, I use it $DAYJOB and it works great for me! (And $DAYJOB is software engineering).

I'm glad your wife was able to save 2 hours of work, but forgive me if that doesn't translate to the trillion dollar valuation OpenAI is claiming. It's strange you don't see the inherent irony in your post. Instead of your wife just directly uploading the dataset and a prompt, she first has to prompt it to write code. There are clear limitations and it looks like LLMs are stuck at some sort of wall.


> 1. you're holding it wrong

When computers/internet first came about, there were (and still are!) people who would struggle with basic tasks. Without knowing the specific task you are trying to do, its hard to judge whether its a problem with the model or you.

I would also say that prompting isn't as simple as made out to be. It is a skill in itself and requires you to be a good communicator. In fact, I would say there is a reasonable chance that even if we end up with AGI level models, a good chunk of people will not be able to use it effectively because they can't communicate requirements clearly.


So it's a natural language interface, except it can only be useful if we stick to a subset of natural language. Then we're stuck trying to reverse engineer a non documented, non deterministic API. One that will keep changing under whatever you build that uses it. That is a pretty horrid value proposition.


Short of it being able to mind read, you need to communicate with it in some way. No different from the real world where you'll have a harder time getting things done if you don't know how to effectively communicate. I imagine for a lot of popular use-cases, we'll build a simpler UX for people to click and tap before it gets sent to a model.


I'd rather run commands and write code than try to reverse engineer an non-deterministic, non-documented, ever changing API.


Boiling down to a couple cases would be more useful if you actually tried to disprove those cases or explain why they're not good enough.

> It's strange you don't see the inherent irony in your post. Instead of your wife just directly uploading the dataset and a prompt, she first has to prompt it to write code. There are clear limitations and it looks like LLMs are stuck at some sort of wall.

What's ironic about that? That's such a tiny imperfection. If that's anything near the biggest flaw then things look amazing. (Not that I think it is, but I'm not here to talk about my opinion, I'm here to talk about your irony claim.)


>Boiling down to a couple cases would be more useful if you actually tried to disprove those cases or explain why they're not good enough.

This reply is 4 comments deep into such cases, and the OP is about a well educated person who describes their difficulties.

>What's ironic about that? That's such a tiny imperfection.

I'd argue it's not tiny - it highlights the limitations of LLMs. LLMs excel at writing basic code but seem to struggle, or are untrustworthy, outside of those tasks.

Imagine generalizing his case: his wife goes to work and tells other bookkeepers "ChatGPClaudeSeek is amazing, it saved 2 hours for me". A coworker, married to a lawyer, instead of a software engineer, hearing this tries it for himself, and comes up short. Returning to work the next day and talking about his experience is told - "oh you weren't holding it right, ChatGPClaudeSeek can't do the work for you, you have to ask it to write code, that you must then run". Turns out he needs an expert to hold it properly and from the coworker's point of view he would probably need to hire an expert to help automate the task, which will likely only be marginally less expensive than it was 5 years ago.

From where I stand, things don't look amazing; at least as amazing as the fundraisers have claimed. I agree that LLMs are awesome tools - but I'm evaluating from a point of a potential future where OpenAI is worth a trillion dollars and is replacing every job. You call it a tiny imperfection, but that comes across as myopic to me - large swaths of industries can't effectively use LLMs! How is that tiny?


> Turns out he needs an expert to hold it properly and from the coworker's point of view he would probably need to hire an expert to help automate the task, which will likely only be marginally less expensive than it was 5 years ago.

The LLM wrote the code, then used the code itself, without needing a coder around. So the only negative was needing to ask it specifically to use code, right? In that case, with code being the thing it's good at, "tell the LLM to make and use code" is going to be in the basic tutorials. It doesn't need an expert. It really is about "holding it right" in a non-mocking way, the kind of instructions you expect to go through for using a new tool.

If you can go through a one hour or less training course while only half paying attention, and immediately save two hours on your first use, that's a great return on the time investment.


Honest question, how do you validate it?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: