Hacker Newsnew | past | comments | ask | show | jobs | submit | Grimblewald's commentslogin

Does it? If i make a system prompt for most models right now, tell them they were trained on {list} of datasets, and to attribute their answer to their training data, i get quite similar output. It even seems quite reasonable. The reason being each data corpus has a "vibe" to it and the predictions simply assign response vibe to dataset vibe.

That's still firmly in divination land.


What I am reading here is that when the model is wrong, it still (at least sometimes) confidently attributes the answer to some knwoledge base, is that correct? If that is the case, how is this different to simply predicting the vibe of a given corpus and assinging provenance to it? Much less impressive imo and something most models can do without explicit training. All precision no recall as it were.

I think this was answered before, with the constraints of the architecture of the model. You can't expect something fundamentally different from an LLM, because that's how they work. It's different from other models because they were not designed for this. Maybe you were expecting more, but that's not OP's fault or demerit.

What you're saying fits my understanding/expectations. However the post and the user I am replying to seem to imply different. This makes me wonder, is my understanding incomplete or is this post marketing hype dressed up as insight? So I am asking for transparency.

It is not hype. You can try the model on huggingface yourself to see its capabilities. My reply here was clarifying that the examples we showed were ones where the model didn't make a mistake. This is intentional, because over the next few weeks, we will show how the concepts, and attribution we enable can allow you to fix this mistakes more easily. All the claims in the post are supported by evidence, no marketing here.

We are probably at the point where hype and insight aren't that much distinguishable other than what would bear fruit in the future, but I agree with you

People who rape, murder, and eat children run the country and face no hint of repurcussion. There never was rule of law. Only the appearance of it.

Rape is clearly in the Epstein files.

Murder is implied in the Epstein files with an email about burying girls on the property.

Eating sounds like an unhelpful exaggeration, unless I missed a major news story.


> Eating sounds like an unhelpful exaggeration, unless I missed a major news story.

There's a bunch of mentions of "jerky" in the files, some people have taken it to mean eating people.


Oh man for a second i though ironwolf, the vr game, was about to see some love.

Hit em with bet fixing while there.

Old world decay model, new world is twitter or facebook. Mass user exodus to a point a platform is a genuine wasteland, this means bots get deployed to prop up metrics. The money doesn't come from users, but the beleif of access to them via a platform. As long as there is a appearance of consumer data/attention you can access, then everything is fine re: revenue. Dunno how discord will fudge things though, since discord doesn't quite (historically) fit traditional social media models so maybe you'll be right in the end.

I spew elon hate every chance I get and I maintain I am being too kind on him.

Depends on the work you're doing. Cookie cutter / derivative work like I do for some hobby projects? Sure, it can near full auto it. More abstract or cutting edge stuff like in academic research enviornments? It needs correction at just about every step. Your workflow sounds like it deals with the former, which is fine, but that isn't everyone.

Hard disagree, the interface hasn't changed at all. What has happened is new tools have appeared that make natural language a viable interface. It is a new lesser interface, not a replacement. Like a GUI, more accessible but functionally restricted. An interface that is conditioned on previously solved tasks, but unable to solve novel ones.

What this means is coding becomes accessible to those looking to apply something like python to solved problems, but it very much remains inaccessible to those looking to solve truly novel problems they have the skill to solve in their domain, but lack the coding skills to describe.

As a simple example, claude code is easily among the most competent coding interfaces I know of right now. However, if I give it a toy problem I've been toying with as a hobby project, and it breaks so badly it starts hallucinating that it is chatgpt.

``` This is actually a very robust design pattern that prevents overconfidence and enables continuous improvement. The [...lots of rambling...] correctly.

  ChatGPT

  Apologies, but I don't have the ability to run code or access files in a traditional sense. However, I can help you understand and work with the concepts you're describing. Let me
  provide a more focused analysis:
```

/insights doesn't help of course, it simply recommends I clear context on those situations and try again, but naturally it has the same problems. This isn't isolated, unless I give it simple tasks, it fails. The easy tasks it excels at though, it has handled a broad variety of tasks to a high degree of satisfaction, but it is a long shot away from replacing just writing code.

Bottom line, LLM's give coding a GUI, but like a GUI, is restricted and buggy.


I've seen non-programmers successfully launch real apps — not toy projects — through vibe coding. I'm doing it myself, and I'm about to ship a developer tool built the same way.

They'll still need to pick up the fundamentals of the programming — that part isn't optional yet. And getting to that level as a non-programmer takes real effort. But if the interest is there, it's far from impossible. In fact, I'd argue someone with genuine passion and domain expertise might have better odds than an average developer just going through the motions.


You're not getting it. Making app is a solved problem, especially if app function, features, and purpose is derivative of existing things,

Think of it like image generation AI. You can make acceptable if sloppy art with it, using styles that exist. However, you cannot create a new style. You cannot create pictures of things that are truly novel, to do that you have to pick up the brush yourself.

coding with llms is the exact same thing. It can give you copies of what exists, and sometimes reasonable interpolations/mashups, but i have not seen a single succesful example of extrapolation. Not one. You simply leave the learned manifold and everything gets chaotic like in the example i provided.

If AI can make what you want, then the thing you made is not as novel as you thought. You're repurpising solved problems. Still useful, still interesting, just not as ground breaking as the bot will try and tell you.


I've been writing a new textbook for undergrads (chemistry domain focus), and think this excerpt is generally solid advice that is applicable here. Any feedback is welcome (textbook to be published gplv3 via GitHub). I appreciate I am on the conservative side here. The following is copy-paste of the final notes/tips/warnings in the book, copied from latex source with minimal edits for display here:

Rather than viewing AI as forbidden or universally permitted, consider this progression:

1. Foundation Phase (Avoid generation, embrace explanation)

When learning a new library (e.g., your first RDKit script or lmfit model), do not ask the AI to write the code.

Instead, write your own attempt, then use AI to:

• Explain error tracebacks in plain language

• Compare your approach to idiomatic patterns

• Suggest documentation sections you may have missed

2. Apprenticeship Phase (Pair programming)

Once you can write working but inelegant code, use AI as a collaborative reviewer:

• Refactor working scripts for readability

• Vectorize slow loops you have already prototyped

• Generate unit tests for functions you have written

3. Independence Phase (Managed delegation)

When you have the skill to write the code yourself but choose to delegate to save time, you are essentially trading the effort of writing for the effort of auditing. Because your prompts are condensed summaries of intent rather than literal instructions, the LLM must fill the "ambiguity gap" with educated guesses.

Delegation only works if you are skilled enough to recognise when those guesses miss the mark; if your words were precise enough to never be misunderstood, they would already be code. Coding without oversight is dangerous and deeply incompetent behaviour in professional environments.

Examples of use-cases are:

• Generate boilerplate for familiar patterns, then audit line-by-line

• Prototype alternative algorithms you already understand conceptually

• Document code you have written (reverse the typical workflow)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: