Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> If you consider that evolution has taken millions of years to produce intelligent humans--that LLM training completed in a manner of months can produce parrots of humans is impressive by itself.

I disagree that such a comparison is useful. Training should be compared to training, and LLM training feeds in so many more words than a baby gets. (A baby has other senses but it's not like feeding in 20 years of video footage is going to make an LLM more competent.)



No, a baby is pre-trained. We know from linguistics that there is a natural language grammar template all humans follow. This template is intrinsic to our biology and is encoded and not learned through observation.


A baby has a template but so does an LLM.

The better comparison to the templating is all the labor that went into making the LLM, not how long the GPUs run.

Template versus template, or specific training versus specific training. Those comparisons make a lot more sense than going criss-cross.


The template is what makes the training process so short for humans. We need minimal data and we can run off of that.

Training is both longer and less effective for the LLM because there is no template.

To give an example suppose it takes just one picture for a human to recognize a dog and it takes 1 million pictures for a ML model to do the same. What I’m saying is that it’s like this because humans come preprogrammed with application specific wetware to do the learning and recognition as a generic operation. That’s why it’s so quick. For AI we are doing it as a one shot operation on something that is not application specific. The training takes longer because of this and is less effective.


I disagree that an LLM has no template, but this is getting away from the point.

Did you look at the post I was replying to? You're talking about LLMs being slower, while that post was impressed by LLMs being "faster".

They're posing it as if LLMs recreate the same templating during their training time, and my core point is disagreeing with that. The two should not be compared so directly.


They are slower. In theory these LLMs with all the right weights can have intelligence superior or equivalent to humans.

But the training never gets there. It’s so slow it never reaches human intelligence even though we know these networks can compute anything.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: