Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> unlimited higher-speed GPT-4 access

aka the nerfed version. high speed means the weights were relaxed leading to faster output but worse reasoning and memory.



Do you have any references on this? I have only seen a lot of speculation.


It's been discussed on twitter and /r/chatgpt but i've noticed it myself. I always find it funny when people say chatgpt hasn't changed since launch when i see it with my own eyes.

> The party told you to reject the evidence of your eyes and ears. It was their final, most essential command


I'm most curious about this: "weights were relaxed" Is that something you've seen with your own eyes? What does it even mean, how did you observe it? Seems hard to verify without proprietary information, if it even means something to begin with.


Or it means that the compute on the inference nodes is more efficient? Or that it’s tenanted in a way that decreases oversaturation? Or you’re getting programmatic improvements in the inference layer that are being funded by the enterprise spend?


If they had a code improvement that made inference faster without damaging capability they would roll it out everywhere. Compute is money, after all.

Worst case just add a `sleep()` to the non-enterprise version.


What does it mean to “relax” weights and how does that speed up output?


I assume he means quantization (e.g. scaling the weights from 16-bit to 4-bit) and it speeds up the output by reducing the amount of work done.


Or they have the priority on high-end hardware or even dedicated one.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: