Yes, verbatim from the paper: "Moreover, training with PET can be performed in several hours on a single GPU without
requiring expensive hyperparameter optimization."
This is a common misconception. GPT-3 was trained using a 300B token (~300gb) subset of common-crawl and friends. The model is larger than the dataset.
My 2c (apologies for the aggressive tone -- I'm just excited about AGI):
That's a very very weak upper bound on how much hardware it takes. I think it's not all that different from emulating a Nintendo64 with a quantum simulation of the hardware.
For complex systems to work (not to mention evolve), they need to be robust to small perturbations -- there's no way the computation the brain is doing is sensitive to the details of particular atoms. There has to be redundancy, modularity, etc. These things aren't human inventions so much as they are the only way to meaningfully move in a 2^|giant-number| state-space.
You could argue despite the huge number of physical degrees of freedom, the operations on DNA can be reduced to copy, repair, express, suppress. On the other hand, there's still a ton of intrinsic complexity in storing a huge amount of data, and yeah some nucleotides are totally essential.
The other thing a wonder about: sure, maintaining a proteome is hugely complicated, but how much of this complexity goes into maintaining homeostasis (e.g. metabolism, cytoskeloton and membrane maintenance, replication,...) vs. enabling computation. Seems like silicon has the advantage here.
I can't believe that you're talking about the mass-murder of civilians like it's some sort of triumph, especially since you could have made the exact same point citing the trinity test. And it's so weird that you use the term "toxic environment" right after that.
IMO AI progress in the last 10 years has drastically exceeded reasonable expectations.
I dont think the "mass-murder of civilians like it's some sort of triumph", although I can see the way I wrote it sounds like that is what I mean (and I cannot edit it because of HN). Quit assuming bad faith in my comment. Why on earth would I think that ?
I meant that we discovered fission and then we built a bomb in under 6 years. Even if you disagree with the outcome of what happen in Japan, the Manhattan Project was an amazing scientific accomplishment, and there is no way the US could do something of that scale, scientifically in 2021.
Still, I think it's pretty hard to compare engineering achievements like this. Is it harder to make a fission bomb in 6 years with 1940s technology, or a COVID vaccine in 1 year with 2020s technology?
Also, I seems like we could be having this conversation in 1940 about an american battleship program.
The COVID vaccine was an achievement, without a doubt, but i still think we need time to see how effective it will be as the virus mutates and make sure there are no long term side affects. If the vaccines achieve herd immunity ends the pandemic then history will be very kind to it and the scientists that created it.
One thought is the actual creating of the vaccine must have been considerable less complicated than the atom bombs if it was really created in one day.
It’s only less complicated if you exclude the years of research into the techniques and we’re seeing delays manufacturing because those are not easy to scale. This work started in the 2000s so I think it’d be closer to thinking of how long it took to build a bomb after you had developed the physics, built the mines and processing systems, etc. To me it really highlights how easy it is for us to forget how much science depends on less publicized work - there must have been hundreds of people whose careers went into the manufacturing techniques alone.
L^1's derivative is a perfectly good function, it's not defined or continuous at 0, but whatever... it's for the same reason that the median handles even-sized sets in a special way.
For gravity and EM fields the inverse square law is a consequence of a massless force carrier particle and Lorentz invariance. Quantum Field Theory in a Nutshell by Zee talks about this early on.
An interesting but rather overcomplicated seeming explanation. The posted article basically is saying inverse square laws results from conversation of flux (of any kind) in 3d. That seems like a pretty good explanation to me
But for force laws, having a massless carrier is critical for an inverse-square law. With massive carriers (like the carriers of the weak force, W+Z bosons) the range of the force scales like 1/M; the force law is more like exp(-Mr)/r ---> 1/r as M-->0. The diminishing of the force with exp(-Mr) means flux isn't conserved. (note I worked in units where hbar = 1 = c, so that the W's mass ~= 80 GeV/c^2 corresponds to 1/M << 1 fm)
But also things that aren't radiation or high energy physics. Inverse square laws are ubiquitous in hydrodynamics, heat flow, electrostatics. Good ole fashioned 19th century physics. And I think that's its most natural arena.