Please reread (or.. read) the paper. They do not make that mistake, specifically section 7.1.
A reward function (R) may be hackable by a model's response, but when asked to confess it is easier to get an honest confession reward function (Rc) because you have the response with all the hacking in front of you, and that gives the Rc more ability to verify honesty than R had to verify correctness.
There are human examples you could construct (say, granting immunity for better confessions), but they don't map well to this really fascinating insight with LLMs.
While I have several disagreements with this deck, there are two large ones:
1. In my experience, a lot of teams don't have long enough meetings to avoid the litany of small meetings. For example, a lot of staff meetings could easily be 2 hours and then cancel many project specific meetings that have 50%+ of the same attendees later in the week. They also enforce a cadence of execution - everyone knows they need to prepare for the weekly staff meeting, rather than many small meetings every day. It also avoids the problem of people feeling not included - you're always invited to the one huge meeting every week, it's up to you to attend or skip.
2. The problem with meeting culture cannot be solved with education on how to say no, it's about admitting that attending meetings actually does convey a lot of things. Lots of information is not shared outside of meetings. Seniority of attendees actually does have a huge impact on visibility in folks' careers. A lot of the advice in this slide deck feels like it should work, but doesn't in practice because of self interest.
The education that needs to happen is quite different imo:
- leadership needs to be done through writing
- meetings should be recorded and minutes sent out broadly, along with allowing silent attendance.
- decisions need to give time for dissent outside of meeting attendees before committing.
While I agree a lot of information is conferred, most of it is not useful.
I'm quite a fan of not attending meetings where I don't get specifically invited (as in, directly, not as part of a group). This may or may not fly at a given organisation. Anyhow, my main learning has been that:
1. All truly important information will be repeated (in the form of tickets, slack messages, further meetings). Usually several times.
2. Most useful subordinate information (the kind that doesn't get repeated) only needs to be related to 1 person 80%+ of the time. It's vanishingly rare 3 or more people need some information that isn't ever repeated elsewhere.
The only really useful work in meetings is making decisions. This is an essential feature, but a big problem is often many "spectators" are invited (attendees without decision power or context). Being a pure spectator in a meeting is almost always completely pointless. Also, people like to make decisions/input so meetings are rife with bike shedding (most people have decision power + context for low importance items usually).
Math notation is high context, so it's great to just ask llm's to print out the low context version in something like lisp where I can read and decompose it quickly.
attention required: 10 minute video > 10 second short
When the written word took over with the printing press, the same concern was levied. The amount of attention required to listen and memorize a story/poem is a lot more than just reading it.
The change with smart phones is just one of access/time spent on these things. There are people who are spending ~5 hours/day watching this content. There is a big difference between someone listening to 5 hours of a single poem, to reading 5 hours of a single book, to reading 5 hours of blog posts, to watching 5 hours of a youtube video, to watching 5 hours of random videos, to 5 hours of <10s videos.
There is a big matrix of risk/reward for any DC location.
You bet Meta asked for incentives, but sometimes a guarantee of future power capacity, fast permitting, or ideal locations are worth more than the incentives the state could afford.
Sure, but you don't have to build your DC at the place that has the cheapest bid.
You just need to make them think you might not build somewhere else unless they sweeten the deal.
Hopefully this just means that governments have wisened up to the fact that a gazillion DCs are going to be built so if you pass on Meta you can just pickup Google's.
A reward function (R) may be hackable by a model's response, but when asked to confess it is easier to get an honest confession reward function (Rc) because you have the response with all the hacking in front of you, and that gives the Rc more ability to verify honesty than R had to verify correctness.
There are human examples you could construct (say, granting immunity for better confessions), but they don't map well to this really fascinating insight with LLMs.
reply