More

fellowniusmonk · 2025-12-25T04:46:44 1766638004

Merry Christmas!

fellowniusmonk · 2025-12-19T21:10:36 1766178636

Explore vs exploit?

Let's run an expierement where we just run exploit forever, let's restrucute the private sector, our countries moral baselines and eventually our executive leadership to be maximally exploitive, lets do that for about 45 years and see where it lands us -Some greed is good guys in the 80s probably.

fellowniusmonk · 2025-12-19T16:39:04 1766162344

So this author loves the easy part (writing code), hates the hard part (reading and reviewing), and lacks so much self awareness that he is going to lecture people on skill atrophy?

If you want to be an artist be an artist, that's fine, don't confuse artististry with engineering.

I write art code for myself, I engineer code professionally.

The author wraps with a false dichotomy that uses emotionally loaded language at the end: "You Believe We have Entered a New Post-Work Era, and Trust the Corporations to Shepherd Us Into It". I mean, what? Why can't I think it's quickly becoming a new era _and_ not trust corporations? Why does the author take that idea off the table? Is this logic or rhetoric? Who is this author trying to convince?

SWE life has always had smatterings of weird gatekeeping, self identities wrapped up in external tooling or paradigms, fragile egos, general misanthropy, post-hoc rationalization, etc. but... man watching the progressions of the crash outs these last few years has been wild.

shadowgovt · 2025-12-19T16:49:54 1766162994

This is a key insight.

In my day job, I use best practices. If I'm changing a SQL database, I write database migrations.

In my hobby coding? I will never write a database migration. You couldn't force me to at gunpoint. I just hate them, aesthetically. I will come up with the most elaborate and fragile solutions to avoid writing them. It's part of the fun.

fellowniusmonk · 2025-12-19T03:17:24 1766114244

It's not hard to start a community but it does take a certain minimal amount of capital, physical space and sufficient population density.

I've done it on a shoe string in the past, but I don't have the space to do it currently. Getting off the ground is the hardest part. Considerably harder than software unless certain resources align.

arjie · 2025-12-19T06:34:26 1766126066

What did you start? If you happen to already have it written down somewhere, I'd be eager to read.

fellowniusmonk · 2025-12-19T16:49:03 1766162943

An insolvent co-working space my company took over after the founders offered it to us as we were the only reliable paying members, we turned into to a ethos driven local community social club and co-working space, I come from multi generational missionary family though I myself am no longer theistic.

Most people don't have the experience of seeing one person turn into 200+ in a geographically co-located place.

My family has probably planted more churches than any living group, they range from <100 people to >10,000.

There are transitional stages at every growth milestone, the hardest part is the first 15 people and you need capitalization to get over early humps, that's why missionary organizations exist.

Eventually you can hit break even or even being able to start a rainy day fund, but people aren't willing to capitalize communities the way they capitalize companies even though in the end they have way more socio-political power.

As a fellow SWE engineers and tech people really underestimate the writing, training and empiricism for polity and growth that non-coercion community building entails.

Even people who are vaguely exposed to missionary work are typically exposed to force-coercion oriented groups.

arjie · 2025-12-19T19:09:34 1766171374

Thank you for sharing that. Yeah, I was curious if someone has put in the work to talk about the mechanistic aspects to making a successful community space. It's an evolving thing, but I'm sure there are many details that one can just get right. Would be a cool book to read.

fellowniusmonk · 2025-12-18T18:30:10 1766082610

In all my unpublished tests, which focus on 1. unique logic puzzles that are intentionally adjacent to existing puzzles and 2. implementing a specific unique CRDT algorithm that is not particularly common but has an official reference implementation on github (so the models definitely been trained on it) I find that 5.2 overfits to the more common implementation and will actively break working code and puzzles.

I find it to be incorrectly pattern matching with a very narrow focus and will ignore real documented differences even when explicitly highlighted in the prompt text (this is X crdt algo not Y crdt algo.)

I've canceled my subscription, the idea that on any larger edits it will just start wrecking nuance and then refuse to accept prompts that point this out is an extremely dangerous form of target fixation.

pillefitz · 2025-12-18T19:06:21 1766084781

How does Claude perform?

fellowniusmonk · 2025-12-18T19:29:33 1766086173

They all have difficulty with certain crdts types in general, 4.5 opus has to go through a round of ask to give it clarifying instructions but then it's fine. Neither get it perfectly as a one shot, claude if you jump straight into agent won't break code but will churn for a bit.

fellowniusmonk · 2025-12-17T23:13:24 1766013204

Is this an attempt to engender developer goodwill that they've lost over 5.2s overfitting issues?

I've canceled my subscription, I don't plan on releasing an app to their platform.

deepvibrations · 2025-12-18T00:21:05 1766017265

Also cancelled - it does feel like commoditisation is here now for LLMs. Recently, I've found Gemini & DeepSeek as good or better at 95% of what GPT can do now, so I can no longer justify paying for it.

dcre · 2025-12-18T00:17:13 1766017033

They’ve been planning this since October.

fellowniusmonk · 2025-12-17T20:22:25 1766002945

All the latest round of openai is massively overfit.

fellowniusmonk · 2025-12-17T19:44:05 1766000645

I have a very complex set of logic puzzles I run through my own tests.

My logic test and trying to get an agent to develop a certain type of ** implementation (that is published and thus the model is trained on to some limited extent) really stress test models, 5.2 is a complete failure of overfitting.

Really really bad in an unrecoverable infinite loop way.

It helps when you have existing working code that you know a model can't be trained on.

It doesn't actually evaluate the working code it just assumes it's wrong and starts trying to re-write it as a different type of **.

Even linking it to the explanation and the git repo of the reference implementation it still persists in trying to force a different **.

This is the worst model since pre o3. Just terrible.

fellowniusmonk · 2025-12-16T18:46:43 1765910803

Regardless of the technical and UX merit of this project.

There are clearly a bunch of people who haven't used a new interface in perhapse years and are simply obtuse.

It took me less than 5 seconds to start using this one handed while I was pissing at a urinal, I mean that quite literally.

nkrisc · 2025-12-18T21:09:10 1766092150

Yes, the users must be wrong.

fellowniusmonk · 2025-12-13T08:58:38 1765616318

5.2 seems worse on overfitting for esoteric logic puzzles in my testing. Tests using precise language where attention has to be paid to use the correct definition among many for a given word. It charges ahead with wrong definitions in a far lower accuracy and worse way now.