More

smueller1234 · 2025-12-14T16:53:35 1765731215

Slight problem with that if you would like to live in a functioning, thriving democracy: democracy in the sense of "one person, one vote" requires or at least greatly benefits from a broadly educated population. It's not sufficient, but very likely necessary.

barbazoo · 2025-12-15T02:28:30 1765765710

See my last sentence

smueller1234 · 2025-12-07T12:18:25 1765109905

You're right -- the theoretical particle physicists at my faculty were using Mathematica very heavily when I was still in academia and maintained a dedicated compute cluster for it.

They really did not appreciate the debugging experience, but maybe that's improved in 15 years. :)

smueller1234 · 2025-09-25T11:26:57 1758799617

I realize you're making a general point about space/IO ratios and the below is orthogonal, no contradiction.

It's actually a lot less user-facing per disk IO capacity that you will be able to "sell" in a large distributed storage system. There's constant maintenance churn to keep data available: - local hardware failure - planned larger scale maintenance - transient, unplanned larger scale failures (etc)

In general, you can fall back to using reconstruction from the erasure codes for serving during degradation. But that's a) enormously expensive in IO and CPU and b) you carry higher availability and/or durability risk because you lost redundancy.

Additionally, it may make sense to rebalance where data lives for optimal read throughput (and other performance reasons).

So in practice, there's constant rebalancing going on in a sophisticated distributed storage system that takes a good chunk of your HDD IOPS.

This + garbage collection also makes tape really unattractive for all but very static archives.

smueller1234 · 2025-08-02T21:32:27 1754170347

I think Chips and Cheese is more like a fine replacement for realworldtech.com sans the toxic and highly educational and entertaining forums. Anandtech was much more accessible to the general tech public, but also more commercial and thus hit and miss on the content (no judgement intended, gotta eat).

smueller1234 · 2025-04-14T14:14:54 1744640094

Google's internal systems have been written against the Colossus semantics for many, many years and thus benefit from it's upsides (performance, cost efficiency, reliability, strong isolation for a multi tenant system, ability to scale byte and IO usage fairly independently, tremendously good abstraction against and automation of underlying physical maintenance, etc) while not really having too much of an issue with any of the conscious trade-offs (like no random writes).

On the other hand, if you've been building your applications against expectations of different semantics (like POSIX), retrofitting this into your existing application is really hard, and potentially awkward. This is (IMO) why there hasn't been an overtly Colossus based Google Cloud offering previously. (Though it's well publicized that both Persistent Disk and GCS use Colossus in their implementation.)

One of the reasons why it would be extremely hard to just set up or build CFS elsewhere or on a different abstraction level is that while it may look quite achievable to implement the high level architecture, there is vast complexity in the practical implementation side. The tremendous user isolation it affords for an MT system, the resilience it has against various types of failures and high throughput planned maintenance, the specialization it and its dependencies have to use specific hardware optimally.

(I work on Google storage part time, I am not a Colossus developer.)

smueller1234 · 2025-04-14T14:02:27 1744639347

Concur, Colossus is one of the examples where Google built what almost feels like magic technology. I work on Google Storage (among other things), and I've wished for a Cloud offering that exposes Colossus for years.

I don't know that it took "AI branding" to convince anybody. I think these workloads potentially enabled additional demand/market for such a product that may not have been there before.

One of the challenges with exposing native Colossus was always that it's just different enough from how people elsewhere are used to use Storage that there was a lot of uncertainty about the addressable market of a "native" Colossus offering. It's not a POSIX file system. Some of the specific differences (eg. no random writes) are part of what makes Colossus powerful and performant on HDDs, but it means you have to write your application to work well within its constraints. Google has been doing that for a long time. If you haven't, even if it's an amazing product, is it worth rewriting your applications or middleware?

Rapid Storage basically addresses this by adding the object store API on top if it (TIL from this thread that there's a lower abstraction client in the works as well).

Anyway, the team behind this is awesome. Awesome tech, awesome people. Seeing this launched at Next and seeing some appreciation on HN makes me very grateful.

smueller1234 · 2025-03-26T22:47:06 1743029226

4% of revenue is terrifying for large corporations.

ziddoap · 2025-03-26T22:57:45 1743029865

Have they ever issued a fine for 4% of revenue? That's the maximum fine possible, under the non-standard "higher maximum" category. This breach surely won't be given the maximum considering there isn't really anything noteworthy about it.

We should consider the maximum that has actually been issued, than subtract some off of that. You also have to subtract out all of the money they saved over the years of reduced investment into security.

I think that lands us squarely back into "cost of doing business" land.

nukem222 · 2025-03-26T22:56:07 1743029767

It's impossible to take their fears seriously—literally any kind of social obligation is going to be scary for an entity with no desire to do anything but feed its owners.

Wait until you see what kind of reaction 40% gets! Existential threats will be the only things that work.

smueller1234 · 2025-02-24T13:34:23 1740404063

https://www.s3ns.io/en

This is Google + Thales doing the 3rd party operator model, with the operator being a subsidiary of Thales and not Google.

(NB: I work for Google in the EU.)

smueller1234 · 2025-02-17T08:06:55 1739779615

Was about to say that we managed upwards of 4k servers worth of MySQL databases (as in 4k baremetal servers worth, not 4k small VMs each having a small MySQL) using "Orchestrator" at Booking.com ten years ago. I checked if it was still kicking before writing this and found that the GitHub repo was archived last year. The next Google hit I find is this article from three days ago:

https://www.percona.com/blog/orchestrator-for-managing-mysql...

Please do some additional research into the state of maintenance of that piece of tech before jumping on it. But it certainly did a lot of powerful things for us back in the day. The automatic promotion of followers was key to our deployment.

b112 · 2025-02-17T10:55:28 1739789728

That, and patching MySQL in weird ways. I recall something about the filesystem layer being monkeyed with, because "who needs safe writes, it slows things down"

smueller1234 · on Sept 28, 2024

They make it easier, but just at a source code level. They're not a real (and certainly not full) abstraction. An example that'll be making it obvious: if you replace the underlying type with a floating point type, the semantics would change dramatically, fully visible to the user code.

With larger types that otherwise have similar semantics, you can still have breakage. A straightforward one would be padding in structs. Another one is that a lot of use cases convert pointers to integers and back, so if you change the underlying representation, that's guaranteed to break. Whether that's a good or not is another question, but it's certainly not uncommon.

(Edit: sibling comments make the same point much more succinctly: ABI compatibility!)