Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'd like to see Cassandra implemented in Go (currently its Java/JVM-based).


Why?


I would imagine his reasoning would be that Go handle concurrency and threading very efficiently and with a low memory footprint, while Java tends to use more memory.


Mainly to see if it would have better GC ability than Cassandra.

I love Cassandra, and the JVM is solid too, but the "stop-the-world-pause" in it's current garbage collectors truly blows. All of that fast latency is for not if the JVM decides to spend a few seconds churning through garbage. :(

I've looked into Azul's Zing JVM, but it's basically only for beefy hardware (16+ cores and 32GB+ RAM), and it's also crazy expensive.


> Mainly to see if it would have better GC ability than Cassandra.

Considering the naïvety of Go's GC (a non-generational, non-compacting, conservative, stop-the-world and broken-on-32b[0] mark&sweep)... I would tend to bet against it way sooner than on it.

And remember that goroutines use shared memory, so you can't have an emergent concurrent GC as in Erlang.

[0] http://utcc.utoronto.ca/~cks/space/blog/programming/GoLang32... https://groups.google.com/forum/?fromgroups#!topic/golang-nu...


Currently most commercial JVMs beat down Go's GC with hands down, specially in 32 bit machines.


I've never used Cassandra, but from what I'd read about Netflix and other's use of it, everyone used at least 32GB of RAM in each node?

Am I totally wrong?


So, there's a difference between "how much RAM is in the box", and "how much RAM is allocated to the JVM".

The latter is where Azul Zing requires a lot of RAM to operate, as I understand it.

The recommended RAM for the JVM heap running Cassandra is 8GB, and 4GB isn't uncommon in the cloud. Netflix uses 12GB, and is contemplating going back to 8GB (they made that decision awhile ago, and things have changed since then).

I have boxes that run JVM heaps as low as 2GB just fine. It really just depends on how you're using it. (I've heard of people using 256MB heaps with Cassandra.)


Thanks, that makes sense


I was under the impression that Cassandra used concurrent mark and sweep (i.e. -XX:+UseConcMarkSweepGC) as the default setting.


I does, but that's still not pause-less, and it doesn't cover a "full" collection at all, just the intermediate ones.

I get a full collection roughly once every 10 minutes. Until then, the pauses are only in the 20-30ms range (which is still double the usual latency for a request).

With Azul Zing, the longest you'll ever see is the 20-30ms range, and it's virtually always around 1ms.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: