defnotanai's comments

defnotanai · on Dec 17, 2020

Congratulations on the launch! Innovation in the Auth* space is really necessary. How do you plan on differentiating from other open source solutions, such as https://github.com/ory/kratos or https://github.com/keycloak/keycloak?

advaitruia · on Dec 17, 2020

Thank you! I answered a similar comment which it seems like you've seen. Not sure I understood your response on that one. Why was the tone deplorable?

defnotanai · on Dec 17, 2020

I think you confused the threads :)

advaitruia · on Dec 17, 2020

Oops, yes I did - haha.

But I hope it answered your question?

Also, when you say innovation is neccessary - what makes you say that? In the sense that do you have any specific pain points or innovation in mind?

defnotanai · on Dec 17, 2020

I am truly looking forward for CockroachDB to become the next PostgreSQL for planet-spanning database workloads. In our ecosystem we get more and more requests for CRDB integration.

Generally though, I would not say that full compatibility should even be desired. A k/v database simply works differently from a strictly relational database. There are things like shard IDs, avoiding hot spots, deciding on how to paginate data.

Going too much into "PostgreSQL" replacement will eventually hurt CRDB because too much focus will go into making legacy enterprise SQL (along the lines of https://news.ycombinator.com/item?id=25454635) work on this system. It's the forklift approach of moving to the cloud. This becomes clear when skimming through the docs:

- https://www.cockroachlabs.com/blog/how-to-choose-db-index-ke...

- https://www.cockroachlabs.com/docs/stable/performance-best-p...

- https://www.cockroachlabs.com/docs/stable/limit-offset.html

- https://www.cockroachlabs.com/docs/v20.2/selection-queries#p...

I think many of the "light SQL" patterns are really great - things like SELECT or WHERE, which e.g. DynamoDB can simply not solve without ElasticSearch. But all in all, I am very excited for CRDB to get more industry acceptance, and I think their cloud offering could also become very interesting - a competitor to Google BigTable / CloudSpanner, AWS DynamoDB or Cloud SQL.

jolux · on Dec 17, 2020

CockroachDB is not a key-value store though.

gred · on Dec 17, 2020

Unless I misunderstood the docs, it is a KV store underneath the abstraction layers:

> At the highest level, CockroachDB converts clients' SQL statements into key-value (KV) data, which is distributed among nodes and written to disk.

https://www.cockroachlabs.com/docs/stable/architecture/overv...

jolux · on Dec 17, 2020

Sure, but this is an implementation detail. It is designed to be consistent in the style of and support the affordances of more traditional single-node RDBMS.

defnotanai · on Dec 17, 2020

You are mistaken, it is a key-value store. It just adds a ton of layers on top of the K/V to offer SQL capabilities.

taywrobel · on Dec 17, 2020

While that's technically true, it isn't in any practical way. What K/V API does CockroachDB offer you as a user? The closest you can get is:

CREATE TABLE kv (key STRING PRIMARY KEY, value STRING);

But calling that a "key-value" store is disingenuous at best.

jordanlewis · on Dec 17, 2020

Hi Taylor!

What are the salient differences in your mind? Under the hood, CockroachDB executes writes to and reads from such a table in the same way that you would against a key-value store. You can explore this for yourself with the "kv trace" functionality of CockroachDB's SQL shell, which logs of all of the KV API calls that a SQL query emits:

  $ ./cockroach demo
  # Welcome to the CockroachDB demo database!
  #
  # You are connected to a temporary, in-memory CockroachDB cluster of 1 node.
  # ...
  #
  demo@127.0.0.1:26257/test> CREATE TABLE kv (k STRING PRIMARY KEY, v STRING);
  CREATE TABLE
  
  Time: 5ms total (execution 5ms / network 0ms)
  
  demo@127.0.0.1:26257/test> \set auto_trace=on,kv
  demo@127.0.0.1:26257/test> INSERT INTO kv VALUES('a', 'b');
  INSERT 1
  
  Time: 2ms total (execution 2ms / network 0ms)
  
                 timestamp              |       age       |                     message                      |                            tag                             |                location                 |    operation     | span
  --------------------------------------+-----------------+--------------------------------------------------+------------------------------------------------------------+-----------------------------------------+------------------+-------
    2020-12-17 23:17:46.626696+00:00:00 | 00:00:00.001123 | CPut /Table/53/1/"a"/0 -> /TUPLE/2:2:Bytes/b     | [n1,client=127.0.0.1:49216,hostssl,user=demo]              | sql/row/writer.go:207                   | flow             |    6
    2020-12-17 23:17:46.626754+00:00:00 | 00:00:00.001181 | querying next range at /Table/53/1/"a"/0         | [n1,client=127.0.0.1:49216,hostssl,user=demo,txn=dcce3954] | kv/kvclient/kvcoord/range_iter.go:159   | dist sender send |    8
    2020-12-17 23:17:46.626792+00:00:00 | 00:00:00.001219 | r36: sending batch 1 CPut, 1 EndTxn to (n1,s1):1 | [n1,client=127.0.0.1:49216,hostssl,user=demo,txn=dcce3954] | kv/kvclient/kvcoord/dist_sender.go:1851 | dist sender send |    8
    2020-12-17 23:17:46.627281+00:00:00 | 00:00:00.001708 | fast path completed                              | [n1,client=127.0.0.1:49216,hostssl,user=demo]              | sql/plan_node_to_row_source.go:145      | flow             |    6
    2020-12-17 23:17:46.627322+00:00:00 | 00:00:00.001749 | rows affected: 1                                 | [n1,client=127.0.0.1:49216,hostssl,user=demo]              | sql/conn_executor_exec.go:622           | exec stmt        |    4
  (5 rows)
  
  Time: 1ms total (execution 1ms / network 0ms)
  
  demo@127.0.0.1:26257/test> SELECT * FROM kv WHERE k = 'a';
    k | v
  ----+----
    a | b
  (1 row)
  
  Time: 6ms total (execution 6ms / network 0ms)
  
                 timestamp              |       age       |                message                 |                            tag                             |                location                 |    operation     | span
  --------------------------------------+-----------------+----------------------------------------+------------------------------------------------------------+-----------------------------------------+------------------+-------
    2020-12-17 23:17:54.402735+00:00:00 | 00:00:00.003116 | Scan /Table/53/1/"a"{-/#}              | [n1,client=127.0.0.1:49216,hostssl,user=demo]              | sql/row/kv_batch_fetcher.go:337         | materializer     |    7
    2020-12-17 23:17:54.402763+00:00:00 | 00:00:00.003144 | querying next range at /Table/53/1/"a" | [n1,client=127.0.0.1:49216,hostssl,user=demo,txn=d30bcbc9] | kv/kvclient/kvcoord/range_iter.go:159   | dist sender send |    9
    2020-12-17 23:17:54.404565+00:00:00 | 00:00:00.004946 | r36: sending batch 1 Scan to (n1,s1):1 | [n1,client=127.0.0.1:49216,hostssl,user=demo,txn=d30bcbc9] | kv/kvclient/kvcoord/dist_sender.go:1851 | dist sender send |    9
    2020-12-17 23:17:54.405091+00:00:00 | 00:00:00.005472 | fetched: /kv/primary/'a'/v -> /'b'     | [n1,client=127.0.0.1:49216,hostssl,user=demo]              | sql/colfetcher/cfetcher.go:888          | materializer     |    7
    2020-12-17 23:17:54.405895+00:00:00 | 00:00:00.006276 | rows affected: 1                       | [n1,client=127.0.0.1:49216,hostssl,user=demo]              | sql/conn_executor_exec.go:622           | exec stmt        |    4
  (5 rows)
  
  Time: 1ms total (execution 1ms / network 0ms)
  
  demo@127.0.0.1:26257/test>

I'll draw your attention to two lines in particular. Here's the put:

  2020-12-17 23:17:46.626696+00:00:00 | 00:00:00.001123 | CPut /Table/53/1/"a"/0 -> /TUPLE/2:2:Bytes/b     | [n1,client=127.0.0.1:49216,hostssl,user=demo]              | sql/row/writer.go:207                   | flow             |    6

And here's the get:

  2020-12-17 23:17:54.402735+00:00:00 | 00:00:00.003116 | Scan /Table/53/1/"a"{-/#}              | [n1,client=127.0.0.1:49216,hostssl,user=demo]              | sql/row/kv_batch_fetcher.go:337         | materializer     |    7

These operations (`CPut` and `Scan`) are KV operations that you'd be able to run yourself against any key-value store. CockroachDB doesn't give you access to those operations directly, but crafting your queries in this way is really not significantly different.

jolux · on Dec 18, 2020

I think it’s the “under the hood” that’s the important part here. The salient difference is that calling it a key-value store eliminates the business reason for using it.

elcritch · on Dec 17, 2020

In that sense, all SQL databases are just KV stores. They compile SQL into actions on various internal KV stores. CRBD's approach just makes the separation more explicit than most since it's designed to run as a cluster.