Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Oriole's design seems to require transaction-aware indexes with point entry removal, which has its own cost.

E.g. a GiST equivalent (for e.g. spatial indexes) would be a hassle to maintain due to its nature of having no precise knowledge about the location of each index tuple, GIN (e.g. FTS indexing) could be extremely bulky due to a lack of compressibility in posting trees, and I can't imagine how they'd implement an equivalent to BRIN (which allows for quickly eliminating huge portions of a physical table from a query result if they contain no interesting data), given their use of index-organized tables. Sure, you can partition on PK ranges instead of block ranges, but value density in a primary key can vary wildly over both time and value range.

Does the author have any info on how they plan to implement these more complex (but extremely useful) index methods?

This doesn't even consider the issues that might appear if the ordering rules (collation) change. Postgres' heap and vacuuming is ordering-unaware, meaning you can often fix corruption caused by collation changes by removing and reinserting the rows that are in the wrong location after the collation changed, with vacuum eventually getting rid of the broken tuples. I'm not sure Oriole can do that, as it won't be able to find the original tuple that it needed to remove with point lookup queries, thus probably requiring a full index rebuild to fix known corruption cases in the index, which sounds like a lot of additional maintenance.



> Does the author have any info on how they plan to implement these more complex (but extremely useful) index methods?

Regarding GiST analogue my plan is to build B-tree over some space-filling curve. Also, I'm planning to add union keys to the internal pages to make search over this tree faster and simpler.

Regarding GIN analogue, it would be still possible to compress the posting lists. The possible option would be to associate undo record not with posting list item, but with the whole posting list.

Regarding BRIN, I don't think we can do some direct analogue since we're using index-organized tables. But we can do something interesting with union keys in the internal pages of PK.

> This doesn't even consider the issues that might appear if the ordering rules (collation) change.

You're right, collation issue is serious. We will need to stick every collation-aware index to particular libicu collation version, before we go to GA.


> Regarding GiST analogue my plan is to build B-tree over some space-filling curve. Also, I'm planning to add union keys to the internal pages to make search over this tree faster and simpler.

This is my first time hearing of "union keys", and I can't seem to find it using DDG or arxiv. Would you mind explaining the concept (or pointing me in the right direction)?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: