OK, fair enough. But it looks like there's some awareness of these issues in tha...

delano · on Feb 12, 2009

Search engine library is better, ya.

However, creating the search engine is the hard problem. It helps to have tools to make some of the outlying problems a bit easier, but the challenge is being able to calculate a configurable set of multi-dimensional scores.

There are 3 places the scoring takes place: indexing-time, run-time, and search-time. Every solution (commercial and open-source) has an indexing process. Most have some run-time processing but only a few have multi-dimensional scoring at search-time and that is the coup de grâce.

When you make the relations configurable and increase the data set to GB or TB you have several incredibly difficult problems. There are commercial products that do it, big and small, but to my knowledge there are no open source projects that can.

In other words, the state of commercial products is so far ahead of open source ones that it's misleading to refer to both as "search engines".