On my recent Rails project I decided to try Sphinx search engine. Before that I used Ferret and then - SOLR. I abandoned Ferret because it's instability and lack of tools to track problems (like Luke - index browser for Lucene).
So, Sphinx. I have found two plugins for Rails - acts_as_sphinx and Unltrasphinx. Ok, I've heard in mailing lists that Ultrasphinx is better (and Sphinx recipe in Advanced Rails Recipes book also involves it) so I decided to use it.
First I have to install Sphinx from sources because version in MacPorts (although, latest released version) is too old and Ultrasphinx requiresa newer one (at that point - release candidate 2 of next revision, which is 0.9.8rc2 vs 0.9.7). Then I had to do various dances around Sphinx to compile it on my Mac OS X (which are described in my previous post.).
And then it began:
1. It was needed that new data (e.g. new articles) could be searched with Sphinx right after it was added. And then I find out, that it is not encouraged to do such updates often, you'd better run full updates once a day... Wtf? I have heard that there are something called "deltas", but as I found out from plugin, it doesn't install any hooks on models, so I assume that deltas need to be built periodically (which is also unacceptable).
After consulting with other people who have used Sphinx I found out that:
1) they don't use plugins and do all communication with some low level library (Riddle, as far as I remember) manually.
2) they install their own hooks on models and call indexer manually to reindex their models. I have tried to to install after_save hook but it is run BEFORE transaction is commited and indexer can't see inserted/updated data, so I don't know how this can be accomplished easily.
2. Sphinx configuration is one big scary thing. Although Ultrasphinx managed to build it for me, I needed to do some tweaks to it and then next time I needed to build configuration for another model my tweaks were lost.
3. Model on which I have used Ultrasphinx (called ::is_indexed) failed to load by automatic dependency loader. After several hours of tracking this problem I stuck "require 'my_model'" into environment.rb (which helped) and cursed that Sphinx and all it's plugins.
So, for me Sphinx definitely FAIL.
PS
I have tried SOLR and it worked like charm:
1) almost no configuration.
SOLR configuration consists of type definitions (which describe how value of that type should be analyzed), field definitions and dynamic field definitions (acts_as_solr rails plugin uses only dynamic fields). Acts_as_solr comes with default SOLR configuration which contains commonly used types and dynamic field definitions for them.
2) no compilation needed. Acts_as_solr comes with SOLR JAR files, so you just need proper version of Java runtime installed.
3) works like charm. Everything you would expect from full-text search engine.
So, Sphinx. I have found two plugins for Rails - acts_as_sphinx and Unltrasphinx. Ok, I've heard in mailing lists that Ultrasphinx is better (and Sphinx recipe in Advanced Rails Recipes book also involves it) so I decided to use it.
First I have to install Sphinx from sources because version in MacPorts (although, latest released version) is too old and Ultrasphinx requiresa newer one (at that point - release candidate 2 of next revision, which is 0.9.8rc2 vs 0.9.7). Then I had to do various dances around Sphinx to compile it on my Mac OS X (which are described in my previous post.).
And then it began:
1. It was needed that new data (e.g. new articles) could be searched with Sphinx right after it was added. And then I find out, that it is not encouraged to do such updates often, you'd better run full updates once a day... Wtf? I have heard that there are something called "deltas", but as I found out from plugin, it doesn't install any hooks on models, so I assume that deltas need to be built periodically (which is also unacceptable).
After consulting with other people who have used Sphinx I found out that:
1) they don't use plugins and do all communication with some low level library (Riddle, as far as I remember) manually.
2) they install their own hooks on models and call indexer manually to reindex their models. I have tried to to install after_save hook but it is run BEFORE transaction is commited and indexer can't see inserted/updated data, so I don't know how this can be accomplished easily.
2. Sphinx configuration is one big scary thing. Although Ultrasphinx managed to build it for me, I needed to do some tweaks to it and then next time I needed to build configuration for another model my tweaks were lost.
3. Model on which I have used Ultrasphinx (called ::is_indexed) failed to load by automatic dependency loader. After several hours of tracking this problem I stuck "require 'my_model'" into environment.rb (which helped) and cursed that Sphinx and all it's plugins.
So, for me Sphinx definitely FAIL.
PS
I have tried SOLR and it worked like charm:
1) almost no configuration.
SOLR configuration consists of type definitions (which describe how value of that type should be analyzed), field definitions and dynamic field definitions (acts_as_solr rails plugin uses only dynamic fields). Acts_as_solr comes with default SOLR configuration which contains commonly used types and dynamic field definitions for them.
2) no compilation needed. Acts_as_solr comes with SOLR JAR files, so you just need proper version of Java runtime installed.
3) works like charm. Everything you would expect from full-text search engine.