Ruby on Rails: Using Full Text Search with Tagging

I’m working on a new application in Rails that uses both Tagging as well as full text indexing. I decided to go with Ferret, which is just a Ruby port of Lucene. Using the acts_as_ferret plugin, it’s dead simple to integrate into your application. First, you want to install Ferret on your machine, and then run this command:

script/plugin install svn://projects.jkraemer.net/acts_as_ferret /tags/stable/acts_as_ferret

At this point, you are ready to start enabling searching on your objects. Let’s say you want to enable full text search through a list of blog posts in your application. All you need is to add the acts_as_ferret line to your model class.

class Post < ActiveRecord::Base acts_as_ferret end

And there you have it, full text indexing is enabled. To search through your objects, you’ll do nothing more than this:

Post.find_by_contents("Agile Rails")

That’s pretty much it. Dead Simple!  So now we require tagging support for our application, because it’s much easier to find a post if it’s tagged with “Agile Ruby Rails“. Here’s where I had a few issues, because it seems like there are any number of different tagging plugins/mixins/whatever out there. I ended up taking the most simple one, as I figure I’ll add in my own features as I go along anyway. I ended up going with the ActsAsTaggable Plugin, mostly because I found it on the official Rails Wiki. This plugin was every bit as simple to install…

script/plugin install acts_as_taggable script/generate migration add_tag_support

In case you are unfamiliar, the second line generates a Rails Migration, which lets you create your database tables with Ruby code. It fits into the overall Rails framework a lot better, so I’ve been using it. If you don’t use it, you’ll need to create a set of tables mimicing what the Ruby code is meant to do. Add this to the add_tag_support_0xx.rb file that is created by the last command.

class AddTagSupport < ActiveRecord::Migration def self.up #Table for your Tags create_table :tags do |t| t.column :name, :string end create_table :taggings do |t| t.column :tag_id, :integer #id of tagged object t.column :taggable_id, :integer #type of object tagged t.column :taggable_type, :string end end def self.down drop_table :tags drop_table :taggings end end

Now you add the acts_as_taggable to your model class

class Post < ActiveRecord::Base acts_as_ferret acts_as_taggable end

Here’s some sample code to get you started:

# tags a post with both the tags "Agile" and "Rails" post.tag_with("agile rails") # returns a string containing "agile rails", space delimited. post.tag_list # find all posts tagged with Rails post.find_tagged_with("Rails")

Let’s recap. We installed Ferret, the acts_as_ferret plugin, and the acts_as_taggable plugin. At this point we have both tagging support as well as the ability to full text search through our objects as simply as doing Post.find.

But there’s a problem! I’ve tagged my post with “development”, but when I type that into my search box, nothing is coming up…

The issue is that the acts_as_ferret plugin is not indexing the tags, just the Post object. There’s some way to override the default handling by overriding the todoc method, but I haven’t gotten that far yet, and it just seems like overkill to redo this whole plugin just because I want to search the tags. So I’ve come up with a simpler solution: Add a new column to my model containing the list of tags. Sure, this violates the DRY principle, but I’m betting it also helps performance, not to mention my sanity. The other thought I had was that I don’t want to maintain a custom version of this plugin if I don’t have to. If, or when, a new version of these two plugins comes out, I’d like to be able to just do a straightforward upgrade.

So, I added a new column to my Post model and called it Taglist. When I save the tags using tag_with(), I also save the list of tags into the taglist object. This is really the only duplicated piece of code, but it’s just a one-liner. I’m sure there’s a better way to override the default behavior, but this method was quicker. And now, I can find items when I search for “Development”.  Ferret is soooooooo fast!

Here’s a quick list of the packages I used:

Enjoy.

Update: I mistakenly had put acts_as_taggable in this next line, when I meant to put acts_as_ferret

Update (again): The correct syntax for specifying the fields is like this:

acts_as_ferret :fields=>[”fieldname”,”otherfield”,:tag_list]

That works perfectly, thanks to John Gray in the comments box

tags: , , , ,

18 Responses to “Ruby on Rails: Using Full Text Search with Tagging”

  1. Matt Rubens Says:

    It’s actually trivially easy to insert your tags into the Lucene index without code duplication — check out the ‘additional_fields’ option to acts_as_ferret.

    http://projects.jkraemer.net/acts_as_ferret/rdoc/classes/FerretMixin/Acts/ARFerret/ClassMethods.html#M000006

    The key point is that acts_as_ferret can either index attributes or methods that return strings. Since acts_as_taggable provides your Post class with the tag_list method already, you should just be able to index that.

    Something like the following should work (haven’t tested it though):

    class Post [:tag_list]
    end

  2. Matt Rubens Says:

    Hmm some formatting and content got lost in that last post.

    To summarize: first declare your model acts_as_taggable, then declare it acts_as_ferret with additional_fields [:tag_list].

  3. Johnny Says:

    I actually did try the additional_fields option, and it doesn’t seem to do anything. I’ve looked through the source, and I don’t see a There’s a way to override the todoc at this URL: http://duncan.beevers.net/?m=200606

    I didn’t want to have to deal with that, and since I couldn’t get additional_fields working, I just went with this method.

  4. John Gray Says:

    Having Ferret index method results (like :tag_list) worked out-of-the-box for me with acts_as_ferret. I specified attributes (table columns) as strings and methods as symbols, like so:

    acts_as_ferret :fields =&gt; [’title’, :tag_list]

    I’ve had issues with corrupted indexes and various platform-related problems, but indexing and searching has been a breeze.

  5. Johnny Says:

    John,

    That worked perfectly! Thank you! Pity there isn’t better documentation on these things.

  6. ben Says:

    After doing the plugin install I’m getting SyntaxError errors from /ruby/lib/ruby/gems/1.8/gems/actionpack-1.12.5/lib/action_view/base.rb:307:in `compile_and_render_template’
    on all of my views - as if something went in a put in strange characters in the files. When I remove the plugin everything works again.

    This is before calling any acts_as_ferret code… this is just because of installing the plugin.

    Any Ideas?

    Running on webbrick 1.3.1, winxp, ruby 1.8.4, rails 1.1.6, ferret-0.10.2-mswin32

    I haven’t tried deploying to the linux staging box for obvious reasons. :)

  7. ben Says:

    I’m pretty sure ferret-0.10.2-mswin32 is doing this and it’s unrelated to your plugin. If I require ‘ferret’ this starts happening. Will explore that further and let you know if I find a fix.

  8. Johnny Says:

    Ben,

    I had a lot of problems getting ferret to work on windows. I had to end up going with an older version of ferret (0.3.2), and the acts_as_ferret plugin that matches ( I think it’s 0.2.0, but I might be wrong, that’s off the top of my head.)

    Development on Linux is definitely better when it comes to Ruby/Rails. I have switched over to using Ubuntu for my development box.

    Also, the plugin for acts_as_ferret isn’t mine, so I don’t want to take credit for it =)

  9. Eric Pugh Says:

    The problem with the compile in the .rhtml’s is because of werid, possibly tab, characters in your code. I asked, and received the answer:
    http://rubyforge.org/pipermail/ferret-talk/2006-September/001329.html

    I used textpad to see visible characeters, and saw werid chars!

  10. c Says:

    how do you get it to search titles, or title and description, OR tags? plus, it says it tokenizes fields by default, but my title definitely waasn’t tokenize accoring to the results.. any ideas?

  11. Johnny Says:

    By default, it should tokenize everything on your object. Getting it to tokenize related objects might be more difficult.

    if you send me an email I can probably help you out a little more.

  12. c Says:

    actually it wasn’t tokenization.. turns out that ferret ignores some words like ‘for’. Did you add acts_as_ferret to the tag.rb class in vendor/plugins/acts_as_taggable/lib ? or “only” in the post.rb model class? It almost seems like its not indexing my tag list… when I enter tags for a new post, they don’t get found by the search Here’s how I did my model post.rb

    Class Post [:title,:description,:tag_list]


    end

  13. Johnny Says:

    I only added acts_as_ferret to the model class.

    You will want to make your post.rb file look something like this:

    class Post < ActiveRecord::Base
    acts_as_ferret :fields=>[�title�,�description�,:tag_list]

  14. c Says:

    funny, exactly what I had.. I just noticed though that my tags aren’t indexed until I explicitly say Post.rebuild_index(Post) (now from script/console) Do your new posts get automatically reindexed? Maybe I have a setting wrong? Anyone else seeing this?

  15. c Says:

    yeah, the titles and dscriptions are indexed automatically, but the tags aren’t.. that’s the thing

  16. newbie Says:

    Regarding C’s problem: the tags are not indexed because the index for the model object (Post or whatever) is created as soon as the object is saved–which needs to happen first if you want to have something to tag. This means that ‘title’ and ‘description’ get properly indexed since they’re attributes of the object. However, since tags are applied after the object is saved, there is no ‘tag_list’ to index yet.

    Solution: after tagging ( e.g. post.tag_with(”agile rails”) ), you need to call ferret_update so that the index for post gets updated with the new tag_list (e.g. post.ferret_update).

    Hope that was clear enough. ;)

  17. Christopher Says:

    Your explanation makes sense, but as a newbie, nothing to me is “dead simple.”

    Acts_as_ferret is particularly confusing, and no one has posted a full application and tutorial demonstrating how to deploy this useful function.

    Do you know of someone who has posted less fragmentary advice for those of us who need more hand-holding?

    You’ve written:
    “And there you have it, full text indexing is enabled. To search through your objects, you’ll do nothing more than this:

    Post.find_by_contents(”Agile Rails”)

    That’s pretty much it. Dead Simple!”

    But where does this code fit in? What does the search box look like?

    Thanks!

  18. club penguin Says:

    I just noticed though that my tags aren’t indexed until I explicitly say Post.rebuild_index(Post) (now from script/console) Do your new posts get automatically reindexed? Maybe I have a setting wrong?

Leave a Reply