Creating, Updating, Deleting documents in a Lucene Index with symfony

by Dave Dash 24Apr07

Previously we covered an all-at-once approach to indexing objects in your symfony app. But for some reason, people find the need to allow users to sign up, or change their email addresses and then all of a sudden our wonderful Lucene index is out of date.

Here lies the strength of using Zend Search Lucene in your app, you can now get the flexibility of interacting with a Lucene index, no matter how it was created and add, update and delete documents to it.

The last thing you want to do is have a cron job in charge of making sure your index is always up to date by reindexing regularly. This is an inelegant and inefficient process.

A smarter method would be to trigger an update of the index each time you update your database. Luckily the ORM layer allows us to do this using objects (in our case Propel objects).

If we look at our user example from before, we did set ourselves up to easily do this using our User::generateZSLDocument() function, which did most of the heavy lifting.

We can make a few small changes to the User class:

We have an attribute called $reindex. When it is false we don't need to worry about the index. When something significant changes, like an update to your name or email address, then we set $reindex to true. Then when we save with an overridden save method:

Now we've got the exact same data that we created during our original indexing. This handled creating and updating object, but we miss updating the index when deleting objects.

Luckily we already made a function User::removeFromIndex() to remove any related documents from the index, so our delete function can be pretty simple:


Where am I?

This is a single entry in the weblog.

"Creating, Updating, Deleting documents in a Lucene Index with symfony" is filed under symfony. It was published in April 2007.

April 2007
M T W T F S S
« Mar   May »
 1
2345678
9101112131415
16171819202122
23242526272829
30  

need more help

If you found our tutorials and articles to be useful, but are still looking for more hands on help, consider hiring us. Find out more about how Spindrop can help you.

 

8 Responses to “Creating, Updating, Deleting documents in a Lucene Index with symfony”


  1. 1 Markus Posted April 25th, 2007 - 1:57 am

    Hi,

    why dont use the $this->modifiedColumns from the BaseXXX class and check against the columns in the index, instead of overriding all setter methods?

    Bye, Markus

  2. 2 Markus Posted April 25th, 2007 - 2:20 am

    additional, in my opinion this could be easy implmented as an propel behaviour…? so you only have to define the columns and the index to be used, and the rest could be a generic functions which would be added using the sfMixer class

  3. 3 Dave Dash Posted April 25th, 2007 - 10:43 am

    Hey Markus,

    I’ve been writing these tutorials as I wrote the code, and there’s plenty of room for optimization.

    I’ll look at both the modifiedColumns property and propel behaviors, to see if this can be done better.

    The most recent project I was working on involved multiple tables/objects and there seemed very un-DRY, so I thought I might have to finally learn how to use sfMixin ;)

    let me know if you come up with some better examples, in the meantime I’ll try to tweak my code and post something.

    Best,

    dd

  4. 4 Dave Dash Posted April 25th, 2007 - 12:11 pm

    Markus,

    This is great. This is a minute point, but we should actually register which columns we want to be updated first, but I can take care of that…

    I think array_intersect() can take care of that, and we can just define an indexedColumns array as part of the class.

    … without looking at propper syntax:

    class User {
      private $indexedColumns = array(UserPeer::NAME, UserPeer::EMAIL);
    
      ...
    
      function save()
      {
        parent::save();
        if(count(array_intersect($this->indexedColumns, $this->modifiedColumns))> 0)
        {
          $index = $this->removeFromIndex();
          $doc   = $this->generateZSLDocument();
          $index->addDocument($doc);
        }
     }
    
  5. 5 David Cook Posted June 10th, 2008 - 3:16 pm

    Hey Dave,

    Very nice articles about Zend Search Lucene and Symfony.

    I was wondering does ZSL allow for indexing across multiple tables / fields? It seems like you lay mention to it in one of your comments above but I can’t seem to find a quick answer / example of this.

    Do you happen to know if this is possible and if so could you direct me somewhere that shows an example of this in action?

    Thanks,

    DC

  6. 6 Dave Dash Posted June 10th, 2008 - 3:24 pm

    Hi David,

    If you are talking about creating separate indexes (let’s say Tags and Urls) and want to make one query to Lucene on both, I don’t think that’s possible (I’m not fresh up on Lucene).

    If I was doing this with the technology of last year, I would query my Tags and Urls indexes seperately and perhaps create a collection of Tags and Urls that are sorted by search score, and then return the result that way.

    IMO, it’s not something you want at such a low level like lucene, but I can see moving it to a middle tier so your front end doesn’t have to worry about it.

    Hope that helps.

    -d

  7. 7 David Cook Posted June 11th, 2008 - 2:01 pm

    Hey Dave,

    Here’s the basic scenario I have.

    I have users and users are grouped into categories.

    Categories: Moderator Administrator User

    Users: Sam Smith – Moderator Sara Steigler – Administrator David User – Moderator Alex Franz – User

    So I want to do a search that uses both the user’s name and the category’s name. So the search would combine both of these elements and come up with a score. I’d like the name to be weighted heavier than the category. So say I searched for ‘user’ then ‘David User’ would come back as the highest rated result and ‘Alex Franz’ would come back as the next hightest.

    Is this possible using the Zend Search Lucene or would I have to do something else? I noticed you could put the entire contents of a page into an index item so I’m thinking I’d just drop all the data pertaining to a certain user in for each index.

    I was also considering using something like Sphinx for my search needs.

    Let me know your thoughts, thanks for your help.

    DC

Who's linking?

  1. 1 Finding things using Zend Search Lucene in symfony at Spindrop Pingback on May 23rd, 2007
    "[...] now know how to manipulate the index via our model classes. But let’s actually do something useful with our ... "

Further Help

If you require more hands on assistance, we do offer affordable hands on support.

Leave a Reply


Comment guidelines: No spamming, no profanity, and no flaming. Inappropriate comments will be deleted outright.