The road so far….

April 10, 2010

Lucene: prefer IndexWriter.commit over IndexWriter.close

Filed under: java, lucene — Rahul Sharma @ 12:45 pm

In one of the projects we were working with Lucene, for indexing data. We needed to create separate  indexes  depending on some data property. As we were writing to separate indexes we needed to close our latest index and then open the IndexWriter on the new index, if required.

At first this seemed to be the right operation because closing an index would flush the data to the indexed and optimise it. But as we started testing the application we found that it was working considerably slow. When we looked into the documentation of IndexWriter we found out the Close operation is considerably slow. Lucene recommends using the Commit operation over the Close operation if the IndexWriter is required to be used again and again. Our metrics found that the close was about 5 times slower as compared to commit.

The Close or Commit operations flushes the data to the Directory and makes them visible to the underlying IndexReader/Searcher. Commit only flushes the data while the close operation optimizes the index also. This way the close operation takes more time as compared to the commit operation .

If you need to add data continually it is better to use commit as close would not yield much benefit. You would be flushing the data and optimising the index and on the next iteration you would be reopening the IndexWriter and re-optimising the index. On the other hand  commit will only flush the data and would neither optimise the index nor close the writer. The writer can be used again to add some more data and then when you are finally done you can use close to do the required optimizations.

Advertisements

2 Comments »

  1. thanks for the explanation.but can you show any source code that use commit instead of close? I am trying to do incremental update using lucene.

    Comment by Jacobian — October 16, 2010 @ 7:03 pm

    • the usage of commit is pretty simple, you can call commint inplace of close call.

      private IndexWriter writer;
      ….
      writer.addDocument(document);
      ……
      writer.commit();

      you can use these apis to build the index incrementally.

      Comment by Rahul Sharma — October 18, 2010 @ 9:43 pm


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: