Introduction to Hibernate Search

Bring the power of Lucene to your database-backed applications

1 2 3 4 5 6 7 Page 7
Page 7 of 7

Configuration

Two sets of configuration files are declared in the sample project -- one for unit testing, and the other packed into a WAR file to be deployed on a Web server. The JPA settings, including entityManagerFactory and transactionManager, are configured in the Spring application context XML files. If you enable bean autowiring in Spring, the XML files become very succinct. The settings for Hibernate Search are specified in the hibernate.cfg.xml file, shown in Listing 12. This file is referenced in the JPA persistence.xml file as a vendor proprietary property.

Listing 12. hibernate.cfg.xml

<hibernate-configuration>
   <session-factory>
      <property name="hibernate.search.default.directory_provider">
      org.hibernate.search.store.FSDirectoryProvider
      </property>
      <property name="hibernate.search.default.indexBase">./lucene/indexes</property>
      <property name="hibernate.search.default.batch.merge_factor">10</property>
      <property name="hibernate.search.default.batch.max_buffered_docs">10</property>
      
      <mapping class="demo.hibernatesearch.model.User" />
      <mapping class="demo.hibernatesearch.model.Resume" />
   </session-factory>
</hibernate-configuration>
Where's the Web tier?
The Web tier is not implemented in the sample application. The AppFuse and AppFuse Light projects hosted by Java.net provide you with great project templates covering a variety of Web frameworks. It will be very easy for you to pick a template of your favorite Web technology to integrate with this article's sample application.

The first attribute specifies the type of directory for Lucene -- a file system directory, in this case. The second attribute, indexBase, identifies where the index files reside. The merge_factor is a Lucene setting related to disk I/O. The value of 10 is the default. max_buffered_docs controls how many Lucene Document objects can be buffered during indexing. The last two parameters may be used for performance tuning. Normally indexBase is the only thing in this file you'd need to touch.

Luke: Lucene index toolbox

Lucene comes with a set of handy toolkits with which developers can browse, manipulate, and search index files. Luke, shown in Figure 1, is a Java Swing application, and is very powerful and comprehensive. Pointing Luke to the index base created by the sample application, you will see two indexes, demo.hibernatesearch.model.Resume and demo.hibernatesearch.model.User. Opening the Resume index, you can gain insights into the Lucene Document structure. There is a list of fields defined inside the Document structure, among which <_hibernate_class> is created by Hibernate Search to identify the persistence entity class. The Document structure also reflects the nature of entity relationships in Lucene indexes. The relationship from Resume to User is denormalized, and nested in the primary Resume index. Note that embeddable objects (components) are not indexed independently as entities are. Figure 1 shows the Luke UI.

The document structure of the resume index displayed in Luke
Figure 1. The document structure of the resume index displayed in Luke

Clustering and more

Hibernate Search provides two solutions for clustered sever environments. The easier one is to point the index base to a shared network directory. The more robust and high-performance approach is to leverage JMS for asynchronous index updates between a master and several slave nodes where Web servers run. In a nutshell, indexing operations occurring in each slave node are queued in a JMS destination for updating a master copy of the index files; meanwhile, the master node periodically synchronizes its master copy of the index files with the ones on each slave node. A drawback of this approach is that changes to the index will not immediately be available on slave nodes. More details about clustering, index sharing, manual indexing, and performance tuning may be found in the Hibernate Search Reference Guide.

In conclusion

Relational databases and search engines are not mutually exclusive technologies. Hibernate Search brings the power of Lucene full-text searching to Hibernate ORM through a high-level, universal API without compromising the database-level portability of the application. It seamlessly and transparently integrates the Lucene indexing processes with the Hibernate/JPA-managed database operations on the persistence domain objects. Programming cost and time are greatly reduced due to auto-indexing in Hibernate Search and the annotations in Hibernate Search, Hibernate/JPA, and the Spring 2.5 application framework.

Dr. Xinyu Liu is a Sun Microsystems certified enterprise architect working in a healthcare corporation.

Learn more about this topic

1 2 3 4 5 6 7 Page 7
Page 7 of 7