Managing the Search index

The Search application uses a Lucene 3.0.3 index, supplemented by social facet information. The location of the Search index is mapped to an IBM® WebSphere® Application Server variable, SEARCH_INDEX_DIR. The value of this variable is set to CONNECTIONS_DATA_DIRECTORY/search/index by default.

The index is generated by retrieving all the necessary information from each HCL Connections application on an administrator-defined schedule. Each task defines which applications to crawl and whether to optimize the index at the end of the task. The following applications can be indexed: Activities, Blogs, Bookmarks, Communities, Files, ECM files, Forums, Profiles, and Wikis. Status updates and community calendar events can also be indexed.

Search uses the WebSphere Application Server scheduling service for creating and updating the Search index. The scheduling service is based on the Cron calendar, which uses predefined date algorithms to determine when a task should run. While the scheduling service supports the use of a Simple calendar, this is not currently supported for HCL Connections. For more information about the WebSphere Application Server scheduler, see Scheduling tasks.

HCL Connections applications maintain delete and access-control update information for a maximum of 30 days. If indexing is not performed on an index for 30 days, that index is considered to be out-of-date and reindexing is necessary. You must delete and recreate the index to ensure data integrity.

Note: When indexing on a Microsoft® Windows® 2008 deployment, you might get the following error: java.io.IOException: Access is denied. This error is caused by an underlying Lucene issue and prevents the index from being updated. To resolve the problem, restart all the machines in the cluster.

The indexing process
The Search index is generated by retrieving information from each of the applications based on a schedule defined by the administrator. Search uses the IBM WebSphere Application Server scheduling service for creating and updating the Search index. The index must be deployed on each node running the Search enterprise application.
Creating Search indexes
Search indexing is automatically configured for HCL Connections during installation.
Index settings
Indexing is automatically configured in HCL Connections. However, when setting up indexing for your environment, you might need to perform additional configuration tasks.
Verifying Search
You can perform a number of steps to verify that Search index creation has completed successfully and the Search application is working as expected.
Adding a new component to an existing index
Add new HCL Connections components to the index that were not included when the index was built.
Configuring scheduled tasks
The SearchService MBean is used to access a service that provides an administrative interface for adding scheduled task definitions to the Home page database.
Running one-off tasks
The SearchService MBean provides commands that allow you to create an indexing optimize task that is scheduled to run once and only once, 30 seconds after being called.
Retrieving file content
Use SearchService commands to perform file content retrieval tasks.
Purging content from the index
Use the SearchService.deleteFeatureIndex command to purge content for a specific application from the Search index in a single-node environment.
Reindexing content
Use the retryIndexing command when you want to reindex content that was not indexed successfully during initial or incremental indexing.
Reindexing content on a clustered HCL Connections deployment
When you reindex from scratch on a clustered HCL Connections deployment, reindexing takes place in the background while the system continues to service requests of the existing index. When reindexing completes, you then switch to the new index. The update process includes ripple restart of Search nodes.
Deleting persisted seedlist data
You can free up disk space by deleting persisted seedlists from your system using the SearchService.flushPersistedCrawlContent command.
Deleting the index
Delete the index by deleting the contents of the directory specified by the IBM WebSphere Application Server variable, SEARCH_INDEX_DIR.
Listing indexing nodes
Use the SearchService.listIndexingNodes command when you need to check the names of the Search indexing nodes in your deployment. For example, if you want to remove an indexing node from the index management table, you can use this command to verify the name of the node that you want to remove.
Removing a node from the index management table
When you are removing a node from a cluster, use the SearchService.removeIndexingNode wsadmin command to remove the node from the index management table.
Backup and restore
Create a backup of the Search index and save it to a secure location so that it can be used to restore the index in the event of loss or corruption.
Configuring file attachment indexing settings
Edit settings in the search-config.xml file to configure Search for file attachments.
Configuring the number of crawling threads
Edit settings in the search-config.xml file to specify the maximum number of threads used when crawling. The maximum number of threads that you should specify is the number of applications that you have installed in your deployment.
Configuring the number of indexing threads
Edit settings in the search-config.xml file to specify the maximum number of threads used when indexing.
Boosting search results by tag, title, or recency
Influence the quality and ranking of search results by configuring a boost to the relevance score associated with specified fields in the index.

Parent topic:Administering Search

Related information

Changing the location of the Search index

Scheduling tasks

Troubleshooting Search

Restoring the Search index