You are on page 1of 13

How-to Guide SAP NetWeaver 04

How To Configure TREX 6.1 for Efficient Indexing


Document Version 1.00 January 2005

Applicable Releases: SAP NetWeaver 04

Copyright 2005 SAP AG. All rights reserved. No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP AG. The information contained herein may be changed without prior notice. Some software products marketed by SAP AG and its distributors contain proprietary software components of other software vendors. Microsoft, Windows, Outlook, and PowerPoint are registered trademarks of Microsoft Corporation. IBM, DB2, DB2 Universal Database, OS/2, Parallel Sysplex, MVS/ESA, AIX, S/390, AS/400, OS/390, OS/400, iSeries, pSeries, xSeries, zSeries, z/OS, AFP, Intelligent Miner, WebSphere, Netfinity, Tivoli, and Informix are trademarks or registered trademarks of IBM Corporation in the United States and/or other countries. Oracle is a registered trademark of Oracle Corporation. UNIX, X/Open, OSF/1, and Motif are registered trademarks of the Open Group. Citrix, ICA, Program Neighborhood, MetaFrame, WinFrame, VideoFrame, and MultiWin are trademarks or registered trademarks of Citrix Systems, Inc. HTML, XML, XHTML and W3C are trademarks or registered trademarks of W3C, World Wide Web Consortium, Massachusetts Institute of Technology. Java is a registered trademark of Sun Microsystems, Inc. JavaScript is a registered trademark of Sun Microsystems, Inc., used under license for technology invented and implemented by Netscape. MaxDB is a trademark of MySQL AB, Sweden. SAP, R/3, mySAP, mySAP.com, xApps, xApp, SAP NetWeaver, and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP AG in Germany and in several other countries all over the world. All other product and service names mentioned are the trademarks of their respective companies. Data

contained in this document serves informational purposes only. National product specifications may vary. These materials are subject to change without notice. These materials are provided by SAP AG and its affiliated companies ("SAP Group") for informational purposes only, without representation or warranty of any kind, and SAP Group shall not be liable for errors or omissions with respect to the materials. The only warranties for SAP Group products and services are those that are set forth in the express warranty statements accompanying such products and services, if any. Nothing herein should be construed as constituting an additional warranty. These materials are provided as is without a warranty of any kind, either express or implied, including but not limited to, the implied warranties of merchantability, fitness for a particular purpose, or non-infringement. SAP shall not be liable for damages of any kind including without limitation direct, special, indirect, or consequential damages that may result from the use of these materials. SAP does not warrant the accuracy or completeness of the information, text, graphics, links or other items contained within these materials. SAP has no control over the information that you may access through the use of hot links contained in these materials and does not endorse your use of third party web pages nor provide any warranty whatsoever relating to third party web pages. SAP NetWeaver How-to Guides are intended to simplify the product implementation. While specific product features and procedures typically are explained in a practical business context, it is not implied that those features and procedures are the only approach in solving a specific business problem using SAP NetWeaver. Should you wish to receive additional information, clarification or support, please refer to SAP Consulting. Any software coding and/or code lines / strings (Code) included in this documentation are only examples and are not intended to be used in a productive system environment. The Code is only intended better explain and visualize the syntax and phrasing rules of certain coding. SAP does not warrant the correctness and completeness of the Code given herein, and SAP shall not be liable for errors or damages caused by the usage of the Code, except if such damages were caused by SAP intentionally or grossly negligent.

Content
1 2 2.1 2.1.1 2.1.2 2.2 2.2.1 2.2.2 2.3 2.4 2.4.1 2.4.2 2.5 2.6 2.7 3 3.1 3.2 About This Document .......................................................................................................2 Indexing Configuration .....................................................................................................2 TREX as Part of Knowledge Management (KM)................................................................. 3 Initial Indexing...................................................................................................................... 3 Ongoing Index Updates....................................................................................................... 3 TREX and Master Data Management/Search Engine Service (MDM/SES) ....................... 4 Initial Indexing...................................................................................................................... 4 Ongoing Index Updates....................................................................................................... 5 TREX and SAP Business Information Warehouse (SAP BW) ............................................ 6 TREX and SAP Knowledge Warehouse (SAP KW) ............................................................ 7 Initial Indexing...................................................................................................................... 7 Ongoing Index Updates....................................................................................................... 7 TREX and mySAP Human Resources e-Recruiting............................................................ 8 TREX and mySAP Product Lifecycle Management (mySAP PLM) .................................... 8 TREX und mySAP Customer Relationship Management (mySAP CRM)........................... 9 Appendix ..........................................................................................................................10 Configuring Queue Parameters......................................................................................... 10 Activating the Delta Index.................................................................................................. 10

-1-

1 About This Document


This document contains recommendations for configuring Search and Classification (TREX) 6.1 for efficient indexing. The recommendations have the following aims: Fast initial indexing of large data sets Fast updating of indexes Fast index replication in distributed TREX systems

Target group of document Technology consultants Constraints Indexing can take place with or without the TREX queue server. This document is targeted only at scenarios in which indexing takes place using the TREX queue server.

2 Indexing Configuration
Both the configuration of the application to which TREX is connected and the configuration of TREX itself are important to the performance of indexing. The parameters below are significant on TREX-side. Queue parameters The queue parameters control the interaction between the queue server and the index server. They determine when the queue server transmits documents to the index server and when the index server begins the indexing process. In a distributed TREX system the queue parameters also control when the updated indexes are replicated.

Delta index The delta index is a separate index that TREX generates in addition to the main index. Activating the delta index at the right time speeds up the index update. The best configuration for these parameters depends on the scenario in which you are using TREX. The individual scenarios differ in the following ways: The types of data to be indexed (whether the data is structured or unstructured) The initial amount of data to be indexed The amount of data to be indexed during productive operation

How fast you want an updated index to be available for searching The sections below describe which settings we recommend for individual scenarios. The appendix contains information on how you set up the configuration.

-2-

2.1 TREX as Part of Knowledge Management (KM)


If you are using TREX as part of Knowledge Management (KM), unstructured data in the form of text documents (Word documents, PDF documents, and so on) needs to be indexed. Usually there are very large document collections (more than 100,000 documents) to be indexed initially. The initial indexing run should be as efficient as possible. After the initial indexing run there may be ongoing changes to the document collection: Documents may be added, changed, or deleted. The indexes should be updated as promptly as possible and replicated if necessary so that the changes are available for searching as quickly as possible.

2.1.1 Initial Indexing


Configure TREX as follows before beginning the initial indexing run: Queue parameters Parameters Schedule Type Schedule Time Schedule Max Documents Transmit Bulk Size Max. Size of Transmit Bulk Synchronize Bulk Size Initial Indexing Mode Merge Time (Delta Index) Replication Time Replicate After Synchronize Delta index Not activated. This is the standard configuration. Recommended Value
Count

Comment

10000 10000 1073741824

(= 100 MB)
10 On


Off

2.1.2 Ongoing Index Updates


Change the configuration of TREX as follows after the initial indexing run: Queue parameters Parameters Schedule Type Schedule Time Schedule Max Documents Transmit Bulk Size Max. Size of Transmit Bulk Synchronize Bulk Size Initial Indexing Mode Recommended Value
Time All-01:00

Comment

Documents are transferred to the index server every hour.

10000 1073741824

(= 100 MB)
1 Off

-3-

Parameters Merge Time (Delta Index) Replication Time Replicate After Synchronize

Recommended Value
All-08:00

Comment The delta index is integrated into the main index every 8 hours.

Distributed TREX system: On Non-distributed TREX system: Off

Delta index Activated. Parameter Threshold Recommended Value


500 MB

Comment

2.2 TREX and Master Data Management/Search Engine Service (MDM/SES)


If you are using TREX with Master Data Management/Search Engine Service (MDM/SES), indexing and searching takes place using TREX business objects. Business objects are structured data such as product data or business partner data. Note the following: If an index might possibly exceed 10 million business objects, we recommend implementing a distributed TREX system with one master and two slave hosts.

2.2.1 Initial Indexing


Configure TREX as follows before beginning the initial indexing run: Queue parameters Parameters Schedule Type Schedule Time Schedule Max Documents Transmit Bulk Size Max. Size of Transmit Bulk Synchronize Bulk Size Initial Indexing Mode Recommended Value
Both All-0:30

Comment

Documents are transferred to the index server every 30 minutes.

10000 100000 100000000 1 On

This value is set automatically by MDM. You do not need to set this value manually.

Merge Time (Delta Index) Replication Time Replicate After Synchronize Delta index


Off

Not activated. This is the standard configuration.

-4-

2.2.2 Ongoing Index Updates


Change the configuration of TREX as follows after the initial indexing run: Queue parameters Parameters Schedule Type Schedule Time Schedule Max Documents Transmit Bulk Size Max. Size of Transmit Bulk Synchronize Bulk Size Initial Indexing Mode Recommended Value
Both All-0:01

Comment

Documents are transferred to the index server every minute.

10000 100000 100000000 1 On

This value is set automatically by MDM. You do not need to set this value manually. The delta index is integrated every night at midnight or every Sunday at 6am. Choose a time when the system load is light.

Merge Time (Delta Index)

All(0)

or
Sun(6)

Replication Time Replicate After Synchronize

Distributed TREX System: On Non-distributed TREX system: Off

Delta index Activated. Parameter Threshold Recommended Value


1000000

Comment You can specify the number of documents in the index (1000000) or the size of the index in GB (1 GB).

or
1 GB

-5-

2.3 TREX and SAP Business Information Warehouse (SAP BW)


If you are using TREX with SAP Business Information Warehouse (SAP BW), configure TREX as follows: Queue parameters Parameters Schedule Type Schedule Time Schedule Max Documents Transmit Bulk Size Max. Size of Transmit Bulk Synchronize Bulk Size Initial Indexing Mode Merge Time (Delta Index) Replication Time Replicate After Synchronize Delta index Not activated. This is the standard configuration. Recommended Value
Both All-0:01

Comment

Documents are transferred to the index server every minute.

10000 6000 10000 1 Off


Off

-6-

2.4 TREX and SAP Knowledge Warehouse (SAP KW)


If you are using TREX with SAP Knowledge Warehouse (SAP KW), configure TREX as follows:

2.4.1 Initial Indexing


Configure TREX as follows before beginning the initial indexing run: Queue parameters Parameters Schedule Type Schedule Time Schedule Max Documents Transmit Bulk Size Max. Size of Transmit Bulk Synchronize Bulk Size Initial Indexing Mode Merge Time (Delta Index) Replication Time Replicate After Synchronize Delta index Not activated. This is the standard configuration. Recommended Value
Both All-0:30

Comment

Documents are transferred to the index server every 30 minutes.

10000 6000 10000 1 Off


Off

2.4.2 Ongoing Index Updates


Change the configuration of TREX as follows after the initial indexing run: Queue parameters Parameters Schedule Type Schedule Time Schedule Max Documents Transmit Bulk Size Max. Size of Transmit Bulk Synchronize Bulk Size Initial Indexing Mode Merge Time (Delta Index) Replication Time Replicate After Synchronize Delta index Not activated. This is the standard configuration. Recommended Value
Both All-0:01

Comment

Documents are transferred to the index server every minute.

10000 6000 10000 1 Off


Off

-7-

2.5 TREX and mySAP Human Resources e-Recruiting


If you are using TREX with mySAP Human Resources e-Recruiting, configure TREX as follows: Queue parameters Parameters Schedule Type Schedule Time Schedule Max Documents Transmit Bulk Size Max. Size of Transmit Bulk Synchronize Bulk Size Initial Indexing Mode Merge Time (Delta Index) Replication Time Replicate After Synchronize Delta index Not activated. This is the standard configuration. Recommended Value
Both All-0:01

Comment

Documents are transferred to the index server every minute.

10000 6000 10000 1 Off


Off

2.6 TREX and mySAP Product Lifecycle Management (mySAP PLM)


If you are using TREX with mySAP Product Lifecycle Management (mySAP PLM), configure TREX as follows: Queue parameters Parameters Schedule Type Schedule Time Schedule Max Documents Transmit Bulk Size Max. Size of Transmit Bulk Synchronize Bulk Size Initial Indexing Mode Merge Time (Delta Index) Replication Time Replicate After Synchronize Delta index Not activated. This is the standard configuration. Recommended Value
Both All-0:01

Comment

Documents are transferred to the index server every minute.

10000 6000 10000 1 Off


Off

-8-

2.7 TREX and mySAP Customer Relationship Management (mySAP CRM)


If you are using TREX with mySAP Customer Relationship Management (mySAP CRM), configure TREX as follows: Queue parameters Parameters Schedule Type Schedule Time Schedule Max Documents Transmit Bulk Size Max. Size of Transmit Bulk Synchronize Bulk Size Initial Indexing Mode Merge Time (Delta Index) Replication Time Replicate After Synchronize Delta index Not activated. This is the standard configuration. Recommended Value
Both All-0:01

Comment

Documents are transferred to the index server every minute.

10000 6000 10000 1 Off


Off

-9-

3 Appendix
3.1 Configuring Queue Parameters
You can use the following tools to configure the queue parameters: Tool TREX Admin Tool (Stand-Alone) TREX monitor in the portal From Release 6.20: TREX Admin Tool in SAP R/3 Path Queue Admin Queue Parameters System Administration Monitoring Knowledge Management TREX Monitor Edit Queue Parameters Transaction TREXADMIN Queue Server Administration Set Queue Parameters

For more detailed information on the tools and the significance of the queue parameters, see help.sap.com/nw04 SAP NetWeaver Information Integration Knowledge Management Administration Guide Technical Operations in Detail TREX Admin Tools in the SAP Library.

3.2 Activating the Delta Index


You can use the following tool to configure the delta index: Tool TREX Admin Tool (Stand-Alone) Path Index Admin Index Info Index Parameters Configure Delta Index

For more detailed information on the delta index, see help.sap.com/nw04 SAP NetWeaver Information Integration Knowledge Management Administration Guide Technical Operations in Detail Delta Index Configuration in the SAP Library.

- 10 -

http://www.sdn.sap.com/irj/sdn/howtoguides