Professional Documents
Culture Documents
Agenda
Introduction to TREX in SAP NetWeaver TREX Functions and Features TREX Architecture and Details TREX Plattform, Sizing Guidelines, Summary
TREX
TREX is the one search technology in SAP solutions TREX is deployed in over a dozen SAP produts TRES searches and analyses as well unstructured documents as structured business data TREX in knowledge management provides search access to an extensible number of document repositories TREX will provide the backend technology for Enterprise Search
Agenda
Introduction to TREX in SAP NetWeaver TREX Functions and Features TREX Architecture and Details TREX Plattform, Sizing Guidelines, Summary
UI
Solutionspecific UI
Apps
IC Web Client
Engine
Index
Index
Index
Index
Index
Document classification
Assign a document to predefined categories
Term search
Find better search terms; discover interesting relationships
Document clustering
Discover sets of related documents
Term clustering
Discover sets of related terms in the current corpus
New:
Multihost enabled Attribute search enabled not only for case-insensitive ASCII but also for case-insensitive Unicode
Berlin Name
...
Street
() Pariserstr (3 hits) ()
Street
A-D ... M-P ...
(5)
(42 hits) (...) (35 hits) (...)
Name
A Z
(26)
(35 hits) () ()
5
User clicks street and sees hits
Age
0-18 ...
(5)
(314 hits) (...)
Search results: Hits grouped by attributes (overlapping) and listed by attribute value ranges (disjunct)
Search results: Groups ordered by relevance for this search All hits are in M-P streets in Berlin
Agenda
Introduction to TREX in SAP NetWeaver TREX Functions and Features TREX Architecture and Details TREX Plattform, Sizing Guidelines, Summary
DB
Users
Java Client
Web Server
Index Server
Natural Language Interface Engine Join Engine Attribute Engine
www
Browser UI
Server Pages
Indexes
TREX APIs
Arabic Chinese trad. Chinese simpl. Czech Danish Dutch English Finnish French German Greek Hebrew Hungarian
Italian Japanese Korean Polish Portuguese Norwegian bok. Norwegian nyn. Romanian Russian Spanish Swedish Thai Turkish more
ASCII Text (7 & 8 bit versions available) Corel WordPerfect for Windows DEC WPS Plus (DX) DEC WPS Plus (WPL) DisplayWrite 2 & 3 (TXT) DisplayWrite 4 & 5 Enable First Choice Framework HTML IBM FFT IBM Revisable Form Text IBM Writing Assistant JustWrite Legacy Lotus AMI/AMI Professional Lotus Manuscript Versions through 9.0 Versions through 4.0 Versions through 4.1 All versions Versions through 2.0 Versions 3.0, 4.0 and 4.5 Versions through 3.0 Version 3.0 Versions through 3.0 (some limitations) All versions All versions Version 1.01 Versions through 3.0 Versions through 1.1 Versions through 3.1 Versions through 2.0
Lotus WordPro (Win16 and Win32 / Intel platforms) Note: The Lotus WordPro filter is for Win32 on the Intel platforms only, and is provided to SAP "as is", without any representations or warranties. Lotus WordPro (Non-Windows platforms - text only) MacWrite II MASS11 Version 1.1 Versions through 8.0
e.g.
Microsoft Rich Text Format (RTF) All versions Microsoft Word for DOS Microsoft Word for Macintosh Microsoft Word for Windows Microsoft WordPad Microsoft Works for DOS Microsoft Works for Macintosh Microsoft Works for Windows Microsoft Write MultiMate Navy DIF Nota Bene Novell Perfect Works Novell WordPerfect for DOS Novell WordPerfect for Mac Novell WordPerfect for Windows Office Writer PC-File Letter PC-File+ Letter PFS:Write Professional Write for DOS Q&A for DOS Professional Write Plus Q&A Write for Windows Samna Word SmartWare II Sprint Total Word Unicode Text Versions through 6.0 Versions 4.0 through 98 Versions through 2000 All versions Versions through 2.0 Versions through 2.0 Versions through 4.0 Versions through 3.0 Versions through 4.0 All versions Version 3.0 Version 2.0 Versions through 6.1 Versions 1.02 through 3.0 Versions through 7.0 Version 4.0 to 6.0 Versions through 5.0 Versions through 3.0 Versions A, B, and C Versions through 2.1 Version 2.0 Version 1.0 Version 3.0 Versions through Samna Word IV+ Version 1.02 Version 1.0 Version 1.2 All versions
Agenda
Introduction to TREX in SAP NetWeaver TREX Functions and Features TREX Architecture and Details Summary TREX Plattform, Sizing Guidelines, Summary
In Detail: http://service.sap.com/pam
Optimize scalability and performance for fewer platforms more efficiently Ensure a highly performance-optimized TREX for our customers Ensure completeness of 64-bit coding for next release No negative or limiting impact on other SAP solutions expected, because of TREX internal client/server acrhitecture and planned appliance delivery Reduce cost of development and support and focus on available expertise at TREX development
TREX releases in current use TREX 5.0 TREX 6.0 TREX 6.1 TREX 7.0 Out of maintenance End 2006 2013 2014
Agenda
Introduction to TREX in SAP NetWeaver TREX Functions and Features TREX Architecture and Details TREX Plattform, Sizing Guidelines, Summary
Summary
Future TREX platform focus on Windows and Linux is a decision that has been made in relation to available development resources and expertises in TREX development. It will enable more focussed development to optimize TREX performance and supportability. It does not express any general platform preference trend at SAP.
TREX can be distributed on multiple hosts TREX hosts can have dedicated roles (Indexing, searching, backup....) TREX processes can run multiple times within the same TREX instance on one host TREX hosts can be added any time
Slave Hosts
RFC WS S NS PP
M IS
S IS
Q Q
Q MI
SI
SI
Responsible for indexing Can also be used for searching but not in default configuration manages original version of index
Responsible for searching Ensure perfomance during indexing times Manages copy of master index Index is created and updated using replication procedure
Slave Host WS PP
RFC S NS S IS
File Server Replace Master Index Server and Queue Server if they become unavailable Inactive if Master Server and Queue Server are available Data has to be stored centrally Use one backup server for whole system or one backup server per master server Q Q Q Q T Q Q MI MI
QSI Q SI SI SN SN Q
RFC S NS S IS
RFC S NS S IS
WS PP
QSI Q SI SI SN SN Q
RFC S NS S IS
Landscape Example
KPIs for TREX sizing Quick information on BIA sizing Sizing Methods and Tools
CPU
Processing time: load during indexing and search Expressed in SAPS
Disk
Storage of indexes and queues Expressed in MB
Memory
Memory consumption during indexing and search Expressed in MB
Network Load
Transferred amount of data KB per server request
Amount of indexed data Search load Type of indexed data Number of languages Amount and frequency of delta indexing High availability needs
Mainly Attributes
Mainly text
TREX Engine
BI Accelerator
Indexes
TREX Engine
Indexes
Records
AttrValue 1.4
TREX Engine
#attributes (mixed set of integer, string, text) x #values x #objects < 100 million attributes per index server Rule of Thumb 100 million attributes 200 million attributes about 1000 SAPS (2 GB RAM) about 2000 SAPS (4 GB RAM)
Varies largely depending on amount of String and text attributes Multivalue attribute
Use questionnaire to get an overview of your szenario Use given formula to get a rough idea how many indexservers you need Do hands on sizing by either
Using testdata from the application Generating testdata on TREX machine, if you know datasets
3.
4.
Test indexing and search perfomance by monitoring CPU load and RAM consumption Come to conclusion if your datasets allow larger amounts of attibutesets per indexserver or smaller ones Split index or design landscape with different indexes
TREX Engine
5.
6.
IS
IS
IS
IS
Leads to
Compression ratio of 1:40 from size of source data (documents) to index size in main memory Searching: Up to 18 000 per hour Indexing: 24 hrs time consumption 2000 SAPS / 6 GB RAM 4000 SAPS / 20 GB RAM
25 GB (50 GB x 0.5)
Document set size x 0.5 x 0.7
1.25 GB
notes
Application System
discussion threads
TREX Engine
Notes Index
Discussions Index
Master Host
mytrexmaster02
RFC RFCRFC RFC
NS
NS
PP
PP
PP
PP
index mode
IS IS
index mode
IS IS
Logical index
QS
QS
discussions Index
Slave Host
mytrexslave02
RFC RFCRFC RFC
NS
NS
PP
PP
PP
PP
Search mode
IS IS
Search mode
IS IS
Logical index
2 slave hosts supporting master host system All in all 5 slave host systems (10 servers)
masterhosts01+02
slavehosts01+02
slavehosts03+04
slavehosts05+06
slavehosts07+08
slavehosts09+10
48 GB 60 GB
Disk
Index is updated two times a day with about 10 000 new or changed documents per index run Running on 12 blades
An Example 6: Notes
masterhost01
RFC RFCRFC RFC
slavehost01 slavehost02 RFC RFC RFC slavehost03 RFC RFC RFC slavehost04 RFC RFC RFC slavehost05 NS PP
NS RFC RFC RFC PP RFC RFC RFC NS PP NS PP IS NS PP IS IS IS IS IS IS IS IS
NS IS PP
Index mode
IS
IS
QS
Using a delta index Merges every hour Indexsize: German 7GB 8GB 1GB
German E
English Japanese
Application server
Application server
Application server
Application server
discussions index
notes index
An Example 8: Summary
5.8 million objects More than 20 000 new or changed documents per day 35 000 users 170 000 search requests per day 25 languages to be processed CPU: RAM: Disk: 36 000 SAPS 48 GB
io olut s n requ n eme ir ts
Index updates twice a day for discussion threads and every hour for notes
Stage II
High load during initial indexing stage Multiple Master Indexservers and preprocessors to speed up initial indexing
Less indexing and preprocessing required Use Master host (indexing) Slave host (searching) Concept Remove one host from landscape
Start with small installation One host for indexing and searching due to little update frequency and search requests
Add Master and/or Slave hosts More search load than expected and/or backup server necessary
Weitergabe und Vervielfltigung dieser Publikation oder von Teilen daraus sind, zu welchem Zweck und in welcher Form auch immer, ohne die ausdrckliche schriftliche Genehmigung durch SAP AG nicht gestattet. In dieser Publikation enthaltene Informationen knnen ohne vorherige Ankndigung gendert werden. Einige von der SAP AG und deren Vertriebspartnern vertriebene Softwareprodukte knnen Softwarekomponenten umfassen, die Eigentum anderer Softwarehersteller sind. SAP, R/3, xApps, xApp, SAP NetWeaver, Duet, SAP Business ByDesign, ByDesign, PartnerEdge und andere in diesem Dokument erwhnte SAP-Produkte und Services sowie die dazugehrigen Logos sind Marken oder eingetragene Marken der SAP AG in Deutschland und in mehreren anderen Lndern weltweit. Alle anderen in diesem Dokument erwhnten Namen von Produkten und Services sowie die damit verbundenen Firmenlogos sind Marken der jeweiligen Unternehmen. Die Angaben im Text sind unverbindlich und dienen lediglich zu Informationszwecken. Produkte knnen lnderspezifische Unterschiede aufweisen. Die in diesem Dokument enthaltenen Informationen sind Eigentum von SAP. Dieses Dokument ist eine Vorabversion und unterliegt nicht Ihrer Lizenzvereinbarung oder einer anderen Vereinbarung mit SAP. Dieses Dokument enthlt nur vorgesehene Strategien, Entwicklungen und Funktionen des SAP-Produkts und ist fr SAP nicht bindend, einen bestimmten Geschftsweg, eine Produktstrategie bzw. -entwicklung einzuschlagen. SAP bernimmt keine Verantwortung fr Fehler oder Auslassungen in diesen Materialien. SAP garantiert nicht die Richtigkeit oder Vollstndigkeit der Informationen, Texte, Grafiken, Links oder anderer in diesen Materialien enthaltenen Elemente. Diese Publikation wird ohne jegliche Gewhr, weder ausdrcklich noch stillschweigend, bereitgestellt. Dies gilt u. a., aber nicht ausschlielich, hinsichtlich der Gewhrleistung der Marktgngigkeit und der Eignung fr einen bestimmten Zweck sowie fr die Gewhrleistung der Nichtverletzung geltenden Rechts. SAP bernimmt keine Haftung fr Schden jeglicher Art, einschlielich und ohne Einschrnkung fr direkte, spezielle, indirekte oder Folgeschden im Zusammenhang mit der Verwendung dieser Unterlagen. Diese Einschrnkung gilt nicht bei Vorsatz oder grober Fahrlssigkeit. Die gesetzliche Haftung bei Personenschden oder die Produkthaftung bleibt unberhrt. Die Informationen, auf die Sie mglicherweise ber die in diesem Material enthaltenen Hotlinks zugreifen, unterliegen nicht dem Einfluss von SAP, und SAP untersttzt nicht die Nutzung von Internetseiten Dritter durch Sie und gibt keinerlei Gewhrleistungen oder Zusagen ber Internetseiten Dritter ab. Alle Rechte vorbehalten.