You are on page 1of 34

Talend Open Studio

for MDM Installation Guide

5.0_a

Talend Open Studio

Talend Open Studio : Installation Guide


Adapted for the Talend Open Studio for MDM and Talend MDM Web User Interface v5.0.x releases.

Copyleft
This documentation is provided under the terms of the Creative Commons Public License (CCPL). For more information about what you can and cannot do with this documentation in accordance with the CCPL, please read: http:// creativecommons.org/licenses/by-nc-sa/2.0/

Table of Contents
Preface .............................................. v
1. General information ................... v 1.1. Purpose ........................... v 1.2. Audience ......................... v 1.3. Typographical conventions ............................ v 2. History of changes ...................... v 3. Feedback and Support ............... vi

5.1. Configuring session timeout for the Web User Interface ............ 5.2. Configuring access control information for the Studio and the Web User Interface ................. 5.3. Changing the default ports in JBOSS ........................................ 5.3.1. Default port list ............. 5.3.2. Using an alternate binding ................................

26

26 27 27 28

Chapter 1. Prior to installing MDM ................................................ 1


1.1. Hardware requirements ............ 1.1.1. Memory usage ................ 1.1.2. Disk usage ..................... 1.1.3. Compatible Operating Systems ................................. 1.1.4. Compatible Web browsers ................................ 1.1.5. Naming conventions ........ 1.2. Third-party softwares ............... 2 2 2 3 3 4 4

Chapter 2. Installing the MDM server ................................................ 7


2.1. Two different installation modes ........................................... 8 2.2. Installing MDM modules using the Windows/Linux executable file ............................... 8 2.3. Installing MDM modules using the jar file ............................ 8 2.3.1. Installing in GUI mode ............................................. 8 2.3.2. Installing in Command/ Console mode ....................... 10

Chapter 3. Migrating databases and MDM objects .......................... 11


3.1. Migrating MDM projects ......... 3.1.1. Migrating the eXist database ............................... 3.1.2. Reimporting and redeploying your Jobs ............. 3.1.3. Moving the pictures and web resources ................. 12 12 12 13

Chapter 4. Managing MDM database(s) ...................................... 15


4.1. Managing the eXist database ................................................... 4.1.1. eXist tuning and performance ......................... 4.1.2. eXists database backup/restore ....................... 4.1.3. Standalone eXist ........... 16 16 19 21

Chapter 5. Important Configuration subjects .................. 25

Talend Open Studio for MDM Installation Guide

Talend Open Studio for MDM Installation Guide

Preface
1. General information
1.1. Purpose
This Installation Guide explains how to install and configure Talend MDM modules and related applications. For detailed explanation on how to use and fine-tune Talend MDM applications, please refer to the Talend Open Studio for MDM Administrator Guide and Talend MDM Web User Interface User Guide. Information presented in this document applies to Talend MDM releases beginning with 5.0.x.

1.2. Audience
This guide is devoted for administrators of Talend Open Studio for MDM and Talend MDM Web User Interface. The layout of GUI screens provided in this document may vary slightly from your actual GUI.

1.3. Typographical conventions


This guide uses the following typographical conventions: text in bold: window and dialog box buttons and fields, keyboard keys, menus, and menu and options, text in [bold]: window, wizard, and dialog box titles, text in courier: system parameters typed in by the user, text in italics: file, schema, column, row, and variable names, The icon indicates an item that provides additional information about an important point. It is also used to add comments related to a table or a figure, The icon indicates a message that gives information about the execution requirements or recommendation type. It is also used to refer to situations or information the end-user need to be aware of or pay special attention to. Any command is highlighted with a grey background.

2. History of changes
The below table lists changes made in the Talend MDM Installation Guide. Talend Open Studio for MDM Installation Guide

Feedback and Support

Version v 4.2_a v 4.2_b

Date 19/05/2011 11/07/2011

History of Change Creation of an MDM Installation Guide Updates in the Talend MDM Installation Guide include: -a new hardware and software prerequisites chapter. -Slight modification and reorganization in the MDM server installation chapter. -A new section in the database management chapter to talk about managing the Talend XML database.

v 5.0_a

21/11/2011

Updates in the Talend MDM Installation Guide include: -splitting the MDM IG into two guides: one for Talend Open Studio for MDM and the other for Talend Enterprise MDM Studio. -Updated documentation to reflect new product names. For further information on these changes, see the Talend website.

3. Feedback and Support


Your feedback is valuable. Do not hesitate to give your input, make suggestions or requests regarding this documentation or product and find support from the Talend team, on Talends Forum website at: http://talendforge.org/forum

vi

Talend Open Studio for MDM Installation Guide

Chapter 1. Prior to installing MDM


This chapter provides useful information on software and hardware prerequisites you should be aware of prior to starting the installation of Talend MDM modules.

Talend Open Studio for MDM Installation Guide

Hardware requirements

1.1. Hardware requirements


To make the most out of Talend MDM to which you subscribed, please consider the hardware recommendations listed in the following sections. As the installation of MDM includes Talend Open Studio for Data Integration and its web application, related modules are included in the recommendation lists.

1.1.1. Memory usage


Memory usage heavily depends on the size and nature of your Talend projects. However, to make it short, if yours Jobs include many transformation components, you should consider upgrading the total amount of memory allocated to your servers, based on the following recommendations. Product Talend MDM Web User Interface Talend Open Studio for MDM Talend Administration Center Commandline JobServer Client/Server Server Client Server Server Server Recommended alloc. memory 1GB minimum (default configuration), 4 GB recommended 1GB minimum, 2 GB recommended 1GB 1GB Depending on your projects: 4GB+

1.1.2. Disk usage


The same requirements also apply for disk usage. It also depends on your projects but can be summarized as the following: Product Talend MDM Web User Interface Required disk installation 700 MB space for Required disk space for use -(server) 1 GB+ -(MDM database) 2 x # records number in Ko. For example: 5 M records = 10 Go. This represents the size that will be needed on the disk. However, we recommend to multiply the size really needed on the disk by 2 in order to avoid problems during high transactions. Talend Open Studio for MDM Talend Administration Center Commandline JobServer Talend Open Integration Studio for 400 MB 50MB (WAR) (deployed) 400MB 3MB Data 400MB + 1 GB+ 70MB ~50MB (cache) 1GB+ 3MB + project size = 100MB+ 1GB+

Talend Open Studio for MDM Installation Guide

Compatible Operating Systems

1.1.3. Compatible Operating Systems


Despite our intensive tests, you might encounter some issues when installing Talend MDM on some Operating Systems. Please refer to the grid below for a summary of supported OS environments. Based on reported issues, we considered that some OS are not supported even though the issue can be resolved in particular conditions. A note has been added providing configuration details. OS Talend MDM Talend Open Talend Talend Open JobServer Web User Studio for MDM Administration Studio for Data Interface Center / Integration CommandLine Working Working Working Working Working Working Working Working Working Working Working Working Working Working Working Working Working Working Working Working SOLARIS Working SOLARIS Working Working Working

SUN SOLARIS 64bits Working SUN SPARC SUN x86-64

WINDOWS XP WINDOWS VISTA (32bits / 64 bits) WINDOWS 2003/2008 SERVER (32bits/64bits)

Working

Working

Working

Working

Working

LINUX MANDRIVA Working LINUX DEBIAN UBUNTU LINUX REDHAT LINUX CENTOS HP UX IBM AIX (32bits / 64 bits) / Working Working Working Working Working
1

Working Working Working Working Working Working


2

Working Working Working Working Working Working


1

Working Working Working Working Working Working


2

Working Working Working Working Working Working

1. Requires the use of an IBM JVM version 1.6+ 32bits. Only limited support is provided. Contact Support for details. 2. However the graphical mode being not supported only Commandline can be used.

1.1.4. Compatible Web browsers


Despite our intensive tests, you might encounter some issues when accessing Talend MDM Web User Interface with some Web browser. Please refer to the table below for a summary of supported Web browser. Based on reported issues, we considered that some Web browsers are not supported even though the issue can be resolved in particular conditions. A note has been added providing configuration details. Web browser Mozilla Firefox Talend MDM Web User Interface Working (Versions 3.0 and above)

Talend Open Studio for MDM Installation Guide

Naming conventions

Web browser Google Chrome Safari Opera

Talend MDM Web User Interface Working1 Working1 Working1

Microsoft Internet Explorer 7 and above Working (Versions below 9.0)

1. Only limited support is provided. Contact Support for details.

1.1.5. Naming conventions


The email you received from Talend lists a number of links to the software modules you are allowed to download according to the license you have. The file naming conventions are as follows: Zip file naming convention Talend-All-rYYYY-vA.B.C Example Talend-All-r63143-V4.2.2.zip Description Commandline interface to the IDE + Talend Open Studio for MDM IDE (GUI)

TMDM_TDQEEMPX-Server-All- TMDM_TDQEEMPX-Server-All- The MDM server rYYYY-VA.B.C r63143-V4.2.2.jar TAC-rYYYY-vA.B.C TAC-r63143-V4.2.2.zip Talend Administration Center: Web-based application used to administrate Talend Integration Suite projects and users.

org.talend.remote.jobserver_A.B.C_rYYYY org.talend.remote.jobserver_4.2.2_r63143.zip JobServer: Standalone execution server Soamanager-rYYYY-VA.B.C soamanager-63143-V4.2.2.jar SOA Manager: helps deploying Web services Jobs

Where: YYYY: Revision number, A.B.C.: Major. Minor. Patch: revision level if relevant. The software modules must be all in the same versions/revisions! This means that both YYYY and A.B.C must match on both: client side and server side.

1.2. Third-party softwares


Some additional third-party applications are required for Talend MDM modules to work smoothly together. As the installation of MDM includes Talend Open Studio for Data Integration and its web application, related applications are included in the lists. A Web application server able to deploy WAR files, for example: - Apache Tomcat version 5.5 or 6.0 (version 6.0 is recommended) - http://tomcat.apache.org and/or - JBoss Application Server version 4.2.2 - http://www.jboss.org/jbossas/downloads/

Talend Open Studio for MDM Installation Guide

Third-party softwares

By default, Talend global Installer will install the above both servers. You can still customize the install to deploy everything on just JBoss. However, this configuration requires some expertise. You are also not required to download JBoss prior to installation as the server is included in the install bundle. For further information on Talend global Installer, see the User Guide. Sun Microsystems (JDK or JRE) JVM 1.5+ (but version 1.6+ is recommended) - http://java.sun.com/javase/ downloads/index.jsp Subversion for storing your projects - http://subversion.tigris.org/ or http://www.visualsvn.com/server/ download/

Talend Open Studio for MDM Installation Guide

Talend Open Studio for MDM Installation Guide

Chapter 2. Installing the MDM server


This chapter provides information about how to install the MDM server using: a graphical installer, your console server or the silent installation XML file generated at the end of installing the server via the installer.

Talend Open Studio for MDM Installation Guide

Two different installation modes

2.1. Two different installation modes


You can install the MDM modules using either an executable or a jar file. The common installation mode is using the executable file to install the MDM modules on Windows or Linux. The less common installation mode is using the jar file to install the MDM modules on all platforms other than Windows and Linux.

2.2. Installing MDM modules using the Windows/Linux executable file


The executable file allows you to launch a global Installer that helps to set up all Talend modules including those for MDM. However, if you want to use the global Installer to install only the MDM modules, you must select the Custom installation type in the Installer. For further information about using the global installer to install MDM on Windows or Linux, see the User Guide.

2.3. Installing MDM modules using the jar file


The jar file allows you to launch a cross-platform MDM-dedicated graphical installer to install JBoss 4.2.2 and deploy the MDM Server in simple click-next steps. The jar file is usually used with platforms other than Windows and Linux. Using the jar file provided by Talend, you can install the MDM modules in two different modes as the following: a cross-platform graphical installer to help you install JBoss 4.2.2 and deploy the MDM Server in simple clicknext steps. On Windows, just double-click on the .jar file included in the product archive file and follow the instructions. On other platforms, you may execute the jar by right-clicking it and selecting the OpenJDK JRE or Sun's JRE. For further information, see Section 2.3.1, Installing in GUI mode. Otherwise, open your command-line and use the command: java -jar <jar name>.jar -console, and then follow the instructions to complete the installation of the MDM server. For further information, see Section 2.3.2, Installing in Command/Console mode. The sections below explain in detail the above installation modes.

2.3.1. Installing in GUI mode


Talend Open Studio for MDM and Talend MDM Web User Interface that make up Talend MDM require that you install an MDM server. Prerequisite(s): -JDK 1.6.0 must be installed. You should also make sure that the JAVA_HOME environment variable is set to point to the JDK directory. For example, if the path is C:\Java\JDKx.x.x\bin, you must set the JAVA_HOME environment variable to point to: C:\Java\JDKx.x.x.

Talend Open Studio for MDM Installation Guide

Installing in GUI mode

-(Only Linux) A Windows Manager must be installed. -It is highly recommended that the full path to the server installation directory is as short as possible and does not contain any space character. -If you already have a suitable JDK installed in a path with a space, you simply need to put quotes around the path when setting the values for the environmental variable. To install the MDM server using a .jar file, complete the following: Unzip the server file provided by Talend. On Windows, double-click the cross-platform .jar file to run the installer. A language selection pop-up displays. On other platforms, you may execute the .jar file by right-clicking it and selecting the OpenJDK JRE or Sun's JRE. From the language selection pop-up, select an installation language from the list and click OK to close the popup and proceed to the next step. On the Talend MDM welcome page, click Next o to proceed to the next step. Read the license agreement and select the accept option. Click Next to proceed to the next step. Read the JBoss information and click Next to proceed to the next step.

Select the check boxes of the packs you want to install, and then click Next to proceed to the next step. The check boxes of required packs are already selected and unavailable (MDM in this case). If you have a JBoss application server already installed on your machine and you do not want to re-install it, clear the JBoss check box. Browse to where you want to install JBoss and the MDM server, and then click Next to proceed to the next step. A message displays to inform you about the creation of a target directory. If you want to install JBoss as a Windows service, select the Create JBoss Windows service check box and then click Next to proceed to the next step. Read the installation settings, and then click Next to proceed to the next step and start the installation. Two progress bar indicate how much of the installation has been completed.

Talend Open Studio for MDM Installation Guide

Installing in Command/Console mode

When the progress bars indicate the end of the installation, click Next to have a confirmation message that the installation is completed successfully. Click Done to close the installer. The MDM server is installed. An MBean is provided to manage the MDM server caches and it is available in the JBoss JMX console. To run the MDM server, execute run.bat (Windows) or run.sh (Linux) in the JBoss.4.2.2.GA folder. To shut the MDM server down, press Ctrl + C in the console window, or run bin/shutdown.bat or bin/shutdown.sh.

2.3.2. Installing in Command/Console mode


You can install the MDM server in a non-GUI mode using the command-line. Prerequisite(s): -JDK 1.6.0 must be installed. You should also make sure that the JAVA_HOME environment variable is set to point to the JDK directory. For example, if the path is C:\Java\JDKx.x.x\bin, you must set the JAVA_HOME environment variable to point to: C:\Java\JDKx.x.x. -(Only Linux) A Windows Manager must be installed. -It is highly recommended that the full path to the server installation directory is as short as possible and does not contain any space character. -If you already have a suitable JDK installed in a path with a space, you simply need to put quotes around the path when setting the values for environmental variable. To use the command-line capabilities to install the MDM server: Unzip the .jar server file provided by Talend. Open your console server depending on the platform you have. Enter the below command, and then press the Enter key on your keyboard to launch the installation procedure through this text-only interface. java -jar <jar name>.jar -console Follow the instructions to install the MDM server.

10

Talend Open Studio for MDM Installation Guide

Chapter 3. Migrating databases and MDM objects


This chapter provides you with information on how to migrate XML databases and other MDM objects (Jobs, pictures, workflows, etc.) on the MDM server.

Talend Open Studio for MDM Installation Guide

Migrating MDM projects

3.1. Migrating MDM projects


The MDM repository and master-records are both stored in the database. On startup, MDM compares the initial database version - the version that was set when you first launched the software - with the current version of the software, and applies all the migration tasks to upgrade the database to the correct version, if necessary. However, as not everything is in the database, you must import and redeploy manually all what is not in the database, namely: Jobs, workflows pictures, web resources. You must delete your web browser cache and cookies whenever you change the version, or the Studio (Talend Open Studio for MDM or Talend Enterprise MDM Studio ) or Talend MDM. Unpredictable behavior or display errors will occur if you do not. The sections below explain all the tasks you must carry out to have a complete migration operation for all the data objects you have on the MDM server including: master-records, Jobs, workflows, pictures and web resources.

3.1.1. Migrating the eXist database


Prerequisite(s): Make sure that both MDM servers are not running. To migrate the eXist database to a newer MDM version, complete the following: In the Jboss folder of the old MDM version, browse to: jboss-4.2.2.GA/server/default/deploy/exist-1.4.0-rev11706-TalendPatch.war/WEB-INF/data Copy the data folder of the old MDM version and paste it in the same path in the new MDM version. if you have pictures in your model, make sure to copy jboss-4.2.2.GA/server/default/deploy/ zz.50.ext.imageserver.war/upload to the same path on the new server. For detail information, see Section 3.1.3, Moving the pictures and web resources. Launch the MDM server and then Talend Open Studio for MDM of the new MDM version as usual and you should have access to the migrated data objects.

3.1.2. Reimporting and redeploying your Jobs


Prerequisite(s): Make sure that the MDM server is up and running. If you have Talend Jobs in your old MDM application, complete the following to migrate these Jobs: Switch to the Data Integration perspective and import the old workspace to retrieve the Jobs. For further information on importing items from the remote repository, see the Talend Open Studio for Data Integration User Guide.

12

Talend Open Studio for MDM Installation Guide

Moving the pictures and web resources

You can simply import your Jobs if they are exported in archive files from older MDM Studios. For further information on importing/exporting items, Routes or Jobs, see Talend Open Studio for Data Integration User Guide. Deploy the Jobs to the new MDM server one by one. For further information, see the Talend Open Studio for MDM Administrator Guide. You can also copy/paste the job scripts (.war or.zip) from their corresponding folder in the old application to the same folder in the new application: jboss-4.2.2.GA/server/default/deploy for wars and jboss-4.2.2.GA/jobox/deploy for zips. But this will not import the job design that you may need at some point. Another limitation with this copy/paste mode is that it is recommended only between two MDM servers that have the same major version (first number of the unique identifier of the version). If the major versions differ, it is very likely that the MDM components will not work with the new MDM Server. If you are migrating between 2 identical versions or 2 versions where only the minor version differs, however, copying the wars or zips will be a lot faster than redeploying the Jobs.

3.1.3. Moving the pictures and web resources


Prerequisite(s): Make sure that both MDM servers are not running. If you use pictures in your data-model, complete the following to migrate them to the new MDM server: In the Jboss folder of the old MDM version, browse to: jboss-4.2.2.GA/server/default/deploy/zz.50.ext.imageserver.war/upload Copy the upload folder of the old MDM version and paste it in the same path in the new MDM version. Launch the MDM server and then Talend Open Studio for MDM of the new MDM version as usual and you should have access to the migrated data objects. If you use web resources (images, css, js, etc. in your smart views, complete the following to migrate them to the new MDM server: In the Jboss folder of the old MDM version, browse to: jboss-4.2.2.GA/server/default/deploy/jboss-web.deployer/ROOT.war Copy the web resources from the old MDM version and paste them in the same path in the new MDM version. Launch the MDM server and then Talend Open Studio for MDM of the new MDM version as usual and you should have access to the migrated data objects.

Talend Open Studio for MDM Installation Guide

13

Talend Open Studio for MDM Installation Guide

Chapter 4. Managing MDM database(s)


This chapter describes some XML database management options regarding database performance, database backup and restore and the installation of an eXist standalone database.

Talend Open Studio for MDM Installation Guide

Managing the eXist database

4.1. Managing the eXist database


Talend Enterprise MDM Studio uses an eXist database to store the MDM repository and master data records. The sections below detail some management options you can carry on the eXist database.

4.1.1. eXist tuning and performance


The performance of Talend MDM depends for a good part on the eXist database. Below are some tuning tips you can use to improve performance.

4.1.1.1. Configuration of the eXist cache


eXist cache needs all the memory you can give. By default, eXist cache is very conservative: 48 MB. There is a very good chance that for every request you make, eXist spends most of its time swapping pages back and forth in the cache. The same applies to most operations, including loading data. The eXist cache has a big impact on paging the record sets in both Talend Open Studio for MDM and in the Talend MDM Web User Interface. If you used the installer to install the MDM server, eXist is part of a web application which is hosted by JBoss: TMDM_TDQPERTX-Server-All-r50363-V4.1.1/jboss-4.2.2.GA/server/default/deploy/exist-1.4.0-rev11706TalendPatch.war Therefore it shares the JVM with JBoss. The total memory allocated to the JVM is specified with the -Xmx switch: Use the below file for Windows: TMDM_TDQPERTX-Server-All-r50363-V4.1.1/jboss-4.2.2.GA/bin/run.bat And use the below file for Linux: TMDM_TDQPERTX-Server-All-r50363-V4.1.1/jboss-4.2.2.GA/bin/run.conf We default it to 1 GB. Example in run.bat: set JAVA_OPTS=%JAVA_OPTS% -Xms512m -Xmx1024m -XX:MaxPermSize=256m Some of this memory can be allocated specifically to eXist cache. This is specified in the eXist settings: TMDM_TDQPERTX-Server-All-r50363-V4.1.1/jboss-4.2.2.GA/server/default/deploy/exist-1.4.0-rev11706TalendPatch.war/WEB-INF/conf.xml The property to look for is cacheSize. Change the 48 MB default to something more realistic. It is recommended not to go over 1/2 of the JVM total memory (-Xmx) when the database coexists with other applications. Keep in mind you still need everything else in JBoss to keep working. <db-connection cacheSize="48M" (...)> to 512M max when -Xmx is 1024 --> <!-- change

On a 64 bit machine with memory aplenty, and with a 64 bit JVM of course, you can set the -Xmx to a high number, say 8 GB, and cacheSize to much more than half of that.

16

Talend Open Studio for MDM Installation Guide

eXist tuning and performance

4.1.1.2. eXist outside J2EE


eXist also works as a standalone application outside a J2EE container. It then has its own JVM so you can set the -Xmx independently. You may delete/move: TMDM_TDQPERTX-Server-All-r50363-V4.1.1/jboss-4.2.2.GA/server/default/deploy/exist-1.4.0-rev11706TalendPatch.war And follow the instructions outlined in Section 4.1.3, Standalone eXist. You can safely move the below file to this new instance to get your data back: TMDM_TDQPERTX-Server-All-r50363-V4.1.1/jboss-4.2.2.GA/server/default/deploy/exist-1.4.0-rev11706TalendPatch.war/WEB-INF/data

4.1.1.3. Range indexes


When you click on the Search button with no search criteria, Talend MDM basically performs a full search. No help from the index here. However, as soon as you do set some criteria you definitely want to setup the corresponding range indexes in the database to improve performance. The following procedure is based on http://exist-db.org/indexing.html. Basically, you may want to index every primary key and foreign keys, as well as every element specified in the searchable section of the browse items views. Talend MDM data containers are stored as collections under /db. For instance, the path to a Product data-container is: /db/Product. To specify an index for this collection, create a document called collection.xconf under: /db/system/config/<same full path to the collection> For example: /db/system/config/db/Product/collection.xconf Launch the Admin Client as outlined in Section 4.1.2.1, Launching the eXist Admin Client . Navigate to: /db/system/config Use File - Create Collection to create a db collection under: /db/system/config Go into the newly created collection, so the current path is now: /db/system/config/db Use File - Create Collection to create a collection with the exact same name as the data-container to index; e.g. Product. Go into the newly created collection, so the current path is for instance: /db/system/config/db/Product Use File - Create blank Document to create a new empty document with this exact name: collection.xconf. Open collection.xconf to specify the index. Below is a sample collection.xconf:

Talend Open Studio for MDM Installation Guide

17

eXist tuning and performance

<?xml version="1.0" encoding="UTF-8"?> <collection xmlns="http://exist-db.org/collection-config/1.0"> <index> <!-- Range indexes --> <create qname="Id" type="xs:string"/> <create qname="AgencyFK" type="xs:string"/> <create qname="Name" type="xs:string"/> <create qname="Firstname" type="xs:string"/> <create qname="Lastname" type="xs:string"/> <!-- Full text index --> <lucene> <text qname="Product"> <ignore qname="Id"/> </text> </lucene> </index> </collection> Navigate back to the top level /db, select your data-container (e.g. Product) and run File - Reindex Collection. If you do not do this step, only new records will be indexed, so the index will be incomplete, and consequently your will be able to search only new records. You can specify range indexes for integers, decimals, dates and strings. You can also create full-text indexes. Please refer to http://exist-db.org/indexing.html. It is recommended to set the element name by QName instead of Xpath. Therefore, it is a best practice to always name the PKs the same, for instance Id, so if you set an index on this QName, all PKs will be indexed.

4.1.1.4. Full-text indexes (Lucene)


As eXist embeds Lucene, you just need to add the entity name (e.g. Product) in the searchable section of the browse items view in order to allow the web interface to select full-text search. Keep in mind the search is performed in Lucene so if your Lucene index is not populated, you will get 0 result. You specify the Lucene index in collection.xconf, just like the range indexes: <lucene> <text qname="Product"> <ignore qname="Id"/> </text> </lucene> The <ignore> element tells Lucene not to index the Product key (usually meaningless, no need to pollute the index).

4.1.1.5. Too many open files issue


By default eXist serializes the XML documents extracted from the database onto the file-system. It does that twice: on the server-side, and as part of the XML-RPC API we use to communicate with eXist. As a result, when you set the page size to a relatively high number (several hundred), eXist creates too many temporary files. This might be invisible on some OS (Windows) but you are very likely to reach a system limit on Linux, where the maximum number of the file handles that can be created by one JVM is 1024.

18

Talend Open Studio for MDM Installation Guide

eXists database backup/restore

eXist creates tmp files in the first place because originally native XML databases were containers for big, if not huge XML documents, and deserializing those in memory was hardly an option. However, most of the time this use case does not apply to MDM where you will usually have numerous small documents. In addition, since eXist is an open-source database, we have modified it to optionally not create temporary files. The standard installation by the graphical installer uses this modified version by default. To activate the option, add the following options in JAVA_OPTS:

-Dorg.exist.xmldb.inMemory.remote.content=true -Dorg.exist.xmlrpc.inMemory.retrieve.content=true So at the end this is how the JAVA_OPTS variable could look like in run.bat:

set JAVA_OPTS=%JAVA_OPTS% -Xms512m -Xmx1024m -XX:MaxPermSize=256m -Dorg.exist.xmldb.inMemory.remote.content=true -Dorg.exist.xmlrpc.inMemory.retrieve.content=true

4.1.2. eXists database backup/restore


Backups are strongly recommended for data protection in the event you experience a system crash or loss of data. Backups are also very useful for exporting data in order to re-import all or parts of the data to a different database, e.g. while upgrading eXist to a newer version. eXist provides different methods for creating backups. For detail information, see http://exist.sourceforge.net/ backup.html. You may use the Admin Client web application to perform backup and restore of eXist data.

4.1.2.1. Launching the eXist Admin Client


Before being able to use the eXist Java Admin Client to back up and restore data, you need first to launch this Admin Client. To launch the Java Admin Client, complete the following: Connect to http://localhost:8080/exist/. At the bottom left corner of the page and in the Administration panel, click Launch. Click OK to accept that the Java Webstart Launcher starts the administration client. If you have a security warning message, accept to run the application to proceed to the next step. In the login page of the administration client, enter admin in the Username field. Make sure the Type is set to Remote and the URL is set to: xmldb:exist://localhost:8080/exist/xmlrpc

Talend Open Studio for MDM Installation Guide

19

eXists database backup/restore

In the Password field, enter 1bc29b36f623ba82aaf6724fd3b16718. The administrator password is specified in jboss-4.2.2.GA/bin/mdm.conf. If required, enter a favorite in the Title field, MDM DB for example, and then click the Save button to the right of the page. The new favorite is listed in the Favorites list. The next time you want to launch the administration client, you can double click this favorite to fill in the login information instead of entering it manually. Click OK to close the login page and open the administration client.

20

Talend Open Studio for MDM Installation Guide

Standalone eXist

From this page, you can see the content of the eXist database. You can also use the button on top of the page to carry out different management options on data including creating backups.

4.1.2.2. Backing up and restoring data


From the Java Admin Client and by using the and buttons, you can respectively create backups of the eXist data and restore your database files from a backup. For detail information, see http://exist.sourceforge.net/ backup.html.

4.1.3. Standalone eXist


The XML database can be installed and run in two modes in Talend MDM: either embedded as a component of the application server which is the default or as a separate application, independent of the application server. In standalone mode, eXist can be run on a different machine than that of the MDM server. The only requirement is that a TCP connection can be established on a single selectable port from the machine hosting the MDM server to the machine hosting the eXist server. The sections below describe the steps required to install a standalone eXist and use it with Talend MDM.

4.1.3.1. Downloading eXist


Download the latest eXist .jar installer from http://www.exist-db.org/download.html.

Talend Open Studio for MDM Installation Guide

21

Standalone eXist

Make sure you remember the password of the admin user.

4.1.3.2. Fine-tuning eXist

How to edit {eXist Dir}/bin/functions.d/eXist-settings.sh


Allocate memory to eXist by updating the -Xmx parameter in set_java_options(). For instance, this will allocate 2GB of RAM for eXist: set_java_options() { if [ -z "${JAVA_OPTIONS}" ]; then JAVA_OPTIONS="-Xms128m -Xmx2048m -Dfile.encoding=UTF-8"; fi JAVA_OPTIONS="${JAVA_OPTIONS} -Djava.endorsed.dirs=${JAVA_ENDORSED_DIRS}"; }

How to edit {eXist Dir}/conf.xml

If needed, change the admin password in: <cluster dbaPassword=[enter your password here] Increase the cache memory in: <db-connection cacheSize=xxM to no more than half of the allocated heap size (i.e. the previous -Xmx parameter in JAVA_OPTIONS). Increase the cache memory in: <db-connection collectioncache=yyM to no more than half of the cache and only if you are using a lot of containers/collections (heavy use of versions and revisions in Talend Open Studio for MDM. Activate automatic backups (recommended) by uncommenting the section: <job type=system name=backup Backups are triggered by default every 6 hours. This may be changed using a cron like syntax.

How to edit {eXist Dir}/server.xml


Change the port on which eXist listens for requests in the element: <listener port=xxxx

22

Talend Open Studio for MDM Installation Guide

Standalone eXist

A typical value is 8088. The default value (8080) will clash with the port used by the JBoss and the MDM Server.

4.1.3.3. Launching eXist


You can run eXist in two modes: A lightweight, server-only mode, with no web-based administration, The complete mode that includes the web administration. Check eXist documentation for implications. Start eXist through: server.sh, or server.bat as appropriate, for the server-only mode, startup.sh, or .bat, for the complete web-based mode.

4.1.3.4. Updating the MDM server to use the standalone eXist


Stop the MDM Server. ###################################################### # eXist DB Setting ###################################################### xmldb.server.name = the server name or the IP address of the server running eXist xmldb.server.port = the port set in server.xml above xmldb.administrator.username = usually "admin" xmldb.administrator.password = the admin password, may be empty after a default install xmldb.dburl = xmlrpc/db if you start eXist through server.sh, or exist/xmlrpc/db if you start it through startup.sh. xmldb.isupurl = leave empty if you start eXist through server.sh, or xmlrpc/db if you start it through startup.sh Make sure the settings in xmldb.dburl and xmldb.isupurl are consistent with the mode you chose Below is an example for a default install of eXist on a machine called exa, starting eXist with server.sh: ###################################################### # eXist DB Setting ###################################################### xmldb.server.name=exa xmldb.server.port=8088 xmldb.administrator.username=admin xmldb.administrator.password= xmldb.dburl=xmlrpc/db xmldb.isupurl= Since you will not need the embedded eXist, you may archive {MDM JBoss Dir}/server/default/deploy/ exist-1.4.0-rev10440.war somewhere else to prevent JBoss from deploying it. You may also want to copy

Talend Open Studio for MDM Installation Guide

23

Standalone eXist

the WEB-INF/data directory to {eXist Dir}/webapp/WEB-INF/data if you want to restore the exact same database. At this point you can start up the MDM Server.

4.1.3.5. General notes


The URL to enter in the eXist client to access a standalone, server mode eXist (started through server.sh) is:. xmldb:exist://{name of the machine}:8088/xmlrpc The Talend MDM run.sh or run.bat startup script should include a mechanism to start the eXist server before it is actually started. Something like (Unix/Linux/Mac only): # Check if eXist is up if [ -n "`pgrep -l -f exist.home `" ]; then echo $"eXist is already running" else echo "*****************************************" echo "** Starting eXist" echo "*****************************************" d=`date +%Y%m%d%H%M%S` /opt/eXist-1.4/bin/server.sh &> server_$d.log & fi

24

Talend Open Studio for MDM Installation Guide

Chapter 5. Important Configuration subjects


This chapter provides useful information about miscellaneous configuration subjects including configuring session timeout or access control and changing the default ports in JBoss.

Talend Open Studio for MDM Installation Guide

Configuring session timeout for the Web User Interface

5.1. Configuring session timeout for the Web User Interface


A user session timeout for Talend MDM Web User Interface is set to 30 minutes by default. The business user or data steward will be redirected to the login page of the Web User Interface after a period of 30 minutes of non-activity. You can always change this session timeout, if required. To set up a new timeout for users connecting to the Web User Interface, complete the following: In the JBoss folder, browse to the web.xml file in: server\default\deploy\jboss-web.deployer\conf Open the web.xml file in a text editor and search for the following tag: <!-- Default Session Configuration -->

Change the value of the default session timeout as desired. Save your modifications. The new session timeout parameter has been set for users connecting to the Web User Interface.

5.2. Configuring access control information for the Studio and the Web User Interface
The default authorized users for Talend Open Studio for MDM and Talend MDM Web User Interface use the following authentication information: admin as the login and talend as the password for the Studio; user/ administrator as the login and user/administrator as the password for the Web User Interface. It is possible for an administrator to change this access control information, if required. To configure new logins and passwords, complete the following: Browse to the login-config.xml file in: JBoss\server\default\conf Double-click this file to open it and search for the following tag:

26

Talend Open Studio for MDM Installation Guide

Changing the default ports in JBOSS

<!-- Policy for talend MDM -->

Change the default access control information in the following elements, as desired:

<module-option name="logins">admin,administrator,user</module-option> <module-option name="passwords">talend,administrator,user</module-option> Save your modifications. The new logins and passwords have been set for the Studio and the Web User Interface.

5.3. Changing the default ports in JBOSS


You may also want to browse the JBoss documentation for running multiple instances of JBoss on the same machine at http://community.jboss.org/wiki/ConfiguringMultipleJBossInstancesOnOnemachine.

5.3.1. Default port list


Below is the default port list: Port 8080 Change in bin/mdm.conf deploy/jboss-web.deployer/server.xml deploy/http-invoker.sar/META-INF/jboss-service.xml deploy/jbossws.sar/jbossws.beans/META-INF/jboss-beans.xml 8443 deploy/jboss-web.deployer/server.xml deploy/jbossws.sar/jbossws.beans/META-INF/jboss-beans.xml 8009 3873 deploy/jboss-web.deployer/server.xml deploy/ejb3.deployer/META-INF/jboss-service.xml

Talend Open Studio for MDM Installation Guide

27

Using an alternate binding

Port 8093 8083 1099

Change in deploy/jms/uil2-service.xml conf/jboss-service.xml conf/jboss-minimal.xml conf/jboss-service.xml

1098 4444 4445 4446

conf/jboss-minimal.xml conf/jboss-service.xml conf/jboss-service.xml conf/jboss-service.xml conf/jboss-service.xml

5.3.2. Using an alternate binding


Browse to the following file: jboss-4.2.2.GA\server\default\conf\jboss-service.xml Uncomment the following: <mbean code="org.jboss.services.binding.ServiceBindingManager" name="jboss.system:service=ServiceBindingManager"> <attribute name="ServerName">ports-01</attribute> <attribute name="StoreURL">${jboss.home.url}/docs/examples/ binding-manager/sample-bindings.xml</attribute> <attribute name="StoreFactoryClassName"> org.jboss.services.binding.XMLServicesStoreFactory </attribute> </mbean> In \jboss-4.2.2.GA\bin\mdm.conf, modify the HTTP port accordingly: #xmldb.server.port=8080 xmldb.server.port=8180 Windows service only: update the port in \jboss-4.2.2.GA\bin\service.bat: call shutdown -s jnp://localhost:1199 -S < .s.lock >> shutdown.log 2>&1

28

Talend Open Studio for MDM Installation Guide

You might also like