Professional Documents
Culture Documents
Learn to deploy,
administer, and
maintain
Wei-Dong Zhu
Dan Adams
Dominik Baer
Bill Carpenter
Chuck Fay
Dan McCoy
Thomas Schrenk
Bruce Weaver
ibm.com/redbooks
International Technical Support Organization
April 2008
SG24-7547-00
Note: Before using this information and the product it supports, read the information in
“Notices” on page xi.
This edition applies to Version 4, Release 0 IBM FileNet Content Manager (product number
5724-R81).
Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
The team that wrote this book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv
Become a published author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvi
Comments welcome. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvi
Contents v
6.3 P8 Content Manager administration support. . . . . . . . . . . . . . . . . . . . . . 149
6.4 JAAS overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
6.5 Product documentation for security. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
Contents vii
10.4.2 Access to the environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
10.4.3 Post-cloning activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
10.4.4 Backup changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
10.5 Deployment by export, transform, and import . . . . . . . . . . . . . . . . . . . . 261
10.5.1 Incremental deployment compared to full deployment . . . . . . . . . 261
10.5.2 Reduce complexity of inter-object relationships . . . . . . . . . . . . . . 262
10.5.3 Deployment automation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
10.6 P8 Content Manager deployment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
10.6.1 CE-Export . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
10.6.2 CE-Objects transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
10.6.3 CE-Import . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
10.6.4 Exporting and importing other components . . . . . . . . . . . . . . . . . 269
viii IBM FileNet Content Manager Implementation Best Practices and Recommendations
11.11.3 Online backup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
11.11.4 System restore . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312
11.11.5 Consistency Check utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312
11.11.6 Application consistency check . . . . . . . . . . . . . . . . . . . . . . . . . . 314
11.12 Task schedule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
11.13 Best practice summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401
Contents ix
x IBM FileNet Content Manager Implementation Best Practices and Recommendations
Notices
This information was developed for products and services offered in the U.S.A.
IBM may not offer the products, services, or features discussed in this document in other countries. Consult
your local IBM representative for information on the products and services currently available in your area.
Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM
product, program, or service may be used. Any functionally equivalent product, program, or service that
does not infringe any IBM intellectual property right may be used instead. However, it is the user's
responsibility to evaluate and verify the operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter described in this document.
The furnishing of this document does not give you any license to these patents. You can send license
inquiries, in writing, to:
IBM Director of Licensing, IBM Corporation, North Castle Drive, Armonk, NY 10504-1785 U.S.A.
The following paragraph does not apply to the United Kingdom or any other country where such
provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION
PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR
IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT,
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer
of express or implied warranties in certain transactions, therefore, this statement may not apply to you.
This information could include technical inaccuracies or typographical errors. Changes are periodically made
to the information herein; these changes will be incorporated in new editions of the publication. IBM may
make improvements and/or changes in the product(s) and/or the program(s) described in this publication at
any time without notice.
Any references in this information to non-IBM Web sites are provided for convenience only and do not in any
manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the
materials for this IBM product and use of those Web sites is at your own risk.
IBM may use or distribute any of the information you supply in any way it believes appropriate without
incurring any obligation to you.
Information concerning non-IBM products was obtained from the suppliers of those products, their published
announcements or other publicly available sources. IBM has not tested those products and cannot confirm
the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on
the capabilities of non-IBM products should be addressed to the suppliers of those products.
This information contains examples of data and reports used in daily business operations. To illustrate them
as completely as possible, the examples include the names of individuals, companies, brands, and products.
All of these names are fictitious and any similarity to the names and addresses used by an actual business
enterprise is entirely coincidental.
COPYRIGHT LICENSE:
This information contains sample application programs in source language, which illustrate programming
techniques on various operating platforms. You may copy, modify, and distribute these sample programs in
any form without payment to IBM, for the purposes of developing, using, marketing or distributing application
programs conforming to the application programming interface for the operating platform for which the
sample programs are written. These examples have not been thoroughly tested under all conditions. IBM,
therefore, cannot guarantee or imply reliability, serviceability, or function of these programs.
SAP, and SAP logos are trademarks or registered trademarks of SAP AG in Germany and in several other
countries.
Oracle, JD Edwards, PeopleSoft, Siebel, and TopLink are registered trademarks of Oracle Corporation
and/or its affiliates.
ReplicatorX, Network Appliance, SnapMirror, SnapLock, NetApp, and the Network Appliance logo are
trademarks or registered trademarks of Network Appliance, Inc. in the U.S. and other countries.
FileNet, and the FileNet logo are registered trademarks of FileNet Corporation in the United States, other
countries or both.
Enterprise JavaBeans, EJB, Java, JavaBeans, JDBC, JRE, JSP, JVM, J2EE, Solaris, Sun, and all
Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or
both.
Active Directory, Excel, Microsoft, Outlook, PowerPoint, SharePoint, Visio, Windows Server, Windows, and
the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both.
UNIX is a registered trademark of The Open Group in the United States and other countries.
Linux is a trademark of Linus Torvalds in the United States, other countries, or both.
Other company, product, or service names may be trademarks or service marks of others.
xii IBM FileNet Content Manager Implementation Best Practices and Recommendations
Preface
IBM® FileNet® Content Manager provides full content life cycle and extensive
document management capabilities for digital content. IBM FileNet Content
Manager is tightly integrated with the family of IBM FileNet P8 products and
serves as the core content management, security management, and storage
management engine for the products.
This IBM Redbooks® publication covers the implementation best practices and
recommendations for IBM FileNet Content Manager solutions. It introduces the
functions and features of IBM FileNet Content Manager, common use cases of
the product, and a design methodology that provides implementation guidance
from requirements analysis through deployment and administration planning.
It is important to note that this book describes features offered in IBM FileNet
Content Manager Version 4.0. Many of the features described in the book also
apply to previous versions of IBM FileNet Content Manager. For specific details,
refer to the correct version of IBM FileNet Content Manager documentation.
Bill Carpenter is an ECM Architect with IBM in the Seattle, Washington, area.
Bill has nine years of experience in Enterprise Content Management, at FileNet
and IBM, as a developer, development manager, and architect. He has previous
experience in building large software systems at Fortune 50 companies and has
also worked at small companies. He has been a frequent mailing list and patch
contributor to several open source projects. Bill holds degrees in Mathematics
and Computer Science from Rensselaer Polytechnic Institute.
xiv IBM FileNet Content Manager Implementation Best Practices and Recommendations
Chuck Fay is a Software Architect at IBM in Costa Mesa, California, reporting to
the CTO for Enterprise Content Management, with responsibilities in system
architecture, patents, and industry standards. He has thirty years of experience
in the software industry, as a developer, manager, and CTO staff member,
including eight years with Xerox Corporation, nineteen with FileNet Corporation,
and one with IBM. At FileNet, he was responsible for the design, development,
and deployment of complex document image management systems and
electronic document management application products. For the past six years,
he has advised FileNet (and now IBM) engineering, support, and technical sales
representatives, as well as clients, in the area of system architecture for high
availability and disaster recovery for IBM FileNet P8 products. He holds an A.B.
in Philosophy and an M.S. in Computer Science, both from Stanford University.
Dan McCoy is a Principal Consultant with IBM. He lives in San Diego, California.
He has 10 years of experience working with clients in Content and Records
Management. He holds a degree in Computer Science from San Diego State
University. His areas of expertise include FileNet/IBM Content Services, P8,
Records Manager, and Email Manager. He has written extensively about best
practices for implementing Records and Email Management Systems.
Very special thanks to Michael Seaman who has contributed to the review and
part of the writing remotely from England.
We also thank the following people for their contributions to this project:
Deanna Polm
International Technical Support Organization, San Jose Center
Preface xv
Kevin Bates
Debbie Lelek
Qiuping Lu
Xingdong Ji
Gregory Miller
Tim Morgan
Joseph Raby
Shari Perryman
Yvonne Santiago
Diane Searer
Michael Tucker
Shawn Waters
Mike Winter
IBM Software Group, Costa Mesa, California
Your efforts will help increase product acceptance and client satisfaction. As a
bonus, you will develop a network of contacts in IBM development labs, and
increase your productivity and marketability.
Find out more about the residency program, browse the residency index, and
apply online at:
ibm.com/redbooks/residencies.html
Comments welcome
Your comments are important to us!
xvi IBM FileNet Content Manager Implementation Best Practices and Recommendations
Send your comments in an e-mail to:
redbooks@us.ibm.com
Mail your comments to:
IBM Corporation, International Technical Support Organization
Dept. HYTD Mail Station P099
2455 South Road
Poughkeepsie, NY 12601-5400
Preface xvii
xviii IBM FileNet Content Manager Implementation Best Practices and Recommendations
1
The ability to make decisions better and faster is a real competitive advantage
that IBM Enterprise Content Management (ECM) solutions can help provide.
IBM ECM improves workforce effectiveness by enabling organizations to
transform their business processes, access and manage all forms of content,
secure and control information related to compliance needs, and optimize the
infrastructure required to deliver content anywhere at anytime.
IBM ECM helps organizations make quick, smart, and cost-effective decisions,
right at the moment that it matters the most.
The IBM FileNet P8 products are based on the IBM FileNet P8 Platform, which
is a unified content, process, and compliance platform that offers maximum
flexibility, accelerates application deployment, and lowers the total cost of
ownership. It is an integrated platform that provides interoperability to a wide
selection of database, operating system, storage, security, and Web server
environments. It serves as the core content management, security management,
and storage management engine for the IBM FileNet P8 family of products.
The IBM FileNet P8 Platform includes the baseline components for enterprise
content management solutions, including Content Engine, Process Engine,
Application Engine, and Rendition Engine. These components address
enterprise content management and Business Process Management
requirements. We discuss these components (excluding the Rendition Engine) in
3.1.1, “Major components of an IBM FileNet P8 Platform” on page 28.
All IBM FileNet P8 Platform capabilities are inherited and therefore are available
in all IBM FileNet P8 products. Additional components can be added to a system
to enable additional capabilities.
The IBM FileNet P8 Platform capabilities can be leveraged for a wide range of
enterprise scalable solutions, including Business Process Manager, Content
Manager, Email Manager, Forms Manager, Image Manager, Records Manager,
and more.
For a list of IBM FileNet P8 family of products and a brief introduction to several
of these products, see 1.4, “IBM FileNet P8 family of products” on page 8.
In the next section, we focus on the main product that this book addresses, P8
Content Manager.
P8 Content Manager provides the ability to actively manage content across the
enterprise regardless of the repository in which it resides, using Content
Federation Services. It is integrated with the IBM FileNet P8 Platform, which
provides interoperability with the widest selection of database, operating system,
storage, security, and Web server environments in the industry.
According to the October 2007 published white paper, FileNet P8 4.0: Content
Engine Performance and Scalability using WebSphere Application Server v6 and
DB2 9 Data Server on IBM System p5 595:
You can download the white paper from the following Web site:
http://w3.ibm.com/software/xl/portal/viewcontent?type=doc&srcID=DM&docI
D=M491211F04837D11
Scalable architecture
P8 Content Manager achieves these performance rates with a scalable
architecture. Multiple servers can be added in load-balanced configurations to
handle increasing transaction loads (see Figure 1-1). This architecture makes
the P8 Content Manager repository an ideal candidate for large corporations,
government agencies, or any client with large information management
requirements.
Load Balancer
Scaled Content
Manager
Servers
Repository
P8 Content Manager features event action scripts that can be triggered when
objects are created, modified, or deleted in the repository. Event actions can
launch workflows or execute Java applications. Events, and the actions that they
trigger, are the mechanism that enables active content.
Create
In Review
Publish
Approval
Figure 1-2 Active content example: Simple Document revision life cycle
In this example, the business process is the revision cycle, and the steps of the
process are Create, Revision (In Review), Approval, and Publish. The
document’s state corresponds to the step in the life cycle.
Using P8 Content Manager and IBM FileNet Business Process Manager (BPM),
designers can build a workflow (Claim Process) that launches when insurance
claims arrive by fax.
Claim
BPM Record
Claim
Adjustment
Claim
Actions
Claim
Approval
The elements of active content — route control, document states, and event
action scripts — offer designers a powerful toolset for creating strong enterprise
content management systems.
There are many content products under the IBM FileNet P8 suite. It is beyond the
scope of this book to introduce all of them. However, to help you better
understand what these products can do for your corporation, we briefly introduce
several of them here.
To help you better understand what these products can do for your corporation,
we briefly introduce several of them here.
Note: BPM comes with the three core engines (Content Engine, Process
Engine, and Application Engine) that IBM FileNet Content Manager provides.
The difference is that BPM extends the basic process capabilities of IBM
FileNet Content Manager and provides much more advanced features and
functions for implementing complex business processes.
To help you better understand what these products can do for your corporation,
we briefly introduce several of them here.
minor versions.
0.2
1.0
0.1
Repository
0.2
Solution description
This solution uses P8 Content Manager features without additional
programming. In the design, safety documents are stored in the repository where
they are available to all users (including factory workers) for reference.
Each safety document goes through a document life cycle with multiple states. In
this implementation, the states are minor and major versions. A minor version is
a draft document; a major version is a completed document that has been
approved and released. A security policy is implemented to define the security
that applies to documents in the major version state and those in the minor
version state. Minor versions can only be viewed and modified by authors or
managers. They are invisible to general users. All users can view major versions,
but only authors and managers can modify them.
For simplification and to reflect the majority of actual solutions, this sample
solution does not include document retention. When implementing a document
revision solution for your environment, you must address your document
retention requirements and include them in your solution implementation as
necessary.
This P8 Content Manager solution is implemented with the following features and
components:
Fax Capture
Business Process Manager (for workflow process management)
Active content event actions
Annotations
Branching workflow steps
Auto processing step (step processor)
Notification through e-mail
PDF rendition
Capture
3. A workflow
launches
automatically. 8. Claim documents
are sent to the client.
Repository
BPM
Solution description
In this solution, the active content is the insurance claims. They move through a
business process in a series of steps.
IBM FileNet Fax Capture receives faxes, converts them to TIFF images, and
stores them in the repository along with metadata supplied by the insurance field
agent on the fax cover form. A claim workflow is launched automatically when
the fax image is added to the system (the add event). Annotation security, the
TIFF viewer, and other security measures control who can review, approve, and
deny the claim at each step. A custom interface (known as a custom step
processor) provides the interaction with the existing accounting systems and
e-mail servers. Finally, P8 Content Manager’s PDF rendition feature records the
process in a log for an audit trail and adds the PDF format to the repository.
The Health Insurance Portability and Accountability Act (HIPA) requires that
insurance providers protect medical records from unauthorized release. This
solution uses P8 Content Manager security features to ensure that only the
patient, the patient’s doctors, and authorized case workers can access the
patient’s record.
This P8 Content Manager solution is implemented with the following features and
components:
IBM FileNet Fax Capture
IBM FileNet Records Crawler
P8 Content Manager security
P8 Content Manager server farm
Custom application (using P8 Content Manager APIs)
High performance search operation (load balancer)
Scalability
Figure 2-3 on page 19 illustrates the implemented call center support operation
solution using P8 Content Manager.
Server-Farmed Load-Balanced
Repositories Application (Web)
Servers
Input Files
Solution description
This solution utilizes the following P8 Content Manager components and
capabilities: Records Crawler, server farms, and a custom application using P8
Content Manager APIs.
Server Farms
For applications with high volume loads, P8 Content Manager can be configured
as a server farm. A server farm employs multiple servers to multiply processing
power. In this solution, three P8 Content Manager servers are deployed to
spread the document processing load across three separate P8 Content
Manager servers. A load balancer spreads the incoming load evenly so that even
a very high ingestion rate does not overload a single server.
This P8 Content Manager solution is implemented with the following features and
components:
IBM FileNet Email Manager with rule-based automation
IBM FileNet Records Manager
Figure 2-4 on page 21 illustrates the implemented e-mail capture for compliance
solution using P8 Content Manager.
e-mail Server
Inbox
Records File Plan
Solution description
This solution uses IBM FileNet Email Manager to monitor an Exchange Server
Journal (Lotus Notes and Novell GroupWise are also supported). The journal
contains a copy of all incoming and outgoing messages. IBM FileNet Email
Manager monitors the journal and searches for messages that meet a set of
conditions or rules. Common conditions include:
Messages that contain particular keywords
Messages to or from a particular set of addresses
Messages that pertain to compliance issues raised by the legal department
Messages that meet the set of conditions or rules are treated this way:
The message is captured and added to P8 Content Manager.
Duplicates of the message (if the message was sent to multiple recipients)
are identified. Only one copy is added to the repository.
The message is classified and declared as an official record subject to legal
retention rules.
In the user’s mailbox, the message is replaced by a stub. When the user
clicks the stub, the message is retrieved from the repository and displayed in
Outlook® as expected.
We outline the recommended design methodology that has been used by many
IBM Content Management Lab Services architects in the field and has been
applied successfully in many client situations.
The remaining chapters of this book address the concepts and recommendations
for each step in the methodology. When you design a P8 Content Manager
solution, we recommend following the chapters in the book and using our
suggestions and recommendations to meet your design challenges.
Typically, requirements gathering is an iterative process. You will start with the
functional design, revisit the requirements, complete the requirements analysis,
and then revisit the functional design. This process continues until you feel
confident that all known requirements have been identified and addressed.
ECM
Content and Workflow Presentation and Delivery
Content Ingestion
Manag ement Management
Bind documents
Applications Browsing
together
SMTP
Send
Figure 2-5 A simple input and output diagram to assess functional requirements
As you read through each chapter of this book, remember that each chapter
provides many of the best practices for a number of scenarios but in a
generalized way. Use these best practices and recommendations within the
context of the actual functional requirements for your solution; do not apply them
as is.
The repository design typically is tightly linked to the functional design. It affects
and is affected by the security design. The repository design must be carefully
synchronized with the application and security design.
For more details, refer to Chapter 5, “Basic repository design” on page 85 and
Chapter 8, “Advanced repository design” on page 185.
2.2.7 Deployment
Deployment is defined as the methodology to move a designed solution from
development to production. When planning for deployment, issues related to
release management, change management, testing, and the steps for the actual
move need to be considered. It is important to plan for deployment as early as
possible, especially at development time, to address many of the challenges that
might arise in this area.
For more details, refer to Chapter 11, “System administration and maintenance”
on page 273.
For the remainder of this chapter, we use the general term IBM FileNet P8
system for a system that is based on P8 Content Manager.
Both Content Engine and Process Engine have their own databases. In addition,
the Global Configuration database (GCD) stores global system configuration
information for all servers in the IBM FileNet P8 domain.
Figure 3-1 on page 30 shows the major components of the IBM FileNet P8
Platform.
Database
Table 3-1 lists the typical subcomponents that are installed in the main
components of the IBM FileNet P8 Platform.
Together with the directory service, Figure 3-2 on page 32 shows a complete,
basic IBM FileNet P8 system.
P8 Domain (System)
Global Configuration
Database
P8 Domain
AE Workplace
PE
The oval around the server in Figure 3-2 marks the IBM FileNet P8 domain to
which all servers belong. In IBM FileNet P8 4.0, all configuration data is stored in
a central location in the Global Configuration Database (GCD). The GCD
contains server and system configuration data, such as information about the
engines that have been installed in the IBM FileNet P8 domain, the topology of
the system, and the locations for caches and storage areas. The GCD belongs to
the entire system.
3.2 Scalability
When planning for an enterprise-wide system, it is hard to predict future
workload. If establishment of the IBM FileNet P8 system is the first project, and
more content process and compliance-related projects follow, there is a good
chance that the system capacity has to be increased and capacity planning has
to be adjusted. It is therefore important to create a system that can easily scale
upwards to accommodate increased workload on the system.
The question to ask when discussing scalability is, “Should an enterprise use a
few large machines or multiple small ones?” The answer depends on the client’s
existing system infrastructure, preference, available resources, and business
requirements. To help answer this question, we introduce horizontal and vertical
scaling.
Farming
This system can be scaled out further by farming engines. In IBM FileNet P8 4.0,
Application Engine, Content Engine, and Process Engine can be farmed using
load balancing. This farming approach can generally be used for components
that do not store data. For databases, the approach is usually vertical scaling. An
exception is Oracle® Real Application Cluster (RAC), which also supports
farming.
Note: Because the terms cluster and farm are not used consistently in the
industry, in this book, we define the terms as:
Cluster: Multiple servers are connected by a heartbeat, access shared
storage in an active/passive way, and communicate to the outside world
with one IP address regardless of which server is active.
Farm: Multiple servers access a shared resource with each node active,
where the single servers are addressed via a hardware or software load
balancer.
Figure 3-4 on page 35 illustrates an additional level of scaling using farming and
load balance technology.
Load Load
Balancer Balancer
Database Server
- Database Instance
In this scenario, each server in the Application Engine farm can talk to the
Content Engine and Process Engine farm. Therefore with this scenario, we
address both scalability requirements as well as high availability requirements.
Scaling of the standard IT components, such as directory server, file system, and
database, is not done for this example.
Instead of just scaling up the server with additional hardware, you can use
multiple J2EE instances on a single physical server, and each application runs
independently in its own J2EE instance. By separating the applications, you
achieve more efficient use of system resources.
Figure 3-5 illustrates the extended vertical scalability option for Content Engine
and Application Engine.
Load
Directory Server Balancer
(existing) JVMs JVMs
JVMs
Application Application
Application
Engine Engine
Engine
JVMs
Load Load
Balancer Balancer
JVMs JVMs JVMs
JVM 3
Port 9083
JVM 2
Port 9081
JVM 1
Port 9080 Process Process Process
Content Engine Engine Engine
Content Content
Engine
Engine Engine
Database Server
Databases
3.3 Virtualization
Virtualization has become a major trend in the IT industry. The drivers for
virtualization are cost reduction and providing better management of hardware
resources. Virtualization can be applied over servers, storage, and applications.
In this section, we focus on server virtualization.
The benefit of virtualization is better use of the current hardware, because the
number of physical boxes decreases, and a physical box becomes a virtual
machine. Instead of managing multiple systems, the resource optimization can
be concentrated at one point. It also opens new pathways for high availability and
disaster recovery, because you can copy entire systems to another location.
For example, if at the end of the month, usage of a certain virtualized application
increases sharply, it can be scaled on demand and assigned more system
resources. In that way, the system hardware is used more efficiently.
Another example is systems that are usually idle and have predictable peak
times. Given the fact that the peak times occur at different points in time, you can
benefit by moving applications from these systems onto one virtualized server.
A third example is systems that are used for training and support. Because
virtualization technology provides the option to clone an existing system, you can
clone a training system with preloaded data from another system. In the area of
client support environments with different operating systems, application version
and patch levels can be stored and started on demand. That increases flexibility
and speeds up problem deduction, because no time-consuming installation tasks
are necessary.
Within the IBM FileNet context, virtualization was first used for training systems.
Now, certain development and user acceptance systems are also using virtual
machines. In several instances, clients use virtualization in production systems.
When the VM wants to access resources that are managed in a system context,
the access is performed by a virtual machine monitor (VMM). The VMM analyzes
the code and provides a replacement function that safely accesses the
resources. Figure 3-6 illustrates virtual machines using VMM.
In certain implementations, the host operating system and VMM are combined
into a single layer. Examples of this approach are VMware products or Microsoft
Virtual Server.
Partition Management
Operating System (Host)
In this scenario, the coupling between the host operating system and the VM is
much tighter. Because only one kernel is used, the overhead incurred with this
approach is very small. However, the disadvantage of virtualization at the
operating system level is that it does not allow you to run different operating
systems.
The isolation of the single partition is key, because the system operates in one
kernel. This is done in the partition management part of the operating system.
The resource management, which is where the physical resources, such as
CPUs, memory, and processors, are assigned, is also done in the partition
management part of the operating system.
This level of virtualization is very popular for service providers who offer Internet
services or host special services. For this scenario, the low overhead and the
automation for replicating and horizontal scaling of virtual servers is key.
VM 1 VM 2 VM 3 VM 4 VM 5
Partition Management
Operating System (Host)
Figure 3-8 Deploying IBM FileNet P8 system with each engine in its own virtual machine
This architecture offers the highest flexibility and scalability because of the
number of virtual machines that you can have in the configuration. However, it
presents the highest complexity in regard to network configuration.
In general, you can use this architecture for a production system where
scalability is key.
Partition
Partition Management
Operating System (Host)
Partition Partition
VM 3
Gateway
Partition
Partition Management
Operating System (Host)
In this scenario, the applications are separated from the data. All servers in VM1
and VM2 talk to the gateway server in VM3 which holds the connection to the
outside world.
When duplicating the three VMs, the IP address to the outside world has to be
adjusted on the new gateway server VM, and then the clients can access the
new system through a new IP address.
If you need another IBM FileNet P8 system, duplicate the three VMs, reconfigure
the host name and network settings on the gateway server VM and pass the new
URL to the users.
Figure 3-11 on page 43 shows the result after duplication of the system.
Application Content Process Database Directory Application Content Process Database Directory
Engine Engine Engine Server Server Engine Engine Engine Server Server
VM 3 VM 3
Gateway Gateway
Partition Partition
Partition Management
Note that the gateway VM is not an IBM FileNet P8 component, but it is used as
an abstraction layer to the clients. It translates the client IP addresses to the
internal VM IP addresses. All IBM FileNet P8 VM clones operate with the same
IP address but only talk to the translating gateway.
In this section, we discuss options for clients using a shared infrastructure model
and provide best practices with regard to the requirements.
The architectures presented are based upon the basic IBM FileNet P8 system
introduced in 3.1.2, “A basic IBM FileNet P8 system” on page 31, that is
colocated on one server.
The Content Engine manages object stores. Each object store uses zero to many
file stores.
Each project looks only at its own content and project data. The existing
hardware stays the same and is used with each new project. All engines share
the same operating system. Patches on the operating system and application
level only need to be installed one time.
The content data is secured by the object store security. Each project only sees
the process data of the connected isolated region. All projects share the same
database. Data separation is done by using different isolated regions. It is
possible to use another object store as a shared repository between the projects.
Depending on the number of processes and the duration of projects, you must
carefully examine the size of the Process Engine database.
There is no limit to the number of isolated regions that you can define in a
Process Engine database. However, memory considerations impose a limit on
the number of isolated regions running concurrently, because data specific to an
isolated region is loaded into physical memory on the server when that isolated
region runs (that is, when a logged-on user initiates workflow activity in the
region).
Recommendations
Use this approach when you want to segregate data because of independent
projects. Content data is physically separated; process data is stored in one
database but is separated due to the isolated region.
This system architecture works for the independent projects that run in the same
environment. The projects can share the same infrastructure with a common
upgrade path and the same maintenance hours.
Although the projects share a common LDAP/ADS system, you can separate
security on the application level (for example, by implementing additional filters).
This medium data segregation approach can also be combined with the previous
shared system scenario where isolated regions are used.
In this scenario, the content data is secured by the object store security. Each
project can only see the process data of the connected isolated region. All
projects use different databases. Data separation is therefore higher compared
to the previous scenario in which they were separated by isolated regions only.
The scalability in this scenario is unlimited. You can establish a new project with
medium segregation by setting up a new Application Engine, Content Engine,
and Process Engine and using another object store and isolated region in an
existing system.
Recommendations
Use this approach for high volume content and process activity. You can use it
when data segregation needs to be at a database level.
This architecture works for projects that run in the same environment. They can
share the same infrastructure with a common upgrade path and the same
maintenance hours.
The content data is secured by the object store security. Each project can only
see the process data of the connected isolated region. All projects use different
databases. Data separation is on the database level just as in the medium data
segregation scenario.
With separate systems for high data segregation, data collaboration is limited
due to the different security across the systems. A cross-repository search has to
Each system can use a different LDAP/ADS security and reside in a separate
IBM FileNet P8 domain.
The scalability is unlimited. You can establish a new project with high
segregation by setting up a new system in a new domain.
Each system can have its own upgrade path, because there are no shared
components.
Recommendations
This scenario is for clients who need to separate security and block collaboration
among all its systems (applications).
This architecture works for projects that do not share the same environment. The
projects can use different infrastructure, reside in multiple time zones, and have
different maintenance windows.
Can the different projects share content with each other? If so, we suggest one
domain. If not, we suggest look into multiple domains.
If two projects that use different security structures must share data, use two
separate systems and implement some kind of data replication. Another way is to
put all users in a common directory service and secure the content via an access
control list.
One other consideration is the variety of the projects and operation hours. If all
clients are located in the same time zone or the system has a time window for
maintenance, a solution might be to use the medium segregation solution. If the
system has to manage projects with clients in different time zones without a
common maintenance window, you might need to either establish online backups
or set up separate systems.
Last but not least, the diversity of the projects can also be an indicator for the
architecture. Are these groups cooperative and will they accept the same time
Backup window Must be same time Must be same time Any time
Independent No No Yes
upgrade of projects
In the case study scenario, an application service provider hosts a system with
multiple applications and wants to offer different qualities of service at different
prices:
A mission critical application scaled through multiple Application Engine,
Content Engine, and Process Engine servers to guarantee performance and
availability
A less mission critical application scaled through one Application Engine,
Content Engine, and Process Engine server
In this case study scenario (see Figure 3-12 on page 52), the farmed IBM FileNet
P8 system consists of four Application Engines, three Content Engines, and
three Process Engines. A load balancer is used for each engine group to
represent the group as one virtual server. When configuring URLs in Workplace,
we configure the URL of the virtual server and let the load balancer distribute the
workload.
The idea for different qualities of service is to use different virtual servers on each
load balancer layer depending on the calling module.
In this case study, there are three projects. Project01 is a non-critical application
wherein the hosting service was sold to the tenant for a special rate. This is a
candidate for the low quality of service category. Project02 is medium-critical.
Project03 is a mission-critical application that must be able to scale and must not
have a single point of failure.
The projects are separately deployed and available under a URL that contains
the project name. The URL for Project03 is as follows:
http://proj03vae:9080/WorkplaceXT
We appended the host name with vae because the URL points to a virtual
Application Engine.
The Application Engine load balancer is available via Domain Name System
(DNS) under three IP addresses, each of which represents a virtual server
(proj01vae, proj02vae, and proj03vae).
If a user uses this URL, the virtual server on the Application Engine level for
project03 is used (which is proj03vae). The load balancer passes this request to
the physical servers, AE2 and AE3, and does a round-robin load balancing.
The Content Engine load balancer is available via DNS under three IP
addresses, each representing a virtual server (vlowce, vmediumce, and
vhighce).
In the case study scenario, the user who uses the following Web address
connects to the virtual Application Engine server proj03vae, which connects to
the virtual Content Engine server vhighce and the virtual Process Engine server
vhighpe:
http://proj03vae:9080/WorkplaceXT
Figure 3-12 on page 52 summarizes the idea and explains it for the Application
Engine and Content Engine level. As described earlier, this works the same for
Process Engine.
The triangle below the load balancers marks the servers that are pooled in a
virtual server.
vhighce
Directory Server
(existing)
Summary:
For project01, we use a low quality of service, using AE1, CE1, and PE1.
For project02, a medium quality of service is provided using AE2, CE1 + CE2,
and PE1 + PE2.
For project03, a high quality of service is provided using AE3 + AE4, CE1 +
CE2 + CE3, and PE1 + PE2 + PE3.
From our experience, the trend in Web application design is toward centralized,
highly available applications. However, under certain circumstances, it makes
sense to think about a distributed system. For example, a client who has multiple
geographical locations , each of which exclusively uses local resources, might
have different options in setting up the system: The client can use different
independent systems, one for each location. However, if all users are managed
in a central directory service, a better solution is one distributed system. In this
case, based on security, a multi-repository search is possible. Collaboration
between the locations is better, and enterprise-wide records and retention
management can be focused on one system.
Figure 3-13 on page 54 shows one IBM FileNet P8 system distributed over two
locations at a domain level. The system is distributed over two locations: the
main location and a satellite location.
Load Balancer
Application Application
Engine Engine
Database Server
- Database Instance
Directory Server
(existing)
Satellite WAN
Directory Server
(existing)
Database Server
- Database Instance
A virtual server is the logical service point with which Content Engine clients
interact. A virtual server can map to a single independent server instance or to a
set of server instances. When a virtual server contains multiple server instances,
Figure 3-14 illustrates a hierarchical view of the domain, sites, virtual server, and
server as displayed in the IBM FileNet Enterprise Manager.
Figure 3-14 Hierarchical view of domain, sites, virtual servers, and server
Cache acts in a write-through way. This means that the cache is updated with
any content being added or updated (written) into the system. At retrieval time,
Content Engine checks to see if the document is already in the cache before
retrieving it from a file store. Documents remain in the cache until cleanup time.
Although you can assign a cache at a site, virtual server, or server level, we
recommend assigning a single cache at the site level. A cache can also be used
by more than one Content Engine server. Using custom programming, you can
preload (also known as prefetch) a cache during the night if the retrieved objects
can be predicted. A preloaded cache achieves optimal performance, because
content can be quickly retrieved.
Recommendations
The performance tests that have been done in the IBM FileNet lab provide good
assistance in helping you to decide which configuration is best suited for your
requirements.
Note: At the time of this writing, we used the white paper IBM FileNet P8 3.0.0
WAN Performance as reference. Currently, further tests of IBM FileNet P8 are
in progress. Check for the latest documentation.
In the IBM FileNet P8 3.0.0 WAN Performance white paper, different distributed
architectures are tested and response times for each architecture are
documented. The general findings are:
Having all the systems at the same site, or having only the directory service
remote from the other components, results in the best performance. Note,
Figure 3-15 on page 58 shows a system distributed over two locations, main and
satellite locations. Request forwarding is disabled.
Client
Database Server
- Database Instance
AN
W
er
File Store
ov
NAS / SAN /
s
Directory Server
rip
fixed
dt
(existing )
un
ro
le
Satellite
ul
tip WAN
M
Directory Server
(existing)
Database Server
- Database Instance
File Store
NAS / SAN /
fixed
When Content Engine (main) talks to the database (sat) and searches for
metadata, this can require a number of queries, and therefore network
round-trips occur to complete the request. If the WAN link between the sites has
high latency, delayed response times are the consequence.
Main Location
Client
Database Server
- Database Instance
File Store
NAS / SAN / Directory Server
fixed (existing )
Satellite WAN
Mult ip
le rou
ndtrip
s ove
r LAN
Directory Server
(existing)
Database Server
- Database Instance
File Store
NAS / SAN /
fixed
When enabling request forwarding, you declare that each defined object store
has affinity with a specific site.
Again, the client (main) addresses Application Engine (main), which contacts
Content Engine (main). Instead of directly contacting the database (sat), Content
Engine (main) forwards the request to Content Engine (sat), which contacts the
database (sat). Content Engine (sat) gathers all data and returns it to Content
Engine (main). Again, Content Engine (main) passes the result back to
Application Engine (main) where it is presented to the client.
At the time that a Content Engine server receives a request, it evaluates the
request to decide whether to forward it or not. For metadata requests, if all
actions in the client request are based on an object store at a different site,
Content Engine will attempt to forward it. At the destination site, the administrator
enables one or more virtual servers to be able to receive the incoming requests.
In our example, Content Engine (main) has to attempt forwarding, because it is
possible to temporarily disable acceptance of incoming forwarded requests at
Content Engine (sat), for example, for maintenance reasons.
The criteria if a request is forwarded or not is whether the majority of the actions
addresses one object store at a different site and whether there are any requests
for the current site. A forwarded request is not forwarded again. Request
forwarding is across the Enterprise JavaBeans™ (EJB™) transport layer only
and only supported across homogeneous application servers.
The main location contains a full IBM FileNet P8 system (Application Engine,
Content Engine, Process Engine, an object store, file store, database, and
Directory Service). You can set up the following options at the satellite location:
No IBM FileNet P8 components are deployed at the satellite location. Only
third party solutions are deployed.
The easiest way to enable the users at the satellite location to use the system
is to provide them with the URL of the Workplace application (or a custom
application) at the main location. You can choose this approach if the satellite
location has a similar infrastructure as the main location with high bandwidth
and low latency.
An alternate approach is the use of third-party software, such as Microsoft
Terminal Server or Citrix, in which the application runs at the main location
and only the content on the window is transferred. This is a solution for clients
who have already deployed this technology.
In the following paragraphs, we describe typical client scenarios and the possible
architectural solution.
If the purpose of this scenario is basic capture, one solution is to use IBM FileNet
Capture Content Engine clients. You do not need additional server components.
Instead, you configure a shared Content Engine repository using a local SQL
database and then commit the documents at an appropriate time to the Content
Engine at the main location.
If another ingestion method is used and the data is temporarily stored at the
satellite location, an approach is to install a Content Engine with a local file store
and a database. You can create custom code to move the content to the main
location at an appropriate time, for example, during off-peak hours. Or, we
generally recommend using batched ingestion across the WAN to the main site.
Load Balancer
Application Application
Engine Engine
Database Server
Figure 3-17 One decentralized system with different scaling per location
The marketing team uses a System Capacity Planning Tool called Scout to
model transactions and to obtain answers to questions, such as:
Based on the projected use of the IBM FileNet P8 system, what servers are
needed?
Given a certain hardware configuration, how busy will the servers be?
After modeling a workload, Scout produces utilization reports that show the
demand placed upon a given set of hardware by that workload.
Figure 4-1 illustrates the basic modeling process for capacity planning.
Adjust
Select hardware
OK?
Refine
Scout uses at least two input sources. One is the hardware configuration, and
the other is the defined workload that consists of one or multiple transactions.
The output from Scout consists of performance charts. If the system utilization of
all components is below a threshold, the system is deemed adequate to meet the
When defining a workload in a presales situation, the details of a model might not
be obvious. Therefore, it might be easiest to develop your general model first and
refine it as you learn more details.
You might want to start with a moderate hardware configuration. When defining
your workload, after each transaction, you can immediately see the result in the
chart and scale the hardware with the transactions. This provides you a better
feeling of the cost per modeled transaction. However, there is a chart option to
view utilization by transaction function to get the explicit cost per modeled
transaction function.
When modeling the workload, Scout provides a walk-through wizard for a quick
start that helps you to configure the basic parameters of the components that you
want to size. We found it useful to use the wizard and save the result to another
file. The wizard helps you learn which transaction functions to add to your
workload but it creates a simplified model, whereas some of the lesser used
functions can only be obtained by manually adding them to your workload from
the Transaction Templates in the tree view.
Client environment
The following list provides questions to ask during sizing that are related to the
client environment:
Does the client prefer specific hardware? If yes, which vendor?
Are there standard machine types that the client wants to use? If yes, what is
the standard server, which processor, and how many CPUs?
What application server will be used?
What database server will be used?
What are the default working hours? You can overwrite this default value in
each transaction if needed.
Content ingestion
The following list provides questions to ask during sizing that are related to
content ingestion:
If content is ingested through scanning:
– What are the scanning hours?
User activities
After the content is ingested, corresponding actions are started. The content can
be processed by Business Process Management or simply stored and used for
retrieval later. A user can work on the content using a custom application or
Workplace. How the user uses the content might determine the sizing of the
system. General questions related to user activities are:
For logon and logoff activities:
– How many times does a user generally log on and log off per day or per
week?
– Are there peak hours of logon and logoff activities during the day or during
the week?
– Are there different logon and logoff behaviors for different users (for
example, are there different behaviors for power users compared to
occasional users)?
Note: After documents are checked out, they usually are viewed. This
viewing is modeled as an additional retrieval.
You can see the Content Engine load throughout the day. In the morning hours
between 8:30 a.m. to 11:30 a.m., the system load is higher due to scanning
activities. From 11:30 a.m. to 4:30 p.m., the activity level is lower, because only
retrieval and processing activities occur. Between 3 a.m. and 4 a.m., prefetching
takes place. Documents that are needed for the next day are retrieved and
loaded into the cache for better performance.
The first step is to collect baseline data for the involved systems. For the Content
Engine baseline, you use the System Manager Dashboard. A dashboard is a tool
for gathering performance data and provides current Content Engine utilization
data. (For more information about dashboards, refer to 11.2.2, “Dashboard” on
Adjust
Select hardware
OK?
Refine
For modeling purposes, we import the current Content Engine utilization with a
factor of two, import the Image Server utilization, and add an application that
accounts for an increased workload of 20%.
Figure 4-4 on page 74 shows the utilization for the Image Services system.
The chart shows the workload summary after importing the three workload
profiles: one for Content Engine, one for the Image Server, and one for the
additional third-party application. The various colors represent single services
that run simultaneously. The chart illustrates the imported workload together with
the new application workload. The result is that with the additional application,
the Image Services server exceeds its threshold at 7:30 a.m. It needs to be
scaled up with two additional CPUs.
Figure 4-5 Content Engine under heavy load (utilization is more than 90%)
As shown in Figure 4-6, we see the IBM FileNet P8 4.x Java Create
Documents transaction creates the most intense workload. When verifying
with the system, in this example, we realize a typographical error in the
number of input documents and correct it.
With the correction made, we see in Figure 4-7 on page 78 that the system
operates well under the threshold.
Figure 4-8 on page 79 shows the spreadsheet that contains the input system
values and the output, which is the estimated disk space required for Content
Engine and Process Engine.
Each section in this chapter explains the design specific features of that element.
In addition, the entire architectural framework has provided a number of specific
features with the overall design goals. The elements of solution space
decomposition directly support scalability and encapsulate these concepts in a
manner that is easy to reconcile with the physical topology of the infrastructure.
The levels at which any given element is controlled, as well as rolling this
administration up through a single tool, both enhance the power and simplify the
task of continued administration and control of these elements throughout the life
span of the solution. Security features are present at almost every single level
and every single manifestation of the repository elements and, in many
instances, in multiple ways. These features provide a variety of security
granularities from very broad to a very specific and individualized level. See
Chapter 6, “Security” on page 131 for the details.
Relationship or composition
Encapsulation of the relationships of objects is based on an understanding of
how entities relate to each other. This relational association can be grouped into
a hierarchal relationship, where one entity is said to inherit the characteristics of
a parent, in an associative manner where diversely characterized entities can be
related in an ordinal manner, or in an aggregate association, where collections
and groups of entities can be handled as a unit. Decomposing based on
composition allows the associations and relationships of entities, regardless of
type, to be carried across the design.
Examining how content is utilized and accessed can reveal patterns that show
relationships that are not formally captured in metadata but exist nonetheless. A
spreadsheet listing appraisers in a specific geography is usually accessed along
with claims. Customer service representatives typically look at all documents
relating to a specific user or geography. This perspective focuses on how people
access and utilize content to complete their tasks.
Behavior or function
Encapsulation of behavior is grouping entities based on the behavior that they
exhibit, the life cycles through which they go, or functionality that they provide.
Decomposing based on function allows simple changes to be made that can alter
the behavior of a wide set of entities.
The business processes that utilize content, the document life cycles, and work
flows all give a perspective that is based on the functionality of the documents.
The active content perspective allows the grouping of content based on what it
does. Content can be combined with various other content to create new content,
such as report generation. The grouping of content in this manner is best
understood in relationship to active content or as it is considered with Business
Process Management (BPM).
Designing the repository from the bottom up is analyzing the existing content and
processes in use by the organization and synthesizing the abstract entities from
this information. Repeated applications of grouping the resultant entities based
on a specific set of characteristics from the four basic types and then
synthesizing the next layer up by abstracting these groupings yields the resultant
design. Each level of organization of entities allows a different facet of design
detail characteristics to be focused on and separated out from the others.
The bottom-up approach has the advantage of your being able to work with
existing, well understood content and with workers who have expert knowledge
of that content. Understanding the abstract characteristics from the four basic
types is well understood and is usually easy to determine. Often as the design
grows from the bottom up, it becomes more difficult, especially for the knowledge
workers, to abstract further away from the concrete details with which they are
used to working.
The bottom-up approach has the disadvantage of taking all of the implicit
knowledge about how the existing problem space is approached, including any
and all artificial constructs that were utilized for historic or other reasons that are
contrary to a good design. It is frequently very difficult to overcome these
Designing from the top down involves understanding the global picture and
decomposing the various levels of the design through either clear design goals or
specific design choices. It is also an iterative process, which in this case drives
from the most abstract down toward the concrete levels. By designing from the
top down, the specific order of design characteristics can be approached in the
manner that makes the most strategic sense for the organization.
The top-down approach has the advantage of developing a design that does not
include any artificial barriers based on constructs, such as organization as
opposed to function, and producing a design that emphasizes the strategic
requirements of the solution. This often results in the most flexible and adaptable
design moving forward.
Design team
The design team itself can consist of one or more architects with the specific
responsibility of producing the design. Regardless of the number of individuals in
the design team, there is a clear set of roles and responsibilities that must be
represented. These roles cover both the technical facets of the design as well as
the business facets. The team is usually led by a technical architect who has the
direct responsibility for the content solution. The team is either populated by
architects and representatives from the following areas or contacts in the
following areas that can provide feedback and direction as needed to the team
without being full-time team members:
P8 Content Manager architect technical role
This is the architect who has the ultimate responsibility for the overall
repository and solution design itself. This role must always be assumed by a
full-time member of the design staff who has expert level knowledge of the P8
Content Manager product itself.
Enterprise architect technical role
This is the architect who is responsible for overseeing the technical fit of the
solution into the existing solution portfolio. This role must always be assumed
by someone who has an expert level of understanding of the current
technology across the enterprise.
Application architect technical role
This is the architect who has direct responsibility for specific application or
applications being addressed at this phase of the design, who is responsible
for tracking the business requirements into the solution space.
Enterprise security technical role
This is someone who has expert level understanding of the security
environments and models that are utilized in the enterprise infrastructure. The
purpose of this role is to assure that all existing security policies are adhered
to and to provide support as needed for security requirements outside of the
P8 Content Manager solution itself.
Technical support roles
There must be experts in server administration, database administration,
storage administration, network administration, and directory service
administration either represented on the team or available to the team. The
purpose of these roles is to assure that the various infrastructure elements
Interviewing process
In most cases, it is impossible to staff the design team with all of the requisite
expertise that is required. Even when experts from areas are direct team
members or are directly available to full-time team members, often specific
individuals and groups have pieces of knowledge that are required. The iterative
process of gathering these pieces of knowledge is done through series of
interviews. Each interview needs to start with a clear set of questions to be
answered and involves the complete design team as well as all individuals who
can contribute to obtaining the answers. Conducting interviews in a group format
allows for the potential of addressing additional questions and issues that are
uncovered during the process and greatly reduces the time and effort required by
all to obtain the body of information that is required from any particular area.
There are a number of standard references to different labels and points that
must be considered in every case. These are presented followed by the call-out
of several naming constructs for specific objects that have been shown to be
useful.
In the case where there are custom interfaces of any kind between the system
and the user, the use of this label is optional and might not be used. This field will
be utilized as the object name in all cases where it is accessed through the
standard interfaces and tools, such as FileNet Enterprise Manager and
Workplace.
An example of display names is prompting the users with User Name on a panel
for the users to enter their names.
5.3.3 Uniqueness
Object names across the entire design generally have a requirement for
uniqueness. Unique naming tracks with appropriate naming, that is, when proper
consideration is given to naming objects, the uniqueness typically follows.
5.3.4 Taxonomy
Taxonomy is the establishment of categorization based on naming. Having a
specific pattern that is applied to names with definitions for each name part that
are well understood facilitates an organized taxonomy. Giving initial thought to
taxonomy and developing a taxonomy prior to the actual naming simplifies the
naming task and accents the self-descriptiveness of the name given.
5.3.5 Consistency
Consistency is important so that as the base of people who will be utilizing the
names is broadened, it leads to better understanding and less confusion as the
system moves forward in scope and in age. Establishing consistency standards
well is beneficial in the long run. Consistency is facilitated by the complete
application of the ideas that are already presented.
Object stores
Object stores are the highest point of naming for a given repository as well as the
first level of decomposition for the solution space. Make sure that you indicate
the part of the solution that an object store represents when you name it.
For example, company XYZ with a single object store can name its object store
XYZ Enterprise. Another company ZYX has two object stores and it can name
the two object stores, ZYX Operations and ZYX Support. The object stores
represent repositories for all content pertaining directly to the business of ZYX:
Storage areas
Storage areas are where the content is saved; there are various types of storage
areas, including file system, cached content, and fixed content. Each type can
represent a number of varieties, each with specific characteristics. Naming the
storage areas in a manner that encapsulates the type and characteristics of the
storage area is useful, because the storage areas are accessed and applied
throughout the lifetime of the system.
For example, Company XYZ has three storage areas in use for the Company
XYZ repository. The first storage area is a file store hosted on the network
accessible protected storage segment of a storage area network (SAN) by a
Network File System (NFS) mount. The second storage area is a fixed storage
area that links to the company’s image management system. The third storage
area is a file store on a local (Just a Bunch of Disks) JBOD device also through
an NFS mount. These three storage areas are named NFS-RAID, IMAGES, and
NFS-CHEAP.
Property templates
Special considerations for property templates need to be taken as the use of a
given property template can be widely used across many different objects. The
names chosen for the property templates need to be self-descriptive of both the
characteristics of the property template as well as the intended use of the
template.
Choice lists
Choice lists are similar to property templates, but they are used to limit the
entries that the user will fill in for a property template. A choice list is associated
For example, the four choice lists in company XYZ are States, Geos, ClaimType,
and Month.
Repository
Repository Repository
Repository Repository
Repository
Object
Store
Database Content
File Content
File Content
There are four major stages involved in the population of a repository: three
design stages and one production stage. The three design stages include
organizational design as described in 5.5, “Repository organizational objects” on
page 101, repository design as described in 5.6, “Repository design objects” on
page 106, and repository content design as described in 5.7, “Repository content
objects” on page 128. The final stage in repository population is the actual
production, or test, usage of the repository. The following sections describe the
design stages and their relationships.
During all of these design phases, there are certain commonalities that are
universally, or nearly universally, utilized in the objects of the design.
Here, we list several of the system properties that have potential application in
other places of the design:
Class description
The class description contains the immutable description of the class from
which this object is instantiated.
Display name
This immutable label is intended for display to the user for prompting for the
entry of the value of this object.
Descriptive text
This immutable text describes the purpose and meaning intended for this
object.
Is hidden
This is a Boolean value that indicates if the object is hidden in its current
context. This property affects the user interface and is exposed to external
systems.
Symbolic name
This immutable label is used for internal, programmatical references to the
object.
ID
This immutable global1 unique identifier (GUID) can be used to reference this
specific object throughout its lifetime.
Is content-based retrieval (CBR)-enabled
This is a Boolean value that indicates if content-based retrieval is enabled in
the current context of the object.
1
In this context, global is only across the IBM FileNet P8 domain, because there might be other
objects in other domains with the same GUID.
Modifying and removing design elements can be a tricky procedure given the
complex relationships that are possible. This is especially noticeable when
attempting to remove a design element that might be utilized or referenced from
a number of other design elements at differing levels of the design. It is always
best to be as thorough as possible in the system design prior to actually creating
the elements in the P8 Content Manager, because this avoids most of these
difficult situations.
P8 Content Manager has a number of wizards that assist in the creation and
modification of the various design elements.
100 IBM FileNet Content Manager Implementation Best Practices and Recommendations
5.5 Repository organizational objects
The solution space is divided into a number of logical divisions. Each division
serves a specific purpose. The composition of all of these divisions provides a
powerful solution that allows the requirements of any implementation to be
clearly and succinctly decomposed.
Figure 5-3 shows the logical relationships among the decomposition elements,
domain, sites, virtual servers, and server instances.
Site
Site
Site
Virtual Server
Virtual Server
Server Instance
Virtual Server
Server Instance
Server Instance
All of the logical elements composing a domain are administered and managed
through IBM FileNet Enterprise Manager.
In this example, we have two central offices where 80% of all of the computing
resources and data reside, along with a number of satellite offices, each with its
own resources and a need to interact with the centralized services as well. Each
office has a highly reliable local area network (LAN) connecting all of the
resources in each office. There is a dual-redundant connection between the
central offices that provides both high reliability and high performance across
their communications. There is a lower speed wide area network (WAN)
connecting all of the satellite offices with the two central offices.
5.5.1 Domains
The highest level of purview for a given P8 Content Manager implementation is
the domain. There is a single domain in this specific P8 Content Manager
implementation. You can choose to have more than a single domain where you
want to totally isolate environments. Do this to support development in its own
logically separate location or for other reasons (See Chapter 8, “Advanced
102 IBM FileNet Content Manager Implementation Best Practices and Recommendations
repository design” on page 185 for complete guidance about multiple domains).
The domain encapsulates all of the logical resources for the implementation, as
well as all of the logical services that provide access to those resources. The
domain defines the absolute boundaries that none of the logical resources or
logical services can cross. The heart of a domain is embodied in the Global
Configuration Database (GCD), which is a database that encapsulates all of the
hierarchy of logical elements that provide access to the resources in the domain.
Each Content Engine that is installed is bound to a specific domain, and it is
within this domain that all of the repository elements are defined.
The domain is, in most cases, analogous to the enterprise. If there are valid
business and technical reasons that the domain is a division of the enterprise, it
is perfectly reasonable to make it so.
In our example, all of the resources shown in Figure 5-4 on page 102 are
included in a single domain, because there are requirements for sharing these
resources across the entire enterprise.
5.5.2 Sites
Sites are encapsulations of geographically colocated physical elements. The
interconnection of these elements is across fast local area network (LAN)
connections. Interconnections between sites is assumed to be across the wide
area network (WAN) with slower connection and bandwidth requirements.
Site decomposition must always be done with direct consideration for the
interconnections. Any site must only contain elements that are interconnected
through high performance, high bandwidth, and highly reliable network
connections. There is typically no functional reason to decompose any
geographic location that is connected through a single LAN into multiple sites;
however, this might be warranted in specific cases.
Object stores, storage areas, and virtual servers are all associated with a specific
site. A site is not limited by P8 Content Manager to the number of object stores,
storage areas, or virtual servers that it might contain.
Sites are created through FileNet Enterprise Manager’s site wizard, which can be
accessed by right-clicking on the top-level folder labeled sites. The only
requirement is to give the site a distinct name and a meaningful description.
When declaring virtual servers into sets that utilize load balancing through
software or hardware techniques, consider the grouping of the individual server
instances that they contain. These virtual server groupings provide both
performance and availability scaling for the clients that will utilize them as their
access points.
P8 Content Manager does not limit the number of server instances that a virtual
server can have.
Virtual server objects are created dynamically during system initialization and
startup based on the configured topology of the application server or via specific
system properties.
In our example from Figure 5-4 on page 102, each satellite office only contains
virtual servers that contain a single server instance, because they have no
requirement for providing high-performance access to their resources for internal
use. The two central offices each contain multiple virtual servers that utilize load
balancing across multiple server instances. This allows the central offices to
support not only internal resource access but also allows the frequent accesses
that will come from the satellite offices.
104 IBM FileNet Content Manager Implementation Best Practices and Recommendations
5.5.4 Server instances
Servers are representations of a single Java 2 Platform, Enterprise Edition
(J2EE) application server instance. They do not necessarily equate to a single
physical server instance, because any physical server can contain any number
of J2EE application servers, each running in their own individual Java virtual
machine (JVM) space, or even their own logical server division on a specific
physical server. The server instances are where the individual compute platforms
of the Content Engine are actually deployed. A server is associated with a
specific virtual server, which is the client entry point for that set of servers.
A server instance contains exactly one J2EE application server instance, and
there are no limits imposed by P8 Content Manager to the number of server
instances in a domain. Our example in Figure 5-4 on page 102 has to have every
server where the Content Engine has been installed to be a server instance.
Figure 5-5 on page 106 gives a visual representation of the GCD layout. The
GCD is set up at install time and is then managed through IBM FileNet
Enterprise Manager.
ObjectStore1
DS/DSXA
ObjectStore2
DS/DSXA
Subsystem Configuration
Domain
Site1
VirtualServer 1
ServerInstance 1
ServerInstance 2
VirtualServer 2
ServerInstance 3
Site2
…
TraceLogging Configuration
106 IBM FileNet Content Manager Implementation Best Practices and Recommendations
metadata objects along with their connections to the content where applicable.
An object store can contain all of the content for the entire enterprise or can be
segmented from the overall enterprise design and assigned to a specific set of
the overall problem. Regardless of the purpose, the object store contains the
entirety of all of the definitions required for use by users and any applications that
will access it. Figure 5-6 shows a graphical representation of the scope of an
object store.
Class Definitions
Other Definitions
· Workflows
· Choice Lists
· Properties
· Events
· Security polices
· Other
An object store is conceptually an object like all entities that make up a repository
and that has specific characteristics. Object stores are created through the use of
The first page of the wizard prompts for the display name, symbolic name, and
description of the object store. The wizard displays a list of currently used names
that are not allowed to be reused.
The second page of the wizard prompts for the Java Naming and Directory
Interface (JNDI) data sources for the object store database, where all of the
definitions and metadata will be stored. It requires both the regular JNDI name as
well as the transaction (JNDI XA) name.
The third page of the wizard prompts for the default storage location that will be
used for content storage. This can be either the database, file storage, or a fixed
store.
The fourth page of the wizard prompts for the entry of the initial object store
administrators, as entries from the LDAP.
The fifth page of the wizard prompts for the initial ACL for the object store, which
lists the initial user groups associated with the store from the LDAP.
The sixth and final page of the wizard allows all of the information entered to be
reviewed one final time prior to acceptance and the actual creation of the object
store.
Recommendation: If your design calls for more than a single object store,
create a metastore that can contain all of the design objects that are common
across all of the stores and replicate this as changes are made.
If a metastore is utilized, do not roll this store out into production, because it is
strictly a development object store.
When creating an object store, always set the object store administrator to a
valid administrator logon and grant the administrator all permissions.
108 IBM FileNet Content Manager Implementation Best Practices and Recommendations
In addition to storage media type, there are a number of logical storage types,
such as database stores, file stores, fixed content stores, cached content stores,
and others. Each of these logical types has implications for performance and
functionality that must be considered when determining specifically where to
store content.
Database store
There is a single database store per object store, where the database store is
analogous to the object store. The database store can be used to store content,
but the content will be stored as a database binary large object (BLOB).
Depending on the size of the content, this is not a very effective use of the
database and can have serious impacts on the database performance.
Recommendation: The database store must only be used for content that is
no larger than 10 KB in size. Larger content sizes must be stored in a file store
to avoid detrimental impact on the database performance.
File store
There can be multiple file stores per object store with each one a separate
directory structure on the server. The file store can be on local storage media or
can be a mount point for remote, or networked, storage media. This is the typical
location that is used for content with different file stores of different media types
used for different content where appropriate (See Chapter 8, “Advanced
repository design” on page 185 for more details on file stores).
Recommendation: There must always be at least one file store defined for
the repository where content that is larger than 1 KB in size can be stored. The
media type and cost must be clearly understood from the file store name to
eliminate content storage area errors.
The first level of document class design is concerned with the common
enterprise objects, as opposed to specific application objects. The result of this
first round of design is a hierarchal document class tree that contains all of the
common enterprise document classes that can be leveraged by specific
applications, because they are included in the P8 Content Manager solution.
There need to be a reasonable number of properties defined in each class. It is
easier to administer and expand a design where each document class is
concerned with a specific aspect of the design. The resultant tree is typically
neither extremely narrow, nor extremely wide. A narrow tree usually indicates
that the class design has focused too specifically on an aspect and has been too
exclusive. A wide tree usually indicates that there are too many aspects of the
design encapsulated at a level.
Another test that can be applied to the resultant design is to see how various
changes to the design can be made. If there are properties that have historically
changed somewhat frequently, or any properties that are projected to change,
see what changes need to be made to the design to accommodate the changes.
The ideal is to address a change with a change in a single class. This is a good
indication that you have the proper level of design encapsulation. The types of
changes to consider are property redefinitions, property additions, property
deletions, class additions, class modifications, class deletions, security updates,
functional changes, and organizational changes.
110 IBM FileNet Content Manager Implementation Best Practices and Recommendations
design much more usable as well. You must always take usability into
consideration during all the design phases. The use of subject matter experts at
this phase can greatly assist you in meeting the unspoken requirements and
usability goals of users.
As a key design object in the system, there are lots of additional components ton
which the document classes are dependent. Most of these dependencies are
covered in the specific sections for the dependent elements. Probably the most
important dependency is the usage of the property templates in the class
designs. This dependency underscores the need to be clear and concise in the
property template definitions and consistent with naming and topology across the
entire design.
Finally, try to avoid designing for the current organization without being modular
enough to accommodate change. Avoid carrying over limitations of current
system that might have been design flaws in the current system or limitations of
the tools that are used to support it. Take into account any current or future
processes in which the content is utilized. That is, always consider business
process automation in the design. Remember that there will always be additional
applications and functional areas that the system will need to support that are not
currently identified or even identifiable.
There are three focus areas that the document class design typically follows.
These are design based on organization, design based on content, and design
based on function. Although these are the major design approaches that are
used, variations on these themes as well as modifications and combinations of
these approaches are also successfully used. The right approach to utilize is
highly dependent on the specific details of your corporation and the application
that is supported by P8 Content Manager:
Design based on organization
Design based on organization starts with the first level of decomposition after
the enterprise root document class, which is groupings around how the
corporation is organized. This can be reflected in line of business (LOB)
objects, support and business value objects, or any other high-level structure
that represents your organization. The subsequent layers of the hierarchy
then follow the organization down into smaller and smaller groupings. Each
level can also have classes that capture content specific aspects where the
document content that they represent has consistency across the entire
organization from that root down the hierarchy. Eventually, the lowest level
represents document content classes that correspond to specific functional
areas or specific content.
This facilitates future changes that occur at the organizational level by
capturing these aspects as high in the tree as feasible and letting these
properties and attributes be inherited down the hierarchy.
The first page of the wizard prompts for the display name, symbolic name, and
description of the document class. The wizard displays a list of currently used
names that are not allowed to be reused.
The second page of the wizard prompts for adding properties to the class, based
on the existing property templates. Although you have the ability to launch the
property template wizard from this page to create a new property, the best
practice is to establish all the necessary property templates prior to creating the
document classes.
The third page of the wizard allows you to set the attributes of the properties of
this class. You can set properties to be required, hidden, the name property, a
default value for the property, and a maximum size, associate a choice list with
112 IBM FileNet Content Manager Implementation Best Practices and Recommendations
the property, and other settings. You can set the attributes not only of the custom
properties that you defined on the second page of the wizard, but to any system
or inherited property as well.
The fourth page of the wizard allows you to set the storage parameters of the
class. You can assign a storage policy here, direct the class to use a specific
storage area, or choose to inherit the settings of its parent class.
The fifth page of the wizard prompts for any event auditing that you want on
objects of this class. By defining trigger events, subsequent entries are made in
the audit log when those events occur.
The sixth and final page of the wizard allows all of the information entered to be
reviewed one final time prior to acceptance and the actual creation of the
document class.
All property templates, choice lists, storage policies, and storage areas need
to be created prior to creating any document classes that utilize them.
Never skip the step of designing high-level abstract objects that are for aspect
encapsulation and will most likely never be instantiated.
A key design decision that needs to be made is whether the main access
mechanism for content follows the search paradigm (represented in Figure 5-7)
or follows the browse paradigm (represented in Figure 5-8 on page 115). Both of
these paradigms offer their own strengths and weaknesses, and this decision
directly affects how folder classes will be used and instantiated.
Search Options
Name:
Size:
Cancel Submit
The search paradigm is very powerful, because it does not rely on the user
needing to know where the content is in the system or the name of the object that
contains the content. Searching also returns a set of objects as an atomic
operation, the maximum size of this set can be controlled as well. This can
include objects that are located in diverse places in the repository. Effective use
of the search paradigm requires the selection of meaningful distinguishing
properties for the objects that have meaning to users. It also requires meaningful
document classes that are understood by users as well.
The search paradigm can be fronted with various methods of compiling the
search criteria and usually is best served by designing searches or through
114 IBM FileNet Content Manager Implementation Best Practices and Recommendations
custom interfaces. It is usually a faster and a more reliable method of finding
content than is offered by the browse paradigm.
The model for the browse paradigm is represented in Figure 5-8 as a typical file
system structure. There is some meaningful relationship between sets of folders
that leads the user to sets of content in an understandable way. The best
analogy is a file system tree structure. Although the analogy presented to help
understand the browse paradigm is a file system structure, a file system folder is
not the same as a P8 Content Manager folder, which supports multiple filed
locations.
The browse paradigm relies on the users who add the content to be thoughtful
and knowledgeable in the manner in which the content is filed. This potentially
includes filing the same content object in multiple folders. There is also a
requirement that the name of the content object has meaning in its context that is
understood by users.
The browse paradigm can increase the time that it takes for the system to search
for content, but it is well suited to users who all understand the basic concepts of
foldering and are used to using foldering for file system access. The browse
paradigm typically takes longer for users to find content than the search
paradigm, and it requires users to have inherent knowledge to be able to reliably
find content.
The first page of the wizard prompts for the display name, symbolic name, and
description of the folder class. The wizard displays a list of currently used names
that are not allowed to be reused.
The second page of the wizard prompts for adding properties to the class, based
on the existing property templates. Although you have the ability to launch the
property template wizard from this page and create a new property, the best
practice is to have established all the necessary property templates prior to
creating the folder classes.
The third page of the wizard allows you to set the attributes of the properties of
this class. You can set properties to be required, hidden, the name property, a
default value for the property, a maximum size, associate a choice list with the
property, and other settings. You can set the attributes not only of the custom
properties that you defined on the second page of the wizard, but to any system
or inherited property as well.
The fourth page of the wizard prompts for any event auditing that you want on
objects of this class. By defining trigger events, subsequent entries are made in
the audit log when those events occur.
The fifth and final page of the wizard allows all of the information entered to be
reviewed one final time prior to acceptance and the actual creation of the folder
class.
Avoid too many layers of too many folders (keep the total number to tens of
folders, not hundreds); this can impact retrieval performance.
There needs to be a single, top-level folder class that extends the base folder
class and from which all other folder classes will be derived.
All property templates and choice lists must be created prior to creating any
folder classes that utilize them.
116 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Folder class characteristics
Folder classes have the following characteristics:
Have metadata
Are containable.
Are not versionable
Are not content
Are containers
Most of the same considerations that are given to creating document object
classes (See 5.6.3, “Document classes” on page 110) also apply to designing
custom object classes: Single top-level class from which all others are derived,
single design aspect captured per class, design with changes in mind, design in
modularity, and do not repeat any mistakes that the current system or processes
might have.
Custom object classes are created through a wizard interface in IBM FileNet
Enterprise Manager by right-clicking on an existing custom object class, such as
the base custom object class, and choosing new class (the custom object
classes are grouped under the heading of other classes along with a number of
other class types).
The first page of the wizard prompts for the display name, symbolic name, and
description of the custom object class. The wizard displays a list of currently
used names that are not allowed to be reused.
The second page of the wizard prompts for adding properties to the class, based
on the existing property templates. Although you have the ability to launch the
property template wizard from this page and create a new property, the best
practice is to have established all the necessary property templates prior to
creating the custom object classes.
The third page of the wizard allows you to set the attributes of the properties of
this class. You can set properties to be required, hidden, or the name property, a
default value for the property, a maximum size, associate a choice list with the
property, and other settings. You can set the attributes not only of the custom
properties that you defined on the second page of the wizard, but to any system
or inherited property as well.
The fifth and final page of the wizard allows all of the information entered to be
reviewed one final time prior to acceptance and the actual creation of the custom
object class.
All property templates and choice lists need to be created prior to creating any
custom object classes that utilize them.
In addition to the methods for searching and browsing normal documents, you
can also search for component relationship objects using the IBM FileNet
Enterprise Manager’s search functionality.
118 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Compound documents provide benefits to organizations by enabling:
Independent modifications to various components
Reuse of components in other documents
Time savings
Enhanced document quality and integrity
Most of the same considerations that are given to creating document object
classes (See 5.6.3, “Document classes” on page 110) also apply to designing
compound document classes: A single top-level class from which all others are
derived, single design aspect captured per class, design with changes in mind,
design in modularity, and do not repeat any mistakes that the current system or
processes have.
There are two types of properties: the system properties that come preinstalled in
P8 Content Manager and custom properties that you create for your specific
installation. All of these properties can be utilized in any definitions as you see
The largest distinction between these property types is how they are displayed in
the properties tab of a class properties dialog. You can selectively display just
your custom properties, your custom properties and the system properties, or all
properties associated with a class.
Property templates must always have a data type associated with them. The
data type can have a cardinality of either single value or multi-value for all data
types. The basic data types used by P8 Content Manager are:
String Can contain any printable character up to the string size
limit set
Integer Can contain signed integer numbers with up to a 32-bit
representation
Object Can contain a reference to any object within the object
store
Float Can contain floating point numbers up to a 64-bit
representation. Floating point is inherently inexact in its
representation of decimal values and must only be used in
scientific and mathematical contexts where that is
understood.
Date/Time Can contain a representation of the date and time,
containing year, month, day, hour, minute, second, and
millisecond
Boolean Can contain a Boolean value of either true or false
Binary Can contain arbitrary BLOBs of binary data that can
contain non-printable characters. You cannot search on
multi-value binary data.
Primary ID Can contain a Microsoft global unique identifier (GUID) for
reference to external entities
There are a number of attributes and controls that you can set on a property
template:
Name property A flag that indicates that the value of this property must be
used as the display name for the object as opposed to
using the GUID or filename defaults
Value required A flag that indicates that this property might not be empty
120 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Hidden A flag that indicates that this property must not be visible
to the users of this class
Settability Controls when this property can be modified. The choices
include read/write, settable only on create, and settable
only before checkin.
Category An optional string value that can be used to group similar
property templates together and can be used for sorting
purposes
Choice list An association with a defined choice list that restricts the
users to selecting a value from the choice list as opposed
to a free-form entry
Minimum value Depending on the property data type, this property sets
the minimum allowed value to which the property can be
set.
Maximum value Depending on the property data type, this property sets
the maximum allowed value to which the property can be
set.
Default value Depending on the property data type, this property sets
the default value for the property.
String size Depending on the property data type, this property sets
the maximum length that the string property can be.
Property templates are created utilizing IBM FileNet Enterprise Manager and the
property template wizard. The wizard can be invoked by selecting the property
template folder in a specific object store, right-clicking, and selecting new
property template.
The first page of the wizard prompts for the display name, symbolic name, and
description of the property template. The wizard displays a list of currently used
names that are not allowed to be reused.
The second page of the wizard prompts for the selection of the property
template’s data type, one of the previously mentioned types.
The third page of the wizard allows the selection of a choice list to be associated
with this property template. This page of the wizard is only shown for the string
and integer data types. When the data type is string, this page allows the
association of a marking set in place of a choice list (For details about marking
sets, see Chapter 6, “Security” on page 131).
The fourth page of the wizard allows you to set the cardinality of the property
template and, in the multi-value case, if these values are either non-unique but
The fifth and final page of the wizard allows all of the information entered to be
reviewed one final time prior to acceptance and the actual creation of the
property template.
Avoid the creation of property templates that are named in such a manner that
it might be confusing to know which template to use.
Choice lists can consist of levels of groupings of values to make it easier for the
correct value to be selected. In the case of multi-value properties, the user can
select multiple entries from the choice list. Choice lists are created by selecting
the choice list wizard in IBM FileNet Enterprise Manager by right-clicking on the
choice list folder and selecting new choice list.
The first page of the wizard prompts for the display name, symbolic name, and
description of the choice list. The wizard displays a list of currently used names
that are not allowed to be reused.
The second page of the wizard prompts for the selection of the choice list data
type, which can be either integer or string.
The third page of the wizard allows interactive building of the choice hierarchy by
adding, moving, and deleting groups and entries until the desired choice list is
created.
122 IBM FileNet Content Manager Implementation Best Practices and Recommendations
The fourth and final page of the wizard allows all of the information entered to be
reviewed one final time prior to acceptance and the actual creation of the choice
list.
Limit the number of elements in each group to a small enough set that it can
be easily displayed and scanned.
Avoid assigning the same value to more than one item in a choice list.
Do not use choice lists for properties where the values are expected to change
frequently.
5.6.9 Annotations
Annotations allow users to link additional information or comments to other
objects, such as documents. These annotations can be in any format, such as
text, audio, video, image, highlight, and sticky note. An annotation’s content does
not necessarily have to be the same format as its parent document and can be
published separately. Document annotations are uniquely associated with a
single document version; they are not versioned or carried forward when their
document version is updated, and a new version is created.
You can modify and delete annotations independently of their annotated object.
However, you cannot create versions of an annotation separately from the object
with which it is associated. By design, the annotation will be deleted whenever its
associated parent object is deleted. Annotations receive their default security
from both the annotation’s class and the parent object. You can apply security to
annotations that is different from the security applied to the parent.
Applications that use annotations will likely add properties for the particular kind
of annotation being implemented. For example, a property can be added
indicating the presence and location of the annotation. A voice annotation needs
a BLOB property in order to contain the sound file.
Annotation classes are created by utilizing the annotation wizard in IBM FileNet
Enterprise Manager by right-clicking on an existing annotation under other
classes and choosing new class.
The first page of the wizard prompts for the display name, symbolic name, and
description of the annotation class. The wizard displays a list of currently used
names that are not allowed to be reused.
The third page of the wizard allows you to set the attributes of the properties of
this class. You can set properties to be required, hidden, the name property, a
default value for the property, a maximum size, associate a choice list with the
property, and other settings. You can set the attributes not only of the custom
properties that you defined on the second page of the wizard, but to any system
or inherited property as well.
The fourth page of the wizard allows you to set the storage parameters of the
class. You can assign a storage policy here, direct the class to use a specific
storage area, or choose to inherit the settings of its parent class.
The fifth page of the wizard prompts for any event auditing that you want on
objects of this class. By defining trigger events, subsequent entries are made in
the audit log when those events occur.
The sixth and final page of the wizard allows all of the information entered to be
reviewed one final time prior to acceptance and the actual creation of the
annotation class.
The first page of the wizard prompts for a description of this specific annotation
and gives you the option to associate one or more content objects with the
annotation.
The second page of the wizard is displayed when you are associating content
with the annotation. It prompts for the location of the files to add and allows you
to add any number of them to this annotation.
The third page of the wizard allows you to select the specific annotation class for
this annotation, as well as the storage policy to use for the annotation.
The fourth and final page of the wizard allows all of the information entered to be
reviewed one final time prior to acceptance and the actual creation of the
annotation.
124 IBM FileNet Content Manager Implementation Best Practices and Recommendations
5.6.10 Document life cycles
Document life cycles allow for the fact that a given document exists in a number
of states throughout its lifetime. Figure 5-9 shows a sample state diagram for the
typical document life cycle in the XYZ corporation.
A3
Retain Final
Version
A2 Share/
B3
Collaborate Retain Final
Version
A B C
Personal A1 Revise Process Workgroup Corporate
B1 B2 Revise
Documents (workflow) Documents Documents
C1
Prepare for
Revision
A4 Destroy B4 Destroy C2 Destroy
SYSTEM-GENERATED SYSTEM-GENERATED
AND EXTERNAL AND EXTERNAL
DOCUMENTS DOCUMENTS
(NON-PRODUCTION) (PRODUCTION)
In this example, documents are in one of three states: personal documents not
being shared or collaborated on, workgroup documents that have a limited scope
of sharing and are intended for collaboration, and corporate records that have
meaningful business value to the company.
In the first two states, a document can be revised and remain in its current state,
reach its end of life and be destroyed, or be promoted to a higher state. In the
workgroup collaboration state, documents can also be processed in some
automated way, such as through Business Process Manager. In the final
corporate document state, a document can also be demoted back to the
workgroup for revisions and updates.
While the figure captures the states and transitions between the states that a
document can take, it also illustrates how IBM FileNet Content Manager
document life cycles can be extremely useful. A document life cycle allows for the
definition of the states in which a document can be and then can associate that
IBM FileNet Enterprise Manager enables you to set up life cycles for documents.
Document life cycles are contained in two design classes: the life cycle policy
class and the life cycle action class:
Life cycle policy class
The definition of the document’s states, and the policy also identifies the life
cycle action that executes in response to the state changes.
Life cycle action class
Action that the system performs when a document moves from one state to
another.
Document types in the Content Engine have default life cycle policies. You can
also assign a default life cycle policy to any new document class. When you
create a document using a class with an associated life cycle policy, the
document uses it as a default life cycle policy. This can be overridden at creation
time by assigning a different life cycle policy to the document.
A document that has a life cycle policy assigned also receives an additional tab
in the documents property sheet. This tab enables promotion, demotion,
resetting, or placing the document in an exception state. Use this method to
change a document state manually when you design and test your life cycle
policies.
5.6.11 Events
IBM FileNet Enterprise Manager enables you to define events that extend the
functionality of an object store, which enables you to configure objects to perform
actions in response to specific activities that occur on each object defined on an
Content Engine server.
126 IBM FileNet Content Manager Implementation Best Practices and Recommendations
or class of objects to which the action applies, as well as which events trigger the
action to occur.
Keep event actions short to ensure quick completion. This is especially true
for synchronous subscriptions where the subscription processor waits for an
event action to complete before moving on to subsequent processing.
Make sure that you thoroughly test your events and subscriptions before
implementing them.
Set up each event action with code stubs that specify each event trigger
(Create, Update, Delete, CheckIn, CheckOut, File Event, Unfile Event), even if
you do not define functions for every trigger. The subscription controls which
of the triggers call an action. You need to prepare the action to handle all
triggers gracefully.
This section touches on points for you to consider when laying out the content in
the object store.
Recommendation: Try to limit the number of folder objects in the system and
try to avoid using the browse paradigm whenever possible.
128 IBM FileNet Content Manager Implementation Best Practices and Recommendations
5.7.2 Other objects
Other objects in the repository need to consider how the folder objects are
intended to be used and leverage their unique ability. Another aspect of object
repository layout is in the storage media that the content will use. Try to provide a
range of storage media and use it appropriately.
Try to match content with storage media in a meaningful way. Internal memos
and other short-lived pieces of content without lots of business value can be
stored on a simple network-attached storage (NAS) storage device while
content that is critical to the business operation can utilize a high-speed,
highly available storage subsystem that also has a higher cost associated with
it.
Chapter 6. Security
In this chapter, we discuss security concepts and how they apply to IBM FileNet
Content Manager (P8 Content Manager) and how P8 Content Manager can fit
into an enterprise security architecture. We introduce the basic concepts and
aspects of security. We describe the specific ways in which P8 Content Manager
utilizes various mechanisms for security and explain how security can be
controlled to provide highly granular and layered security.
Access
The access control facet of security concerns controlling the access to the data.
This access control includes controlling viewing the existence of the information,
creating information, viewing the information itself, modifying the information,
and removing the information. The access control facet is focused on being the
gatekeeper, allowing access to the information for only for the authorized
operators and the authorized processes.
Integrity
The integrity facet concerns controlling the ability to alter the data. This includes
not only modification, but also creation and deletion as well. Integrity of the
132 IBM FileNet Content Manager Implementation Best Practices and Recommendations
information goes beyond the basic access control facet and provides a granular
ability to control and audit all changes that are made. This builds on the concept
of access by focusing on the control of changes that are made. This can include
specific controls over access at a highly granular level as well as specific access
permissions for specific operations, such as viewing, creating, modifying, and
deleting information.
P8 Content Manager provides for granular controls that can specify who is
authorized to alter any piece of data that it contains. These controls are provided
through ACLs that are associated with every object in the system. These ACLs
allow individual settings for actions, such as creation, modification, viewing, and
deletion based on an individual security principle.
Privacy
The data privacy facet of security concerns controlling visibility of the content of
the data. Although this appears to be the same as the access facet, this facet is
focused on techniques, such as data encryption and cryptography of the actual
data, as opposed to the access facet, which focuses on accessing the data
without regard to understanding the content or messages contained therein.
The major concern for privacy is the physical security of the hardware and
hardware access points of the information. The hardware must be trusted and
controlled as to who can physically access it. In addition, the network access
points, as well as the network topology, must be considered. Physical security is
outside the control of the P8 Content Manager.
Verification
The verification facet of security is concerned not directly with the data, but with
the user, or entity, that is attempting to interact with the data. The aspect of
authenticating, or verifying, the identity under which the process that is trying to
access a piece of data is running, is an example of the concerns of the
verification facet.
There are two major classes into which verification falls: a stand-alone,
self-contained verification source and a trusted federation of identity source:
Enterprise security service (single source)
Enterprise security service is enterprise service that is responsible for
authentication identities. It is usually a single source of authentication
information, such as Lightweight Directory Access Protocol (LDAP),
administered under the same administrative authority as the rest of the
security domain in the enterprise. A corporate-wide directory service is an
example of a single source for verification. It is important to note that a
corporate-wide security service is only considered secure and valid if it is
considered the repository of record for the authentication information.
It is also possible to have multiple sources or copies of the information, but
established in some synchronized manner that allows a master service with
distributed subordinate services as required, possibly in differing formats. In
cases where a single source is not technologically compatible with P8
Content Manager, it is acceptable for a bridging technology to provide the
information in a format that is consumable by JAAS as long as it does not
cache information, but always references back to the single source.
Security service federation (SSO)
Federation, or SSO, is the conceptual single point for individuals to
authenticate themselves and then have the resultant security token
recognized across all other locations where the same security is in place and
the token is recognized. Federation relies on established circles of trust that
allow externally verified identifications to be accepted into the system or
134 IBM FileNet Content Manager Implementation Best Practices and Recommendations
enterprise as trusted identities. Security federation technologies, such as
Security Assertion Markup Language (SAML), can also provide trusted
authentication verification for an individual or process. These technologies
enable trusted and secure transport of identity from outside the enterprise or
organization that utilizes a repository of record. This allows trusted and vetted
external entities authentication mechanisms in the system.
6.1.2 Models
Various facets of security are implemented through different security models.
Where 6.1.1, “Facets of security” on page 132 focuses on security of information
from a system or enterprise view, the models focus on security from the user
perspective. These models represent different ways of approaching security
issues. They are not necessarily mutually exclusive, nor are they intended to be
exhaustively implemented. These models need to encapsulate the security
viewpoints and goals of the security policies of any entity. We also discuss how
each of these models can be applied to, or supported by, the P8 Content
Manager environment.
An access control entry (ACE) is a single entry from the ACL that contains a
single identity. The ACL is a list that is made up of ACEs.
Recommendation: Fully understand the ACL and ACE model and how it is
utilized in P8 Content Manager prior to doing any design work.
Silos
Silos or islands are individual instances of security that are based on
organizational or functional composition of business. Silos are not centralized,
and they are not enterprise-wide. Silos typically represent ad hoc or
departmental security frameworks that are not tied into external resources. This
model allows isolation of security into a smaller than usual domain. It is typically
used during the development and testing of products in order to keep any
security vulnerabilities out of the production environment.
Recommendation: Isolate the security domain into its own silo during
development and testing. Utilize development to test principals and
authentication mechanisms that are separate from the production systems.
This safeguards the enterprise in the event of unforeseen interactions or
breaches while tuning the security setup of P8 Content Manager.
Chinese Wall
The Chinese Wall model (also known as the Brewer and Nash model1) is a
security model where read/write access to files is governed by membership of
data in conflict-of-interest classes and data sets. This is the basic model used to
provide both privacy and integrity for data. This methodology allows the security
to be driven directly from the classes and data sets of the information,
segregating the accesses based on their content.
P8 Content Manager can support the Chinese Wall methodology through ACL
settings on the data objects contained in the system themselves. The control can
be assigned on a role basis, identifying the conflict of interest data sets at design
time.
1
Dr. David F. C. Brewer and Dr. Michael J. Nash (1989), “The Chinese Wall Security Policy”:
http://www.cs.purdue.edu/homes/ninghui/readings/AccessControl/brewer_nash_89.pdf
136 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Note: Although roles are not supported in IBM FileNet P8, IBM FileNet P8 can
leverage a role-based methodology. See “Role-based access control” on
page 138. Role-based access control is also a very important concept from
the enterprise security level.
Perimeter
The Perimeter security model concerns entrance and exit points around the
data. This model assumes that a perimeter can be visualized around the security
concerns, all entities inside the perimeter are considered trusted and secure,
while all entities outside the perimeter are suspect. This focuses the security
control on all the interaction areas between the trusted and untrusted zones.
P8 Content Manager supports perimeter security through the JAAS modules that
control the authentication and verification of entities attempting to access the
system. The controls can be put in the JAAS layer to act as a gatekeeper across
this line. Perimeter controls that have the P8 Content Manager inside the trusted
zone are outside of the control and purview of the P8 Content Manager.
Multi-level
Multi-level security occurs at separate and distinct places in the enterprise where
each location is focused on securing a specific security aspect. This method
allows distribution of the security administration to specific organizational entities
that have the best understanding of their specific security requirements.
Layered security
Layered security occurs at a number of layers, therefore, protecting the central
core element. Similar to multi-level security, a layered model accumulates the
security from all the surrounding layers into a single security setting for the data
at access time. The major distinction with this model is every layer serves to
This method allows an abstraction of security roles away from the concrete
identity of a specific individual. This model allows a great flexibility in the
assignment of principal identities in a dynamic manner while retaining a relatively
static assignment of roles from the application perspective. This is a powerful
model to decouple the direct dependence of a certain application on the
underlying authentication mechanism.
From the perspective of the application, P8 Content Manager in our case, this is
indistinguishable from identity-based control as the translation from individual
principal identity and role is accomplished in the security level directly.
Although P8 Content Manager does not contain a role concept, this can be
modeled for P8 Content Manager by utilizing the corporate directory service and
assigning groups based on a role perspective. Having groups in the corporate
directory that represent roles provides that abstract layer benefit of role-based
access control.
138 IBM FileNet Content Manager Implementation Best Practices and Recommendations
controls. Objects assigned to the life cycle can then progress through the state
transitions and have their security aspects changed as appropriate.
P8 Content Manager supports the object aspect model directly through the
facility of marking sets, therefore, allowing specific aspects of an object to be
directly responsible for their access control. We explain object aspect control
more completely in the P8 Content Manager security features section.
The ACM model provides a complex relationship that allows every possible
situation to be individually modeled. It is seldom used in practice due to the
overhead and complexity involved. P8 Content Manager can simulate certain
aspects of ACM through the Workplace interface and its implementation of roles.
The role-based security provided is outside of the scope of this document, and
therefore, we do not discuss it. The ACL support of P8 Content Manager, as
previously discussed, is the support for ACM provided at this level.
Users and groups in the security system can be authenticated by that system,
and gain access to a P8 Content Manager system in their domain, but there are
no inherent permissions with this authentication. All of the authorization
permissions exist in the P8 Content Manager-specific security contexts that have
been assigned. Individual accounts are typically mapped into role-based groups,
and then these defined roles are utilized in the system to set all the ACL
permissions as appropriate. Because the authorization mechanism is inside the
control of P8 Content Manager, it is possible for a superuser of the system to
have no rights or privileges in the P8 Content Manager.
6.2.1 Authentication
Authentication of individuals, or ideally of the roles that an individual has, through
the external authentication mechanism is key to the security features in P8
Content Manager. The two standards at the core of the authentication process in
P8 Content Manager are the Java Authentication and Authorization Service
(JAAS) standard and the Web Services Security standard (WS-Security). The
JAAS standard forms the framework for security interoperability in the J2EE
world, while the WS-Security standard forms the framework for security
140 IBM FileNet Content Manager Implementation Best Practices and Recommendations
interoperability in the heterogeneous world of clients and servers that
communicate through Web services interfaces.
The CREATOR-OWNER identity takes on the principal identity of the user who
creates an object at creation time. This identity is typically used in the default
instance permissions and is replaced by the actual creator’s identity when the
default permissions are transferred into the object instance at creation time. This
identity is typically granted full rights to an object, because it has total control
over its modification and destruction, which allows designers to model security
that will be given to the person who actually creates an object at run time.
6.2.3 Authorization
When an individual, who has already been authenticated, attempts to access
IBM FileNet P8 objects, Content Engine will attempt to retrieve that individual’s
The ACL on a specific object has a number of entries or ACEs. Each ACE either
allows or denies a specific right or a set of rights to a specific identity. For
example, a particular class of documents can allow one identity to modify a
document and at the same time deny a second identity the same right. It is
important to note that deny always takes precedence over allow2, which means
that you must set up ACLs carefully. If an individual is allowed access to a
document under one identity but belongs to a group identity that is denied
access, the individual will not have access to the object.
Every ACE has a source, which you can view in the IBM FileNet Enterprise
Manager’s security editor:
Default Permissions are placed on an object by the Default Instance Security
ACL of its class, as well as permissions placed on a subclass by its parent
class. Default permissions are copied from the class definition to the object’s
ACL at creation time and are treated identically to Direct ACEs. Default ACEs
are directly editable; if you edit a Default ACE, its source type becomes
Direct.
Direct Permissions are added directly to an object. Direct ACEs are directly
editable.
Inherited Permissions are placed on the object by a security parent or by
setting up a relationship with an object-valued property whose Security Proxy
Type has been set to Inherited. Inherited ACEs are not directly editable.
Template Permissions are placed on the object by a security policy or
document life cycle policy. Template ACEs are not directly editable and do
not appear on classes. Rather, a document, folder, or custom object class
might have a default Security policy that will pass template ACEs to the
instances of the class, if all the conditions for the template apply.
Each ACE has one access type: either Allow or Deny. When evaluating the
access granted by a particular ACL, the current system applies ACEs in the
following order of precedence (higher in the list takes precedence over lower):
Direct/Default Deny
Direct/Default Allow
Template Deny
Template Allow
Inherited Deny
Inherited Allow
2
That is, within the same hierarchical level. Directly applied ACEs take precedence over inherited
ACEs.
142 IBM FileNet Content Manager Implementation Best Practices and Recommendations
You cannot remove or change an inherited access right, but you can override one
by directly allowing or denying an access right. To edit an inherited access right,
the administrator must modify the parent that is the source of the inherited
access right.
Because Deny has precedence over Allow within each category (for example, a
Template Deny takes precedence over a Template Allow), if you explicitly deny an
access right to a group and explicitly allow it to a member of that group, the
access right will be denied to the member.
Thus, if an ACL contained two ACEs that were identical in every respect except
that one was an Inherited Deny and the other a Direct Allow, the Direct Allow
takes precedence with the result that the user is allowed the ACE.
Access rights are assigned within P8 Content Manager to identities that are
defined in the directory service. Thus, a directory service user who might, in
other contexts, be considered a superuser, might not have any rights within P8
Content Manager, depending on how the IBM FileNet P8 administrator chooses
to assign access rights to the user.
Each ACE that is present on an object’s ACL either allows or denies the right to
do certain things. For example, a particular class of documents can allow one
user to delete a document but deny another user the same right. Following
standard practice, deny always takes precedence over allow, which means you
must set up ACLs carefully. An ACE can allow or deny rights to either a user or a
group, which is referred to as the grantee. If the grantee is a group, the ACE
applies to anyone who is a member of that group. Thus, if one user is allowed
access to a document as a user but belongs to a group that is denied access, the
user will not have access to the object.
Note that permission granted by the object store ACL is necessary, but not
sufficient. The object must pass through both gates for the operation to be
allowed.
Object gate
The object gate evaluates the rights granted for the specific object that is the
target of the requested operation and either permits or disallows the operation
depending on whether those rights include those particular to that operation. For
example, a delete operation will be permitted or disallowed by the DELETE
access right.
Evaluation of the rights granted to the object is also in two steps. First, the
object’s own ACL is evaluated, yielding a provisional set of granted rights. Then,
a check is made of the markings applied to the object, if any, which might result
in some rights being removed from the provisional list. (Markings only remove
rights, they do not add them. Later, we discuss more details about markings).
Evaluation of the object’s ACL follows the order of precedence described earlier.
All rights are implicitly denied unless there is an explicit allow ACE that is not
overridden by a higher precedence deny.
Inheritance model
This model allows the security of an object to be determined based on placement
in a containment hierarchy or, more generally, in any subordinate relationship to
another object, inheriting its security from the object. The mere placement of an
object in a container does not automatically activate inheritance; it must be set by
identifying the container as a security parent. In this case, the ACL is then
checked as part of this authorization process, which allows folders to have
specific ACL settings that can supersede or augment security on objects which
they contain.
144 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Life cycle model
This model allows the security of an object to be determined based on transitions
through states in a life cycle, either the implicit life cycle of the versioning model,
an explicit life cycle defined through a document life cycle policy, or
application-defined state transitions. In all these cases, the security effect of the
state transition involves the addition and removal of ACEs with a source
Template from the object’s ACL. Versioning and application-defined state
transitions enact these changes through a Security Policy. Document life cycle
state transitions enact these changes through security templates attached to the
state definitions within the document life cycle policy.
Document life cycle objects, life cycle actions, and life cycle policies have the
following security characteristics:
Both are instances of their own classes: the Document Lifecycle Policy class
and Document Lifecycle Action class. Therefore, they obtain initial security
from the Default Instance Security ACL of their class, just as all objects do
when first created. Both classes are subclassable. You can view and modify
these classes under IBM FileNet Enterprise Manager’s Other classes node.
Life cycle actions and policies are independently securable. They are not
required to have the same security as the security placed on the document
class (or individual document) to which they are attached. It is obviously a
much simpler security model if they do. However, it can be configured
differently if required by the needs of the application’s security.
Document life cycle actions and policies do not have a security parent
relationship with any other object. Specifically, a life cycle policy does not
have a security inheritance relationship with the document class to which it is
associated.
In IBM FileNet Enterprise Manager, individual life cycle actions and life cycle
policies are displayed in subfolders of the Object Stores → object store
name → Document Lifecycles node. If these folders are empty, it means that
none have been created for that object store. Both objects have property
sheets containing a Security tab, which allows you to view and modify the
security on that object.
Like other objects, life cycle actions and life cycle policies have an owner
property. The owner does not need to be the same as the owner of the
document with which the life cycle policy is associated.
Note that the default ACEs placed into the object’s ACL are copies from the class
definition. Subsequent changes to the class definition do not take effect on
existing instances of that class.
Marking sets
Marking sets are intended to provide a mechanism for supporting records
management from the content inside of P8 Content Manager. Marking sets allow
specific metadata attributes to be identified that control the effective access of an
object in conjunction with the object’s ACL. Due to the specific design intentions
of marking sets, only utilize this security mechanism if there is no other feature
that can provide the model that you are seeking. When using marking sets, you
must exercise caution as well as make sure that there is a thorough knowledge
of the specifics of the operation of marking sets, which are used in the IBM
FileNet Records Manager extension to P8 Content Manager.
A marking modifies the access granted to an object (by the object’s ACL) based
on a specific property value. The marking set definition assigns a constraint
mask to each possible value of the marking-controlled property and assigns
rights to “use” each value to individual users or groups. If a user attempts to
access an object marked with a value to which that user does not have “use”
permission, the access rights represented by the constraint mask for that value
are removed from the provisional set of rights determined by evaluating the
object’s ACL. (If “use” permission is granted, nothing happens; no rights are
added).
You can have multiple properties assigned to a single class with associated
marking sets, and they will all be used to determine the final access to the object.
The collection of all markings that are actually applied to a particular object is
displayed by the IBM FileNet Enterprise Manager as the object’s “active
markings”.
146 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Except where specifically mentioned, this topic describes the association of
security policies with documents and document classes. To fully understand this
topic, you must be familiar with document versioning and the versioning states
Released, In Process, Reservation, and Superseded.
Note: Security policies are just one way to apply ACEs to an object’s ACL.
The other sources are the object’s class, a security parent, direct edits to the
object’s security, and by programmatically setting the object’s access rights.
Sequence tables detail the versioning states through which documents proceed
following check in, check out, and other versioning actions.
To create a security policy, you run one of the security policy wizards provided by
IBM FileNet Enterprise Manager and by Workplace and Workplace XT. The
wizard creates a security policy that the system administrator can then customize
by adding security templates. When created, the security policy can be
associated with documents, folders, or custom objects. Alternatively, the
administrator can make the security policy the default value for the security policy
property for one or more classes. Making a specific security policy the default for
a class ensures that all instances of the class are associated with that security
policy unless the value is explicitly overridden.
The security policy class can have subclasses, just like the other classes in IBM
FileNet Enterprise Manager’s Other Classes node. The security policy wizard
lets you create a security policy using a subclass, whereas IBM FileNet
Enterprise Manager’s wizard supports only the base class. You can also use one
of the supported IBM FileNet P8 APIs to create a security policy using a
subclass.
Newly created security templates contain no default permissions that have been
placed on them by the Content Engine. Administrators can add permissions at
creation time while running the security policy wizard, or at any later time.
Applying a template that contains no permissions to an object will have the effect
of removing any existing permissions on that object that were previously applied
by a security policy.
In the first case, the default security policy is automatically associated with the
object instance at the time of creation unless the default is explicitly overridden.
The default security policy will continue to be associated with all versions in the
document’s version series, unless you do something to change the association.
By having the same security policy for all documents in a class, you have a
simple, easily understandable and manageable security scheme. If, however,
you change a single document version’s class, the default security policy of the
new class (provided there is one) is immediately applied to that document
version, and the old security policy (if there was one) is removed. However,
changing a version’s class does not override a security policy that was directly
assigned to that version by a user, nor does it change any earlier versions of the
same document.
In the second case, you assign a security policy to a specific document version.
Each document version in a version series can, theoretically, have a different
security policy assigned to it. The default security policy of the document class
will be placed on each instance of the class, but you can override the default with
a different security policy. You do this manually by using IBM FileNet Enterprise
148 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Manager to open the document version’s property sheet and changing the
security policy. This is cumbersome and difficult to manage from a system
administrator’s point of view and must be done only as an exception to the
normal application by the document class.
In addition to the list of security templates associated with it, each security policy
has an important property called Preserve Direct ACEs (also called Preserve
Direct Permissions). This property, which can be set to either True or False,
governs whether or not direct permissions are preserved in the target object’s
ACL when a security template is applied to it. The value of this property applies
to all the templates contained by the security policy.
By default, this property is set to True, because this is likely to be the most
common use case. In fact, the security policy wizard does not ask you to set a
value and just sets it to True. After you have created the security policy, you can
open its property sheet’s General tab to view or change the Preserve Direct
ACEs setting.
IBM FileNet Enterprise Manager is the tool that system administrators use in
their daily work. IBM FileNet Enterprise Manager gives system administrators
easy access to most of the administrative and security features needed for
Content Engine security configuration tasks.
Security auditing can be turned on for any class and for any operation that can
be authorized for any object. Turning on security auditing adds an audit log to the
object for which it has been activated. It is important to note that the utilization of
security auditing, especially if activated for actions, such as view or search, can
be very resource intensive both from a storage perspective as well as a
processor perspective. You must exercise care when activating any auditing
feature. When these types of changes are made, we highly recommend that
there be a formal logging of what has specifically been activated in order to
Although most security changes beyond the design phase are normally atomic in
nature, explicit security settings are sometimes done in bulk. IBM FileNet
Enterprise Manager provides for a batched-mode update operation, which is
driven by a search to allow a bulk change in things, such as specific roles or
identities.
User C
redenti
als
JAAS S ubjec
JAAS
Login
Module
t
JAAS Credentials
The user presents a set of credentials to the security service. These credentials
are processed through the JAAS login module, which references the directory
service as well as a secure credential store. The login module, upon
authentication of the user credentials, creates a JAAS subject containing the
identities of the user as given in the directory server. This JAAS subject is then
150 IBM FileNet Content Manager Implementation Best Practices and Recommendations
returned to the user’s session and is then available to be utilized for authorization
of operations as they are requested. The entire process of subject creation is
dependent on the J2EE architecture and on the J2EE container service, such as
WebSphere, on which P8 Content Manager is installed.
After a user has a JAAS subject, the JAAS subject is then utilized for the
authorization of specific operations as shown in Figure 6-2, which illustrates the
JAAS authorization process. The application server infrastructure automatically
passes the JAAS subject to the P8 Content Manager server. Note, this might not
be the case depending on the application server implementation. The JAAS
subject is not always directly passed over the wire. WebLogic might serialize the
subject, but WebSphere uses a security token to propagate and revive the JAAS
subject on the other end.
The identities that are encapsulated in the JAAS subject are from the directory
service and are the same set of identities that is utilized in P8 Content Manager
ACL settings. The authorization process takes the identity from the JAAS subject
(along with group membership information for that identity that is obtained from
the directory) and uses that combined information for comparison against
applicable objects’ ACLs.
For exhaustive and detailed coverage of security for the IBM FileNet P8 Platform,
including the P8 Content Manager and its specifics, see IBM FileNet P8 Security
Help, GC31-5524, which is a compilation of the detailed installation and
configuration information for IBM FileNet P8 security from ecm_help. You can
download it from the previous Web site or directly from the following URL:
ftp://ftp.software.ibm.com/software/data/cm/filenet/docs/p8doc/40x/File
Net_P8_security.pdf
152 IBM FileNet Content Manager Implementation Best Practices and Recommendations
7
Because it is an administrator’s tool that can be used for doing extraordinary and
powerful low-level changes, IBM FileNet Enterprise Manager strikes a balance. It
exposes low-level details of the P8 Content Manager, yet it remains usable
through extensive task wizards and other user interface help.
There are many, many things for which IBM FileNet Enterprise Manager can
used, but here are just a few of the things for which you are most likely to use it:
Creating an object store, storage area, or other infrastructure object
Creating or changing a marking set
Adding classes and properties to an object store
Creating and running queries for routine maintenance and troubleshooting
Adjusting Content Engine server logging levels
Managing subscriptions and events
Browsing the contents of an object store
Exporting and importing objects from and to object stores
154 IBM FileNet Content Manager Implementation Best Practices and Recommendations
7.1.2 Workplace and Workplace XT
In contrast to IBM FileNet Enterprise Manager (see 7.1.1, “IBM FileNet
Enterprise Manager” on page 154), Workplace and Workplace XT are intended
for the wider audience of non-administrator users. Even though they are generic
in nature, they still provide a comfortable and productive user interface for
accomplishing a variety of everyday tasks. Workplace and Workplace XT are
Web applications. The user interface for these applications uses emerging Web
2.0 and Ajax technologies to closely model a desktop application experience. It
provides easy-to-use windows and wizards for navigating and searching for
documents and folders. It also provides integration with the Process Engine
through an inbox for workflow tasks and the ability to manage and launch
workflows.
BPF is optimized for creating applications that involve workflow steps for
processing content. Many business applications fall into this category, which is
often called case management. Examples of case management in everyday
situations include loan processing, customer service inquiries, document
authoring and approval cycles, and many others. Typically, there are one or
more documents, which together must go through several logical business steps
performed by different organizations, employees, or automated systems.
156 IBM FileNet Content Manager Implementation Best Practices and Recommendations
7.2 Application technologies
Content Manager comes with a set of applications that you can use as is. The
applications include IBM FileNet Enterprise Manager, Workplace XT, and others
(see 7.1, “IBM FileNet P8 applications” on page 154). The operations and
interfaces provided by these applications might not always satisfy your
company’s business requirements. In many circumstances, you have to create
custom applications to fulfill your business needs. Your applications will be
designed with specific business goals in mind, and those come in many varieties.
We do not attempt to cover business goals here. Instead, we discuss more
general technical application technologies.
One of the JAR files distributed with P8 Content Manager is Jace.jar. The
Jace.jar file contains the classes and supporting files for the Content Engine
Java API. The API acts remotely from the server but ultimately must be able to
communicate with the server to do any productive work. For a discussion of the
available transports for communicating with the server, see 7.3.2, “Transports
available with the APIs” on page 165. The use of the Web services transport with
a thick client is easy to understand: the API translates requests into Web
services calls, and the server’s Web services listener receives and responds to
them.
If Enterprise JavaBean (EJB) transport is used, the interaction within the API is
more complicated. The details of the interaction are not exposed, and you
generally do not have to worry about them. However, it is useful to have a basic
understanding. The J2EE specification refers to a stand-alone Java virtual
The most obvious disadvantage is that the users must run a Java-capable Web
browser, and the use of Java applets must be enabled. All major Web browsers
are Java-capable, but for security reasons, organizational policies sometimes
forbid enabling the running of Java applets.
An applet is launched from a link on a Web page. The applet infrastructure has
built-in mechanisms for caching the applet and its supporting JAR files on the
client machine. The infrastructure automatically notices version updates and
performs fresh downloads when needed.
Applets have restricted access to client system resources. Your applet can be
granted access to whatever resources you need, because Java has a rich
permissions infrastructure. However, the conservative permissions infrastructure
makes it difficult to deploy even simple applets without security pop-up dialogs.
The dialogs either create unnecessary worry in users, or they are casually
approved without full consideration. This unpleasant aspect of the user
experience has given applets a reputation for being difficult for the everyday
user, and their use is often limited to experts and administrators.
158 IBM FileNet Content Manager Implementation Best Practices and Recommendations
7.2.3 J2EE Web applications and other components
The technology underlying much of the enterprise software development these
days is Java 2 Enterprise Edition (J2EE). The J2EE platform helps you make
efficient use of resources by providing common services, such as security, high
availability, transaction management, and scalability. Because the platform
provides these services with mechanisms for configuring them when the
applications are deployed, you are free to concentrate on business logic in your
applications. The Content Engine, which is implemented as J2EE components,
uses many common features of the J2EE platform. You can write Content
Engine applications with traditional thick client Java applications or even
non-Java client technologies, but the tightest integration will naturally be
available when your application is integrated with a J2EE application server.
There are many standardized technologies available in the J2EE platform, but a
couple are particularly worth mentioning, because they often show up in typical
J2EE application development: servlets and Enterprise Java Beans (EJBs):
The J2EE servlet container is often thought of as the container for Web
applications, because it represents the tier where J2EE presentation logic is
generally placed. Web applications are perhaps the most popular use for
servlets, but it is not necessary to have an actual Web interface to use
servlets. For example, the P8 Content Manager WebDav provider is
implemented using a servlet, and the user interface is provided by the
WebDav client applications. The servlet container is appropriate for
application components, which receive and respond to outside requests and
which optionally preserve some state on the server side between requests.
The J2EE EJB1 container provides what are often thought of as
enterprise-level services. For example, EJBs can have declarative security
and transactional properties, provide transparent load balancing across
servers, and provide nearly transparent access to relational databases. EJBs
are frequently used to encapsulate reusable business logic and seldom, if
ever, contain any presentation logic.
Consider implementing SOA if you can decompose your application logic into
small, well-defined request and response pieces that map well to the loose
coupling of SOA.
This section describes the APIs available in the IBM FileNet P8 4.0 release. We
do not discuss the compatibility APIs (for Java and Content Manager) that exist
160 IBM FileNet Content Manager Implementation Best Practices and Recommendations
to help in the transition of applications written for earlier P8 Content Manager
releases. The IBM FileNet P8 Platform Version 4.0 Installation and Upgrade
Guide describes features and limitations of the compatibility APIs. We
recommend the IBM FileNet P8 4.0 APIs for any new development and, where
possible, for additions to existing applications. It is not possible to use the IBM
FileNet P8 4.0 APIs to communicate with a pre-4.0 Content Engine server.
Sample applications are made available from time to time in the support area for
P8 Content Manager at:
http://www.ibm.com/software/ecm
Java API
P8 Content Manager provides a full-featured Java API. Any feature that is
available in the server is completely available to Java programmers. This access
includes routine operations, such as retrieving and updating Document objects,
and specialized operations, such as adding a custom class or property to an
object store’s metadata definitions.
The Java API is structured with Java classes having names that match system
metadata classes in the server. For example, there are Java classes for
Document, Folder, CustomObject, and so on. When you instantiate a Java object
(usually through a subclass of the Factory class), it refers to an object that
actually resides in the server. The API maintains stateless interactions with the
server and is intentionally loosely coupled, which means that the API objects are
not actually holding a server-side connection or other resources for the life of the
Java object.
Dirty property values and pending actions are not sent to the server until an
explicit call is made to do so. If an API object is discarded without that call, the
changes are never made on the server. The most common method of sending
changes to the server is to call the save() method on an API object. There is also
a batching mechanism for sending updates to multiple objects in a single
round-trip over the network. Batching provides improved performance and
provides transactional atomicity for all of the changes in the batch.
.NET
P8 Content Manager provides a full-featured .NET API, which you can use to
write programs in any .NET-compatible language. With a couple of exceptions,
any feature that is available in the server is completely available to .NET
programmers. The exceptions are mainly custom code that must be executed
within the server, for example, EventActions. Because the Content Engine
server is a J2EE application, internally executed custom code is limited to
Java-compatible technologies.
The principles behind the .NET API are the same as those behind the Java API
(see “Java API” on page 161), so we do not repeat that discussion here. One
significant feature available only with the .NET API is the use of Kerberos to
perform authentication via Microsoft Windows Integrated Login. This is only
possible when both the client application and the Content Engine server are
running on Microsoft Windows.
Web services
Modern, loosely coupled frameworks, such as service-oriented architecture,
favor Web services protocols for connecting components. P8 Content Manager
provides Content Engine Web Services (CEWS) for accessing nearly all features
available in the server.
There are still a few occasions where the direct use of CEWS might be useful:
162 IBM FileNet Content Manager Implementation Best Practices and Recommendations
You have an application already using CEWS, and no plans exist for
immediately porting it to the Java or .NET API.
You are building an application component as part of a framework in which
the use of Web services is the model for communicating with external
systems.
Although a rare occurrence, you might be using a language or technology that
can make use of Web services but is not compatible with the use of a Java or
.NET API.
For these occasions, the direct use of CEWS is a good choice and is fully
supported.
When setting things up, use care in choosing the Web services endpoints. There
are two distinct sets of endpoints. One set, which you sometimes hear referred to
as “the 3.5 endpoints”, has WSDLs compatible with CEWS from P8 Content
Manager 3.5.x releases. Those endpoints, which are easily identified by noting
the 35 in the endpoint name, are supported for direct use. The other set of
endpoints, sometimes referred to as “the 4.0 endpoints”, is used internally by the
APIs for transport and is not supported for direct use outside of the APIs.
In theory, you can take the WSDL file for CEWS and use any current Web
services toolkit to generate the interfaces that you will use on your end. In
practice, however, toolkits are still individualistic in their handling of various
WSDL features, and it is difficult to write a WSDL for a complex service that is
usable by a wide cross-section of Web services toolkits. Check the latest
hardware and software support documentation, IBM FileNet P8 Platform 4.0.x
Hardware and Software Requirements, and only use a supported toolkit.
Your toolkit will generate programming language stubs and other artifacts so that
you can include CEWS calls in your program logic. The details of those artifacts
vary from toolkit to toolkit, but you will surely see representations of Content
Engine objects, properties, and update operations. The P8 Content Manager
product documentation describes how these pieces interrelate, but we
recommend you also read the developer documentation for the Java or .NET API
to get additional understanding of how the things mentioned in the WSDL were
intended to be used.
In all, there are over 600 classes in the APIs. To assist developers in keeping the
usage of all these classes organized, the classes themselves are arranged into
packages in the Java API and namespaces in the .NET API. The arrangement is
similar between the two APIs, differing only in the stylistic naming conventions of
the different programming environments. All exposed classes (classes supported
for external use) are in subpackages of com.filenet.api (in Java) or
subnamespaces of FileNet.Api (in .NET). Unless specifically documented
otherwise, classes in any other package or namespace are strictly internal and
not supported for external use.
Table 7-1 summarizes the more prominent types of classes in the APIs. The table
is intentionally incomplete and only gives a flavor of the API organization.
164 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Package Description Examples
There are two available transports: Web services (WS) and Enterprise Java
Bean (EJB). EJB transport is available only for the Java API, whereas WS
transport is available for both APIs. For most situations, the EJB transport is
preferred, but the WS transport can be used in more environments. In all cases,
the transport is considered stateless, which means that the APIs operate on the
basis of a single request and response for each interaction. No client state is
maintained by the server after a request has been serviced.
EJB transport
The EJB transport internally uses EJB method calls. The method calls are made
on the client side and transported by the application server to the server side of
WS transport
As its name implies, the WS transport uses Web services protocols. In fact, the
WS transport uses an enhanced version of the Content Engine Web Services
(CEWS) protocol. You probably already know that means XML over HTTP or
HTTPS. Because HTTP and HTTPS use only a single port for the entire
conversation and use a strict client-server interaction model, it is generally easier
to configure a firewall or reverse proxy through which to allow WS transport
requests to pass.
Web services attachments are used for carrying pieces of content between the
client and server sides. Attachment handling has undergone a lot of changes
over the years, and different environments and tools support different standards:
When using the Java API, you must select the CEWS endpoint that supports
Direct Internet Message Encapsulation (DIME) attachments (recognizable
because it has DIME in the endpoint name).
When using the .NET API, you must select the CEWS endpoint that supports
Message Transmission Optimization Mechanism (MTOM) attachments
(recognizable because it has MTOM in the endpoint name).
In both cases, you must select the endpoint with 40 in the endpoint name. Do not
select the endpoint with 35 in the name.
166 IBM FileNet Content Manager Implementation Best Practices and Recommendations
transport unless instructed otherwise through a parameter setting on the
Connection object.
The EJB used by the EJB transport automatically propagates any ambient
JAAS authentication context to the server. If you are already using a
JAAS-based authentication scheme, either in isolation or as part of a single
sign-on (SSO) framework, P8 Content Manager is very likely to participate in
that scheme with few or no configuration changes.
In contrast, there is no general framework for propagating an authentication
context when using WS transport. Although a standard called WS-Security
provides a high-level framework for adding authentication schemes, WS
transport can only support schemes backed by specific implementation
programming. P8 Content Manager directly supports WS-Security Username
token and Kerberos token authentication schemes. The latter can be used to
facilitate integration with Microsoft Windows applications. Custom
authentication schemes can also be implemented by using the IBM FileNet
Web Services Extensible Authentication Framework (WS-EAF). Specific
details of using Kerberos and WS-EAF are provided in the Web Service
Extensible Authentication Framework Developer’s Guide section of the online
help files, IBM FileNet P8 Documentation.
WS transport, which is based on HTTP or HTTPS, uses just one or two
TCP/IP ports for all interactions. There are also commercially available
products for examining and validating Web services traffic. Therefore, many
administrators find it easier and more secure to open their firewalls to WS
transport requests. In contrast, EJB transport might use a vendor-specific
binary protocol. Such protocols often employ a range of TCP/IP ports. These
factors typically lead to a greater willingness to allow WS transport to pass
through firewalls and a reluctance to do the same for EJB transport.
In cases where WS transport is using Username token authentication, the
credentials will appear on the wire unprotected unless you use Secure
Sockets Layer (SSL), which we strongly recommend.
For the first of these problems, the software industry has evolved to a model of
pluggable authentication. That means that components for verifying different
credential types can be developed independently of the framework into which
they fit. The output of a pluggable authentication framework is often a token
affirming that valid credentials were presented and verified. That is typically
enough information for most authentication consumers; although, some systems
also provide information about the types of credentials that were presented.
The Java environment and JAAS framework work together to affiliate a security
Subject with a thread of execution after authentication has taken place. When
you use EJB transport to connect to the Content Engine server, the transport
automatically propagates the Subject along with the rest of the request. The
Content Engine server does not care what credentials were used to authenticate
the user; it cares only that authentication was successfully performed via JAAS
within a trusted environment. P8 Content Manager includes sample JAAS
configuration files suitable for use with various application servers, but the
Content Engine does not depend on their use. For any particular environment,
168 IBM FileNet Content Manager Implementation Best Practices and Recommendations
any JAAS configuration can be used as long as it is compatible with the
application server.
If you use WS transport to connect to the Content Engine server, the JAAS
Subject cannot be directly propagated due to technology differences. Instead,
individual authentication and credentials schemes must be specifically
anticipated in code. For a Java environment, you must use the
com.filenet.api.util.WSILoginModule. It intercepts the raw user ID and
password credentials and arranges for them to be transmitted to the Content
Engine server. For .NET environments, the Content Engine .NET API can
transmit Kerberos tokens to take advantage of Windows Integrated Logon. In
either case, the WS Listener on the Content Engine server immediately takes the
security context information that it receives and performs a JAAS login, so that
the bulk of the Content Engine server is only aware of the JAAS framework.
Get or fetch
When many people think about interacting with an object from the server, they
first think about doing a round-trip to fetch the object. That is a necessity for
many things, but there are several cases where you do not need that initial fetch.
For example, if you are only going to use an object so you can set the value of an
object-valued property on another object, you really only need a reference. If you
somehow know that the object already exists, you can skip the round-trip to fetch
it. (If it turns out that you were wrong and it did not already exist, the referential
integrity mechanisms in Content Engine will throw an exception when you try to
save the referencing object.) The APIs have a mechanism called fetchless
instantiation. There are three flavors of Factory methods for creating
programming language objects that reference Content Engine objects, and you
can tell them apart by the word used as the beginning of the method name:
create indicates that a new Content Engine object is to be created. No
round-trip is done as the result of this Factory method call; although, a save
call must eventually be done.
fetch indicates that a round-trip will be immediately made to the Content
Engine to verify that the object exists and to return an initial set of properties.
Fine-tuning of the properties returned can be controlled via an optional
PropertyFilter (see “Property filters” on page 170).
Property filters
Property filters are optional parameters to a number of methods that fetch
objects or properties from the Content Engine. They allow highly granular control
of the objects or properties being returned.
Options for using property filters are described in detail in the online reference
help for the PropertyFilter class. Most of the Content Engine API calls that can
take a property filter will also accept a null value. In these cases, the API still
works correctly, but it might make additional round-trips behind the scenes. It is
designed that way so that you can get your application working quickly and
optimize the performance later.
Pending actions
Methods, which at first glance seem to be making updates to Content Engine
objects (for example, checkin(), checkout(), and delete()), are only marking
the programming language object with the change. The APIs call these pending
actions. The concept is easy enough to understand. It is the reason that a call to
save() must be done to send pending changes and property value updates to
the Content Engine. Not as obvious is that you can queue up multiple pending
changes on a single object. Not all combinations of pending actions make sense,
but when it does make sense to combine them, you can save network
round-trips. This feature is especially useful when combined with batching.
Batching
The Content Engine APIs contain two separate but similar batching mechanisms:
A RetrievingBatch is used to fetch multiple, possibly unrelated, objects from
the Content Engine in a single round-trip. Object references and property
170 IBM FileNet Content Manager Implementation Best Practices and Recommendations
filters are added to the batch, and retrieveBatch() is called to trigger the
round-trip.
An UpdatingBatch is used to group multiple updates in a single round-trip to
the Content Engine. Instead of calling save() on individual objects, the
objects are added to the batch, and updateBatch() is called to trigger the
round-trip. Updates are performed as an atomic transaction.
There is another type of transaction that you can control in your application. If
you use the Java API with EJB transport, you can include Content Engine activity
within a client-side transaction. This feature is unavailable when using WS
transport (see “WS transport” on page 166). The client-side transaction can be
started implicitly by the J2EE container or started explicitly through your use of a
javax.transaction.UserTransaction object.
P8 Content Manager follows the J2EE model for transactions, and J2EE in turn
follows industry standards for distributed transactions. In this context, the
relevant facts are that a transaction is started, operations performed by a
transactional resource (in this case, Content Engine) are tagged with the
transaction identifier, and the transaction is either committed or rolled back. All
changes tagged with a given transaction identifier are committed or rolled back
as an atomic unit.
You control whether or not your Content Engine calls participate in a client-side
transaction by configuring the Connection object. By default, Connection objects
are configured to not participate in a client-side transaction (even for EJB
transport). This presents the least surprising behavior, because both transports
give the same behavior by default. You make the following call on a Connection
object conn to change from the default behavior:
conn.setParameter(ConfigurationParameter.CONNECTION_PARTICIPATES_IN_TRA
NSACTION, Boolean.TRUE);
After some analysis, it almost always turns out to be the case that applications
using client-side transactions can be rewritten to use API batching. For the few
cases where client-side transactions are genuinely needed, they are supported
as described. The case where you might be forced into a client-side transaction
is when your application must include transactional resources outside of P8
Content Manager. For example, if you must include P8 Content Manager
updates atomically with updates to a stand-alone database, that is a motive for
using a client-side transaction. If you do find yourself using a client-side
transaction that you cannot avoid, do your best to minimize the amount of time
that the transaction is active.
172 IBM FileNet Content Manager Implementation Best Practices and Recommendations
An available AddOn can then be installed into an object store, which means that
the data is imported and the post-install script is run. IBM FileNet Enterprise
Manager has menu actions and wizards for manipulating AddOns, including
selecting which AddOns to install when an object store is created.
The SQL query syntax is exactly the same for the JDBC interface and the native
API queries, including extensions for handling class hierarchies and folder
containment. Here is an example of a typical query:
In this example, Invoice is a custom class and the properties mentioned are
custom properties on that class. For the purposes of this query, it does not really
matter whether it is a subclass of Document, Folder, or some other class.
The JDBC interface follows the JDBC specifications and programming models,
but the motivation for its development was primarily for use by reporting tools.
The JDBC interface is also purely read-only. Therefore, the JDBC interface is not
an especially good choice for use in application development. For general
application programming, the native APIs provide a richer interface.
When an event occurs in Content Engine, any active subscriptions link the event
to an EventAction and ultimately to your code, which implements the interface
com.filenet.api.engine.EventActionHandler. (Because the Content Engine
runs in a J2EE application server, all event action handlers are written in Java so
that interface does not appear in the .NET API.) Through the onEvent() method,
your code receives parameters that describe the event that occurred as well as
the state of the object when the event occurred. For some events, you get both
before and after snapshots of the object.
174 IBM FileNet Content Manager Implementation Best Practices and Recommendations
For an asynchronous event subscription, your event action handler is called
after the change to the object has been committed to the database. Your
handler does not run within the context of an active EJB transaction. Instead,
it has its own transaction started by Content Engine. You can make changes
to the triggering object, but those changes are just normal, additional changes
like you might make from a client program. Your handler cannot veto the
original change, because it has already happened and been committed.
Because it is an asynchronous event subscription, your handler is called at
some time after the commit. Although you can usually expect your handler to
be called within a few hundred milliseconds, overall system load and
competing event action handlers contribute to the overall timing, and there is
no guaranteed time by which your handler will run.
By using the event subscription model in Content Engine, you can create
handlers that monitor changes to objects not just from your application or
components, but from all sources.
The details of how to create and publish an API or framework vary by technology
and by organizational environments. Here are general considerations when
planning these interfaces:
Can you separate out the control of your interface into configuration (which
controls overall behavior, locations, names, and so on) and parameters
(which can vary from call to call within the same invocation)?
Have you built any assumptions into the interface which can just as easily be
made configuration items or input parameters?
Alternatively, do not make things into configuration items or parameters if
they will actually never change. It is easy to add configuration items or
parameters later with appropriate default values, but it is more difficult to
remove or change them after they are published.
What is the appropriate layer in your module that is likely to be useful to
someone else? The separation between infrastructure and use cases is a
good starting point, but you might find further layering points within the
infrastructure and use cases. You might end up publishing more than one API
or framework.
Your organization, industry segment, or technology community might already
have formal or informal standards for the look and feel of APIs or frameworks.
Use those as a guide when creating your own; although, you will ultimately
have to come to your own conclusions when there is not an exact fit. The goal
is to make it easy for others to understand and use your interfaces
productively.
There is a downside to reusable APIs and frameworks. The more others come to
depend on them, the harder it is to change them in upwardly compatible ways.
There is a really good chance that the first revision will not be correct, and you
will want to change things in later releases in ways that are incompatible with
earlier releases. If you are the only developer or if the development of all the
using applications is in the same small organization, it is usually not a problem to
have a short period of time when everything switches from the old interface to the
new interface. In even moderately complex development environments, that
short period of time can be infeasible. The usual solution to this challenge is to
organize things so that different generations of your API or framework can be in
176 IBM FileNet Content Manager Implementation Best Practices and Recommendations
use by different applications at the same time. This solution avoids the problem
of forcing the conversion of all applications at exactly the same time.
7.3.10 Logging
P8 Content Manager APIs have built-in logging, which focuses on providing
details of round-trips between the client and server. The reason for that focus is
because those details are typically interesting information for resolving both
performance and functional problems. The main purpose of the logging is to
have artifacts for diagnosing problems when hands-on debugging is not possible.
When designing logging for your own applications, you are likely to have similar
goals. You might want to consider the following points:
Determine the interesting interactions in your application. Focus your logging
efforts on those interactions first. You can always add more logging as your
application evolves or as you get a feel for the types of problems that occur in
production. Think of logging those interesting interactions as a unit, whether
they are all contained within a given software module or not.
Do not log uninteresting details. Log files can become quite large, and many
details that are logged will turn out to be distracting clutter when you are
looking at log files later. If something is likely to help solve a problem, log it.
If there is just a remote possibility that it will help, skip it.
Be careful about tying things to source code. It is fine to assume that the
people looking at the logs will have access to the source code to see what
entries mean, but only if that is actually true. Otherwise, log entries must be
reasonably self-explanatory so that you can teach someone what they mean.
Log the impossible. In any application, there are conditions that are supposed
to be impossible. It is tempting to ignore those conditions in program logic. If
one of those conditions actually happens, it must be logged, because it is an
indication of a design flaw or something seriously strange in the runtime
environment.
Pick a few severity and verbosity levels. It is probably better to have fewer
rather than more levels of granularity in your controls for logging. Modern
logging toolkits often give you the freedom to control things with many levels.
Do you really need them all? You probably do not. You probably do not need
much more than on, off, and perhaps one level in between. For each
combination, ask yourself who will really use it and why it is better than
another combination that you already need. One reason to have an
intermediate level is because voluminous logging usually has an impact on
performance. You can sometimes get ideas for narrowing your focus by using
only intermediate logging.
If you think your application might run in an environment without direct Content
Engine connectivity, there are alternative approaches.
Custom protocols
If your application makes only a few types of requests to the Content Engine
without much variation in the request parameters, you might consider creating an
abstraction layer for those requests and creating an application-specific proxy
solution. For this solution, you build a proxy, probably as a J2EE servlet, to
receive requests from your application and translate them into appropriate
Content Engine API calls. The proxy is also responsible for taking the results of
the Content Engine API calls and relaying them back to your application.
You obviously have quite a few technology options available to you for creating
the proxy and the protocol used between the proxy and the application. Instead
of creating a one-of-a-kind, technically isolated solution, think about generalizing
the types of requests and responses. You can probably let your proxy design
evolve into one or more services that fit into the SOA model. A lot of information
and tools are available for SOA solutions, so your overall effort will likely be less
than a one-of-a-kind solution. Here are points to consider when using your own
proxy and protocol:
Do not assume that any requests coming in are coming from your application.
When a protocol listener is deployed on a network, you really cannot control
who or what connects to it.
Use a robust authentication scheme. You might have an application in which
anonymous access is allowed, but those situations are pretty rare. Even if
access is not tightly constrained, there is often still a requirement for logging
who had access.
178 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Do not expose user credentials in non-encrypted network packets. Even if
your system handles only low security information, users often have the same
credentials for many systems (whether there is a policy forbidding that or not).
Compromising user credentials is often a bigger problem than compromising
your own system. You will often be able to use the easy solution of securing
the entire connection with SSL or Transport Layer Security (TLS); as a side
effect, the credentials are also protected.
Reverse proxies
After reading the section about custom protocols, a natural question is why the
product does not offer this type of solution for common use cases. Actually, it
does. It offers this not just for common use cases but for all use cases that the
Content Engine APIs serve.
You can restrict yourself to the WS transport and use an HTTP/HTTPS reverse
proxy for getting access to Content Engine inside the firewall. There are many
reverse proxy solutions available, and we are not discussing any specific
package here. The functional principles of a reverse proxy are the same at a high
level. The reverse proxy resides in the boundary area of the firewall and
selectively allows requests and responses to pass through.
Because WS transport can be used with both the Content Engine .NET API and
the Content Engine Java API, there are many application-building technologies
available to you.
In principle, you can use a reverse proxy with the EJB transport, but the options
are much more limited. Depending on the application server and underlying
protocol, there might not be any workable reverse proxy for EJB transport. Many
EJB protocols are derivatives of the standard Java Remote Method Invocation
(RMI) protocol. RMI can be tunneled over HTTP, but there is usually a pretty
significant performance cost to do that. If you can find a reverse proxy that
handles the underlying protocol of your application server’s EJB layer with
reasonable efficiency, the reverse proxy is a good model for EJB transport.
Object-valued properties
One of the more powerful features of the data model is object-valued properties
(OVPs). When one object needs to reference another object, use OVPs instead
of storing the ID or path to the object. By using OVPs, you can directly navigate
from object to object. For an OVP, the metadata provides type safety by only
allowing you to point to objects of a given class (or subclass), just like an object
reference in a programming language. The server provides features for
referential integrity (prevents the occurrence of pointers to nonexistent objects)
and configurable cascading deletion (automatically controlling the deletion of
pointed-to objects or preventing the deletion of pointing-to objects).
Reflective properties
A particularly useful form of OVPs is a reflective property, also known as
association properties. You can configure these OVPs via a wizard in IBM
FileNet Enterprise Manager. More than one object can point to a particular other
object. When that happens, the reflective property mechanism is used to simplify
the bookkeeping and let Content Engine perform most of the work. The usual
examples have a parent and many children. Suppose you have an Invoice
object with many LineItem child objects. With the reflective property mechanism,
define an Invoice property on the LineItem class and a LineItems property on
the Invoice class. (The naming is just a convention that works well in practice.
Any property names can be used.) To affiliate a new LineItem with the Invoice,
you need to only populate the Invoice property on the LineItem object. Because
it was created as a reflective property, the LineItems property on the Invoice
class is automatically updated to reflect the new line item. When you access the
180 IBM FileNet Content Manager Implementation Best Practices and Recommendations
multi-valued property (the LineItems property in our example), Content Engine
automatically performs a query for applicable objects with the appropriate value
in the single-valued property (the Invoice property in our example).
Many-to-many relationships
Especially because of reflective properties, it is easy to use OVPs to model
one-to-many and many-to-one relationships. You might find the need to model a
many-to-many relationship. The usual solution for that is to use an intermediate
object to express a single pair of relationships. The system class,
ReferentialContainmentRelationship (RCR), is an example of this solution for
the special case of containing objects in folders. A single object can be contained
in many folders, and a folder can contain many objects. The Document class has
a reflective property, Containers, which identifies all the RCRs (and, therefore all
the containment relationships) that reference a specific Document instance. The
Folder class likewise has a Containees property.
You can see that this intermediate relationship object, combined with reflective
properties, is a powerful tool for simplifying your modeling of many-to-many
relationships. Not only does it express the relationship, but it can also have
properties specific to that particular relationship. For example, an RCR has a
property, ContainmentName, that gives a unique name to a contained object for
the purposes of path-based navigation. When you use an intermediate object for
a relationship, you can add whatever properties are appropriate to your business
needs. Both ReferentialContainmentRelationship and
DynamicReferentialContainmentRelationship classes are subclassable, and
you can use them for your own relationships if they happen to fit the folder
containment model. Other good choices for the intermediate object are
subclasses of CustomObject and Link system classes.
Custom objects
You will often find yourself with a need to hold a collection of related properties
for one reason or another. In a database programming environment, you might
create a new table with rows representing the collection of information. The P8
Content Manager solution for this is to create a subclass of the CustomObject
class. The CustomObject system class has only a few properties of its own, and it
exists specifically to be subclassed for this kind of use. The invoice and line item
example used for reflective properties can also be modeled this way.
The technical note provides the following guidelines (several of which we have
already discussed earlier in previous sections of this chapter):
Limit rows returned.
Avoid non-indexed ordering and searching.
Avoid non-function-indexed case-insensitive comparisons.
Avoid unnecessary object type searches.
Avoid unnecessary column returns.
Use the free-threading model.
Tune query batch size parameters.
Avoid complex table linkages.
Avoid unnecessary result row ordering.
Avoid subqueries (Oracle).
Although the paper is written for Version 3.x, many of the guidelines are still
applicable for Version 4.0 software. To download the complete technical notice,
visit:
ftp://ftp.software.ibm.com/software/data/cm/filenet/docs/p8doc/35x/V10_
P8_Query_Perf_Guidelines_TechNote.pdf
182 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Another technical notice, IBM FileNet P8 Recommendations for Handling Large
Numbers of Folders and Objects Technical Notice, provides recommendations
about handling large numbers of contained folders, documents, and custom
objects in an IBM FileNet P8 environment. The paper is also written for Version
3.x. However, the recommendations are still applicable for Version 4.0 software.
To download the complete technical notice, visit:
ftp://ftp.software.ibm.com/software/data/cm/filenet/docs/p8doc/35x/fold
erlimitrecommendations.pdf
To view all available technical notices, go to the product documentation page for
IBM FileNet P8 Platform and look for the technical notices:
http://www.ibm.com/support/docview.wss?rs=3278&uid=swg27010422
One of the primary benefits of filing into a folder is browsing. Browsing allows
users to traverse a folder structure and locate content inside a folder. Hopefully,
all the content in any given folder relates to a particular activity or function.
Another advantage is that in a P8 Content Manager repository, content can be
186 IBM FileNet Content Manager Implementation Best Practices and Recommendations
filed in more than one folder at a time. There is one master copy of the content,
and references filed in multiple folders point back to the single master.
Remember that with P8 Content Manager, users can always search for and view
any content that meets search criteria whether or not the content is filed in a
folder. Folders are simply a convenience for users who wish to browse for
repository content.
There are use cases where unfiled content makes sense. Table 8-1 is a decision
table for the filed option as compared to the unfiled folder option.
Unfiled content (does not use folders) Content is only accessible by search.
There is no need to organize
repository content using folders.
Transactions that add content are
slightly faster.
Appropriate for high-volume image
applications where access will be by
search only.
In solutions of this kind, searching becomes the primary mechanism for content
retrieval. For this reason, the metadata that identifies the content when it is
added to the repository is vital.
188 IBM FileNet Content Manager Implementation Best Practices and Recommendations
folder structure; the first three levels are sufficient. The goal for the first three
folder hierarchical levels is a structure that is accessible at first glance to any
member of your organization.
Best practice: When designing a central repository folder structure, start with
the first three levels of the structure. Build this out for your entire organization.
The first three levels of the folder hierarchy form the central organization scheme
for your repository. Three levels are not an absolute rule; four or five levels might
be necessary for large organizations. The idea is to create a structure that
provides an organizational foundation.
Example: By function
The next folder structure is based on function. This structure is appropriate for
records systems that are typically organized by the function of the document, the
activity with which it belongs, and the record category under which it needs to be
filed. In this scheme, as shown in Figure 8-4, the folder levels are:
(1) Function → (2) Activity → (3) Document type
190 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Best practice: Create a folder structure that makes sense for your entire
organization. Develop your folder structure to at least the third folder
hierarchical level. This structure forms the framework for your repository
organizational scheme.
In the example in Figure 8-5 on page 192, folder creation rights to levels 1-3
need to be reserved for system administrators only. Folder creation rights under
the accounting folder need to be granted to the accounting group manager, and
folder rights under projects need to be granted to the IT group manager.
System-wide:
may not be changed Level 2
finance HR marketing IT
Level 3
accounting tax audit projects
Mary’s Files
Best practice: Do not create a folder that contains more than several hundred
subfolders. Otherwise, performance suffers as a result.
192 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Although the paper is written for Version 3.x, the recommendations are still
applicable for Version 4.0 software. To download the complete technical notice,
visit:
ftp://ftp.software.ibm.com/software/data/cm/filenet/docs/p8doc/35x/fold
erlimitrecommendations.pdf
Or, go to the product documentation page for IBM FileNet P8 Platform and look
for the technical notice:
http://www.ibm.com/support/docview.wss?rs=3278&uid=swg27010422
Catalog
Object Stores
When choosing a storage method for your content, keep in mind that each of
these storage methods can be configured on a per document class basis.
8.2.1 Catalog
The catalog is a relational database (RDMS) that is specified at installation time.
The catalog can be created on any supported RDMS; refer to the product
documentation for information about supported brands and versions.
194 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Property values
Document (content) links
Search definitions
The exception to this rule is custom property indexes. You can create a database
index for any class property except system-owned properties. These database
indexes, also known as single indexes, are stored within the object store
database. For properties that users search frequently, single indexes reduce
processing time for queries on this property. After creating indexes through IBM
FileNet Enterprise Manager, you can use RDMS tools to analyze the
performance of the indexes and apply refinements as necessary.
Figure 8-7 on page 196 shows the Date Hired property where you can set or
remove the associated database index.
Note: When selecting a property to index, the object store search must be
case-sensitive, or the index is not created correctly. You must create
additional indexes in Oracle and DB2® to avoid full table scans.
When choosing either file or database storage for a document class, consider the
size of the content that will be stored. We recommend storing large content in file
stores. For small content (maximum file size smaller than 10 MB), a database
store has a measurable storage and retrieval performance advantage over file
storage.
196 IBM FileNet Content Manager Implementation Best Practices and Recommendations
8.2.3 File stores
With a file store, P8 Content Manager stores content files on a local or shared
network disk drive. A file store is the most common object store configuration. To
organize the files on disk, P8 Content Manager sets up a managed hierarchy of
directories on the specified drive.
Note: The file names of the content written to this directory will be named for
the content object’s Global Unique Identifier (GUID).
File store directory hierarchies can be quite large and are limited only by
available disk space. The software handles directory size limitations by
automatically creating new directories when needed.
Storage
Doc class A Policy1
Disk Drive
Storage Farm
Farming
A storage area farm is a group of storage areas that acts as a single logical
target for content storage. With farming, Content Engine provides load-balancing
capabilities for content storage by transparently spreading the content elements
across multiple storage areas. Therefore, the storage policy functions as both the
mechanism for defining the membership of a storage area farm and also the
means for assigning documents to that farm.
198 IBM FileNet Content Manager Implementation Best Practices and Recommendations
3. Create a new file store under a new directory name, and complete both
wizards. See Figure 8-10.
4. Refresh the object store.
5. On the document class that will be stored in the new location, right-click and
select properties. Select the new storage policy.
All documents created with this object class will now be created in the new
storage area.
If it becomes necessary to move a file store to a new disk, the Move File Store
wizard enables you to relocate a file storage area from one physical location to
another. Rather than moving a file storage area for you, the Move File Store
wizard prompts you to perform certain steps, and it also performs certain steps
for you. For the actual transfer, you can use the file transfer tools of your choice.
200 IBM FileNet Content Manager Implementation Best Practices and Recommendations
8.3 P8 Content Manager searches
There are several methods of searching for content in the P8 Content Manager
repository. The methods can be divided by the purpose of the search:
User-invoked searches
Content-based searches
Repository maintenance searches
Workplace search
Workplace search is a Web site that can be customized by individual users. It is
a feature of the Workplace Web interface. Search appears when users log in to
P8 Content Manager using Workplace (see Figure 8-11 on page 202).
Workplace search is an ideal tool for user-invoked ad hoc searches for repository
content. When users click Modify in the lower right area of the page, they can
modify the search criteria. Any system or custom property can be added to the
criteria display.
Note: When users modify their search criteria, the system remembers the
settings and will display them again on the next visit to the site.
Workplace-stored searches
Workplace also offers a tool for designing search templates for more
sophisticated content searches. Search Designer offers the following enhanced
features:
Cross-object store searches
Search criteria expressions (AND/OR options)
Preset criteria for filtering search results
Searches that appear as links on a browser favorites menu
Use Search Designer to create stored searches. This is an applet that can be
found on the Advanced Tools page within Workplace.
202 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Cross-repository search
To create a cross-repository search, click the Object Store tab to add the
repositories to be search to the selected object store list. See Figure 8-12.
204 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Figure 8-14 Preset search criteria
206 IBM FileNet Content Manager Implementation Best Practices and Recommendations
assist with managing the size of your audit log and for managing entries in the
QueueItem table.
Create, save, and run SQL queries.
Searches can be combined with bulk operations that include the following
actions (available on the Query Builder’s Actions tab):
– Delete objects.
– Add objects to export manifest.
– Undo checkout (for documents).
– Containment actions (for documents, custom objects, and folders): file in
folder and unfile from folder.
– Run VBScripts or JScripts (Query Builder Script tab).
– Edit security by adding or removing users and groups.
– Lifecycle actions: set exception, clear exception, promote, demote, and
reset
Query Builder
To access Query Builder, open IBM FileNet Enterprise Manager and expand the
object store that you want to search. Right-click Search Results and select New
Search. See Figure 8-15.
There are two possible ways to construct searches: Simple View and SQL View.
Select view from the toolbar to select a view style:
Simple View offers a point-and-click interface where you can select tables,
classes, and criteria from drop-down lists.
SQL View translates anything that you create in Simple view (one-way
translation only: you cannot translate an SQL View into a Simple View) and
presents the query in an SQL text window that you can then directly edit or
load any *.qry files that you have saved on the network.
Tip: To aid administrators using SQL View, the P8 Content Manager help files
contain P8 Content Manager database view schema.
IBM FileNet-provided search templates are installed with every Content Engine
or IBM FileNet Enterprise Manager-only installation into a folder on the local
server named SearchTemplates, which is located in the FileNet installation
directory. Any queries placed in this folder appear in IBM FileNet Enterprise
Manager’s Saved Searches node as long as they have .sch as a filename
extension.
Multiselect operations
Multiselect (or bulk) operations perform an operation on all objects returned in
the search result dialog. This feature is especially useful for object store
maintenance activities. With multiselect operations, you can perform the
following actions on multiple files at the same time:
Delete
File to folder
Unfile from folder
Undo checkouts
Change life cycle states
208 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Add to security access control lists (you cannot delete existing entries)
Run an event action script
To access the multiselect menu, run a Query Builder search and select the items
that you want to modify from the result set. Right-click and select Multiselect
Operations.
For example, assume that several documents had been checked out by
someone who left your company. Using multiselect operations, you can search
for all documents that were left checked out by that person and undo these
checkouts in one operation. To do this, you use the Query Builder to construct a
search to find all documents currently checked out under the former employee’s
system login name.
The IBM FileNet Records Manager object store that hosts record information is
subject to processing intensive database activity during retention and disposition
processing. In addition, record objects are small and best-suited for database
stores. For these reasons, records need to be stored in a separate object store.
Best practice: Set up a separate, database object store for IBM FileNet
Records Manager. This object store is commonly called the file plan object
store (FPOS).
Email
Manager Email Object Store
Content Records
Object Store Object Store
210 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Another reason to implement multiple object stores is a requirement to strictly
separate content for security reasons. Although it is possible to keep classified
content secure using marking sets and security policies, certain content must be
kept absolutely separate. In these situations, install a second object store for
classified content.
Here are a few situations where secure object stores are a solution:
Board of director level content
Secret or top secret government content
Public-facing Internet accessible libraries
Service companies that offer enterprise content management services to
multiple customers
WAN
Link
Location 1 Location 2
Repository Repository
212 IBM FileNet Content Manager Implementation Best Practices and Recommendations
9
Business continuity planning in the limited scope of IT functions will involve the IT
department, facility management, telecommunications, and line of business
management who can assist in evaluating which IT functions are mission-critical
after a disruption or disaster. High availability and disaster recovery plans need
to be formally developed and reviewed by all these stakeholders, implemented,
and then regularly tested by all staff to be certain that they will function as
expected during and after a real disruption.
This chapter covers the part of business continuity that concerns restoring IT
functions, in particular P8 Content Manager, after a disruptive event.
214 IBM FileNet Content Manager Implementation Best Practices and Recommendations
9.2 Defining high availability (HA)
What is high availability (HA) and how is it measured? We start by defining
availability. A business system is said to be available whenever it is fully
accessible by its users. Availability is measured as a percentage of the planned
uptime for a system during which the system is available to its users, that is,
during which it is fully accessible for all its normal uses.
Planned uptime is the time that the system administrators have agreed to keep
the system up and running for its users, frequently in the form of a Service Level
Agreement (SLA) with the user organizations. The SLA might allow the system
administrators to take the system down nightly or weekly for backups and
maintenance, or, in an increasing number of applications, rarely if at all. Certain
mission critical systems for around-the-clock operations now need to be
available 24 hours a day, 365 days a year.
The concept of high availability roughly equates to system and data available
almost all of time, 24 hours a day, 7 days a week, and 365 days a year.
Achieving high availability means having the system up and running for a period
of time that meets or exceeds the SLA for system availability, as measured as a
percentage of the planned uptime for a system.
Table 9-1 on page 216 helps quantify and classify a range of availability targets
for IT systems. At the low end of the availability range, 95% availability is a fairly
modest target and hence is termed basic availability. It can typically be achieved
with standard tape backup and restore facilities. The next level up, enhanced
availability, requires more robust features, such as a Redundant Array of
Independent Disks (RAID) storage system, which prevents data loss in the first
place, rather than the more basic mechanisms for recovering from data loss after
it occurs. Highly available systems will range from 99.9% to 99.999% availability
and require protection from both application loss and data loss. At the high end of
this continuum of availability is a fault tolerant system that is designed to avoid
any downtime ever, because the system is used in life and death situations.
To make this more concrete, consider the maximum downtime that can be
absorbed in a year while still achieving 99.999% availability, also called five nines
availability. As Table 9-1 indicates, five nines permits no more than 5.3 minutes of
unscheduled downtime per year, or even less if the system is not scheduled for
round-the-clock operation. This is near continuous availability, but not strictly fault
tolerant. For a three nines target of 99.9%, we can allow 100 times more
downtime, or 8.8 hours per year. An availability target of 99%, which still sounds
like a high target, can be achieved even if the system is down 88 hours per year,
or over three and half days. So the range of availability is actually quite large.
You might be asking yourself, “Why not provide for the highest levels of
availability on all IT systems?”. The answer, as always, is cost. The cost of
providing high availability goes up exponentially as availability approaches
99.9% and higher.
Our advice is to determine what has caused the most downtime in the past for a
particular system and focus first on that. Frequently, we have found that stricter
216 IBM FileNet Content Manager Implementation Best Practices and Recommendations
change control and better load testing for new applications will pay off the most.
Focus on the root causes of outages first and then address the secondary and
tertiary causes only after protecting against the root causes.
Here are several examples of best practices for avoiding downtime from people
and process problems:
System administrators need to be well-trained and dedicated full-time to their
systems, so that they are least likely to commit pilot errors.
The applications running on the system must be designed with great care to
avoid possible application crashes or other failures.
Exception handling, both by administrators and application programs, must
be well thought-out, so that problems are anticipated and handled efficiently
and effectively when they occur.
Comprehensive testing and staging of the system is paramount to avoiding
production downtime. Testing of the system under a simulated production
workload is critical to avoiding downtime when the system is stressed with a
peak load during production. Downtime on a test system does not affect
availability of the production system, so make sure to wring out all the
problems before taking a new system, software release, service pack, or
even software patch into production.
Deploying a new application into production must likewise be planned and
tested carefully to minimize the possibilities of adversely affecting production
due to an overlooked deployment complication.
Thorough user training will help keep the system performing well within the
bounds for which it was designed. Users who abuse a system due to
ignorance can affect overall system performance or even cause a system
failure.
Make sure that all sources of downtime are addressed, if high availability is truly
to be achieved. After the fundamental people-related and process-related
problems have been addressed, you need to consider hardware and software
availability next.
As we have already discussed in 3.2, “Scalability” on page 32, the key concept
for a server farm is to distribute the incoming user workload across two or more
active, cloned servers. This distribution is commonly called “load balancing,”
which can be implemented either in hardware or software.
In a load-balanced server farm, clients of that server see one virtual server, even
though there are actually two or more servers behind the load-balancing
hardware or software. The applications or services that are accessed by the
server’s clients are replicated, or cloned, across all the servers in the farm. And
all those servers are actively providing the application or service all the time.
The load-balancing software or hardware receives each request and uses any
one of a variety of approaches for distributing the request workload over the
servers in the farm. This can be a simple round-robin approach, which sends
requests to the servers in a predefined order. A more sophisticated load balancer
might use dynamic feedback from the servers in the farm to choose the server
with the lightest current load or the fastest average response time, for example.
In any case, the load balancer keeps track of the state of each server in the farm,
so that if a server becomes unavailable, the load balancer can direct all future
requests to the remaining servers in the farm and avoid the down server, thereby
masking the failure.
The key enabler for a server farm is the load balancer. IBM FileNet leverages
IBM and third-party load-balancing hardware and software products. Microsoft,
for instance, includes its Microsoft Network Load Balancer with every copy of
Windows Server® 2000 and Windows 2003 Server. All the Java application
server vendors provide software for to balance the Java application workload
running in their Java 2 Platform, Enterprise Edition (J2EE) environments. For
example, IBM WebSphere Network Deployment and Extended Deployment
application server products include built-in software load balancing. J2EE
218 IBM FileNet Content Manager Implementation Best Practices and Recommendations
application server vendors, including IBM, use the term cluster for their
load-balancing software feature.
Figure 9-1 shows a logical diagram of a load-balanced server farm. This figure
shows a pair of hardware load balancers and multiple servers in the server farm.
Redundancy is essential to prevent the failure of one load balancer from taking
down the server farm.
Load
Balancer
This concept of no single point of failure is key to high availability. Every link in
the chain, that is, every element in the hardware and software, must have an
alternate element available to take over in case it fails. Software load balancers,
for example, are designed to avoid any single point of failure; therefore, each
server in the farm has a copy of the load-balancing software running on it in
configurations using software instead of hardware for load balancing.
Note that the software running on each server in a farm is functionally identical.
As changes are made to any server in the farm, you must replicate those
changes to all the servers in the farm.
Load balancing offers a good solution: Any client calling into a load-balanced
server farm can be directed to any server in the farm. The load can be evenly
distributed across all the servers for the best possible response time and server
usage. However, load balancing can be a problem if the servers in the farm
Load balancers can be configured for session-based load balancing to solve this
session state problem. This is also known as sticky sessions, session affinity, or
stateful load balancing. The load balancer keeps track of which server it selected
at the beginning of a user session and directs all the traffic for that session to the
same physical server. Session-based load balancing is required for the
Application Engine, but not for the Content Engine or Process Engine.
Now, we turn to server clusters and explore how they differ from farms.
Business logic and data tier servers all differ from Web and presentation servers
in that they directly manage substantial dynamic data, such as content or
process data. A stream of dynamic data, by definition, is a stream of new or
rapidly changing data. It is common for a single server to manage a dynamic
data set, rather than a set of servers that need to cooperate to manage the data
jointly.
Because of that single server architecture, a server farm with two or more active
servers does not fit well with servers that have not been designed for cooperative
data management. Yet a second server is still needed for continued availability,
in case the first server fails. The solution in this case is an active-passive server
cluster, where the second server stands by until the first server fails, before
stepping in to take over the data management.
The second server needs to have access to the data that was being managed by
the first server, either the same exact copy, or a copy of its own. The common
220 IBM FileNet Content Manager Implementation Best Practices and Recommendations
solution allows both servers to have access to the same copy of data either via a
network file share or, more commonly, a Storage Area Network (SAN) device
that both servers can access, but only one at a time. The active server owns the
SAN storage, and the passive server has no access. Sharing the SAN storage in
this way is a simpler solution than replicating the data to a second storage device
accessed by the second server.
So, shared data storage is a key concept for server clusters. Figure 9-2 shows
two servers in a server cluster with access to the same shared storage. Recall
that server farms typically do not have this requirement for shared storage, so
this is an essential difference between server farms and server clusters. Oracle
RAC and the Content Engine are exceptions, in that they exhibit both server farm
and server cluster characteristics. They take advantage of load balancing,
combined with cooperative data management using storage that is shared by all
the RAC or Content Engine servers. In the case of a load-balanced server farm
with shared storage, all the servers are active and thus need to access the
storage in parallel, so a network file share is required. An active-passive server
cluster, however, is designed to allow only the active server to access the
storage, so the single-owner model of SAN storage works well. The typical
server cluster does not support load balancing, but it does support shared
storage via SAN. Note that the storage is shared in the sense that both servers
are connected to the same storage, so they share access to the same storage,
but never concurrently in the case of SAN storage.
Shared storage
As with server farms, clients of a server cluster see one virtual server, even
though the physical server they interact with will change if the primary server
fails. If the primary server fails, a failover occurs, and the second server takes
over the data copy and starts up the software to manage the stored data. It also
Both triggering a failover and actually accomplishing the failover are the
responsibility of clustering software running on both servers. This software is
configured on the secondary server to monitor the health of the primary server
and initiate a failover if the primary server fails.
After the failed server is repaired and running again, a failback is initiated to shift
the responsibility back to the primary server and put the secondary server in
waiting mode again. This failback is necessary to get back to a redundant state
that can accommodate another server failure.
In certain cases, intentional failovers can be used to mask planned downtime for
software or hardware upgrades or other maintenance. You can upgrade and test
the secondary server offline, and then, you can trigger a failover and apply the
upgrade to the primary server while the secondary server is standing in for the
primary server.
This type of configuration, in which the second server is inactive or passive until it
is called to step in for the active server, is called an active-passive server cluster.
Several clustering software products, if not all, also support an active-active
cluster configuration, which is similar to a server farm where all servers are
active. An active-active cluster configuration is useful for data managing servers
that are designed to share the management across more than one server.
However, IBM FileNet P8 products that use clustering software for high
availability all require an active-passive configuration. IBM FileNet P8 products
that work with an active-active configuration always use a server farm and load
balancing rather than clustering software. (Server farms are always
active-active.)
Server cluster software requires agents or scripts that are configured to manage
key server processes on a particular server. These agents or scripts are
configured so that they can monitor the health of the application software, as well
as start and stop the application software on that server.
Cluster software typically comes with predefined agents or scripts for common
server types, such as database servers. In addition, you can develop custom
agents when there is no predefined, or when you want more granular control of
the processes during a failover.
222 IBM FileNet Content Manager Implementation Best Practices and Recommendations
9.3.3 Geographically dispersed server clusters
Most server clusters consist of two side-by-side servers. However, certain
software vendors also support geographically dispersed clusters. Symantec’s
Veritas Cluster Server, for instance, supports both stretch clusters and replicated
data clusters. A stretch cluster is defined as two servers in a cluster separated by
as much as 100 km (62 miles). The distance limitation is due to the requirement
to connect both servers via fiber to the same Storage Area Network device for
shared storage and also due to the maximum amount of time allowed for the
heartbeat protocol exchange between the two servers. The two servers in a
stretch cluster always share the same SAN storage device, just as though they
were side by side and operate identically with a local server cluster.
You can use a stretch cluster as a disaster recovery solution as long as there is
an offline copy of the data at the second site. It requires only two servers total,
rather than the more typical three servers that are needed for HA plus DR: two in
a local cluster in one site for HA and a third server in the other site in the event of
the loss of the first site.
A replicated data cluster is similar to a stretch cluster, but the remote server
always has its own replicated copy of the data. In the event of a failover, the
second server comes up on its local copy of the data. In certain cases (but not all
cases), this capability removes the need for an expensive fiber connection
between the two sites, because neither server needs the speed of fiber to access
storage at the other site. Data replication can be done over an IP network. There
is still a 100 km (62 miles) distance limitation to insure that the heartbeat
between servers will not time out due to transmission delays and to allow for
synchronous replication. See 9.5.1, “Replication” on page 231 for an explanation
of synchronous and asynchronous replication.
Like a stretch cluster, a replicated data cluster can act as a DR solution, as well
as an HA solution. However, a replicated data cluster cannot provide the same
level of availability as a local cluster, because of the additional downtime
required for a data resync to the primary site on a site failback. In addition, the
communication requirements between the two sites are typically much more
expensive and substantially more prone to failure than the local communication
requirements between two servers in a local cluster. In order to support a
replicated data cluster, the two sites need to be connected by a dedicated and
redundant high-speed network, and their physical separation must be no more
than 100 km (62 miles).
Asymmetric 1-to-1
The simplest of these configurations adds a passive server to be paired with
each active server. See Figure 9-3 on page 225. This asymmetric 1-to-1
configuration doubles the numbers of servers, assuming active-passive
clustering, and half of those servers are idle until an active server fails. Luckily,
there are more efficient server cluster configurations.
1
These variants are described here using Symantec Veritas terminology.
224 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Database Server File Server
Symmetric
A symmetric server cluster uses two active servers as backups for each other,
thereby avoiding any idle servers. In the example in Figure 9-4, part A on the left
shows one server running a database and the other server running a file server.
If the file server fails, the clustering software on the database server detects the
failure and starts up a copy of the file server on the database server as shown in
part B of Figure 9-4. Note that both servers must have both the database and file
server software installed for this to work.
Database Server
Database Server File Server File Server File Server
Failover
A. Initial state and state after failback B. State after failure and failover
Symmetric clusters require the same operating system on all servers in the
cluster. Also, the two servers need to support coresidency for this configuration
to work. In this example, the database and file server support coresidency; that
is, the database and file server can be installed on the same server and coexist
there. When the failed server is fixed or replaced, a failback is required to get
back to an HA configuration (shown in part A of Figure 9-4) in order to handle a
subsequent failure.
Asymmetric N+1
Another asymmetric configuration, which is called N+1, limits the number of idle
servers to one passive server that is shared by N active servers. In the example
in Figure 9-5 on page 227, the three servers are a database server, a file server,
and one extra server that acts as the passive server for both of the active
servers. If the file server fails, for instance, the file server function fails over to the
extra server.
Another asymmetric configuration, this one called N+1, limits the number of idle
servers to one passive server that is shared by N active servers. In the example
in Figure 9-5 on page 227, part A shows three servers in a 2+1 configuration: a
database server (1), a file server (2), and one extra server (3) that acts as the
passive server for both of the active servers. If the file server (2) fails, for
instance, the file server function fails over to the extra server (3) as shown in part
B.
One of the advantages of this approach is that the failed server (2) simply
becomes the new passive server when it is repaired, so no failback (and the
associated downtime) is required for N+1 clusters. Another advantage is that,
like asymmetric 1-to-1 clusters, there is no change in server performance after a
failover, assuming all three servers are identical in capacity. In contrast,
symmetric clusters cannot guarantee unchanged performance after a failover,
because two independent servers with independent workloads have to coexist
on the same physical server after a failover.
226 IBM FileNet Content Manager Implementation Best Practices and Recommendations
New
File New
Database Passive Passive
Server Database File
Server Server Server
Server Server
1 2 3 1
Failover 2 3
A. Initial state before server 2 fails B. State after File Server fails over from
server 2 to 3, and subsequent repair to 2
Instantaneous failover Yes, all servers active all No, must wait for software
the time to be started after failover
Shared storage between Not necessarily, but can Yes, typically SAN
the servers include a network file share storage, which allows just
for parallel accesses from the active server to access
all the servers in the farm the storage
Now that we have covered the differences between server farms and server
clusters, we explore the advantages of farms over clusters and the advantages
of clusters over farms. Server farms have no idle servers, by definition, because
all servers in a farm are active, whereas asymmetric server clusters always have
one or more idle servers in a steady state. Even more importantly, you can
expand server farms by simply adding a server clone, thereby scaling out the
farm to handle larger workloads. This horizontal scalability is not possible with
active-passive server clusters. The last advantage of a farm over a cluster is
faster recovery time. Server cluster failovers are delayed by the time that it takes
to start up the FileNet software on the passive server on a failover, whereas all
the servers in a server farm are active and immediately available to accept work
that has been redirected away from failed servers.
Many clients prefer clustering IBM FileNet P8 servers over farming in order to
standardize on clustering for all servers in their data center. They anticipate
lower total cost of ownership through this standardization, because there are
fewer technologies to learn, support, and maintain.
228 IBM FileNet Content Manager Implementation Best Practices and Recommendations
farming configurations clusters. As we have seen, farms and clusters, under our
definition of those terms, are quite different, hence the emphasis here on distinct
terms for these HA approaches.
For recovery from the loss of an entire production system in a disaster, a full
remote system with its own up-to-date copy of the data is needed. All users and
operations must be switched over to the remote system. Alternatively, the
optimal high availability solution is an automated, localized, and limited
substitution of a single replacement component for the failed component. Server
farms and clusters substitute a single replacement component with minimal
disruption to the rest of the system and its users. Disaster recovery solutions are
much more drastic, disruptive, time-consuming, and heavyweight, because they
have to replace an entire system or data center, not just a single failed
component. Therefore, disaster recovery solutions are an inappropriate choice
for high availability.
In certain cases, the most recent data changes at the production site, which
stretch back to a point in time prior to the disaster, do not make it to the recovery
site because of a time lag that is inherent in how the data is replicated. The
magnitude of this time lag is dependent on the particular type of data replication
technology that you choose. Assuming a disaster occurs, the recovery point is
the point in time before the disaster that represents the most recently replicated
data. How far back in time is the business willing to go after the disaster
happens? That is, the Recovery Point Objective translates to how much recent
data the business is willing to lose in a disaster.
230 IBM FileNet Content Manager Implementation Best Practices and Recommendations
The duration of time that passes before the systems can be made operational at
the recovery site is called the recovery time. The Recovery Time Objective is the
business’s time requirement for getting the system back online. That is, how
much downtime can the business endure?
9.5.1 Replication
Backing up to tape or other removable media is the minimum for copying data for
use after a disaster. You must ship the media off-site to a location outside of the
projected disaster impact zone. The greater the distance of the location from the
production site, the lower the risk that both production and recovery sites will be
impacted by the same disaster. One general rule is that a backup tape vault and
recovery site must be at least 30 miles away from the production system, which
in most cases, is sufficient to avoid a flood or fire disabling both sites. However,
sites that close together can still be in the same impact zone for earthquakes,
hurricanes, or power grid failures, so more cautious organizations separate their
production and recovery sites by hundreds, if not thousands, of miles.
Companies usually perform backups once a day, which meets only a 24 hour
recovery point objective, which means that as much as 24 hours of data can be
lost. The recovery time required for data restoration from tape can be days due to
the need to restore a series of tapes that represents a full backup and
For a better RPO, that is, to reduce the potential data loss in a disaster, you need
to periodically replicate the data to a remote disk, because periodical replication
can be done more often than tape backup, which effectively reduces the window
of data loss. Continuous replication that is done in real time can avoid any data
loss at all.
Note that there are several levels at which you can perform replication: the
application level, the host level, and the storage level. Database replication is the
best example of application-based replication. Host-based replication is beneath
the application level, but it still resides on the server host and typically runs at the
file system or operating system level. Storage-level replication is implemented by
the storage subsystem itself, frequently a Storage Area Network (SAN) device or
a Network-Attached Storage (NAS) device.
Application-based replication
Application-level software that understands the structure of data and
relationships between data elements can copy the data intelligently, so that the
structure and relationships are preserved in the replica. Database and
object-based replication are examples. Database replication insures that the
replica database is always in a consistent state with respect to database
transactions. Object-based replication insures that content objects that include
both content and properties are replicated as an atomic unit, so that the content
and properties are always consistent with each other in the replica.
Each database vendor has replication products that replicate just the database,
but not other data. Examples include IBM DB2 high availability disaster recovery
(HADR) and Oracle Data Guard. Database replication products are typically
based on shipping database logs to the recovery site to be applied to a database
copy there. The advantage of these products is that they keep the database
replica in a fully consistent state at all times, with no incomplete transactions,
which reduces the recovery time required when bringing up the database after a
disaster. The disadvantage of these products is that they have no means to
replicate anything other than databases. File systems that need to be kept
consistent with the database, for instance, have to be replicated by a different
232 IBM FileNet Content Manager Implementation Best Practices and Recommendations
replication mechanism, which introduces the possibility of inconsistency between
the database and file system replicas.
Host-based replication
In contrast to application-based replication, host-based replication has no
understanding of the data content, structure, or interrelationships. It detects
when a file or disk block has been modified and copies that file or block to the
replica. NetApp ReplicatorX™, Symantec Veritas Volume Replicator, and
Double-Take Software’s Double-Take are examples of host-based replication
products. Unlike application-based replication, they can be used to replicate all
forms of data, whether it is in a database, a file system, or even a raw disk
partition. Several of these products use the concept of consistency groups, which
tie together data in different volumes and allow all the data to be replicated
together, thereby maintaining consistency across related data sets, such as
databases and file systems. In contrast to application-based replication,
however, the replica is not guaranteed to be in a clean transactional state,
because the replication mechanism has no visibility into database or file system
transactions. Recovery can take longer, because incomplete transactions must
be cleaned up prior to making the data available again.
Storage-based replication
All of the storage vendors offer storage-based replication for their SAN and NAS
products. The storage products themselves provide storage-based replication
and do not use server host resources. Examples include IBM Metro Mirror
(PPRC) and Global Mirror (XRC), EMC SRDF and MirrorView, Hitachi Data
Systems TrueCopy, and Network Appliance SnapMirror®.
NAS products replicate changes at the file level, whereas SAN products replicate
block by block. In both NAS and SAN replication, as with host-based replication,
there is no knowledge of the structure or semantics of the stored data. So,
databases replicated in that way can be in any transient state with regard to
database transactions and hence might require significantly more database
recovery time when the replica is brought online. That increases the overall
recovery time.
NAS replication covers any data in the file system, whereas SAN replication,
which is at the lower level of disk blocks, covers all data stored on the disk.
For sites that are separated by more than 60 miles (96.5 km), asynchronous
replication is the choice. Asynchronous replication is not done in lock step, the
way that synchronous replication is. Instead, the local disk write is allowed to
complete before the write is completed to the second site. The update to the
second site is said to be done “asynchronously” from the local update, that is, not
in the same logical operation. This method frees the production system from the
performance drag of waiting for each disk write to occur at the remote site.
However, it opens up a time window during which the production site data differs
from the recovery site copy. That difference represents the data that is lost in a
disaster when asynchronous replication is used. In exchange for that data loss,
the two sites can be any distance apart, although the further apart they are, the
greater the typical data loss.
Storage vendors have devised a way to insure no data loss over any distance,
however, by a configuration involving a third copy as shown in Figure 9-6 on
page 235. This solution requires a nearby synchronous replica and a remote
asynchronous replica. The data from the production site is replicated
synchronously to a backup site within 60 miles (96.5 km), which is Site 2 in
Figure 9-6 on page 235, and replicated asynchronously to a remote site, Site 3,
any distance away. As long as only one of the three sites is lost in a disaster, it is
234 IBM FileNet Content Manager Implementation Best Practices and Recommendations
always possible to recover all the data from the remaining two sites. In the
diagram in Figure 9-6, if Site 1 is lost in a disaster, the synchronous copy at Site
2 holds all the data up to the moment of the disaster. From there, the data can be
replicated asynchronously to Site 3, the actual recovery site, thereby extending
zero data loss all the way to Site 3. It works, but the added replica and site can
be expensive.
Synch Asynch
Asynch
Several vendors support an optimized version of the second site called a “bunker
site” where only the blocks not yet replicated are stored and no others. The list of
the blocks that have not yet been replicated is typically a small list, so a bunker
site can be configured with minimal storage space, which reduces the overall
cost of this solution. IBM Asynchronous Cascading Peer-to-Peer Remote Copy
(PPRC) is an example of this three-site zero data loss solution.
Most organizations prefer to have at least one manual decision step before
declaring a disaster, because of the gravity and cost of switching all operations
and users to a recovery site. But after that decision has been made, a global
cluster manager can automate the rest of the process. This is advantageous,
because automating the process reduces the chances of human error, makes
the process repeatable and testable, and thus increases the chances of a
successful site failover in the highly stressful period following a disaster. IBM
High Availability Cluster Multiprocessing/Extended Distance (HACMP/XD) is one
example of a global cluster manager with these capabilities for the AIX platform.
Symantec Veritas Global Cluster Option is another example that runs on a
variety of platforms.
236 IBM FileNet Content Manager Implementation Best Practices and Recommendations
9.5.3 Disaster recovery approaches
IBM FileNet Lab Services has defined three common approaches for disaster
recovery:
Build it when you need it.
Third-party hot site recovery service.
Redundant standby system.
For an RTO of three days or more, the minimum level of data replication, namely
backup to tape, is sufficient. As we noted earlier, a form of point-in-time backup,
such as tape backup, is always required, regardless of RTO, as a means of
recovering from data corruption or accidental deletion. The solution is to retrieve
the latest backup tape or other point-in-time backup from the off-site storage
location and restore the data to a point in time prior to the corruption or deletion
of the data. Full data restoration from tape is a slow and laborious process, which
typically involves a full backup tape and a number of incremental backup tapes
after that, which takes days for completion. Backups are done periodically,
usually once a day, possibly multiple times a day, so the RPO for this minimum
solution is hours to days of lost data.
Periodic replication to off-site storage characterizes the next two solutions up the
cost curve with an increase in cost for communications links, but providing an
RPO and RTO of hours, not days. Periodic point-in-time backup to remote
storage, usually disk storage, is the first step up from standard local tape backup.
The next step up consists of shipping database or file system update logs to the
remote recovery site, where they are applied to a copy of the data to bring it up to
date with that log. These are both done on a periodic basis, but as the period is
shortened, it approaches the limit of continuous replication, which is the next step
up the cost curve.
238 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Table 9-5 Range of disaster recovery solutions
Recovery time Recovery point Cost Technologies
The cost now starts to accelerate upward. As the name implies, continuous
replication is the process of replicating data to the recovery site as it changes,
that is, on a continuous basis. Near continuous and continuous replication
greatly decrease the potential for data loss when compared to periodic
replication, which brings the RPO down to seconds worth of data loss, or even
zero data loss in the case of synchronous replication.
DNS Server
Web/Presentation Tier
FileNet Server Farms
(AE, eF, RM, TCM, FSP)
Data Tier
Server Farms
& Clusters
Data Replication
At the business logic tier, sometimes also called the services tier, the HA best
practices shown in Figure 9-7 are a mix of load-balanced server farms and
active-passive server clusters. The P8 Content Engine and Process Engine
servers must both be deployed in load-balanced server farms.2 The Content
240 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Engine has been qualified with both hardware and software load balancers; the
Process Engine requires a hardware load balancer as of P8 4.0.
IBM FileNet Content Services and IBM FileNet Image Services repositories can
be federated with the P8 Content Manager via Content Federation Services.
Both of these older products must be deployed in active-passive server clusters
for high availability; they do not support being deployed in load-balanced server
farms.
At the data tier, all the database servers can be deployed in active-passive
server clusters for HA. Oracle can also be deployed in its load-balanced RAC
configuration for HA. The Content Engine makes use of network file shares for
file storage areas for content storage and index areas for content-based search
indexes, so the network file servers or NAS devices underlying the Content
Engine file storage areas and index areas need to be highly available as well. For
a network file server, the typical HA configuration is an active-passive server
cluster; NAS devices typically have either active-active or active-passive
configurations for HA.
The best practice for redirecting the user community to the replacement systems
at the recovery site is via DNS updates. DNS aliases (CNAMES) must be used
by the user’s client computers to locate the P8 Content Manager services, so
that the aliases can be redirected after a disaster through DNS updates. This
redirection allows reconnection to the recovery site without making any client
2
Prior to P8 4.0, the Content Engine supported both farming and clustering for its Object Store
Services component, but only active-passive clustering for its File Store Services component. For
4.0, these components were unified and now support farming across the board. Prior to P8 4.0, the
Process Engine required active-passive server clustering for high availability.
Why does this approach not work with Content Manager? The key is the nature
of the data and how it must be managed. P8 Content Manager, as the name
suggests, is designed to manage rapidly changing and growing collections of
data that are being accessed and modified in parallel by users across an
enterprise. Unlike the largely static data of a corporate Web site, which is
published or released to the site in a carefully controlled authoring and
information publication process, content in a typical P8 Content Manager object
store is being collaboratively authored, enhanced, deleted, created, and
processed in a dynamic manner under transaction control to avoid conflicting
changes. As a result, only a single active copy of the data can be online and
changeable at any point in time so that transaction locking can be enforced and
changes are saved in a safe, consistent manner. This means that the basic idea
of two sites, in which each site has an active copy of all the content, is not the
best practice for a transactional system. It is not supported by the P8 Content
Manager.
242 IBM FileNet Content Manager Implementation Best Practices and Recommendations
How about using geographically dispersed farms and clusters, that is, with the
farms and clusters split between the two sites? If one server fails, the server at
the other site takes over, either coming up at the time of failure in the case of an
active-passive server cluster or simply taking on redirected client requests in the
case of server farms. Again, there is an availability trade-off because of the
added risk of communication problems between the two sites. As we noted
earlier, we do not recommend geographically dispersed farms and clusters as
best practice because of the added risk and higher networking costs.
So the best practice is to deploy local server farms and clusters for high
availability in order to provide for continuing service in the event of local
component failures and to deploy a second site with data replication and,
optionally, global clustering, to provide for rapid recovery from disasters. The
best practice is to locate the recovery site outside the disaster impact zone of the
production site.
Two documents (downloadable from this Web site) are devoted to high
availability and disaster recovery:
FileNet P8 High Availability Technical Notice
FileNet P8 Disaster Recovery Technical Notice
For those of you who are not interested in concepts and background information
but the details of P8 Content Manager deployment, you can skip to the last
section, 10.6, “P8 Content Manager deployment” on page 264, for specific
information.
246 IBM FileNet Content Manager Implementation Best Practices and Recommendations
While trying to isolate phases of the software development cycle into different
environments, the complexity of maintaining the different states becomes
challenging. Deployment must be maintained in an organized manner.
Larger companies tend to add these additional environments to the basic three
environments identified earlier:
Performance testing
Training
Staging
The more environments that you have, the more important it is to maintain and
synchronize them properly.
Release Management
Release
planning SW
Design
Build Quality Release Rollout Implement Verify
configure review accepted plan Release Release
Testing
System
Unit System
integration
Test testing
testing
Integration
tests User
Performance
Acceptance
Unit
Test
Regression Regression Regression
The illustration in Figure 10-1 also shows the change and configuration
management processes. Although the configuration database (CMDB) maintains
the state of involved software and hardware assets, in typical client situations, a
more detailed CMDB just for IBM FileNet Content Manager deployment is
needed. You can maintain a few spreadsheets with the detailed information,
which allows you to track every change. It is absolutely crucial to have a good
change management process and to track the same level of detail for the
non-production environments.
In the next few sections, we present more details about release, change, and
configuration management, as well as testing, before we dive into a discussion of
moving the applications from development to production.
248 IBM FileNet Content Manager Implementation Best Practices and Recommendations
10.2.1 Release management
Over the past several decades, we have seen an evolution in enterprise
architecture. The evolution has gone from monolithic architectures
(COBOL-based programs running on mainframes) to component-based
architectures (J2EE and .NET applications) and toward service-oriented
architectures (SOA). The changes transform the enterprise into a highly
interoperable and reusable collection of services that are positioned to better
adapt to ever-changing business needs.
A software release manager is responsible for handling the following tasks and
requests:
Risk assessment
Deployment and packaging
Patch management (commercial or customized bug fixes)
Commercial patches for the runtime environments (for example, for operating
systems and application servers)
From the software development area:
– Software change requests (modifications)
– New function requests (additional features and functions)
From the quality assurance (quality of code) area:
– Software defects of custom code/commercial code
– Testing (code testing)
Software configuration management (the rollout of new releases)
The release manager for an IBM FileNet Content Manager solution might find the
following documentation and information helpful in performing release
management tasks:
Hardware and software compatibility matrix from the customer support site
Available export and import options to deploy the solution between
development and production environments. Search and replace scripts used
to prepare exported assets for use in the target environment where object
stores, users, or groups differ from the source environment
Deployment guidelines from the customer support site
Online help
In addition, the Rational® product line from IBM can be helpful in supporting
release management, change management, and testing. For reference, go to:
http://www.ibm.com/software/rational
250 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Another aspect of release management deals with objects that have been
created in production that affect the configuration of the solution and might
impact deployment. In IBM FileNet Content Manager solutions, these types of
objects include folder structures, add entry templates, search templates, and
others. Release management must have a strategy in place to handle or restrict
bidirectional deployment between multiple environments.
With IBM FileNet Content Manager solutions, there are two issues associated
with change management that you must consider:
The number and details of configuration items needed for a proper
deployment might overwhelm a configuration management database (which
supports the change management process for all IT-related changes, not only
changes for IBM FileNet Content Manager).
When the development system is not part of a change management process,
situations can occur in which changes are applied to the development
environment without being documented or in an uncontrolled manner.
Consider the following areas related to P8 Content Manager when managing the
change process:
Commercial code and assets (versions of P8 Content Manager, as well as
individual patches and levels of its components, such as Content Engine,
Application Engine, and Process Engine)
At the beginning of deployment, you must not handle commercial code and
custom code separately. For the targeted solution release, everything must be
assembled via an automated process if possible. The combining of custom and
commercial code must be handled by release management and documented by
configuration management.
Figure 10-2 shows the areas for which you distinguish between custom (yellow
area) and commercial code (green area):
Workplace
Object store
Workplace
Your
war - file
Application Engine
Document
Classes
Documents
Property Templates
Figure 10-2 Custom Code at the level of object store and Workplace
For more information regarding the separation of custom and commercial code
and the build process for Workplace, refer to 10.6.4, “Exporting and importing
252 IBM FileNet Content Manager Implementation Best Practices and Recommendations
other components” on page 269. For more information regarding the separation
of custom and commercial assets during repository design, refer to Chapter 5,
“Basic repository design” on page 85.
Retain a zip/tar file of all release-specific data, including code, exported assets,
and documentation, in a central datastore. Typically, you maintain
release-specific data by using a code version control system. IBM clients can use
Rational Clear Case, for example.
10.2.4 Testing
There are multiple ways to address environments associated with testing. One
way is to split testing into two major phases, which typically happen in different
environments:
Development environment
Regression tests
For all environments, regression testing must be implemented to enable a quick
functional test to see whether all components are up and running. A baseline
version might include from each relevant aspect one test object, such as a test
document class, a test search template, a test folder, and a test workflow.
The regression test must be used after having modified software (either
commercial or custom code) for any change in functionality or any fix for defects.
A regression test reruns previously passed tests on the modified software to
ensure that the modifications do not unintentionally cause a regression of
previous functionality. Regression testing can be performed at any or all of the
previously mentioned test levels. The regression tests are often automated.
Automating the regression test can be an extremely powerful and efficient way to
ensure basic readiness.
254 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Best practice: Establish a small suite of regression tests in each
environment. The best synergies are achieved by having the deployment of
the test assets and the test script as automated as possible. One side effect is
that this automation of regression tests affects the repository design.
Test automation
Two areas of consideration for automating tests are:
The load and performance test
The regression test
While the load and performance test might be executed only on major version
changes (commercial or custom releases), the effort to maintain the code for the
automation might be substantial.
The regression test must be generic enough so that the scripts are written once,
and maybe updated if there are minor changes, but typically stay pretty stable
over time. We recommend that you store the scripts together with the supporting
version of the test application in one location. Typically, this location is your
source code version control system.
Test automation tools are available from IBM and other vendors. For example,
refer to the IBM Rational products Web site at:
http://www.ibm.com/software/rational
Best practice: Distinguish load and performance tests from regression tests.
Each area has its own characteristics.
You can typically use the existing testing infrastructure for load and
performance tests. For regression testing, it typically makes no sense to use a
centralized large and complex infrastructure. It is more important that the tests
can be executed and will quickly show simple results.
Test documentation
Before we move to a discussion of the actual deployment, we must discuss the
testing documentation and its importance.
Several of the tests might fail. It is crucial to document the behavior but also the
resolution. The knowledge base must be part of problem management.
Combining test documentation with a searchable interface to find known
problems is very advantageous.
Best practice: Carefully document your tests with sufficient detail before the
tests are executed. Make the test documentation database searchable to
search for problems previously seen by users. Create a knowledge database.
There are three ways to deploy (transport) changes from one environment to
another:
Cloning
Exporting, transforming, and importing
Scripted generation of all the necessary documents and structures
256 IBM FileNet Content Manager Implementation Best Practices and Recommendations
10.3.1 Cloning
You can deploy changes from one environment to another by cloning the source
environment and bringing it alive as a new but identical instance of the source
environment.
You can use local VMware-based images to clone a system. For a large system,
however, this might not be a workable solution. Large systems are often not as
flexible as small systems, or there is a lack of powerful machines that can be
made available in a timely manner for cloning. Sometimes, the security and
networking policies do not allow these virtual environments to connect to
back-end machines.
The next logical step is to use virtual farms that host applications at larger client
sites. This approach might not be practical for the following reasons:
From the corporate network, they cannot be accessed unless using remote
desktop applications. A direct interaction is not possible due to using the
same host names and IP addresses multiple times in the same network.
Single virtual images are typically not powerful enough for the full stack of
components that are needed for a solution (which includes directory server,
database, application server, and other P8 Content Manager components).
In the past, we have seen projects struggle for months when using a manual
process to move J2EE applications that include IBM FileNet Content Manager
components. Today, we can deploy similar projects within one to two days. The
following factors contribute to the improvements:
Introduction of a solid release management process
Separation of commercial code from custom code and automation of the build
process mainly for J2EE-based or .NET-based applications
Adherence to the proposed guideline of stable GUIDs to reduce
dependencies (as described later in this chapter)
Implementation of a central datastore (database-based or file-based) in which
environment-specific information is stored
Automation wherever possible
The activity for transformation can take place as described in the IBM FileNet P8
Platform Planning and Deployment Guideline, before or just after import. Custom
258 IBM FileNet Content Manager Implementation Best Practices and Recommendations
scripts can be called to make the necessary transformation. The transformation
can also be conducted on the exported files before importing.
This approach has been proven to work, but the effort to maintain this type of
script is huge and every change must be put into the script code. All of the
benefits of using a tool, such as IBM FileNet Enterprise Manager, are lost with
this approach. There is very little benefit in using this approach unless it is to
overcome limitations where there is no alternative. This approach can be used to
create marking sets or to maintain application roles. We will not discuss this
approach further due to its limitations.
10.4.1 Topology
Figure 10-3 on page 260 illustrates a clonable topology with three identical
environments using VMware images. Every domain is formed by a collection of
servers, which are part of multiple VMware images. All images of one domain are
connected over a private network to a special image called the router. The router
implements network address translation (NAT) and virtual private network (VPN)
gateway functionality. This can be done using Microsoft Remote Access Server
or other products. The other network link of the router is mapped to a network
card of the brick, which is accessible by the corporate network.
To clone the environment, only the router image has to be modified and the
public interface needs to be set up correctly. The Application Engine resolves the
public Domain Name System (DNS) of the router image.
Even though a large group of developers has all of the tools necessary to
perform their tasks, developers might prefer to have a pre-configured image to
run on their individual workstations. If Microsoft Active Directory is used, use a
VMware image that was initially part of the same Active Directory.
260 IBM FileNet Content Manager Implementation Best Practices and Recommendations
10.5 Deployment by export, transform, and import
In this section, we discuss deployment by export, transform, and import either for
a full or incremental deployment.
A full deployment for a P8 Content Manager solution means that both the
structure information and the documents are deployed in one iteration. The
target environment gets everything with the assumption that the target object
store is empty.
Full deployment is a very powerful vehicle to move a project the first time through
the various stages of deployment. You only perform a full deployment one time
for a project.
There are multiple ways to figure out the differences between the two releases:
Manually
By strictly rolling forward changes from the source environment to the target
environment and preventing any changes to the target environment between
releases
Automated discovery of the differences
Manually detecting the differences between the source environment and the
target environment is time-consuming and error-prone. This option is only valid
for small deployments.
Clients typically choose the second option with the consideration that someone
has manually verified both environments. In a multi-stage environment, there is a
good chance that mistakes in this approach will be detected in the first
deployment step from the development environment to the test environment.
When errors are detected at this point, there is an opportunity to fix the
underlying problems and retry the same procedure. As soon as the deployment
to the test environment passes testing (and is documented), the future
deployment to production most likely works smoothly.
The third option is extremely difficult to achieve and potentially too expensive.
There are a lot of exceptions when just comparing date times between the
various environments. A development or source environment might include more
objects than will be used for the target deployment. So, a selective tagging of
objects that are part of a release seems to be mandatory.
262 IBM FileNet Content Manager Implementation Best Practices and Recommendations
When moving objects between multiple environments, you must consider
dependencies. Objects are often dependent on other objects in the object store
or on external resources. Examples:
An Add Entry Template references a folder.
An application’s Stored Search definition is an XML document in an object
store. The XML content references multiple object stores by name and ID.
A document references an external Web site that contains its content.
While there is no work-around for external dependencies, there is one for inter-
object store dependencies by keeping the identification of these objects (GUIDs)
consistent across the various environments. This is not in contradiction to the
previously mentioned uniqueness of GUIDs, but it is rather a consequence for
two reasons:
The objects, which are considered to be kept consistent with the same GUIDs
across object stores, have configuration characteristics, such as document
classes, folders, property templates, add entry templates, search templates,
and others.
The predefined population of an object store after you run the object store
creation wizard follows the same pattern.
In Figure 10-4 on page 264, we show two options of how to deploy a search
template that has a dependency on a folder structure. While you might argue that
there are better ways to reference folders by referring to a full path, you might
discover similar situations where there are good reasons to depend on a GUID.
In the first option, we followed the practice of using stable GUIDs which we did
not do in the second option:
Deploying the folder with the same GUID leads to no additional corrections
deploying the search template above.
Deploying the folder and letting the system generate a new GUID leads to a
situation where the search template must be changed to refer to the deployed
folder.
You can avoid the extra effort of maintaining the dependencies in the target
environment by following the pattern of having stable GUIDs.
Figure 10-4 on page 264 illustrates the deployment from development with stable
GUIDs in the top box on the right and not applying stable GUIDs in the lower box
on the right. Not following the stable GUID pattern results in maintaining the
dependencies with additional deployment logic.
Object Store A
456
123
Object Store A
456
123
Test
Object Store A
885
789
264 IBM FileNet Content Manager Implementation Best Practices and Recommendations
There are three major types of objects to be exported:
Structure (such as document classes and folders)
Configuration documents (such as templates and workflow definitions)
Business documents (such as faxes, e-mails, and images)
10.6.1 CE-Export
When preparing an export, you need to consider the granularity of the export. It is
usually unnecessary to include everything in one export run. If you need to fix a
problem, we recommend having multiple exports addressing smaller chunks of
data rather than one huge XML file that describes everything.
Hierarchy of exports
Build a logical hierarchy of exports, which can help you to test the imports
sequentially and fix dependencies more easily.
Certain objects, which include all P8 Content Manager domain level objects,
including marking sets, cannot be exported. There are Application
Engine-related objects that cannot be exported as well. See AE-based CE
Export 10.6.4, “Exporting and importing other components” on page 269.
If marking sets have been used, you have to create them manually in the target
system. The export and import sequence is:
Exporting content
When exporting configuration documents, you have an option to specify an
external directory where the actual content must go. Choosing this option gives
you a better starting point for your future transformations. Not choosing this
option embeds the actual content that is encoded into a CDATA section in the
exported XML file.
Best practice: Choose an external subfolder for content when exporting your
configuration documents.
266 IBM FileNet Content Manager Implementation Best Practices and Recommendations
groups correctly by coding the specific users and groups into only one script
instead of into hundreds of objects. In this approach, you handle user and group
information in access control lists.
There is another aspect of setting user and group information within the Add
Entry Templates. These objects contain user and group information that is used
to grant permissions to new objects (similar to the default instance security of
document classes). We discuss this topic in the section about AE-based content
deployment in 10.6.4, “Exporting and importing other components” on page 269.
Automation of export
The description of the detailed P8 Content Manager deployment is based on the
experiences gained with larger deployments under IBM FileNet P8 V3.5x.
Note: The ability to export and import assets from the object store has been
changed between IBM FileNet P8 CE Version 3.5x and 4.x.
In Version 3.5x, IBM FileNet Enterprise Manager is the preferred method for
interactive export and import. The CE COM API also offers export/import
methods, which can be used when automation is a goal. These methods
operate on individual objects, leaving it up to the developer to import and
export objects in the correct sequence and accounting for dependencies
between objects, which is a non-trivial task.
In Version 4.x, IBM FileNet Enterprise Manager is still available for interactive
export and import. You can use a command line export and import utility that
is new in Version 4.x when automation is a goal. Unlike the 3.x API calls, the
new command line utility handles object sequencing for you. The utility takes
an XML manifest file as its instruction set. This manifest file can be generated
using IBM FileNet Enterprise Manager (the file can be generated once,
interactively, in IBM FileNet Enterprise Manager, and then used multiple times
in automated deployments). The Version 3.x export/import API calls are not
available in Version 4.x.
Business documents Not required Do not use the same GUID unless they
have the configuration characteristic
Note: If user and group information must be transformed to suit the target
environment, the transformation applies to all objects, including those where
this table says transformation is “Not required”.
Configuration documents must be exported using the option to put their content
into a separate folder. If you do this, the XML file that describes each document
object holds an absolute pointer to the content in the configured folder.
268 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Note: IBM is not responsible for testing and supporting your exported files and
objects. Always test them in a non-production environment before deploying
them in a production environment.
10.6.3 CE-Import
Importing the exported and transformed objects using IBM FileNet Enterprise
Manager is straightforward by using the order that is presented in “Hierarchy of
exports” on page 265.
Database
All changes to rows in the object store database are covered by exporting and
importing objects as previously explained.
In addition, you can consider propagating changes that have been applied at the
database level, such as adding additional indexes, changing server options, and
others. You can typically accomplish this by rerunning the SQL-based scripts
that were written to configure the database in the source environment. Check
whether the scripts depend on infrastructural information, such as user ID,
password, server name, IP addresses, and database name.
In any case, you must map the users, groups, and memberships to the target
environment, which depends on the security settings in your company. If
possible, use the same scripts and transform them based on a naming
convention. This step needs to happen prior to the CE/PE Import.
The underlying Process Engine APIs contain all the required methods to move
Queues, Rosters, EventLogs, and to validate Workflow Definitions.
You can export the Process Engine configuration by a call to the Process Engine
Java API. The method VWXMLConfiguration.exportConfigurationToFile
(apiObjects [], outputFile) takes a list of objects to be exported to an XML file.
The import into the Process Engine works in a similar way. The method
VWXMLConfiguration.importConfigurationFromFile (session, inputFile, option)
imports the XML file’s content into an existing session by either overwriting
existing items or merging them.
If you currently use other BPM features or services, refer to the FileNet P8
Planning and Deployment Guide, which you can download from:
http://www-1.ibm.com/support/docview.wss?rs=3278&uid=swg27010422
Application Engine
Whenever Workplace applications have to be moved between environments,
there are business assets and application configuration assets to be deployed.
270 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Workplace stores various objects in an object store, such as:
Site Preferences
User Preferences
Add Entry Templates
Stored Searches
Search Templates
Application Roles
We have already discussed the business assets under the export, transform, and
import process. We do not need further explanations from a methodology point of
view.
We need to further explain the application configuration object, the Add Entry
Template. The Add Entry Template is used at the moment when you add a
document to the object store. It describes the metadata information that is used
to choose the document class containing folder, initial values, and initial object
permissions.
When exporting the Add Entry Template, it is important to understand that the
user and group information, which is embedded in the exported XML file, has
nothing to do with the IBM FileNet Enterprise Manager option “Export security.”
This information is used by the Workplace application, and you need to either
manipulate this section as part of the transformation or remove this section from
the XML export and import. Then, you can modify all Add Entry Templates
manually by editing them. If you have a large number of Add Entry Templates,
you might consider editing the section in the XML file automatically, which has
been done successfully.
Workplace is a Web application spanning one war file, which contains the
relevant Java APIs to connect to Content Engine and Process Engine.
Workplace is built on top of the Workplace toolkit that can be leveraged to
customize Workplace. When Workplace is deployed as an application, there are
two areas for consideration:
Custom code incorporated into the same Workplace war file
Custom code incorporated into a different ear/war file that accesses the
commercial Workplace application
It is beyond the scope of this book to explore the details about how to achieve a
good build process in detail, but there are tools around in the market such as Ant
from Apache Foundation to help to facilitate the build process. Many clients
successfully did this to adapt Workplace and to automated its deployment.
RemoteServerUrl, /WEB-INF/WcmApiConfig.properties
RemoteServerDownloadUrl,
Remote-ServerUploadUrl and CryptoKeyFile
272 IBM FileNet Content Manager Implementation Best Practices and Recommendations
11
The help system uses framed Web pages. The left frame contains links to details.
In this document, we point to additional details by selecting ecm_help → Help
Directory → How to use Help. On your ecm_help system, expand Help
Directory → How to use Help to find additional details. See Figure 11-1.
You can obtain product documentation for the IBM FileNet P8 Platform from the
following Web site:
http://www.ibm.com/support/docview.wss?rs=3278&uid=swg27010422
274 IBM FileNet Content Manager Implementation Best Practices and Recommendations
You can obtain technical notices from this Web site in the Technical Notices
section, including:
IBM FileNet P8 Performance Tuning Guide
IBM FileNet P8 High Availability Technical Notice
IBM FileNet Content Engine Query Performance Optimization Guidelines
Technical Notice
IBM FileNet Application Engine Files and Registry Keys Technical Notice
IBM FileNet P8 Asynchronous Rules Technical Notice
IBM FileNet Content Engine Component Security Technical Notice
IBM FileNet P8 Directory Service Migration Guide
IBM FileNet P8 Disaster Recovery Technical Notice
IBM FileNet P8 Extensible Authentication Guide
IBM FileNet P8 Process Task Manager Advanced Usage Technical Notice
IBM FileNet P8 Recommendations for Handling Large Numbers of Folders
and Objects Technical Notice
IBM FileNet P8 DB2 Large Object (LOB) Data Type Conversion Procedure
Technical Notice
Although several technical notices were written for IBM FileNet P8 3.5, much of
the content provided is useful for the 4.0 version as well.
11.2.2 Dashboard
Administrative personnel can use this tool to routinely monitor system
performance. The Dashboard provides a means to generate detailed reports
regarding performance. It displays the details and also has the ability to save the
information in various formats.
The Dashboard is a Java utility that can be installed and run on Windows or
UNIX/Linux clients. It is installed separately from the server installation. It can
also be installed and run on the P8 Content Manager servers. On Windows
machines, run the Dashboard utility. On UNIX, you must have an XWindows
display exported and run the P8Manager shell script. The Dashboard installs a
local copy of its online help; it can be accessed from the help menu option.
When the Dashboard is first run, you need to create clusters of P8 Content
Manager components to monitor. These clusters are not used for high availability
but are simply a user-defined logical collection or cluster of servers to monitor.
The cluster contains servers and monitoring frequency. Select the Cluster tab
and click New. Enter a name for the cluster, which is typically the application
system name or location, and click OK. See Figure 11-2 on page 277.
276 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Figure 11-2 Dashboard: New Cluster
Click Edit to add servers and timing details. See Figure 11-3 on page 278. The
Interval sets the frequency that the Dashboard polls the server listeners to get
details in seconds. For a 15 minute interval, enter 900 seconds. The number of
datapoints sets the maximum number of interval details that the Dashboard
keeps in the display.
Click OK. At this point, the Dashboard tool begins querying for System Manager
Listeners on the servers and populates details in the Dashboard tool’s various
windows. It finds all listeners running on each server; individual servers only
need to be defined once. You can save the cluster details for future use or open
existing ones from the file menu. The cluster file is an XML-formatted file that is
saved on the local computer. You can copy the cluster.xml file to other
computers where the Dashboard is installed for use on other workstations.
The Dashboard summary tab simply shows a graph of the cluster’s performance.
The Details tab contains counter details for all listeners. You can expand the P8
Content Manager applications on each server and view the following items:
CPU, Network, and Disk utilization
Environmental details, such as OS level, P8 Content Manager version, and
Java virtual machine (JVM) settings
Remote Procedure Call (RPC) activity shows how the P8 Content Manager
subsystems are performing and it details the count and average time
consumed (duration) by the calls during the interval
Figure 11-4 on page 279 shows RPC count details and the number of items
processed per interval.
278 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Figure 11-4 RPC Count details window
Running the archiver.jar can be automated through host scripts. The archiver.jar
writes to files with one file per listener in a log directory. The archived files are
binary files that can be opened via the Dashboard’s File → Open Archive menu.
The same view and report options apply as in a live system monitoring session.
280 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Table 11-1 lists the archiver.jar parameter options.
-t hh:mm Total amount of time in hours and minutes that the archiver process
must run
-n hh:mm The interval at which the current archived files must be closed and
new ones opened
-i integer The interval, which is specified in seconds, at which to poll for data
from the specified machines
-d file path The path to the location at which to place the archive log files
FileName.xml The complete path to the saved cluster file that specifies which
machines to poll
Best practice: Start the archiver.jar immediately before your system activity
picks up during the peak times (for example, in the morning) and run it until
activity slows down (for example, in the evening).
If you restart your system while the archiver.jar is running, you must restart the
archiver.
FSM features a Web interface that authorized personnel use to monitor and
manage your system. It features a knowledge base of faults and possible
corrective actions. You can customize this knowledge base to offer
application-specific corrective actions. When a fault is encountered, support
personnel can quickly identify and correct the failing component. Figure 11-6
shows the main window of FSM.
The rapid fault isolation and corrective action database make FSM a must-have
for mission critical systems. FSM reduces manual efforts in the daily
administration of P8 Content Manager and helps to increase system availability.
FSM can help reduce your operational costs and help you meet your your
Service Level Agreements more efficiently.
282 IBM FileNet Content Manager Implementation Best Practices and Recommendations
For more information about IBM FileNet System Monitor, go to:
http://www.ibm.com
Application Engine does not have a message log. Messages and exceptions are
written to the J2EE application server’s log.
Note: Trace logging all components can create enormous log files with very
little system activity. Performance might also be impacted. Turn on the
minimum trace logging necessary to collect the required information in relation
to the problem that you are investigating.
You can monitor the Application Engine activity through Content Engine’s API
trace logging. You can obtain additional information in 7.3.10, “Logging” on
page 177.
You can turn Content Engine trace logging on or off without recycling the server.
It can be enabled at several different levels, for example, to include all Content
Engine servers or only one server. It can be enabled for specific components or
all components. 11.4.2, “Trace logs” on page 284 contains additional details for
enabling trace logging.
284 IBM FileNet Content Manager Implementation Best Practices and Recommendations
P8 Administration → Process Engine Administration → Administrative
tools.
The Content Engine provides a log4j.xml.server file that must be edited to enable
logging, and it must be copied into a directory specified in the Content Engine’s
CLASSPATH.
To:
WebSphere\AppServer\profiles\default\installedApps\hqdemo1Node01Cell
\FileNetEngine.ear\APP-INF\lib\log4j.xml
Or:
bea\user_projects\domains\mydomain\myserver\.wlnotdelete\Engine-wl\A
PP-INF\lib\log4j.xml
To:
WebSphere\AppServer\profiles\default\installedApps\hqdemo1Node01Cell
\Workplace.ear\app_engine.war\WEB-INF\lib\log4j.properties
Or:
bea\user_projects\domains\mydomain\myserver\.wlnotdelete\extract\mys
erver_Workplace_Workplace\jarfiles\WEB-INF\lib\log4j.properties
Note: Improper log4j settings can cause system problems. Always test on a
development system before you implement logging in a production
environment.
Best practice: Rename system logs to a date format name to keep them for a
brief period, and then delete them. The log maintenance timing depends on
how busy your system is and how large the logs grow over time.
Best practice: If your system has a Process Engine or you use Content
Engine audit logs, a best practice is to maintain the logs weekly, which insures
that the database tables are kept as small as possible. If left unchecked, audit
logs can become very large and impact system performance.
11.5 Reporting
To get reports about your system, you can use queries. Two distinct models are
available for performing queries against object store objects. In this section, we
describe both models and gives examples of their use.
286 IBM FileNet Content Manager Implementation Best Practices and Recommendations
properties represented as SQL columns, but you have the full benefit of Content
Engine access controls, data conversions, and other internal optimizations.
There are a number of ways to perform queries using the object model:
The IBM FileNet Enterprise Manager provides a guided user interface called
Query Builder to assist in creating and running queries. Refer to 11.9.1,
“Search using IBM FileNet Enterprise Manager” on page 299, which
describes using the search IBM FileNet Enterprise Manager for more
information.
The Java API, the .NET API, and Content Engine Web Services all provide
programmatic query interfaces.
The Java API provides a Java Database Connectivity (JDBC) driver for use
with commercial reporting packages. Refer to 7.3.7, “Using the JDBC
interface for reporting” on page 173 for details.
Regardless of the method used, the SQL syntax of the queries is identical. The
syntax is generally a subset of SQL-92 (only the SELECT statement) with
several IBM FileNet P8-specific extensions.
Suppose you need to find the largest content in your repository. You plan to run
this query periodically and therefore are interested in finding only the content that
was created or updated since the query was last run. This is a sample query to
accomplish this task:
SELECT TOP 500 Creator, ContentSize, Id FROM Document d
WHERE ContentSize > 50000000.0 AND DateLastModified > 20071031T040000Z
ORDER BY ContentSize DESC
In this example, we searched for content larger than 50 MB that has been
modified after 4 A.M. in Coordinated Universal Time (UTC time) on the last day
of October 2007. We selected the Document properties Owner and Id, although we
can use any other Document properties of interest. The results are ordered in
descending size. To prevent retrieving too many results, we constrain the SELECT
statement with a TOP modifier that limits the result set to a maximum of 500.
Best practice: Use the object model whenever possible for queries.
The P8 Content Manager classes and properties do not map one-to-one with
tables and properties in the underlying database. As an example, the Document
class objects are stored in the DocVersions database table. The Document
properties Creator, ContentSize, Id, and DateLastModified are mapped to
DocVersions columns creator, content_size, modify_date, object_id, and
modify_date, respectively.
As you deploy and begin using your application, monitor and record these server
statistics:
Disk usage
CPU and memory utilization
Database usage
288 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Your database administrators can provide database details. The most important
information is actual database size, but it is also good to know if specific tables or
data fields are growing rapidly.
P8 Content Manager systems tend to grow over time. Content is added daily,
additional applications are developed, and users are added. By monitoring and
recording these statistics, you can measure how your system is performing
against the initial model. More importantly, you can track how quickly you are
using resources and determine the impact to the system when an increase in
system usage is planned.
Figure 11-8 on page 291 shows the IBM FileNet Enterprise Manager main
window. IBM FileNet Enterprise Manager allows you to create object stores,
assign security to all P8 Content Manager items, define and run searches, and
turn trace logging on or off. All P8 Content Manager administrative functions are
performed via IBM FileNet Enterprise Manager.
290 IBM FileNet Content Manager Implementation Best Practices and Recommendations
=
Figure 11-9 shows the Server Cache setup at the IBM FileNet P8 domain level
properties page. The Server Cache setup is also shown on the Properties page
for server instance, virtual server, and site objects.
Figure 11-9 IBM FileNet Enterprise Manager domain level Properties page
Figure 11-10 on page 293 shows a server level Properties page. Note that the
server level has fewer tabs; it includes only the values that pertain to the specific
server.
292 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Figure 11-10 IBM FileNet Enterprise Manager server level Properties page
Figure 11-11 on page 294 shows the main IBM FileNet Enterprise Manager
window.
In our example, shown in Figure 11-11, it does not matter where we enable trace
logging, because we only have one server. P8 Content Manager systems can
have literally hundreds of servers. Turn on the minimum logging on the fewest
possible servers as necessary to investigate a problem.
Important information: When using trace logging, enable it for the fewest
possible number of servers. Depending on what logging is enabled and how
busy your system is, all of your servers can produce large logs if trace logging
is enabled at the site or domain level. Performance can also be impacted on
busy systems if all trace logging is enabled.
Double-click trace logging or right-click the virtual server or domain level and
select properties to open the Properties page. The Properties page is shown in
Figure 11-12 on page 295.
294 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Figure 11-12 IBM FileNet Enterprise Manager Properties page
In the Properties page, which is shown in Figure 11-12, you select the
subsystems to monitor and the level of detail to log. If the property page is from
the server or virtual server level, you must select the Override inherited settings
check box.
11.8 Auditing
P8 Content Manager provides audit logging to monitor event activity of objects.
Audit data is stored in a table in the object store’s database. Auditing is controlled
in the IBM FileNet Enterprise Manager.
Best practice: When you enable audit logging, try to enable it for the fewest
objects and for the shortest amount of time that you reasonably can.
In the General Tab, check Auditing Enabled? and see Figure 11-14 on
page 297.
296 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Figure 11-14 Object Store General Tab
Select a Content Access Recording Level, and then click OK. Content Access
Recording levels are:
None: Specifies that updates to the DateContentLastAccessed property are
disabled (which is the default behavior). The value for this constant is -1.
Immediate: Specifies that the DateContentLastAccessed property is updated
as soon as content is accessed. The value for this constant is 0.
Hourly: Specifies that the DateContentLastAccessed property is updated
only when an hour (3600 seconds) has elapsed since the last update of the
DateContentLastAccessed property. Any access of content within an hour of
the last update is not recorded.
Daily: Specifies that the DateContentLastAccessed property is updated only
when a day (86400 seconds) has elapsed since the last update of the
DateContentLastAccessed property. Any access of content within a day of
the last update is not recorded.
The searches discussed in this section are IBM FileNet Enterprise Manager
searches for maintenance purposes only. The P8 Content Manager application
layer has similar stored search capability. The application searches are stored in
a different location. While similar, IBM FileNet Enterprise Manager searches
cannot be directly promoted for application use. Refer to 8.3, “P8 Content
Manager searches” on page 201 for an additional search discussion.
IBM FileNet Enterprise Manager provides a Query Builder to create, save, and
run search functions. Predefined search templates are provided with each P8
Content Manager installation. These templates are provided to assist you with
managing the size of your audit log and for managing entries in the QueueItem
table.
Searches can:
Find objects using property values as search criteria
Create, save, and run simple searches
Create and save search templates that prompt for criteria when launched
Create, save, and run SQL queries
Searches can perform bulk operations, such as the operations in the following
list, on content that meets the search criteria:
Delete objects
Add objects to an export manifest
Undo document checkout
Perform life cycle actions, such as set exception, clear exception, promote,
demote, and reset
Perform containment actions, such as file into a folder and unfile from a folder
Run VBScripts or JScripts
Edit security permissions
298 IBM FileNet Content Manager Implementation Best Practices and Recommendations
11.9.1 Search using IBM FileNet Enterprise Manager
In this section, we demonstrate a simple search using IBM FileNet Enterprise
Manager. We build a search for all content created by Joe User on 11 September
2007.
Launch IBM FileNet Enterprise Manager (refer to 11.7, “IBM FileNet Enterprise
Manager” on page 289). Expand an object store tree view and click Saved
Searches.
Figure 11-15 on page 300 shows the supplied search templates. Templates are
represented by a binocular icon with a shaded bottom and right edge.
To create a new search, right-click Search Results and click New. This starts
the Query Builder application.
300 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Figure 11-16 Query Builder
Query Builder provides a point and click interface to create searches. Query
Builder allows you to perform bulk mode actions against the result set. You can
also use VBScripts or JScripts to build queries.
In our example shown in Figure 11-16, we created a search for all items that are
created by Joe User on or after 11 September 2007.
When you have completed all search criteria, click OK to run the query. If the
search that you created might be needed again, you must select File → Save
before running the query. After clicking OK, the query runs and a results window
appears indicating the progress and when the query completes.
When the query completes, click OK in the Query Status window. Content that
matches your search appears in the IBM FileNet Enterprise Manager Search
Results window.
You can right-click the items to view their properties in order to validate that they
meet your criteria.
302 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Figure 11-18 Content Properties
To perform bulk operations, your IBM FileNet Enterprise Manager logon ID must
have sufficient security privileges to perform the desired actions.
Figure 11-19 Query Builder Security tab page with the selected modify options
304 IBM FileNet Content Manager Implementation Best Practices and Recommendations
6. When we click OK on the Query Builder Security tab page, our query status
returns, but this time, it indicates that it has successfully updated two items.
The same content items appear in the query results window. If we check the
properties, we now see that AUserGroup with our selected permissions has
been added to the content.
Figure 11-20 shows that the modified properties now include AUserGroup.
Figure 11-21 on page 306 shows additional bulk operations that can be
performed on a query.
To update an object store with new users or groups, use IBM FileNet Enterprise
Manager’s Security Script Wizard to run the OSecurityUpdate.xml script.
Note: While you might find other ways to apply security, using the Security
Script Wizard is the only way to insure that it is set correctly in the object store.
Failure to use the wizard can cause problems when users attempt to access or
create content.
306 IBM FileNet Content Manager Implementation Best Practices and Recommendations
To update an object store with new users or groups, follow these steps:
1. In the IBM FileNet Enterprise Manager, right-click the object store node,
choose All Tasks and run the Security Script Wizard.
2. When prompted to select an XML security script information file, browse to
and select OSecurityUpdate.xml. It is installed in the installation base
directory:
FileNet\ContentEngine\Scripts\Component Library\
3. When prompted to define security roles, you see two roles under Security
Role: Object Store Administrators and Object Store Users.
Use Add to add security participants for the selected role. The Select Users
and Groups dialog box opens. Click OK when you have added the
participants for that particular role. See Figure 11-22, which shows the
Security Script Wizard.
4. Click Finish when you are done. The wizard generates a prompt informing
you where its log file will be located. The wizard proceeds to apply the
security permissions to the objects in the object store. This process can take
time, depending on the number of objects that need to be updated. The
wizard informs you when the process of applying security is complete.
5. If you added groups to only one Security Role, a notice appears. Simply click
OK, because no current Security Roles will be deleted; only the new roles will
be added by the wizard. See Figure 11-23 on page 308.
Examine the new permissions on IBM FileNet Enterprise Manager’s root folder.
Depending on how you have configured the inheritance from the root folder and
all generations of child folders, these new permissions might not yet have been
inherited. You need to configure the folder security parentage as appropriate.
Chapter 9, “Business continuity” on page 213, discusses types of events that can
require a system restoration. It focuses on building a highly available
environment with protection against catastrophic system or site loss to ensure
that your system is always available. If you are responsible for system recovery,
familiarize yourself with business continuity methods whether your budget
permits a hot site or not. You might be able to use some business continuity
methods to reduce backup and restore times in your data center. If your budget
permits a hot site, you still need a backup and restore mechanism to recover
from human errors, such as deleted or modified files. A mirrored hot site mirrors
all activity; it lacks a means to differentiate an intentional or accidental change.
Best practice: Store your backup media off-site away from your primary
servers. You must make sure that the media is moved to the off-site location
as soon as possible after the backup completes.
The longer your backup media is stored near your primary servers, the greater
the chance that a catastrophic event can destroy both your servers and your
ability to restore your systems to operational condition.
308 IBM FileNet Content Manager Implementation Best Practices and Recommendations
P8 Content Manager does not provide backup software. You must use backup
utilities that are supplied with your operating system or database or by third
parties.
Note: If your system uses Fixed File Storage areas for compliance or
Image Manager applications, you need a normal file storage area for
temporary staging of content. If your application performs content
reservations or uses annotations, that metadata is stored in the
“temporary” file store. This file storage area must be included in your
backup and recovery strategy.
A backup window is the amount of time that your system can be down for
backup. If your system has users running from 6:00 A.M. to 11:00 P.M., you have
a seven hour backup window. A best practice is to allot time before and after
users require the system to accommodate late workers or a backup that runs
longer than usual. We recommend allotting 1/2 to one hour before and after
users expect the system to be operational. In this example, allotting one hour
before and after gives you 5 hours total backup window to stop the servers,
perform the backup, and start the servers.
The amount of time required for the longest component’s backup must fit within
your backup window. Your content storage area usually consumes the greatest
amount of backup time.
There are a few steps that you can take to decrease backup time to fit your
window:
Use a combination of full and incremental backups. Incremental backups
simply capture information that has changed since the last backup. This can
greatly reduce time spent backing up data. During a restore, you must restore
from your last full backup and apply the incremental backups before starting
your system, which increases the amount of time necessary to restore your
system. A best practice is to perform full backups weekly when a larger
backup window is available and perform incremental backups during the
week when your backup window is smaller.
If you use tape as your backup media, a faster alternative is to back up your
data to disk files. When the backup to disk completes, transfer the backup
files to tape, which allows your P8 Content Manager system to run while the
transfer to tape occurs.
Section 11.11.3, “Online backup” on page 311 discusses potential methods to
run online backups. Those techniques can safely be used for offline backups.
Simply stop your P8 Content Manager servers, run the copy and restart your
system. This approach provides the fastest possible offline backup.
If your backups cannot be completed within your backup window, you need to
look at the online backup methods discussed next.
310 IBM FileNet Content Manager Implementation Best Practices and Recommendations
11.11.3 Online backup
You need to investigate online backup alternatives if your system must run 24x7,
your backup time exceeds your backup window, or your Service Level
Agreements (SLAs) require a higher frequency than a nightly backup.
There are options on the market that can help resolve this situation. Disk,
volume, or storage area network (SAN) mirroring techniques are available that
permit time slice backups or snapshots of your data. These options typically work
similarly to the disk mirroring that has been used for many years. Where they
differ is that they mirror several disks or volumes in groups and permit adding
time slice details. Restoring involves copying the mirror back to the last good
time slice. Several techniques offer offline tape backup of the mirror and time
slice copies. Ideally, the utilities provide a means of capturing consistent time
slices across all disk drives and servers used by your application.
Section 9.4.1, “Disaster recovery concepts” on page 230 discusses methods that
use these techniques to copy your data to a remote facility. The same techniques
can provide copies in your primary data center. Most storage vendors offer local
and remote mirroring capabilities for this copying. It might be called a time slice,
snapshot, or flash backup capability. Most storage vendors also provide tape
backup solutions to move the data off-site.
Check to see if your database vendor has any special requirements for using
these techniques for system backups; most vendors do have special
requirements. Consider using an online database backup for additional safety.
Note: At the time of writing this book, the P8 Content Manager engineering
group has not tested or supported these techniques. Many clients are
currently using these techniques for online backups. The suggestions
provided here are for your reference.
If you need to perform online backups, you must perform due diligence and
validate that the techniques used can create a restorable backup data set.
Your P8 Content Manager system needs to be down during the restore process,
If you used incremental backups, restore all incremental backups before starting
your P8 Content Manager system. After all restores have completed, start your
P8 Content Manager system normally. Check for individual component errors.
Refer to 12.3.1, “Quick checks” on page 323 for system test procedures.
After a system restore, if your P8 Content Manager system uses file stores,
perform a consistency check using the Consistency Check utility.
To run the Consistency Check utility, start IBM FileNet Enterprise Manager and
select an object store. A set of object store tasks displays in the right pane for the
selected object store. (See Figure 11-24 on page 313 in which an object store is
selected).
312 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Figure 11-24 Enterprise Manager with object store selected
Click Start Consistency Check. The consistency check progress window shows
status, start time, approximate completion time, and when complete.
Best practice: Consistency checks can run for a very long time depending on
the amount of content in your system. Limit the amount of time that the
consistency check runs. Set the check to start a few hours before the major
event that requires its use.
314 IBM FileNet Content Manager Implementation Best Practices and Recommendations
where you need to restore your P8 Content Manager system and external
systems cannot be restored to the same point in time. In those cases, you need a
means to validate that content references in the external systems are on the P8
Content Manager system.
Note 1: Log maintenance must include all operating system, application server,
and P8 Content Manager product error and trace log files. Log maintenance
must also include the Content Engine audit log and the Process Engine log
database tables, if used. All log files can grow quite large over time; on busy
systems, you might need to increase the maintenance frequency. Low use
systems might be able to reduce the frequency.
Note 3: IBM FileNet Fix Packs are produced at regular intervals. Fix Packs, as
well as the latest documentation, are available at:
http://www.ibm.com
316 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Store your backup media off-site. For reference, go to 11.11, “System backup
and restore” on page 308.
Allot free time before and after the backup. For reference, go to 11.11.2,
“Offline backup” on page 310.
If using incremental backups, perform full backups weekly. For reference, go
to 11.11.2, “Offline backup” on page 310.
When running the Consistency Checker utility, configure it to start checking a
few hours before the major event. For reference, go to 11.11.5, “Consistency
Check utility” on page 312.
Automated tools can greatly reduce troubleshooting time, because they can alert
you to a major component failure or problem, such as a disk or file system full.
IBM FileNet Enterprise Manager and Workplace are also tools (applications) that
help you identify problems. Having access to the user applications will be very
helpful in determining what parts of the application are working.
Note: When enabling trace logging for troubleshooting, only enable the
subsystems that are necessary to diagnose the issue. Unconditionally
enabling all levels of all subsystems will have a negative impact on
performance.
320 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Human Interaction
Application Engine
Directory Server
Content Engine
Object
Store
Database Server
In the basic P8 Content Manager application, the user points their browser to the
Application Engine running on the Java 2 Platform, Enterprise Edition (J2EE)
application server:
1. The user receives a logon window and enters the user ID and password.
2. The Application Engine passes the user details to the LDAP server. If correct,
the user’s credentials are obtained by the Application Engine.
At this point, the user can view or create new content. The benefit of logging on
this way is that as your system grows you can quickly and easily increase power
by scaling vertically (adding more power to your server) or horizontally (adding
more servers). This approach allows your P8 Content Manager system to grow
and support hundreds of applications with thousands of users working on an
enormous amount of content. The N-Tier J2EE architecture (server-client) is
what allows us to scale from small systems to very large enterprise systems with
minimal effort.
This might be oversimplified, but as you approach a problem and think about it in
client/server terms, finding the failing client/server section allows you to quickly
rule out what is working and focus on the component that is not working.
We will look at common problems and break them down to a client/server style
approach in the following chapters.
322 IBM FileNet Content Manager Implementation Best Practices and Recommendations
IBM FileNet Enterprise Manager is also a great tool for problem isolation, and it is
similar to Workplace or Workplace XT. In this section, we offer tips to assist with
the problem isolation process.
b. To check if the Process Engine server is running, use its ping page by
pointing your browser to:
http://<PE host machine>:32776/IOR/ping
Example:
http://hqdemo1:32776/IOR/ping
You browser will display a window similar to Figure 12-3.
324 IBM FileNet Content Manager Implementation Best Practices and Recommendations
3. Log on to FileNet Enterprise Manager. If you are able to log on and view
configuration details, the Content Engine is running.
4. Log on to Workplace. Test your system with it. Depending on your
applications, browse folders and view configuration details. If Workplace
works fine, your Application Engine is running.
If the Process Engine is installed, Workplace can run the Process
Configuration Console to determine if your Process Engine is running. This
assumes your logon ID is a member of the Process Engine Administrators
group.
5. If you have access to your users’ application, run it from your workstation.
Remember, you probably have administrative privileges. If you can, test with
user privileges to quickly validate if there are invalid security settings.
If these tests work, your primary P8 Content Manager server components are
functioning. If one of these tests fail, you have a good place to start
troubleshooting.
In the next sections, we look at common problems that users report, and we
diagnose the problems in a client/server fashion.
If a user gets the logon window but cannot logon to the application:
Verify that the user’s credentials on the LDAP system are correct:
– Is the user’s ID locked?
– Does the user have the correct group memberships for the system that the
user is attempting to access?
– Was the Content Engine security recently added or changed? Did you use
the Security Script Wizard (see 11.10, “Adding security” on page 306)?
If the logon appears to work, but the application does not appear or does not
work correctly:
Did the user receive any new applications or system patches?
A new non-related application might have updated files on the operating
system, loaded a different Java runtime version, or altered system settings.
Is the user accessing a different part of the application than the parts that
work?
Is the user using a piece of the application that requires a server or external
system that other parts do not?
Are any special permissions required to access this portion of the
application?
Were there any recent changes to the application that might impact only this
portion?
Was the Content Engine security recently added or changed? Did you use the
Security Script Wizard (see 11.10, “Adding security” on page 306)?
326 IBM FileNet Content Manager Implementation Best Practices and Recommendations
12.3.3 Many users report an issue
We assume that the checks in 12.3.1, “Quick checks” on page 323 have failed,
and many users are reporting issues.
328 IBM FileNet Content Manager Implementation Best Practices and Recommendations
– Administrative functions (performed within IBM FileNet Enterprise
Manager and also through APIs) must be performed during non-peak
hours, because they can consume a lot of CPU resource:
• Metadata authoring, for example, creating and updating classes and
properties.
• Administrative object updates, for example, creating new object stores,
IBM FileNet P8 Domain-level objects, or other Global Configuration
Database (GCD) objects.
Application considerations:
– Applications that use inefficient queries or folder schemes can cause
performance problems. We describe application design considerations in
other chapters of this book. We also have two technical notices on the
Web site that discuss ways to efficiently use queries and folders:
• The FileNet Content Engine Query Performance Optimization
Guidelines Technical Notice describes query details.
• The FileNet P8 Recommendations for Handling Large Numbers of
Folders and Objects Technical Notice describes folder use.
You can read the Technical Notices at:
http://www.ibm.com
Select Support & downloads → Documentation → Choose support
type → Information Management → Choose a product → FileNet
Content Manager → Go → Learn → Product documentation →
FileNet P8 Platform.
Or use this hot link:
http://www.ibm.com/support/docview.wss?rs=3273&uid=swg27010422
The technical notices are located in the section FileNet P8 Platform
Technical Notices.
To submit PRMs via the Web site, your Site Technical Contact, at your company,
must authorize you to submit PMRs electronically to IBM.
330 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Note: Only use electronic PMRs for minor problems. If your production system
is down, call IBM Support.
On more difficult problems, you might also need to have the following items:
Application architecture diagram that details how all application components
are designed to work
Network topology diagram, including servers, routers, firewalls, and network
load balancers
If your problem is performance-related, performance archive files
332 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Table 12-1 Problem severity descriptions and examples
Severity level Further definitions Examples
When speaking with a software support specialist, also mention the following
items if they apply to your situation:
You are under business deadline pressure.
Your availability, or when you will be able to work with IBM Software Support.
You can be reached at more than one phone number.
You can designate a knowledgeable alternate contact with whom the IBM
support representative can speak.
You have other open problems (PMRs) with IBM regarding this service
request.
You are participating in an early support program.
You have researched this situation prior to calling IBM and have detailed
information or documentation to provide for the problem.
After we start forcing errors by stopping the database, and then we try to use it,
we get the error messages shown in Example 12-2.
334 IBM FileNet Content Manager Implementation Best Practices and Recommendations
com.filenet.api.exception.EngineRuntimeException: DB_ERROR: An error
occurred accessing the database. ErrorCode: 0, Message: 'Connection
reset by peer: socket write error'
...
→
com.filenet.engine.dbpersist.DBMSSQLContext.throwEngineException(DBMSSQ
LContext.java:186)
...
Caused by: com.ibm.websphere.ce.cm.StaleConnectionException: Connection
reset by peer: socket write error
→ at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)
...
2007-09-17 10:52:01,203 INFO [ImportAgentDispatcher_EVTFS_#6] -
Retrying Connection to:FNGCDDS Caused by:The TCP/IP connection to the
host has failed. java.net.ConnectException: Connection refused:
connectDSRA0010E: SQL State = 08S01, Error Code = 0DSRA0010E: SQL State
= 08S01, Error Code = 0
2007-09-17 10:52:01,437 INFO [CacheUpdateDispatcher_EVTFS_#3] -
Retrying Connection to:FNGCDDS Caused by:The TCP/IP connection to the
host has failed. java.net.ConnectException: Connection refused:
connectDSRA0010E: SQL State = 08S01, Error Code = 0DSRA0010E: SQL State
= 08S01, Error Code = 0
The exception log for this error contains over 200 lines. The logs can get
extremely large during normal operation. The log clearly indicates that the
Content Engine cannot communicate with the database (see the text highlighted
in bold in Example 12-2 on page 334). After starting the database, everything
returned back to normal operation.
Repository Design
These major components form the building blocks of ECM solution design.
Solution building blocks are application tools that ECM solution designers can
specify and combine to build out each of the four components of an ECM
solution: ingestion, storage, process, and presentation. The IBM FileNet suite of
products contains applications and tools that offer designers a wide range of
features and functions for the design of each of the major components of an ECM
implementation.
Figure 13-2 on page 339 shows several of the IBM tools that are available to
ECM designers and the place for these blocks within the four major design
phases of an ECM solution.
338 IBM FileNet Content Manager Implementation Best Practices and Recommendations
ECM
Content and Workflow Presentation and Delivery
Content Ingestion
Manag ement Management
Bind documents
Applications Browsing
together
SMTP
Send
340 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Document type IBM FileNet P8 Comments and features
application tool
The following storage design visual aid will help designers build a repository
storage scheme that will store content with the correct class, with appropriate
metadata properties, on the right storage device, and with the proper security
settings.
Class hierarchy
In an IBM FileNet P8 repository, content is stored according to class. As
described in Chapter 5, “Basic repository design” on page 85, document classes
are object-oriented containers that hold content, properties that describe the
content, security descriptors, and pointers to storage areas on disk where the
content will be written.
342 IBM FileNet Content Manager Implementation Best Practices and Recommendations
able to construct a search for any document in the
Finance division, in the Accounting department, and in the
Accounts Receivable group.
Document Class The document class level describes individual document
types. The properties at this level describe properties
unique to individual document types. A “loan” document,
for example will have a loan number, while a “contract”
document will have a contract number and a contract
date.
A complex repository can have more than three levels of class hierarchy, but
regardless of the number of levels, the spreadsheet arranges the classes in a
hierarchy that makes the inheritance pattern clear. Figure 13-3 shows the class
hierarchy spreadsheet.
Docum ent C lass Hierarchy Class Storage Policy Class Security Policy
BaseDocum entClass
Docum ent ID
Division
Departm ent
G roup
The spreadsheet also shows two additional columns, “Class Storage Policies”
and “Class Security Policy”. These columns define two other aspects of
document class design.
The storage policy column labels need to refer to a storage policy description as
shown in Figure 13-4.
344 IBM FileNet Content Manager Implementation Best Practices and Recommendations
C la s s : L e g a l C o n tra c ts
G ro u p s
All C o n tra c t C o n trac t CM
E m p lo ye e s C le rk s O ffic e rs Ad m in is tra to r
P e rm is sio n s
V ie w D oc u m e n t x x x x
A d d D o c um en t x x x
M o dify P ro p e rtie s /A nn o tatio n s x x x
M o dify S e c urity x x
C ha n g e D o c um e nt S ta tu s x x x
P u b lis h D o c u m e n t x x x
D ele te x
C re a te n ew C la s s e s &
S u b C la s s e s x
Table 13-2 on page 346 describes IBM FileNet P8 application tools for actively
managing content.
346 IBM FileNet Content Manager Implementation Best Practices and Recommendations
IBM FileNet P8 application tool Features and capability list
Published documents:
Can continue to exist after the source
document has been deleted
Can be automatically deleted when
the source document is deleted
are not changed when their source
documents are changed
Can exist in a different folder than the
source documents
Can have a different file format than
the source documents, for example,
the source document might be a word
document, while the publication
documentation might be an HTML
document. Publishing options defined
by individual templates
Can originate as Microsoft Office (for
example, Word, Excel, and
PowerPoint®) documents and be
rendered to PDF or HTML.
348 IBM FileNet Content Manager Implementation Best Practices and Recommendations
IBM FileNet P8 presentation tool Features and capability list
350 IBM FileNet Content Manager Implementation Best Practices and Recommendations
IBM FileNet P8 presentation tool Features and capability list
Annotations:
Are independently securable. Default
security is provided by the class and
by the annotated object. Can
optionally have a security policy
assigned to it
Can have subclasses
Can have zero or more associated
content elements, and the content
does not need to have the same
format as its annotated object
Are uniquely associated with a single
document version and thus are not
versioned when a document version
is updated
Can be modified and deleted
independently of the annotated object
Can be searched for and retrieved
with an ad hoc query
Can subscribe to server-side events
that launch when an action (such as
creating an annotation) occurs
Can be audited
352 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Table 13-4 E-mail design patterns
Design pattern Description Challenges
EmailOutSMTP Sending e-mail from an Legacy systems do not contain the recipient’s
event in CM/Business e-mail addresses for outbound e-mails
Process Management No spell checker available
(BPM) No customized text
RelateEmail Storing e-mails either as Linkage between body and attachments for
individual files (body searches
and attachments) or as
one file See more details about relating documents in
“Relate and bind document design patterns” on
page 375
RestrictEmails How are the e-mails By recipients from the messaging system
secured (users/groups)
By technical users/groups
Functional mailboxes
354 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Design pattern Description Challenges
Table 13-5 E-mail management use cases and applicable design patterns
Use case Value propositions Applicable design
patterns
In this section, we assume that the document does not contain pages and the
document is not a compound document. Support for compound documents is
very new, and therefore, we do not address compound documents in this book.
All design patterns share the same challenges. The repository design must be
completed before the ingestion method can be used.
ElecDocHighVolOnce Documents are available The existing file system is not structured
on a file system for a single ideally
ingestion. Folder as opposed to searches question
Usability of accessing application after
ingestion
Source files are multiple times stored in
various versions
356 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Design pattern Description Challenges
ElecDocLowManually Documents are ingested Due to the manual step, this process can
and typically indexed by a only be done for low volume ingestion
user. User has to authenticate at the ingestion
application (usability)
Office Integration for non-Microsoft Office
products
Distribution of Office Integration if used
User has to provide meaningful index
information
Lack of drag and drop support
Education of users
While Workplace, Office Integration, and WebDav are mainly suited for low
volume ingestion with human interaction for the indexing part, Records Crawler
and other alternative tools can be leveraged for high volume ingestion. You can
use the tools for high volume ingestion for image ingestion, as well, which we
discuss in “Image (scanned paper and fax) ingestion design patterns” on
page 361.
Table 13-8 on page 359 summarizes the design patterns for content ingestion by
Business Process Management and the design challenges.
358 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Table 13-8 BPM ingests content design patterns
Design pattern Description Challenges
OperationsBasedDoc All document-based activities in BPM are Custom object operations are
delegated to P8 Content Manager using a available in BPM operations or
library of functions that are called need to be crafted according your
operations. The default operations are specification.
called CEOperations and implement the
most needed interactions between BPM
and CM.
CreateDocRelationship When storing documents in the context of Finding the most flexible
a process, they often need to be related to mechanism to bind the
each other. documents together, using:
Folders
Properties
Custom objects
Links
Containments
Any event in a workflow Maintain persistency after the workflow has OperationsBasedCustomo
can generate a custom ended bject
object and store it in an
object store.
Before writing your own application, it makes sense to check whether the
functionality is already available either by a third party or whether there are
quicker ways to enhance your application by either using the Web Application
Toolkit or Workplace Portlets. (See ecm_help → developer road map.)
360 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Table 13-10 Content ingestion by application design pattern
Application design Description Challenges
pattern
DocApiUploadBatch The APIs are used with the batch Transactional behavior can be
mode to be able to achieve high achieved in a limited way
throughput
DocApiUploadEnduser The user is authenticated in the Store user ID and password in the
API call application
Single sign-on
DocApiUploadTechUser The user is not used in the API Direct access to CM must become
call but a technical user is very restricted
Introduction of BPM and Records
Management (RM) is very difficult,
because the content-based
security is not in place
If you are using this pattern in a
Web application, consider
securing the link by an additional
hash key to prevent malicious
access to arbitrary content
Enhance existing application Unlock the data and content silos DocApiUploadTechUser
to delegate content Do not change the bespoke application too DocAPIUploadSingle
management to CM much and integrate easily with CM
Table 13-12 summarizes the design patterns associated with fixed content and
images and their challenges.
362 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Design pattern Description Challenges
ScanDelegateAnnot2CM Annotations are done after ingestion to This is not needed at all.
P8 Content Manager: Annotation as P8 Content
Allows you to annotate when the Manager annotations as
actual committed documents have opposed to PDF
not been indexed and visually annotations
verified Additional license cost for
Dependent on the viewer, can be the full-featured viewer
used to use annotations in a
consistent way regardless of the
document format (TIFF, PDF,
Office, and so forth)
Scan2Tiff Images are saved as TIFF files: Where to put the OCR text
Ingestion page-wise
Ingestion document-wise
Scan2Pages Each page of a scanned document is Ingest the various files into
saved as a file same document using
different content elements
Export the document as
one file
364 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Table 13-13 Image ingestion use cases and the applicable design patterns
Use case Value propositions Applicable design
pattern
You can distinguish the use cases by the level of automation and whether the
paper to electronic conversion occurs in an early or late stage in the business
process.
Table 13-14 on page 367 summarizes the design patterns associated with
repositories that are connected to Content Federated Services (CFS) and their
challenges.
366 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Table 13-14 Design patterns for repositories connected through CFS
Design pattern Description Challenges
APICFS Existing applications need more All new functionality can only be
functionality made available loosely coupled and
in the background, because existing
Changing the access path to the applications are not going to change
repository leveraging the IBM
FileNet P8 API cannot be done in
a timely manner
ActiveCFS Additional need for a notification Data field mapping might constrain
after ingestion or launching the data model in P8 Content
workflows for existing repositories Manager
OpticalCFS There is a business need for Data field mapping might constrain
support of optical media the data model in P8 Content
Manager
Optical media support only for
Image Manager available
New clients to IBM FileNet P8 products must consider a direct ingestion path into
P8 Content Manager to ensure that they can leverage the newest features of the
IBM FileNet P8 Platform.
No custom applications
Table 13-16 on page 369 summarizes the SAP ingestion design patterns and
their challenges.
368 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Table 13-16 SAP ingestion design patterns
Design pattern Description Challenges
SAPInbReo Optimizing volume of data held in the SAP Huge volumes of data. Ask
underlaying database. Unneeded documents for appropriate storage
are archived to P8 Content Manager and can media
be put under Records Management if needed
Best practice: Try to avoid large code blocks in events. Instead, use a
delegate pattern that allows you to test code separately from P8 Content
Manager.
370 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Table 13-17 Adjustment of classification-related design patterns
Design pattern Description Challenges
AdjFileInFolder Folder structure was not If you save certain properties, which are
known at ingestion time. common to other documents on a folder level
and then file the document in just a folder, the
logic can be put in this type of an event. The
event can, as well, deal with the folder
creation.
AdjDeclareAsRecord Ingestion tool cannot While BPM offers you an easier way to
declare as records. declare a content as record, in a P8 Content
(Third-party tools). BPM is Manager only use case, you can decide to
not needed. call the declare as record method (part of the
RM API) from an event.
AdjDataMapping There is a data mapping You have a standard for your meta
problem, and the information but the need to federate another
workaround is that repository might end up with data type
property content gets mapping problems. For these cases, you can
assigned to target map the data to additional properties, which
property. can be mapped, and then mapped back the
values in the event. This removes the
technical complexity and allows you to
enforce a common data model.
AdjSecByValues Marking Sets Users do not worry about security; they just
want to select another value for a property in
a choice list.
AdjSecByLifeCycle Life Cycle Policy (LCP) The life cycle goes through many different
types of status, and users do not want to think
about minor and major versions; therefore,
Life Cycle Policies are great from a user
perspective. For the programmer, they are
additional work. Promoting and Demoting is
the mechanism to leverage LCPs.
372 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Table 13-19 Design patterns associated with moving content
Design Pattern Description Challenges
AdjStoragelocation Change the storage You can distinguish between short-term storage
location for content. and long-term storage. As long as the object store
has two different file stores associated, the
content can be relocated to a different file store.
If the content must be relocated, including the
metadata, see the copy content option.
ReplicateByProxyObj Create another object for The content needs to be left in its original place
ect indexing purposes of a but made available through a different object
second business context. store, maybe even in a different IBM FileNet P8
The new object refers back domain. This is possible by writing code that
to the same physical generates a contentless object that includes a
content. URL to the actual physical content.
Take care on the security. The accessing user
must have the security for the proxy, as well as for
the physical content.
ReplicateByCopyCon Create a duplicate in There are rare situations where you need a
tent another object store. document in two or more business contexts, and
you do not have a “binding” strategy, so you are
tempted to copy. From a compliance perspective,
this is not allowed. See “Relate and bind
document design patterns” on page 375.
ReplicateByRenderC Create a duplicate in a There are formats that need certain platform
ontent different content specific components, such as Microsoft Office, on
representation. a server platform.
RouteByWorkflow After a content is stored, Multiple documents belonging to the same context
a workflow is launched. each can launch a workflow. The launched workflow
must have the ability to wait for a certain amount of
time, and other documents belonging to the same
context must be able to attach to the already launched
workflow. This behavior is implemented in the
Business Process Framework (BPF).
There are more activities that can be part of the first moment that an event is
launched. We describe other popular situations, such as relating documents, in
the next section.
All four mechanisms are triggering events, but, other than the first one, the rest of
the mechanisms do not need coding in the event to allow users to interact with
the system.
374 IBM FileNet Content Manager Implementation Best Practices and Recommendations
“Processing-related design patterns” on page 375
“Relate and bind document design patterns” on page 375
ProcByEvent Event triggers and There are almost no constraints in the functionality that you
subscripted code or can achieve by calling external code or launching
workflow are workflows. Think carefully which event must do what. While
launched BPM gives you, for many situations, an easier interface for
defining the functionality graphically, it might be appropriate
to call external code directly. If you have a complicated
system, the BPM approach might give you an easier
documentation path rather than generating graphs for all the
dependencies of your code manually.
ProcByPropertyV Update event This option is very powerful and, from a usability
alueChange launches and the perspective, very beneficial. Without any code written,
associated value is security can be changed by leveraging marking sets.
changed. This can Defining the marking sets needs a bit of education. From an
also trigger security operations point of view, it might be difficult to maintain,
changes if a marking because marking sets are globally available to the IBM
set is associated. FileNet P8 domain. A naming convention for the marking
sets might be useful.
ProcByVersionSt Version status Education of users to use minor and major versions might
atusChange changes and be sort of a challenge. After this is done, security policies
associated security might be extremely beneficial. In addition, they provide a
policy change powerful vehicle, which prevents changing security on
permissions documents over the life cycle.
ProcByLifeCycle ChangeState event Very flexible but more labor intensive for the programmers.
StateChange launches and the From a usability perspective, this is user friendly. A user can
associated life cycle just demote or promote and does not have to worry about
actions are executed. security or minor and major versions at all.
For every change,
code can be executed
and security
changed.
RelateDocByProperty Documents of the same How to trigger the user to find related
business context are loosely documents? A stored search or a search
coupled by the same value in a template might help.
specific property. How to prevent deletions of some of the
documents?
RelateDocByAssociati Documents of the same You can only administer the foreign entity
on business context are tightly and not the primary entity. For example,
coupled using the association each claim has an associated policy. You
property, ensuring referential can add a policy number to a claim, but you
integrity. cannot add a claim to a policy. So, the claim
(foreign entity) references the policy (primary
entity).
RelateDocByCompou Documents of the same With this approach, single documents can be
ndDoc business context can be linked changed and the collection of documents
using the compound document that are treated as an entity can be
feature. This is a very tight refreshed. So, a major version of a
coupling mechanism which compound document can be the collection of
takes care of every change. all major versions of its children.
376 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Design pattern Description Challenges
AnnotDocByDeja Use DejaViewer and annotate for The annotations can only be made available
all file formats consistently. while having access to IBM FileNet Content
Manager. As soon as the content is checked
out, there is no way to access these
annotations.
AnnotDocByNative Use PDF Annotations and Office Many users do not have the license to
Version Tracking. annotate PDFs. For Office documents, often
it makes more sense to track changes
natively in the Microsoft Office applications.
SignDocByNativeA Use the native environment to sign Loosely coupled information about signed
pp a document, for example, PDF documents is a challenge.
documents
DeleteDocManual Allow some superusers to Audit the deletes to make sure that the
delete documents. users who can delete cannot change
audit levels.
RMSweep Delegate the deletion to RM. Not every content is part of a record.
DeleteDocByHide Do not delete, but hide the This does not cope with the storage
documents by removing growth.
permissions or by unfiling from
folders.
DeleteDocAutoStorage The retention period ends on the Make sure that the content is not just
storage tier. removed from the storage; this leaves
the file store in an inconsistent state and
the metadata can still be found.
378 IBM FileNet Content Manager Implementation Best Practices and Recommendations
“Restricting design patterns” on page 381
“External linking design patterns” on page 381
Table 13-25 Finding content and metadata design patterns and their challenges
Design pattern Description Challenges
FindByStoredSearch Uses a similar structure as a folder The effort to create the stored
(Workplace and Workplace XT) but searches and to maintain them might
executes a search. This is be substantial.
extremely user friendly and
powerful.
FindByReport Reports can be done by using the Depending on the nature of the
IBM FileNet Content Manager queries, a system administrator or a
JDBC driver, including security user can run them. Think about the
settings, or by reading the security implications.
database natively (no security).
DeliverStream This is the default behavior when Multiple Content Elements might not
accessing a document through be supported by the configured
Workplace. consuming application. Workplace
The MIME type is relevant to (DejaViewer) allows you to step
launch an associated application through single page TIFFs if they are
at the client. saved in multiple content elements
Workplace can be customized to (page-wise ingestion).
behave differently for certain
MIME types.
DeliverZipped A limited number of documents Only the first content element is used.
can be selected in Workplace and The number of files is limited.
marked for download. A zip file is
delivered containing selected
documents.
380 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Restricting design patterns
Design patterns related to restricting content delivery are summarized in
Table 13-27.
RestrictByProperty Changing a property value will not Make sure the property is indexed
make the candidate document any correctly on the database. There is
more a result of a certain search no guarantee that the document will
and therefore restricts access to the not show up, because it can be
document for a given search found by a different search, unless
criteria. Using a marking set for the the marking set sets permission
choice of values makes this pattern more restrictively.
even more powerful by combining
the strength of the
RestrictBySecurity pattern with the
ease of use of the
RestrictByProperty pattern.
RestrictByFolder This is filing a document or unfiling The document is not browsable any
a document from a folder. longer, but it is still searchable.
LinkByDocGuid This is a version stable document ID. This is not really linking to the
content at run time; the content
was linked in the past.
All external linking patterns are challenged by the authentication problem that
needs to be addressed by either implementing a single sign-on mechanism or by
leveraging a technical user pattern “DocApiUploadTechUser” on page 361.
The parameter vsId points version-agnostic to the document. For each version,
there is a different id. The parameter Id can be overridden by a literal current or
release.
382 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Customizing the ViewOne component is described in ViewOne HTML and the
Installation Manual, which is available from the customer support page at
Product Documentation for IBM FileNet P8 Platform:
http://www.ibm.com/support/docview.wss?rs=3278&uid=swg27010422
The requirements mentioned can now be mapped directly in the design patterns
found in the relevant section for content ingestion, content and workflow
management, and delivery and presentation management.
Figure 13-6 on page 384 illustrates the document review and approval process.
minor versions.
0.2
1.0
0.1
Repository
0.2
Figure 13-6 Document review and approval require minor and major versions
By going to the list of all of the design patterns that we presented earlier, we
marked potential relevant patterns and summarized them in the following shaded
box.
384 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Discussion of the chosen design patterns
By reading the list of all ingestion patterns, remember that you have to deal with
the task of importing existing documents, as well as with the daily document
interactions after importing is completed. ElecDocHighVolOnce was selected to
ensure that the choice of the import tool and the preparation of the document
import are addressed in the solution description. Because most of the documents
have an electronic document character, there is no need for a tool addressing
the page-wise ingestion. So, our tool can be Records Crawler, which can handle
document-based ingestion well. For the import, the project can do an analysis of
the target folder structure or might just use properties instead of folders at all.
The techniques to allow the users to still file in their existing folder structure while
migrating documents to P8 Content Manager is described in 13.3.3, “Information
capture supporting call center operation” on page 391.
The choice of the tool for the daily interactions with P8 Content Manager is not
yet clear. There are no restrictions yet. It might be Workplace and Office
Integration so far. This might become clearer when the delivery management
patterns are understood.
We have not chosen the alternative approach where security and routing are
solved by promoting and demoting. One reason was that we did not want to use
a programmer to code the actions for the purpose of a simple use case. In reality,
you might decide on the alternative approach depending on the skill set available
to the project.
The delivery patterns are guiding you through the choice of application.
You compare the capabilities that are offered by Workplace. If they satisfy your
requirements, you might use WorkplaceXT. If you need more specific features,
you might consider comparing the two approaches, customize Workplace as
Solution details
The four different states that a document can represent typically are a key
indicator to use ProcessByLifeCycleStateChange. To keep the solution design
simple, we translated the functional requirements into a design, which allows
users to toggle a status flag on the document. Depending on its status (draft, to
review, reviewed, and approved), a property can be toggled by the user who
has the needed permissions.
Each change to the status of the document will be audited and an notification is
sent to the reviewer group as soon as the status has changed to to review.
This is an attractive alternative for reviewers, who do not work with Workplace on
a daily basis. As soon as a document is ready for review, the reviewers get a
notification. Later, when reviewers are using P8 Content Manager more often,
this notification is not needed any longer.
Each role (author, reviewer, approver, or user) will have a preconfigured portlet,
which shows a list of objects for which the user has permission and the status of
those objects.
A user typically has one portlet that shows all approved documents. A reviewer
sees all documents that are ready for review and perhaps the reviewed
documents. In addition, the user needs to see approved documents.
Authors need to see their own documents and the documents ready for review,
as well as the reviewed and approved documents.
The portlets can consume a stored search as their browsing “folder”. This is a
good way to combine property-based searches with the look and feel of folders.
According to the illustration Figure on page 384, only major versions are
approved. This is not really needed technically, but it is a common way to
implement access to documents. Typically, users will only see major versions.
386 IBM FileNet Content Manager Implementation Best Practices and Recommendations
The system will be implemented on an “all-in-one” approach where all engines
are installed in one machine.
Two hundred users only generate a small number of documents relative to what
other clients are achieving using P8 Content Manager. Therefore, we decided to
implement everything in one object store but use separate document classes for
each functional aspect.
The life cycle of the documents was implemented by a status property with
associated marking sets. This addresses two issues at the same time: Control
the access to change the security, and simplify the function of changing security
by toggling a value. The marking sets will be implemented per document class.
From a security perspective, different groups are needed for each functional
aspect of the document type and per user role, such as author, reviewer, and
approver. There might be more sophisticated ways to achieve the combination of
functionality and roles that we did not include in this simple scenario.
Figure 13-8 on page 389 illustrates the insurance claim processing use case.
388 IBM FileNet Content Manager Implementation Best Practices and Recommendations
1. Insurance claim 2. Fax Capture
arrives from a field adds the claim to
office by fax. the repository.
Capture
3. A workflow
launches
automatically. 8. Claim documents
are sent to the client.
Repository
BPM
By going to the list of all design patterns that we presented earlier, we marked
potential relevant patterns and summarized them in the following shaded box.
Annot@Scan was selected to make sure that annotations are only used for
describing bad images or corrections to images that cannot be rescanned.
There is no plan to allow users to annotate the images after ingestion. For this
purpose, we want to use a property that allows us to store long text.
390 IBM FileNet Content Manager Implementation Best Practices and Recommendations
ScanDelegateVal2ActCont expresses the capability that validation of certain
assigned index values will be performed as part of a workflow, which might
include human interaction.
By going to the list of all design patterns that we presented earlier in the chapter,
we marked potential relevant design patterns and summarized them in the
following shaded box.
Records
3. Load-balanced Web servers provide
Crawler
fast response times required by a large
Servers
call center.
Server-Farmed Load-Balanced
Repositories Application (Web)
Servers
Input Files
The EmailFullText pattern describes the capability to fulltext index the e-mails
(attachments and body), which complements the ingestion of electronic
documents.
In this scenario, we consider a simple class model, which holds all of the
necessary information of a case regardless of the ingestion channel. We do not
distinguish between body text and attachments. The object store needs to be
392 IBM FileNet Content Manager Implementation Best Practices and Recommendations
fulltext indexed to serve the information provided in the stream to the user.
ElecDocHighVolMulti in this context describes mostly the nature of versioning.
Every single piece of information, which is ingested, is understood as a new
major version 1. There is no versioning needed at all. Performance with high
volumes is another characteristic of this pattern. The original files were removed
from the file system and from the e-mail journal files respectively as soon as
e-mails were archived.
We assume that at ingestion time, the rules have been powerful enough that the
case numbers have been defined all along.
e-mail Server
Inbox
Records File Plan
By going to the list of all design patterns that we presented earlier in the chapter,
we marked potential relevant design patterns and summarized them in the
following shaded box.
394 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Applicable design patterns for the use case:
Content ingestion:
– Ingestion Pattern, “EmailAutoRules” on page 353
– Stubbing Pattern, “EmailStub” on page 353
– Indexing Pattern, “EmailIndex” on page 354
– Classification Pattern, “EmailDocClass” on page 354
– Ingestion Pattern, “EmailDeclRecord” on page 354
– Classification Pattern, “EmailClassificationAutomation” on page 355
Content and workflow management:
– Processing Pattern, “AdjDeclareAsRecord” on page 371
– Security Pattern, “AdjSecByValues” on page 372
– Security Pattern, “RestrictEmails” on page 354
Delivery and presentation:
– Delivery Pattern, “FindByStoredSearch” on page 379
– Delivery Pattern, “DeliverInterceptedStream” on page 380
– Delivery Pattern, “RestrictBySecurity” on page 381
– Deletion Pattern, “RMSweep” on page 378
Without this assumption, we might defer the records declaration to the CM and
BPM part, AdjDeclareAsRecord, which typically requires human interaction to be
able to complete the classification.
The pattern EmailIndex concerns when we will complete the index information
describing the ingested documents. If we are able to classify the documents
correctly, we can also assume that we can complete indexing at the time of
396 IBM FileNet Content Manager Implementation Best Practices and Recommendations
ingestion. In reality, this might be different. Carefully think how you want to index
the multi-item values (to:, cc:, and bcc: fields) either by the underlying database
indexing mechanisms or by the usage of verity fulltext indexing. This greatly
reduces the overhead of running queries against recipient lists.
In our situation, we decided to use the database, and we make sure that we have
an index applied over the ListofString table.
Consider switching object stores based on time intervals, for example, every
year. With this organizational approach, clients with huge e-mail ingestions can
get a more convenient method of improving search times and improving
throughput for deletion.
Before you delete anything, identify documents with a legal hold and move them
to a different object store. This approach helps you to easily drop the object store
and delete the file store, which might sound extremely pragmatic but can help to
speed up the deletion process tremendously.
398 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Related publications
The publications listed in this section are considered particularly suitable for a
more detailed discussion of the topics covered in this book.
Online resources
These Web sites are also relevant as further information sources:
IBM FileNet Content Manager support Web site:
http://www.ibm.com/software/data/content-management/filenet-content-
manager/support.html
Product documentation for IBM FileNet P8 Platform:
http://www.ibm.com/support/docview.wss?rs=3278&uid=swg27010422
You can obtain technical notices from the previous product documentation
Web site, in the Technical Notices section, including:
– IBM FileNet P8 Performance Tuning Guide
– IBM FileNet P8 High Availability Technical Notice
– IBM FileNet Content Engine Query Performance Optimization Guidelines
Technical Notice
– IBM FileNet Application Engine Files and Registry Keys Technical Notice
– IBM FileNet P8 Asynchronous Rules Technical Notice
– IBM FileNet Content Engine Component Security Technical Notice
– IBM FileNet P8 Directory Service Migration Guide
– IBM FileNet P8 Disaster Recovery Technical Notice
– IBM FileNet P8 Extensible Authentication Guide
– IBM FileNet P8 Process Task Manager Advanced Usage Technical Notice
– IBM FileNet P8 Recommendations for Handling Large Numbers of Folders
and Objects Technical Notice
– IBM FileNet P8 DB2 Large Object (LOB) Datatype Conversion Procedure
Technical Notice
Although several technical notices were written for the 3.5 version, much of
the content provided is useful for Version 4.0 as well.
400 IBM FileNet Content Manager Implementation Best Practices and Recommendations
Index
Application Program Interface (API) 20, 26, 154,
Symbols 157, 160, 163–164, 175–176, 195, 250, 252
.NET 160
Application Role 272
.NET API 154
application server 29, 68, 79, 104–105, 158, 165,
249, 257
A application-based replication 232
access control 132 architecture 5, 24, 218, 322, 385
Access Control Entry (ACE) 100 architecture and design sessions 93
access control entry (ACE) 135 architecture requirement capture session 93
Access Control List (ACL) 31, 100, 139 archiver.jar 280, 316
access control list (ACL) 135 association property 180
Access control matrix (ACM) 137 asymmetric 1-to-1 224
access data 132 asymmetric N 224, 226
access role 155 asymmetric server cluster configuration 226
ACL asynchronous
object 143 event subscription 174
Active content 5 asynchronous replication 234
active content 5, 16–17, 88, 174, 347, 350 audit log 286, 316
example 7 Content Engine 315
Active Directory 260 audit logging 295
Add Entry Template 263, 272 auditing 316
AddOn event 172 AUTHENTICATED_USERS 144
administration 282 authentication 31, 139, 167–168
AJAX 29 authorization 31, 139
alter data 132 automated system monitoring 320
annotation 16, 123–124 availability 216
annotation class 123 avoid downtime 217
annotations 123
Apache Foundation 272
Apache log4j 285
B
backup 231, 317
API batch 172
data 237
API call 361
offline 310
APIs 162, 360, 365
online 311
applet 156, 158, 202
point-in-time 232, 238
application container 158
required system components 309
application crash 217
backup and restore 308
application design 20, 25, 160
backup tape 238
impact on performance 328
backup window 310
Application Engine 44, 57, 270, 322
best practice
checking 325
folder structure 189
exporting and importing 270
best practices
multiple instances 46
disaster recovery 241
Application Engine (AE) 29–30, 50, 220, 240, 265,
high availability 240
283
402 IBM FileNet Content Manager Implementation Best Practices and Recommendations
198, 200, 218, 220 catalog 195
message logs 283 Content Engine 28, 30
Content Federated Services (CFS) 10, 81 exporting and importing 269
content ingestion row limit 181
sizing questionairs 68 database schema 287
Content Manager database store 109, 194, 196
full deployment 261 database view schema 208
incremental deployment 261 database-based replication 235, 241
content object 25, 28, 56, 86, 93, 115, 186–187, default instance security 145
197, 232 Demilitarized Zone (DMZ) 137
single version 187 deployment 252, 256
content storage 108–109, 197–198, 241 cloning 259
load-balancing capabilities 198 full 261
single logical target 198 incremental 262
content store 96, 109 deployment approach 256
content-based retrieval (CBR) 205 design
control changes 133 document class, based on content 112
coupling 87 document class, based on function 112
CPU utilization document class, based on organization 111
Dashboard 278 impact on performance 328
crash loggging 177
application 217 repository 93
cross repository search 203 repository, bottom up 89
custom application 20, 48, 56, 69, 156–157, 240, repository, interviewing process 92
361 repository, top-down 90
Custom object design methodology 13, 22
class 117–118 design pattern 361, 365, 368
classes characteristic 118 design patterns
custom object 95–96, 117, 181, 205, 207, 261, 266, definition 351
360 development 253
class 117 direct ingestion 367
custom object class Direct Internet Message Encapsulation (DIME) 166
design recommendation 118 directory server 31, 53
custom property 70, 99, 113, 116, 173, 195, 202 directory service 29, 31, 35, 141, 267, 353, 357,
381
different organizational unit 47
D Directory Service Provider
Dashboard 276, 278–279, 281, 323
exporting and importing 270
data
disaster recovery 37, 214, 223, 229–230, 239, 242
access 132
best practices 241
data backup 237
common approaches 237
data integrity 132
disk utilization
data loss 215, 232, 234, 239
Dashboard 278
data model
Display name 94, 99
creating 179
disruption 230
data privacy 133
disruptive event 214
data replication 223
critical business functions 214
data segregation 44–47, 52
distributed system 52–53, 57
database
DNS server entry 240
Index 403
document reason for 247
revision cycle, example 6 event 126
revision process 14 AddOn 172
state 8 design recommendation 127
document class 19, 86, 110–111, 181, 194, 196, On Add 16
198, 252, 259, 263, 265, 342–344, 362 event action scripts 8
actual creation 113 event subscription
database storage 196 asynchronous 174
design 112 synchronous 174
design based on content 112 event subscription model 175
design based on function 112 explicit object security 146
design based on organization 111 export 267
design recommendation 113 Export security feature 267
document content 111, 196, 200 export sequence 265
document life cycle 125 exporting 258
Document type 126 exporting and importing components 270, 272
document type 188–190, 362, 387
domain 53, 101–102, 206
Domain Name Server (DNS) 241
F
facility management 214
downtime 216
failback 222
avoid 217
failover 221–223
dynamic privacy 133
farm 34, 218, 221
server farm 227
E farming 34
eForm 71, 360, 377 Fax Capture 17–18
EJB transport 157, 159 Fax capture 16
disable transaction propagation 166 fetch 169
good model 179 fetchless instantiation 169
Java API 171 file storage area 241
reverse proxy 179 file store 28, 44, 80, 96, 109, 196–197, 199–200
workable reverse proxy 179 file store device 200
EJB™ transport layer 60 file system 24, 28, 35, 96, 113, 115, 186, 200, 232,
electronic document 86, 119, 340, 356–358 356–357, 363, 379
email 349, 353–354 filed
Email Manager 11, 20–21, 82, 210, 341, 357, 392, folder option 186
397 FileNet Enterprise Manager
e-mail message 187 annotation wizard 123
encapsulation 87 document object properties dialog 119
engine 28 wizard interface 112, 116
engines FileNet Enterprise Manager (FEM) 44, 50, 94, 101,
communication 44 149, 154–155, 157, 194–195, 289–290, 294, 325
enterprise configuration management database enable trace logging 293
253 installation 206
Enterprise Java Bean (EJB) 159 tree view 195
entry template 24, 251, 258, 271 FileNet P8
environment 135, 246 domain 32, 54
testing 254 FileNet P8 Platform
environments 246 planning 258, 264
404 IBM FileNet Content Manager Implementation Best Practices and Recommendations
FileNet System Monitor 283 Distance (HACMP/XD) 236
FileNet System Monitor (FSM) 282 horizontal scalability 33, 228
final time 108, 113 horizontal scaling 33
fixed content store 109, 200 host-based replication 232–233, 235
consideration, vs file store 200 hot site
fixed storage device 200 third party recovery services 237
folder 128, 265
design 116
hierarchy 197
I
IBM Customer Number (ICN) 329
inherited security 191
IBM FileNet
folder class 95–96, 113–114, 116, 164, 181
Content Service 241
actual creation 116
Fax Capture 17
folder hierarchy 191
support side 264
folder object
system capacity planning tool 65
recommendation 128
IBM FileNet P8
folder option
Business Process Manager 5
un-files and filed 186
Content Engine 82
folder structure 186, 188, 191, 251, 263, 356, 358,
Content Federated Service 10
379, 385
documentation 161, 167
best practice 189
eForms 156
full deployment 261
Enterprise Manager (FEM) 44, 50
fulltext
family 28
exporting and importing 270
Forms Manager 9
functional area 111
Image Manager 10
functional design 22–24, 383, 389, 391
Records Manager 12
functional requirement 23
system 65, 79
IBM FileNet P8 Platform 3
G IBM FileNet P8 platform 28
generic object system properties 98 IBM FileNet P8 system 28, 35, 39
geographic cluster manager 236 IBM Metro Mirror (PPRC) 233
geographically-dispersed farm 243 IBM support 329
global cluster manager 236 Image Server 74
Global Configuration Database (GCD) 44, 53, 105 import 267
Global Configuration database (GCD) 29, 32, 94, import sequence 265
97, 172 importing
Globally Unique Identifier (GUID) 262 objects 269
grantee 143 inbound documents 368
GUID 253, 258, 262–263 incremental deployment 262
index 354, 356, 362
index area 206, 241
H Information capture 14, 392
heartbeat 34
information capture 68
help 274
ingestion
hierarchy
content 68
folders 197
content, sizing questionairs 68
high availability 218–219, 229, 242
direct 367
best practices 240
ingestion rate 4
high availability (HA) 214–215
inheritance
High Availability Cluster Multiprocessing/Extended
Index 405
object class 180 Lightweight Directory Access Protocol (LDAP) 31,
instantiation hierarchy 129 100, 108
integration 358, 362–363, 366 LineItem 180
integration testing 254 LineItems property 180
integrity lines of business (LOB) 111
data 132 load balance 18
interviewing process load balancer 19–20, 34, 50, 218, 220
repository design 92 layer 50
isolated region 29, 44–46 product 219
virtual IP address 240
load balancing 34, 219, 221
J session-based 220
J2EE application
load testing 254
development 159
load-balanced server farm 218–220
server 30
local area network (LAN) 102–103
J2EE application server 105, 158, 174
log
instance 105
Process Engine 315
message logs 284
log4j 285
vender 218
setting 285
J2EE container 171, 178
log4j.xml.server 285
J2EE environment 218
logging 177
J2EE servlet container 159
design 177
J2EE specification 157
logs
Java API 80, 156, 159, 182
message logs 283–284
reference material 161
long string 182
Java APIs
class type 164
Java applet 158 M
Java Authentication and Authorization Service maintenance
(JAAS) 31, 150–151, 167–168 best practices summary 316
Java Server Faces (JSF) 29 logs 316
Java Virtual Machine (JVM) 44, 105 maintenance planning 26
Java Virtual Machine (JVM™) 30 major version 381, 393
JDBC interface 173 version
major 15
many-to-many relationship 181
K marking set 127
knowledge base 256
recommendation 128
knowledge worker
maximum downtime 216
business role 92
memory 279
message log
L maintenance
LDAP 134 logs
life cycle maintenance 316
design recommendation 126 message logs 283–284
document 125 Message Transmission Optimization Mechanism
policy 374 (MTOM) 166
Life cycle policy 126 meta information 271
lifecycle 5 metadata class 161, 163
406 IBM FileNet Content Manager Implementation Best Practices and Recommendations
metadata elements multiple fixed content stores 109
organizational 188 multiple object stores 262
minimal disruption 230 name 268
minor version property template folder 121
version repository 206
minor 15 search 196
mission critical system 282 security 45–46, 143
monitoring services component 241
system 320 tab 203
multiple folders 113 underlaying database scheme 262
multiple locations 128 various objects 271
multi-repository search 53 object store gate 144
multiselect operations 208 object store security 143
object-oriented design (OOD) 86
ObjectStore 161
N object-valued properties 180
NAS replication 233
object-valued property 169–170, 208
NetApp® Snaplock 200
On Add event 16
network address translation (NAT) 257
on-line help 274
network device 219
operation
network topology 133
bulk 298
Network utilization
bulk operation 208
Dashboard 278
Oracle RAC 221, 228
Network-Attached Storage (NAS) 232
organizational metadata elements 188
non-functional requirement 23
organizational metadata properties 188
O
object P
P8 Content Manager 154, 156–157
ACL 143
Administration section 206
generic, system properties 98
APIs 165, 177
object class
architect technical role 91–92
inherintance 180
catalog database 195
object gate 144
client 104
object security 146
configuration information 194
object store 28, 44, 80, 94–95, 154, 193–194, 250,
content transaction 187
252–253, 369–370, 387, 392
database view schema 208
actual creation 108
document life cycle 125
administrator 108
folder 113, 115
box population 263
foldering concept 128
configuration 197
help file 208
creation wizard 263
product documentation 163
database 108, 195
release 161
design 106
repository 186–187, 193
design recommendation 108
repository element 88
GUID 253
search 201
import assets 267
search tool 205
initial ACL 108
solution 110, 181, 218
maintenence activity 208
support area 161
multiple file stores 109
Index 407
update 172 message logs 283
P8 Platform 3 Process Simulator 155
P8 platform 28 properties
P8 system 28, 35, 39 object 100
parent folder 192 property template 95–96, 119, 259
pattern 87, 382 actual creation 122
PDF rendition design recommendation 122
feature record 17 Special considerations 96
peak hours 69 property value
pending change 170 constraint 180
perf_mon 73 PropertyDefinitionString 182
performance 79 PropertyDescriptionString 182
application design impact 328 PropertyTemplateString 182
monitoring 275
trace log 284
troubleshooting tips 327
Q
queries
performance archiver 280
creating and running 287
performance data
report 316
capture 316
using database schema 287
performance issue 40
Query Builder 206–207, 287, 298, 301, 304
performance test 56, 80, 255
Query Builder Script 207
performance testing 254
questionairs
physical security 133
content ingestion 68
PMR
sizing, user activities 69
open by calling IBM 330
QueueItem table 298
open via Web 330
point-in-time backup 232, 238
policy R
life cycle 374 Real Application Cluster (RAC) 34, 218
security 344, 351, 375 recommendation
post-install script 172 choice list design 123
pre-fetch 56 custom object class design 118
preload 56 document class design 113
preloaded cache 56 event action and subscription design 127
primary function 188 folder design 116
privacy folder object 128
data 133 life cycle design 126
problem 326 marking set 128
isolation 323 object store design 108
Problem Management Record (PMR) 329 property template design 122
Process Designer 156 site design 104
Process Engine 44 virtual server 104
checking 325 Records Crawler 18–19, 83, 340, 358, 385, 392
exporting and importing 270 records management 71
isolated regions 46 records management (RM) 68, 71, 92, 127
log database 315 Records Manager 20, 48, 71, 240, 398
statistics log 286 separate, database object store 210
transaction rates 83 recovery 230, 237
Process Engine (PE) 29, 78, 82, 220, 240, 269 disaster 214, 223
408 IBM FileNet Content Manager Implementation Best Practices and Recommendations
disaster, common approaches 237 process, document 14
Recovery Point Objective (RPO) 230–231 round-trip 162, 169–170
recovery service Content Engine 170
third-party hot site 237 multiple objects 162, 170
recovery site 230–231 round-trips
replacement systems 241 minimizing 169
Recovery Time Objective (RTO) 230–231 route control 8
recursion level 170
Redbooks Web site 400
Redundant Array of Independent Disks (RAID) 215
S
scalability 35, 52
redundant standy system 237
scaling
references
horizontal 33
Webservices 268
vertical 35
referential integrity 180
scaling scenario 63
referential integrity mechanisms 169
schema
reflective property 180–181
database 287
Container 181
Scout 65–66, 73
mechanism 180
output 71
regression test 254–255
sample output 72
performance test 255
use cases 67
small suite 255
utilization chart 71
regression testing 254
search 114, 201, 298
relational database (RDMS) 194
cross repository 203
release 250
folder design recommendation 116
release management 250
Workplace 205
release manager 250
search criteria expression 203
Remote Method Invocation (RMI) 166, 179
search criterion 114, 187, 202
Remote Procedure Call (RPC) 278
search paradigm 114
Rendtion Engine 29
search server 206
replicated data clusters 223
Search Template 272
replication 232
search template 24, 155, 202, 206, 208, 251, 254,
application-based 232
258, 263
database-based 235–236, 241
searches
host-based 232–233, 235
stored 155
storage-based 233, 235–236
security 31, 306
replication choice 237
default instance 145
report
explicit object security 146
Dashboard 280
inherited, folder 191
report queries 316
object store 143
repository 186–187, 193
security changes 150
design 89, 93
security features 132
design goal 86
security granularity 86
naming standard 93
security policy 15, 208, 344, 351, 375
repository design 25, 85, 88, 98, 253, 255, 356, 363
identifying ID 208
request forwarding 57
security verification 134
restore
server capacity 328
system 308, 312
server cluster 221
reverse proxy 179
active-active 222
revision
Index 409
active-passive 220 SQL View 207–208
configuration 224 SSO framework 168
software products 224 full discussion 168
server clusters standby system
comparing with server farms 227 redundant 237
server farm 18–19, 217–219, 221 state
essential difference 221 document 8
key enabler 218 static privacy 133
load balancing 219 statistics log 286
load-balanced 218–220 Storage Area
server farms Network 232
comparing with server clusters 227 storage area 28, 95, 113, 124, 197–199
server instance 35, 54–55, 101, 104–105 design 108
Service Level Storage Area Network (SAN) 221, 223
Agreement 215 storage farm 198
Service Level Agreements 282 storage policy 113, 124, 198–199
Service Oriented Architecture (SOA) 159, 162, 249 storage-based replication 233, 235
Service-Oriented Architecture (SOA) 159 database-based replication 236
servlet container 159 emerging specialization 233
session-based load balancing 220 stored search 202
shared infrastructure 25, 27, 43, 49 Stored Search definition 263
shared storage 221 stored searches 155
short string 182 stretch cluster 223
single round-trip 170 string-valued property 182
Content Engine 170 subject matter experts 89
multiple objects 162, 170 subscription
single server event 127
architecture 220 event, design recommendation 127
instance 104 swap death 328
single sign-on (SSO) 167–168 symbolic name 94, 99
site 54, 101 symmetric cluster 225
recommendation 104 symmetric server cluster 225
Site Preference 271 synchronous
sites 103 event subscription 174
sizing 74 synchronous replication 234
disk space 78 system
hardware 78 backup and restore 308
system 72 restore 312
system, user activity questionairs 69 sizing 68
sizing questionairs system architecture 24, 385
content ingestion 68 System Capacity Planning Tool 66
sizing system 68 system components
software release manager requiring backup 309
challenges 249 system integration testing 254
software service 28–29 system log
solution building blocks maintenance 286
definition 338 System Manager 275, 281
Sout 74 System Manager client
SQL database 62 Dashboard 276
410 IBM FileNet Content Manager Implementation Best Practices and Recommendations
System Manager Dashboard 72 content 187
System Manager server transaction load
Listener 276 handling 5
system monitoring 320 transaction rate 187
system properties transformation 268
generic object 98 transforming 258
system testing 254 transport 165
transports
comparing 166
T troubleshoot 319
table row 181
troubleshooting
relatively wide areas 182
performance 327
taxonomy 95
performance, tips 327
technical user
pattern 382
technology community 176 U
template un-filed
entry 251, 258, 271 folder option 186
property 119, 259 unit testing 254
search 155, 202, 206, 208, 251, 254, 258, 263, use case 43, 60, 154, 176, 187, 209, 355, 357, 371,
298 383
search template 24 software module 176
test use cases
automation 255 Scout 67
performance 255 user acceptance testing 254
performance test 255 user activities
regression 254–255 sizing questionairs 69
regression test 255 user experience 123, 158
test environment 80 unpleasant aspect 158
testing 253 user interaction 353, 357, 360, 379
regression testing 254 User Preference 271
thick client 158 user-interface component 94
thin application 158 UsesLongColumn 182
threshold 74 utilization 73
toolkit 162–163 Dashboard 278
top-down approach 90
repository design 90
top-level directory 197
V
VBScript 207
topology
verification
cloning 259
security 134
trace log 284
version
maintenance 286
major 381, 393
trace logging 55
versioning 68, 350, 374, 393
enable 293
content 14
tracing
vertical scalability 35
capture SQL syntax 328
vertical scaling 35
transaction
virtual 38
behavior 171
virtual machine 29, 38
client-side 171–172
virtual machines monitor (VMM) 38
Index 411
virtual memory 328
virtual private network (VPN) 257
virtual server 39, 50, 53–54, 93, 101, 104–105,
206, 218, 221
horizontal scaling 39
recommendation 104
virtualization 25, 35–36, 52
operating system-level 39
VMWare 257
W
WAN 58
Web service 157, 160, 167, 268, 358, 360
common implementation technology 160
external references 268
Web services description language (WSDL) 162
Web Services Extensible Authentication Framework
(WS-EAF) 167
Web Site Voice 123
wide area network (WAN) 102–103, 211
wizard display 108, 112
workflow 7, 11, 16–17, 155–156, 338, 347, 350
workflow activity 45, 174
workflow definition 155, 265, 268
workflow management
requirement 24
workflow step 16
workload 74
modeling 66
workload modelling 73
Workplace 44, 271–272, 325
exporting and importing 272
Workplace search 205
CBR feature 205
Workplace XT 155, 157, 379
WS transport 165–166, 179
WSDL file 162–163
X
XML 263, 278
XML file 4, 19, 265
XML manifest file 267
412 IBM FileNet Content Manager Implementation Best Practices and Recommendations
IBM FileNet Content Manager Implementation Best Practices and Recommendations
(0.5” spine)
0.475”<->0.875”
250 <-> 459 pages
Back cover ®
Use system IBM FileNet Content Manager provides full content life cycle
architecture, and extensive document management capabilities for digital INTERNATIONAL
capacity planning, content. IBM FileNet Content Manager is tightly integrated TECHNICAL
and business with the family of IBM FileNet P8 products and serves as the SUPPORT
continuity core content management, security management, and ORGANIZATION
storage management engine for IBM FileNet P8 family of
products.
Design the
repository, security, This IBM Redbooks publication covers the implementation BUILDING TECHNICAL
application, and best practices and recommendations for IBM FileNet Content INFORMATION BASED ON
solution Manager solutions. It introduces the functions and features of PRACTICAL EXPERIENCE
IBM FileNet Content Manager, common use cases of the
product, and a design methodology that provides
Learn to deploy, IBM Redbooks are developed by
implementation guidance from requirements analysis
administer, and the IBM International Technical
through deployment and administration planning. Support Organization. Experts
maintain
The book addresses various implementation topics including from IBM, Customers and
system architecture design, capacity planning, business Partners from around the world
create timely technical
continuity, repository design, security, and application information based on realistic
design. Administrative topics covered include deployment, scenarios. Specific
system administration and maintenance, and recommendations are provided
troubleshooting. We also discuss solution building blocks to help you implement IT
that you can specify and combine to build a solution. solutions more effectively in
your environment.
This book is intended to be used in conjunction with the
product manual and online help to provide guidance to
architects and designers about implementing IBM FileNet
Content Manager solutions. For more information:
ibm.com/redbooks