Understanding the WebSphere config service

Andreas Veithen
Revision 1

Table of Contents
1. Repository structure ........................................................................................................ 1 2. Metadata ....................................................................................................................... 2 3. Configuration object serialization; root and inner configuration objects ..................................... 4 4. Configuration IDs ........................................................................................................... 5 5. Parent-child relationships ................................................................................................. 5 5.1. How parent-child relations are defined ..................................................................... 5 5.2. Navigating parent-child relations ............................................................................. 6 5.3. Navigating from child to parent .............................................................................. 7 6. getid vs. list ............................................................................................................ 7 6.1. getid ............................................................................................................... 7 6.2. list ................................................................................................................. 8 6.3. Metadata inconsistencies ........................................................................................ 8 7. Configuration objects and WebSphere variables ................................................................... 9 A. Correspondance between MBean operations and scripting methods ........................................ 10

1. Repository structure
At a low level, the WebSphere configuration repository is a collection of folders and documents. Although these resources are persisted one-to-one as directories and files under the config directory in the WebSphere profile, the configuration repository service in WebSphere is more than a simple file access service. It allows a client (such as wsadmin) to prepare a set of changes in a so called workspace1 and to save these changes at once (or to discard the changes if something goes wrong). The repository service detects conflicts to prevent clients from attempting to modify the same documents concurrently. It also tracks changes so that node agents can incrementally replicate the master configuration repository from the deployment manager to individual nodes. The configuration repository is used by WebSphere for various things. E.g. application binaries are stored there. However, most of the documents in the repository contain configuration objects. To manage this configuration data, WebSphere adds another layer on top of the configuration repository service: the config service. The config service uses the following concepts: Repository context A repository context is a structure that can contain documents and nested repository contexts. Each repository context has a type and a name. Contexts are mapped to two-level folder structures, where the first level identifies the context type and the second level identifies the context name2. Figure 1, “Repository context example” shows an example of

1 2

Workspaces are stored in the wstemp directory in the WebSphere profile. Note that there are actually exceptions to this rule. In fact, it appears that some repository contexts have no name and are represented as a onelevel folder structure. This applies to the ExtendedRepositoryService object stored in the repository.xml file under cells/name/ repository.

1

Understanding the WebSphere config service a repository structure representing two contexts of type servers, one named nodeagent and one named server1.

Figure 1. Repository context example

Document

At the repository level, a document can have any content. The config service on the other hand has a narrower definition of a document, namely an XML file that contains a set of configuration objects in serialized form. Configuration objects are described using the Eclipse Modeling Framework [http://www.eclipse.org/modeling/emf/].

Configuration object

2. Metadata
To manage the configuration data, the config service needs metadata that describes the supported types of contexts, documents and configuration objects, as well as their relationships. This metadata comes from two sources that we will describe in this section. As noted earlier, configuration objects are defined using EMF. In EMF, classes and their relationships are described using Ecore models. Ecore models are conceptually similar to (but simpler than) UML models, and they can be used to generate Java code. The generated code not only contains the Java classes corresponding to the classes defined in the Ecore model, but also code that allows to introspect the model at runtime. This code is contained in implementations of the org.eclipse.emf.ecore.EPackage interface that are registered in the EMF package registry (EPackage.Registry.INSTANCE). The config service uses this EMF metadata to discover configuration object types, their attributes and relationships which each other. Figure 2, “EMF model for the cluster configuration” shows an example of an EMF package used in the WebSphere configuration, namely in the cluster configuration.

Figure 2. EMF model for the cluster configuration

Starting with version 6.1, the WebSphere runtime has a modular design based on the Eclipse platform. It is split into a set of plug-ins, and each of these plug-ins can contribute EMF packages to the configuration model. In EMF, the standard mechanism to discover packages in the deployed plug-ins and to populate the package registry uses the org.eclipse.emf.ecore.generated_package extension point. However, WebSphere uses its own extension point for this, namely com.ibm.wsspi.extension.server-model-init.

2

Understanding the WebSphere config service WebSphere uses EMF as a facility to discover configuration object types and also to store graphs of configuration objects in XML documents. However, EMF doesn't allow to describe the overall structure of the repository, e.g. which document is used to store which type of configuration object, and what are the documents expected in a given repository context. That kind of metadata is provided by a set of XML documents stored in the .repository folder in the repository itself. All the files in that folder have the same structure, as shown in Figure 3, “Structure of the metadata files in the .repository folder”.

Figure 3. Structure of the metadata files in the .repository folder

One can see that these files define the hierarchy of repository contexts as well as the documents that can exist in each context. The root-refobject-type elements also establish links between document types and the types of configuration objects they contain. Files in the .repository folder are parsed by a class called com.ibm.ws.sm.workspace.metadata.impl.MetaDataParser. Metadata coming from different files is simply merged together by that parser. This means e.g. that a repository context is not necessarily completely defined in a single file; the definition may be scattered over multiple files. The reason for this is obviously to allow a product extension to modify an existing context definition by simply adding a file to the .repository folder. At runtime, the metadata is modelled as shown in Figure 4, “Repository metadata”. Note that EClass represents the EMF metadata for a given configuration object type.

3

Understanding the WebSphere config service

Figure 4. Repository metadata

3. Configuration object serialization; root and inner configuration objects
As mentioned earlier, WebSphere uses EMF not only to describe configuration objects, but also to serialize them to XML. This XML data is then stored as documents in the repository. Actually, the format used is XMI (XML Metadata Interchange). It allows to serialize EMF model instances to XML in a standard way without additional metadata (i.e. without the need to explicitly define the mapping to XML). EMF explicitly supports the notion of containment. E.g. the ServerCluster class has an attribute members which is a reference of type ClusterMember with multiplicity 0…*. This reference is modelled as a containment relation. In XMI, contained objects are serialized as child elements of the XML element representing the containing object, as shown in the following sample:

Example 1. Sample serialization of a ServerCluster object
<topology.cluster:ServerCluster xmi:version="2.0" xmlns:xmi="http://www.omg.org/XMI" xmlns:topology.cluster="http://www.ibm.com/websphere/appserver/schemas/5.0/topology.cluster.xmi" xmi:id="ServerCluster_1362990710886" name="cluster0" preferLocal="true" nodeGroupName="DefaultNodeGroup"> <members xmi:id="ClusterMember_1362990716560" memberName="server1" weight="2" uniqueId="1362990714239" nodeName="ulx00003"/> <members xmi:id="ClusterMember_1362990717935" memberName="server2" weight="2" uniqueId="1362990716833" nodeName="ulx00004"/> </topology.cluster:ServerCluster>

Note that an XMI document can have one or more root objects. In the latter case, the serialized objects are wrapped in an xmi:XMI element. E.g. a resources.xml document contains a collection of J2EEResourceProvider objects. The existence of containment relations allows to distinguish between to categories of configuration objects: Root configuration object A configuration object that is not contained in another configuration object and that appears at the top level of an XMI document. A configuration ojbect that is contained in another configuration object (which may be a root or an inner configuration object).

Inner configuration object

In principle, the same configuration object type could be used both for root configuration objects and for inner configuration objects (depending on the part of the configuration where it is used). However this

4

Understanding the WebSphere config service possibility is not actually used, which means that the distinction between root and inner not only applies to configuration objects, but also extends to configuration object types3. E.g. ServerCluster is a root configuration object type, and ClusterMember is an inner configuration object type.

4. Configuration IDs
A configuration object is uniquely identified by an identifier with the following structure: name(context|file#id) context identifies the repository context and is equal to the relative path to the corresponding folder in the repository, file identifies the document in that folder and id identifies the configuration object inside that document. name is not a mandatory part of the configuration ID. It is returned by WebSphere for objects that have a name attribute and its only purpose is to improve human readability. If the name contains spaces, then it is enclosed in quotes. For configuration IDs passed to WebSphere, the name is ignored. Example: lx03(cells/test/nodes/lx03|node.xml#Node_1) Obviously the format of the configuration ID is chosen to allow WebSphere to load a configuration object quickly without scanning the entire repository.

5. Parent-child relationships
5.1. How parent-child relations are defined
Configuration objects are linked by parent-child relations. They are defined in two different ways: 1. By containment relations in the EMF models (cf. previous section). E.g. a ClusterMember is a child of a ServerCluster. 2. Figure 4, “Repository metadata” shows that for every repository context, there is a single root document. E.g. for a cells/*/nodes/*/servers/* context, the root document is server.xml. Each of these root documents is expected to contain a single root configuration object. E.g. for a server.xml document this is a Server object. The root configuration object in the root document of a given context is considered the parent of the following set of configuration objects: • The root configuration objects in all documents in that context other than the root document. • The root configuration objects in all root documents of all child contexts. E.g. a JDBCProvider object stored in the resources.xml document in a cells/*/nodes/*/ servers/* context is a child of the Server object in the server.xml document in that context. That Server object itself is a child of the Node object stored in the node.xml document of the parent context. This is illustrated by Figure 5, “Cross-document parent-child relations”.
3

There are actually some exceptions to that rule. E.g. the PMIModule type is used both for root and inner configuration objects in pmiconfig.xml.

5

Understanding the WebSphere config service

Figure 5. Cross-document parent-child relations

It is easy to see that with these definitions, a given configuration object can have at most one parent. Also note that the root context (corresponding to the repository root and not to be confused with the cells/ * context) doesn't define a root document type. This means that the repository will contain multiple configuration objects that have no parents. This includes the Cell object (corresponding to the unique cells/* context) as well as one or more Policy objects (corresponding to the waspolicies/* contexts). We have seen that documents are simply containers for configuration objects and that WebSphere defines parent-child relations that connect every configuration object to a limited set of configuration objects without parents (of which only a single one is usually relevent, namely the Cell object). This means that it is a good approximation to simply view the WebSphere configuration data as a tree of configuration objects, and that in practice one rarely has to deal directly with the concept of repository context and document. For a given configuration object type, the metadata described in Section 2, “Metadata” determines which types of configuration objects are allowed as parents and as children. This information can be queried using the getRelationshipsMetaInfo operation on the ConfigService MBean or the parents and contents methods of the AdminConfig object in wsadmin. E.g. for the CacheProvider type, the allowed parent types are Cell, Node, Server and ServerCluster.

5.2. Navigating parent-child relations
The AdminConfig object in wsadmin has several methods that allow to navigate parent-child relations. For navigation from parents to children (or ancestors to descendants), these are: showAttribute This method can be used to navigate from a parent object to a child object if the parent-child relation is derived from a containment relation in EMF, i.e. if the child object is part of the same document as the parent object. This type of relation is always defined using an attribute on the parent object, and in this case the showAttribute method returns the configuration ID of the child object (or a list of configuration IDs for the child objects, depending on the multiplicity of the relation).

6

Understanding the WebSphere config service getid, list These methods can be used to navigate any type of parent-child relation. Note that list actually performs a search in a subtree, i.e. it allows to navigate directly from an ancestor to a descendant without following the intermediate parent-child relations. More on this in Section 6, “getid vs. list”.

5.3. Navigating from child to parent
While wsadmin has several methods to navigate from a configuration object to its children (or descendants), there is no universal method to navigate from a configuration object to its parent. If the parent-child relation is derived from a containment relation in EMF, i.e. if the parent object is part of the same document as the child object, then it may be possible to navigate from the child to the parent. Some containment relations defined in the WebSphere configuration models have corresponding opposite references that point from the contained objects to the containing objects. An example can be seen in Figure 2, “EMF model for the cluster configuration”: the cluster attribute in the ClusterMember type defines a reference that is the opposite of the reference defined by the members attribute in ServerCluster. This means that it is possible to navigate from a ClusterMember back to the parent ServerCluster by querying the value of the cluster attribute (using showAttribute e.g.). As can bee seen in Example 1, “Sample serialization of a ServerCluster object”, the cluster attribute is not serialized in the XMI files. The reason is that since it is an opposite reference, serializing it would be completely redundant.

6. getid vs. list
As mentioned in Section 5.2, “Navigating parent-child relations”, the getid and list methods of the AdminConfig object in wsadmin both allow to navigate parent-child relations. They differ not only in usage and features, but also in the way they locate configuration objects.

6.1. getid
getid locates configuration objects matching a given containment path. A containment path is a sequence of type/name pairs, the name part being optional. That sequence is passed to getid as a single string with the following syntax: /Node:lx03/Server:server1/JDBCProvider:/ Note that the corresponding operation on the ConfigService MBean (see Appendix A, Correspondance between MBean operations and scripting methods) uses a different syntax: Node=lx03:Server=server1:JDBCProvider= The first step in the containment path is evaluated using a subtree search starting from the root repository context, using the same approach as the list method (see below). The result of the evaluation of the first step is a set of configuration objects. Subsequent steps are evaluated iteratively, and each time a new set of configuration objects is constructed by locating the matching child objects of the configuration objects returned by the previous step. The result of the getid method is the set of configuration objects (more precisely their configuration IDs) returned by the last step. Child objects are located using the definitions described in Section 5.1, “How parent-child relations are defined”. This process always needs to consider parent-child relations defined by containment relations in EMF. Note that such relations never cross document boundaries. For a configuration object that is the root configuration object in the root document of a repository context, additional parent-child relations need to be considered, and the situation is more complex because these relations cross document and repository

7

Understanding the WebSphere config service context boundaries. Since the type of child configuration objects is known in advance, getid can use the metadata described in Figure 4, “Repository metadata” to narrow down the document(s) (and contexts) that need to be examined. For example, to evaluate the second step of the containment path shown above, WebSphere will use its knowledge that a Server object can only exists as a child object of a Node object if it is contained in a server.xml document in a subcontext of type servers (see Figure 5, “Cross-document parent-child relations”). Therefore it only needs to examine a very limited number of documents (corresponding to the number of WebSphere instances defined on that node).

6.2. list
list performs a subtree search for configuration objects of a given type. The one-argument variant of the list method always starts from the root repository context. The two-argument variant allows to specify the object from where to start the subtree search. If the scope object is an inner configuration object or a root configuration object that doesn't correspond to a root document, then by definition, the search is limited to a single document. On the other hand, if no scope object is given or if the scope object is the root configuration object of a root document, then the search scope corresponds to an entire repository context (including all descendants). This may cover a very large number of documents. To avoid scanning all these documents, the list method uses a specific algorithm to address that case4: 1. Determine the document types that may contain objects of the type being searched for. If the configuration object type is a root configuration object, then the metadata shown in Figure 4, “Repository metadata” allows to get this information easily. If the object type being searched for is an inner configuration object, then the EMF metadata is used to determine the corresponding root configuration object type first. 2. Recursively search the repository context for candidate documents. This simply uses the fact that all documents of a given type have the same file name. 3. Search each document found for configuration objects of the expected type.

6.3. Metadata inconsistencies
WebSphere doesn't validate that the repository structure conforms to the metadata described in Figure 4, “Repository metadata”. This implies that it is possible to see documents in the repository that are not expected to be there if one refers to the repository metadata in .repository. E.g. .repository/metadata_cei.xml specifies that a resources-cei.xml file with root object type EventInfrastructureProvider can appear in repository contexts of type cells, nodes and servers, but not clusters. Nevertheless, when a cluster is created, WebSphere also creates a resources-cei.xml document in the corresponding context.5 Such inconsistencies have interesting consequences because getid and list use fairly different algorithms to locate configuration objects and these algorithms are affected differently by inconsistencies with respect to the repository metadata:

4

To see this in action, use the following trace spec:

com.ibm.ws.management.configservice.*=all:com.ibm.ws.sm.workspace.impl.*=all This will show the calls to the com.ibm.ws.sm.workspace.impl.RepositoryContextImpl#getAllList method which returns the list of all documents in a given context. That call is followed by a call to com.ibm.ws.management.configservice.WorkspaceHelper#getDocAccessor for each document with a matching name. 5 According to IBM support, this is a known defect that will not be fixed because (1) it only affects the getid and a workaround is to use list instead, (2) IBM believes that correcting the metadata may have side effect causing problems elsewhere and (3) CEI is getting deprecated.

8

Understanding the WebSphere config service • getid with containment path /ServerCluster:/EventInfrastructureProvider:/ will return an empty list, even if one or more clusters exist. • getid with containment path /EventInfrastructureProvider:/ will return a list that contains EventInfrastructureProvider objects in cluster scope. • list will also return EventInfrastructureProvider objects in cluster scope. The reason for these discrepancies is as follows. In the first case, the config service would have to navigate from the ServerCluster object to a child object of type EventInfrastructureProvider. However, according to the repository metadata, such a parent-child relation doesn't exist; according to that metadata, a EventInfrastructureProvider object can only exist as a child of a Cell, Node or Server object. Therefore, the EventInfrastructureProvider object is not found. In the other cases, the config service performs a subtree search using the optimized algorithm described in Section 6.2, “list”. From the metadata it knows that EventInfrastructureProvider objects are stored in documents named resources-cei.xml and it will recursively search for documents with that name. This search will return resources-cei.xml documents stored in repository contexts of type clusters, even if according to the metadata they should not exist. That is why in these cases, the config service is able to locate the corresponding EventInfrastructureProvider objects.

7. Configuration objects and WebSphere variables
The product documentation defines WebSphere variables as follows: WebSphere variables are name and value pairs that are used to provide settings for any of the string data type attributes that are used to configure the product. This implies that WebSphere variables can be used in any attribute of type string of any configuration object. As a matter of fact, at runtime6, variable substitution occurs before the config service returns the value of an attribute and is therefore supported everywhere in the configuration. This can be seen in Figure 6, “WebSphere variable expansion”.

Figure 6. WebSphere variable expansion

In this case, the EndPointMgr component looks up the value of the host attribute of a configuration object. That results in a call to the VariableMap component to substitute WebSphere variables in that attribute value.
6

That is when a WebSphere component requests configuration data; wsadmin will always see the unexpanded value.

9

Understanding the WebSphere config service One important thing to understand is that not all the WebSphere configuration is stored in configuration objects managed by the config service. Some WebSphere components store configuration data as documents in the repository, but without letting the config service manage them as documents containing configuration objects serialized using EMF. An example of a WebSphere subsystem that does this is the Virtual Member Manager (VMM). It stores a certain number of documents with configuration data in the repository under cells/*/wim, but these are not XMI documents, and there are no configuration object types corresponding to the contents of these documents, i.e. they can't be managed using the config service. WebSphere variable substitution doesn't automatically apply to this type of configuration data. It is only supported if the WebSphere component or subsystem takes care of it itself. For VMM this is not the case. This explains e.g. why the LDAP host in a federated user repositories configuration (which is part of VMM and therefore doesn't use the config service) cannot be specified using a WebSphere variable, while for a stand-alone LDAP registry (which uses the config service), this is supported.

A. Correspondance between MBean operations and scripting methods
Table A.1. Correspondance ConfigService operations
AdminConfig method (wsadmin) parents getRelationshipsMetaInfo contents list getid queryConfigObjects resolve

between

AdminConfig

methods

and

ConfigService MBean operation

10