Professional Documents
Culture Documents
document
On Demand Router common problems
A Step-by-Step guide to troubleshoot
Author:
Premium Support Analysts /XD Team
1
12/7/2015
12/7/2015
Table of Contents
1
Business Objectives....................................................................................................3
Common Problems....................................................................................................4
ODR Tracing............................................................................................................13
Bypass ODR.............................................................................................................16
ODR Resources........................................................................................................17
12/7/2015
Business Objectives
The objectives of this document are to address several of the most common problems
observed by IBM Premium Support Analysts while supporting ODR, to help Wal-Mart
engineers avoid unnecessary outages, respond to common problems efficiently, enhance
customer skills, share lessons learned and provide great tips. This document also provides
step-by-step guidance for resolution of common problems and links to IBM resources for
WebSphere Virtual Enterprise On-Demand Router (ODR).
12/7/2015
Note that an entry will appear in local.log with a 404 return code if the Virtual
Host of the target application is not configured to work with the ODR. A Host
Alias must exist for the ODR (use the ODR PROXY_HTTP_ADDRESS (nonsecure) and/or PROXY_HTTPS_ADDRESS (secure) ports for the aliases) that is
associated with the Virtual Host for the target application. Normally, the ODR
ports are 80 and 443. And, by default, there are Host Aliases for those ports for
the default_host Virtual Host. So, if you are using the default_host for your
application and your ODR ports are 80 and 443, you shouldnt have to add any
aliases.
12/7/2015
If you are experiencing http 302 return codes in the local.log file its possible that
you have not configured the ODR to trust the Web Server. This can happen
when running the ODR as non-root. Go the the XD Information Center and
search on trusted web server to see how this configuration is done. You can
also search the WAS Information Center for trusted web server, but the XD Info
Center gives a better description.
An alternative to configuring the ODR to trust the Web Server to resolve 302
return codes, is to configure Rewriting rules for the ODR. This is another ODR
feature inherited from the WAS Proxy Server. So, to find information on how to
configure this, go the the WAS Information Center and search on Rewriting rules
configuration.
A particular request will appear in only one of the log files. So, you may have to look in
all three to determine if the request was received. If the request doesnt appear any of the
log files, one of four things has happened:
The ODR did not receive the request.
This can happen if the host name or port number for the ODR is incorrect on the
request. The port to use is the PROXY_HTTP_ADDRESS (non-secure) or
PROXY_HTTPS_ADDRESS (secure).
The ODR received the request, has routed the request to a remote server but has
not (yet) received a response.
If you think this may have happened, there are a couple of things to do. First,
check the ODR outbound request timeout setting (Servers -> On Demand Routers
-> ODR Name -> On Demand Router Properties -> On Demand Router settings
-> Outbound request timeout). Its possible that the ODR has routed the request,
the remote server hasnt responded and you have checked the log file before the
request has timed out. An entry will not be made in an access log until the ODR
returns to the client.
You can verify that the ODR has received and routed a request by configuring
ODR tracing. Refer to the next section of this document where it shows how to
enable the trace and shows an example trace of a routed request.
The ODR has received the request, but a defect in the ODR has prevented the
request from being logged.
This is a very unlikely possibility, but not out of the question. If you think this
may be happening, configure ODR tracing (refer to the next section of this
document) and verify that the request was received by the ODR and routed.
Access logging has been disabled in the ODR settings.
Access logging is enabled for the ODR by default. Check this setting by going to
12/7/2015
Servers->On Demand Routers -> ODR Name -> On Demand Router Properties ->
On Demand Router settings -> Logging -> Enable access logging.
The access logging settings can be changed at the ODR settings page (Servers->On
Demand Routers -> ODR Name -> On Demand Router Properties -> On Demand Router
settings -> Logging -> Enable access logging). On this page you can also configure the
maximum size of the access log plus the names and locations of the log files.
Note that you may see access log configuration settings associated with the web container
(Servers->On Demand Routers -> ODR Name -> Container Settings -> Web Container
Settings -> Web container transport chains -> WCInboundDefault -> HTTP inbound
channel (HTTP 1)). The Enable access and error logging setting and any configuration
settings from Related Items -> HTTP error and NCSA access logging have no affect on
the ODR access logging. So, one result of this is that the access logging format for the
ODR cannot be configured for NCSA combined as it can for the WAS internal http
servers.
Common Problems
1 HTTP Response Code 404
Symptom
A 404 means the requested resource was not found. The URL does not match any
application known to the ODR.
Analyzing
The things of interest are the request's host, port, context root, and URI. The ODR 's
target.xml will prove very useful. Using the target.xml, first check that a vhost exists that
corresponds to the requested URL's host and port. If they match, then look at the web
modules linked to this vhost. Those are the applications which the request may be
mapped to.
<vHostGroup name="default_host">
<!-- vHost section -->
<vHost name="myhost.com:9098">
<property name="port" priority="0" value="9098"/>
</vHost>
<vHost name="*:80">
<property name="port" priority="0" value="80"/>
</vHost>
<!-- webModule section -->
<link
name="/cell/dabtcCell07/application/dabtcNode03/odr1/GenericApplication/webModule/
Default_default_host_HTTP_WC"/>
<link name="/cell/dabtcCell07/application/A/webModule/microwebapp.war"/>
7
12/7/2015
<link name="/cell/dabtcCell07/application/A/webModule/microsipapp.war"/>
<link
name="/cell/dabtcCell07/application/DefaultApplication/webModule/DefaultWebApplica
tion.war"/>
</vHostGroup>
Now look at each of those available applications, and see if the URI matches any of their
context roots and URI patterns. If so, then the 404 may not be appropriate, and ODR
must-gather should be analyzed
<application name="A">
<property name="isWebSphere" value="true"/>
<property name="root" priority="0" value="A"/>
<property name="edition" priority="0" value=""/>
<property name="state" priority="0" value="ACTIVE"/>
<!-- webModule section -->
<webModule name="microwebapp.war">
<property name="id" priority="0" value="dabtcCell07/A/microwebapp.war"/>
<property name="contextRoot" priority="0" value="/A"/>
<property
name="fileServingEnabled" priority="0" value="true"/>
<property name="serveServletsByName" priority="0" value="true"/>
<property name="protocolMap" priority="0" value="direct"/>
<property name="routingEnabled" priority="0" value="true"/>
<!-- uri section -->
<uri name="/"/>
<uri name="/*.jsp"/>
<uri name="/*.jsv"/>
<uri name="/*.jsw"/>
<uri name="/servlet/*"/>
<uri name="/CpuLoad"/>
<uri name="/IOBound"/>
</webModule>
Behavior
1. The plug-in marks the ODR to be down if it detects an issue and does not forward the
requests to ODR. This would result in an error code 500 so no logs will generated on the
ODR side.
12/7/2015
2. If an error code 500 is generated and logged in the local.log that means something is
wrong on the ODR side.
3. If an error code 500 is logged in the proxy.log, the problem is usually on the back end application
server.
4. If the 500 logged in the local log, IBM would need the ODR must-gather along with the channel strings
enabled on the ODR
Analyzing
The following are some common things to look for, which may be seen in the ODR's
target.xml:
Are the applications available and started on the servers?
Is the application server in maintenance mode?
Is the application server reachable?
Does the application server has a weight of 0?
Here is an example segment from target.xml.
<server name="TestClusterA_dabtcNode03">
<property name="type" priority="1" value="APPLICATION_SERVER"/>
<property name="state" priority="1" value="STARTED"/>
<property name="reachable" priority="0" value="true"/>
<property name="weight" priority="1" value="20"/>
<property name="server.maintenancemode" priority="1" value="false"/>
<!-- serverApplication section -->
<serverApplication name="A">
<property name="state" priority="1" value="STARTED"/>
</serverApplication>
If nothing jumps out, then ODR must-gather should be analyzed.
12/7/2015
Condition # 2) If (avg. RT - avg. SRT ) > 1 sec for ALL WCM ODRs
I.
J.
K.
L.
M.
N.
activate
Condition # 3)
If adding additional WCM ODRs does not resolve the problem
Step #1)
Invoke the odrPMIStats.py script again to ensure the WCM ODR is adding in multiple seconds to the
request flow
Step #2)
Perform a rolling restart of all WCM ODRs
Step #3)
provide the trace, cores, netstat & pmi output to IBM
10
12/7/2015
Possible Cause
If ODR can not receive request from the Web Server, you might needs to check your
trusted secure proxy server configuration if your Web Server are not as trusted secure
proxy server, configure them as a trusted secure proxy server.
The configuration details can be found be accessing the below link.
http://publib.boulder.ibm.com/infoccenter/wxdinfo/v6r1/index.jsp?
topic=/com.ibm.websphere.ops.doc/info/odoe_task/tccgodrscen.html
6 Errors occur when attempt to start ODR from V6.0.2.x Admin Console
Symptom
The ODR does not start from the administrative console for Websphere Extended
Deployment Version 6.0.2.x when Firefox 2.0.x on SUSE Linux Enterprise Server
(SLES) is used.
Cause
A request to start an ODR from the admin console times out due to a Firefox 2.0.x.
11
12/7/2015
review your work class default actions and rule actions, and apply and save your changes
directly in the console. For more information, see the Creating routing policies for
application editions topic in the WebSphere Virtual Enterprise Information Center.
Cause
The node listed in the ARFM0173W message might not have been augmented with the
WebSphere Extended Deployment template.
12
12/7/2015
The node listed in the ARFM0173W log entry needs to be augmented with a WebSphere
Extended Deployment template.
Symptom
Rewrite Rules do not work as expected
Routing rules do not work as expected
Symptom
503 responses are returned
Unexpected system outages
Symptom
Web Server Definitions are not supported
On Demand Router (ODR) is used to manage Web Servers
Having an application mapped to a Web Server definition could cause unexpected
exceptions
13
12/7/2015
Symptom
If multiple products are installed on the same WAS at different maintenance levels
Requests will fail
Servers will not start
Unexpected exceptions will be thrown
Symptom
XD uses reports and visualization logs to monitor the state of the cell
PMI is enabled by default on the back end servers to gather statistics and for health
monitoring.
Symptom
14
12/7/2015
Symptom
How to use CIM without internet connection
SSH is used by default on Linux/UNIX platforms
Incompatibility between SSH protocols
Symptom
If core groups are configured incorrectly; it can cause loss of service
Too many members in a core group
Multiple core groups not bridged together
Core Bridge automatic configuration not disabled
15
12/7/2015
Symptom
APC can place servers in maintenance mode
Max heap size on the nodeAgents is not increased
ODR Tracing
If a problem persists please follow the must-gather link to capture the tracing data
according to the symptoms, IBM Support occasionally will modify the trace string so it a
good to visit the below link.
MustGather: Read first for WebSphere Extended Deployment
http://www-01.ibm.com/support/docview.wss?rs=3023&uid=swg21224033
WebSphere Extended Deployment
http://www-01.ibm.com/support/search.wss?
rs=3023&apar=include&q1=MustGatherDocument&tc=SSPPLQ&loc=en_US&cs=utf8&lang=en&sort=rk&p=3
manageODC.py script
The manageODC.py script manages the ODC tree. The ODC tree is an in-memory
representation of the state of a WebSphere Application Server cell.
Purpose
16
12/7/2015
The manageODC.py script can add and remove nodes and edges, or modify the value of
properties on a node. You can also use the script when troubleshooting routing policy
errors for the on demand router
Location
The manageODC.py script is located in the install_root/bin directory.
Usage
The script usage for general help follows:
./wsadmin.sh|bat -lang jython -f manageODC.py
The script usage for operation-specific help follows:
./wsadmin.sh|bat -lang jython -f manageODC.py operation help
Generate a target.xml file to determine the ODC names to plug into the script
There are three ways to generate target. xml file
Operation
You can perform the following operations with the manageODC.py script:
manageODC.py can be found under $WAS_HOME/bin
Operation: *removeODCNode * Removes a node.
Parameters
odcNodePath Specifies the full ODC tree path of the node to remove.
nodeName Specifies the name of the Websphere node containing the server that
initiates the removal.
serverName Specifies the name of the server to initiate the removal.
Operation: addODCNode Creates a new node.
Parameters:
odcParentNodePath Specifies the full ODC tree path for the parent of the new
node to be created.
odcNodeType Specifies the ODC node type of the new node to be created.
newNodeName Specifies the name of the new node to be created.
17
12/7/2015
nodeName Specifies the name of the Websphere node containing the server that
initiates the addition.
serverName Specifies the name of the server to initiate the addition.
[--p odcPropertyDescriptor priority::value] Specifies the name of the ODC
property to be modified on the new node priority. Value is the priority and value to set the
ODC property to on the new node. Priority can be omitted if the default value is used.
[--l linkOdcNodePath] Specifies the full ODC tree path of the node for which an
edge is to be created
and similar formatting for the rest.
removeODCEdge Removes the link between two nodes.
odcNodePathA
odcNodePathB Specifies the full ODC tree paths of the nodes to be unlinked.
nodeName Specifies the name of the Websphere node containing the server that initiates
the removal.
serverName Specifies the name of the server to initiate the removal.
addODCEdge Links one node to another node.
odcNodePathA
odcNodePathB Specifies the full ODC tree paths of the nodes to be linked.
nodeName Specifies the name of the Websphere node containing the server that initiates
the addition.
serverName Specifies the name of the server to initiate the addition.
*modifyODCProperty * Modifies a specified ODC property of a node.
odcNodePath Specifies the full ODC tree path of the node whose property is to be
modified.
odcPropertyDescriptor Specifies the name of the ODC property to be modified.
priority::value Specifies the priority and value to set the ODC property to. Priority can
be omitted if the default value is used.
nodeName Specifies the name of the Websphere node containing the server that initiates
the modification.
serverName Specifies the name of the server to initiate the modification.
Generate target.xml
Generate a target.xml file to determine the ODC names to include in the script. The
following code example shows a shortened version of a target.xml file, where parameters
for the cell, node, and server that you want to use in the script are located.
Command to generate target.xml
./wsadmin.sh -lang jython -f odrDebug.py setHttpDebug my_node my_ODR 503 true 2
(the above command will generate a full target.xml in the ODRs SystemOut.log when you hit a 503 routing failure
through that ODR)
<cellGroup name="target">
<!-- cell section -->
<cell name="Cell1">
18
12/7/2015
19
12/7/2015
cfg.xml file to redirect specific requests that you do not want to route through the ODR.
Instead, the requests are routed directly from the Web server to the back-end server.
Alternatively, you can reset the custom property so that the ODR resumes intercepting
requests. The format of the custom property value is a comma-separated list of module
paths, such as cell/appName/edition/moduleName=value.
Procedure
In the administrative console,
click System administration > Cell > Custom Properties > New.
Type ODR_Module_Routing_Policy as the name of the custom property.
Type the value of the custom property.
Set the value to cell/appName/edition/moduleName=direct to route requests directly to the
back-end server.
Set the value to cell/appName/edition/moduleName=ODR to route requests through the ODR
before the back-end server receives the requests.
For example, if you set the value to
cell/app/edition/module=direct,cell/app2/edition/module=ODR ,
each module is configured independently as to whether requests for that module are sent
through the ODR or directly to the back-end server. You can use a wildcard (*) in place
of the appName, edition, and moduleName variables.
Click Apply and save your changes.
Example
In the following example, the custom property is set to route requests to a back-end
application server. A wildcard is used in place of the appName, edition, and moduleName
variables.
myCell/*/*/*=direct
Step2. Ensure ports are open between Web Server and Application Server
Step3. Ensure SSL files are copied to Web Servers, if required
Step4. Propagate newly generated plugin-cfg.xml file on Web Servers
Step5. Restart the Web Servers
Step6. Enable plug-in tracing in httpd.conf file
Step7. Perform the validation of your application.
Step8. If everythings working as expected, disable the plug-in trace in httpd.conf file
20
12/7/2015
ODR Resources
IBM Information Center
http://publib.boulder.ibm.com/infocenter/wxdinfo/v6r1/index.jsp?
topic=/com.ibm.websphere.ops.doc/info/welcome_61_ooxd.html
WebSphere Extended Deployment Support
http://www-01.ibm.com/software/webservers/appserv/extend/support/
Recommended fixes for WebSphere Extended Deployment
http://www-01.ibm.com/support/docview.wss?rs=3023&uid=swg27005709
Developer Works
http://www.ibm.com/developerworks/wikis/display/xdoo/Home
If problem persistence, please open a PMR by accessing the below link or call 1-800IBM-SERV
http://www-01.ibm.com/software/support/probsub.html
21
12/7/2015