You are on page 1of 14

Best Practice

Subject: Sybase ASE Monitoring


Author(s): Tom Oorebeek, Staff DBA, Sybase IT Reviewer(s): Hema Seshadri, Sr. DBA Manager, Sybase IT Abstract: In todays information driven world, high availability and optimal performance of database infrastructure is more important than ever. Many business aspects rely on being able to retrieve realtime data from their databases. Monitoring this critical infrastructure ensures maximum uptime. Monitoring is important not just from an overall availability and performance perspective, but also from the perspective of the end-user such that business productivity is not compromised. This document discusses some of the best practice ideas on what aspects of a production ASE environment should be monitored.

Sybase, Inc. 2009

Page 1 of 14

Table of Contents
Introduction ...............................................................................................................................................................3 1 Best Practices: ASE Monitoring ...............................................................................................................................................................4 1.1 ASE versions ...........................................................................................................................................................4 1.2 Operating systems ...........................................................................................................................................................4 1.3 Tools.............................................................................................................................................4 1.4 Monitoring aspects.......................................................................................................................4 1.5 What next? ...........................................................................................................................................................4 2. ASE resources ...............................................................................................................................................................6 2.1 The ASE itself...............................................................................................................................6 2.2 Licenses.......................................................................................................................................6 2.3 Database Availability....................................................................................................................7 2.4 Data Storage................................................................................................................................7 2.5 Disk Space...................................................................................................................................8 2.6 User Activity.................................................................................................................................9 2.7 Error Logs.....................................................................................................................................9 2.8 Blocking......................................................................................................................................10 2.9 Data Consistency.......................................................................................................................10 Appendix A ........................................................................................................................................11

Sybase, Inc. 2009

Page 2 of 14

Introduction
This Best Practices document is setup to help a Sybase DBA understand the various aspects of ASE monitoring and also to provide a quick start for setting up such monitoring.

Sybase, Inc. 2009

Page 3 of 14

1 Best Practices: ASE Monitoring


1.1 ASE versions
This document applies to all currently supported ASE versions. If certain features are specific to a particular version, it will be mentioned in the description at hand.

1.2 Operating systems


The monitoring aspects are generic across H/W platforms, although the sample scripts in Appendix A were written for the UNIX platform.

1.3 Tools
Various tools are available in the market for monitoring an ASE, e.g. Sybase SCC, Sybase Central, sysmon, MDA Tables, DB Artisan, DB Virtualizer, or Nimbus. This document focuses on what to monitor rather than recommending a tool or method to monitor. As necessary, references are made to automatic/unattended monitoring through scripts and cronjobs (Unix environments only). The ASE server software includes software like Historical Server and Monitor Server, but these are separate products with their own manuals. As such these products are not described here either.

1.4 Monitoring aspects


What aspects of an ASE should be monitored? How often? Although the type, frequency and threshold levels for monitoring a system will vary from environment to environment, some aspects are essential to all ASE environments. ASE Availability Licenses Database availability Data storage o o o Devices Database segments Log

User Activity Error logs Blocking Data Consistency

1.5 What next?


You follow the best practices and setup extensive monitoring. Now, what? Whether you choose to page a on call DBA, email the DBA or an entire alias, or fix the problem on a proactive basis, is up to

Sybase, Inc. 2009

Page 4 of 14

you. This document does not spell out the actions to follow an alert from your monitoring tool, although certain recommendations may be made. Typical Sybase environments have multiple ASEs that cater to a variety of business applications. It is recommended to buy or build a tool that provides an enterprise view of these systems, with the ability to drill down to issues for any given system. A dashboard approach to the problem that is tied in to the monitoring system, allows for the monitoring script/feature to display the server status based on the alerts just received. For example, a server down status might mark the server icon red. A 1105 error might mark the server orange. A server with no monitoring alerts would stay green and so on. As alerts are attended to, the server status would change accordingly.

Sybase, Inc. 2009

Page 5 of 14

2. ASE resources
2.1 The ASE itself
a. Is the server up and running? Ping the database server frequently to ensure it is alive and responding to commands. Check for the process at O/S level and also ping the database for a simple command. Ensure it returns the expect results. This confirms the basic availability of your server. Page the on call DBA if the server process does not show up or respond. Do the same for your Backup Server process. b. If it was just rebooted, Did it start up correctly with all options enabled? Is the license valid? Did it come up with async I/O? Did all the databases come up correctly? Check for anything abnormal

ASE startup messages are more or less standard server messages, like opening of the allocated devices and databases, but others, depending on configuration and/or traceflags, can show extra information about various aspects of the running ASE, like: connections (successful and failure logins), licenses etc. See sample startserver script file in Appendix A for in-built error checking. Once a server has started, the errorlog should be checked for startup behavior and nonstandard messages, whereas a running server should have its errorlog automatically and periodically be scanned and non-standard messages be emailed to the concerned DBA(s). To decide if a message is important enough for reporting, one could setup search strings per different type of messages, like errors (fatal or not), warnings and operational messages. A separate category would be messages that could be picked up as errors or warnings, but can be skipped during normal processing. This type of messages would include objects names with the word error or msg in it. Be sure to include keywords such as E(e)rror, stack, infected, warning, msg, instead, will shutdown, failed and any other keywords/phrases applicable to your environment. Filter on errors you know to be informational. See your ASE Error Messages Guide for details on informational errors. Error numbers of the pattern 6?? (6 followed by two digits) tend to be severe. Set up special alerts for this for example, email messages with the word PANIC or SEVERE ALERT in them.

2.2 Licenses

Sybase, Inc. 2009

Page 6 of 14

Many Sybase products are offered with several editions and license types. Some features are not part of the core product and as such, require active licenses. Most customers use the SYSAM manager to manage licenses. Some may have a central licensing server. Monitor your license server process. Ensure it is up and running. Look for errors. If you use stand alone licensing, check for keywords such as grace, will shutdown, Failed to obtain and expire. You can specify the email address to send licensing warnings and errors during ASE installation or by using the sp_lmconfig command after installation.

2.3 Database Availability


The server itself may be up and running, but are the individual databases available to the users? For example, after a scheduled downtime where a particular database was made offline to users, setup monitoring that will alert a DBA if a database is unintentionally left in dbo or read-only mode. Or sql queries may be timing out on a particular user database. Another example would be that the database is available to users, but is perhaps in log suspend or is refusing connections for some reason. Setup alerts for free log space per user database. Set up your alerts such that they escalate in priority if the situation does not improve. You may set your last chance thresholds to dump the log or abort the transaction, but there could be scenarios where the LCT action does not have a chance to kick in. Be sure to check for 1105 and log suspend errors.

In addition to alerts, one should check the general response time that will alert the DBA if a simple query such as sp_who does not return in a reasonable amount of time. Be sure to use a nonprivileged account when checking database availability as discussed in this section.

2.4 Data Storage


Device space and database sizes are more or less static and pre-allocated. Information about these is provided by the standard procedures, sp_helpdevice (with ASE 15.x also showing free space) and sp_helpdb (also showing free space per fragment). These standard procedures however generally do not show the details required during continuous monitoring or they show too much detail, like per each fragment, when we are only interested in overall data, log and/or index segments and device space usage. Most important space monitoring is the current level of segment (data and/or log) space usage and the growth per day, week or month. Keeping historical space and growth information allows for better resource (disk space) capacity planning and allocation. See the devusage and listseg current space usage monitoring scripts in the list of useful scripts in Appendix A. Devices In general database devices are created on raw partitions and are therefore not monitored with O/S commands. However, in case an ASE also uses filesystems to store (some of) its database devices, these filesystems should also be monitored. Examples are the devices for tempdb and/or development and test ASEs. Devices files are more or less static, but ASE 12.5 introduced the disk resize command, so even

Sybase, Inc. 2009

Page 7 of 14

existing devices can grow and fill up the filesystem they are stored on. Monitor growth on these devices.

2.5 Disk Space


Depending on the Operating System being used, disk space usage can be monitored using the df (Solaris, AIX) or bdf (HP-UX) commands. Important directories to monitor at the O/S level for a Sybase installation are: - The Sybase software installation tree (Sybase directory) - Errorlog directory - Possible device directories - Database dump directories (if dumping to disk) The Sybase software directory is more or less static, apart from the ever growing errorlog and configuration files. For best practices, you may want to consider separating these files from the main installation directory for easier maintenance. Each ASE has its own startup files like the RUNSERVER startup script, configuration files and its errorlog. Separating these files under there own folder structure enables better handling, monitoring and also allows easier upgrading of the ASE software in the future. As an example one could store the software and startup files using the following tree setup: (using PROD_ASE as example server name)
Directory: -------------------/sybase/PROD_ASE /sybase/PROD_ASE/cfg /sybase/PROD_ASE/log /sybase/sybase_12.5.4 /sybase/sybase_15.0.3 /sybase Contents: --------------------------------------Startup script(s) and sub directories. Configuration files ASE error logs Version 12.5.4 software tree. Version 15.0.3 software tree. Link to the actual version directory currently being used.

2.6 Database & Transaction Log dumps


To provide for easy recovery, it is recommended that you backup your Sybase databases periodically. Backups may done directly to tape or, first backed up to disk filesystems and later backed up at the O/S or network level. In this latter method, both the software being used (Sybase directory, startup directories, scripts and log files) as well as the database dumps can be stored on one tape or combination of tapes. Directories used for database dumps have to be checked at least once a day, once the database dumps have been made, as once the data volume grows, also the size of the database dump files will grow. For performance reasons it is possible to dump databases to parallel stripes, meaning multiple dump directories (and/or filesystems) can be used for one database dump. Be sure to monitor space in each of the dump directories, so your nightly backups succeed. Transactional log dumps are periodically taken to ensure point-in-time recovery. Keep these tran log dump files in a separate directory from your full backups and monitor this directory for free space as well. Setup your monitoring to confirm daily successful backups. In addition to lack of disk space being a possible issue, there could be other failures, including disk failures or errors. Check the backup server errorlog for error keywords.

Sybase, Inc. 2009

Page 8 of 14

2.6 User Activity


User Activity is significant to system performance and health. It needs to be monitored from a historical perspective as well real-time to identify performance problems as they happen. Gather information on active user sessions, CPU cycles, cache hits, and disk I/O over a period of time. This will help you set up alert thresholds. User connections Monitor the total number of currently connected clients/users on a regular basis. This will help establish a baseline. Regular checking if a server (almost) hits its maximum number of user connections prevents users from being locked out (signaled with Error 1608 messages in the log). One way to check the number of current connections is the standard procedure sp_who, but more detailed and summary providing scripts and procedures are referenced in this document. See the sp_w procedure, db_spy and cnt_sessions scripts in Appendix A. CPU, Disk I/O, cache hits A standard ASE procedure that can be used to monitor for contention and bottlenecks in real-time is sp_sysmon. Running this procedure on a regular basis with a reasonable timeframe (e.g. 5-15 minutes per run) gives us information about various ASE counters and behavior, for that timeframe. When combined with information gathered on active processes (see sample db_spy script in Appendix A), one should be able to pinpoint server or process misbehavior to certain processes and/or configuration issues. Several third party monitoring tools use this method of gathering information in their monitoring. For details on using and interpreting sp_sysmon, please see the standard documentation set: Performance and Tuning Guide, chapter: Monitoring Performance with sp_sysmon. There are tools available on the O/S side of the house as well to check for disk I/O bottlenecks on devices that house the database and log files. See sample script run_sysmon in Appendix A. Problem SQL/stack trace Typical problems faced by DBAs include sudden slow performance, hung queries, queries consuming 100% CPU and the like. Tools such as MDA Tables can be a big help here. Since problems dont always occur when a DBA is at their desk, it helps to collect historical information that can be used as a baseline. Some of the things MDA collects that can help you monitor as well as root cause a problem: Process Activity: CPU usage, IO activity, resource usage Resource usage: Data cache, procedure cache, engines Object usage: Tables, partitions, indexes, stored procedures Query history: SQL text, statement metrics, query plans, errors

See Practical Use of MDA tables for examples and more information.

2.7 Error Logs


All logfiles associated with your ASE (ASE errorlog, backup server log, monitoring or maintenance jobs output logs) should be scanned for errors, warnings and other important messages. At a minimum this scanning should report the existence of any messages (found/not found or success/failure). One step further would be to also report the object (e.g. database or segment) the messages are related to. See Unix Scripts in Appendix A for a template script for error checking.

Sybase, Inc. 2009

Page 9 of 14

Be sure to include keywords such as E(e)rror, stack, infected, warning, msg, instead, will shutdown, failed and any other keywords/phrases applicable to your environment. Filter on errors you know to be informational. See your ASE Error Messages Guide for details on informational errors. Error numbers of the pattern 6?? (6 followed by two digits) tend to be severe. Set up special alerts for this for example, email messages with the word PANIC or SEVERE ALERT in them. Be sure to monitor space for your ASE errorlogs. In the default setup, ASE always appends to the current errorlog, so this file is always growing and can become too large to query or edit with an editor. In case of an ASE reboot, it is therefore advised to rename the old errorlog (adding date/timestamp) and have the ASE create a new errorlog during every reboot.

2.8 Blocking
Most blocking conflicts are temporary in nature, and will resolve themselves eventually in a very short period of time. However, potentially bad application design or thoughtless adhoc user transactions can cause massive blocking impacting multiple users of the database. At such times, the server may be up and running, but for all practical purposes it is unavailable to the user. Setup your monitoring to check for blocking that is non-transient. Any process (spid) that blocks for more than say, 2 minutes should be watched. Depending on the tolerance level of your user base, setup the alert to page the oncall DBA upon reaching a certain threshold. Often times, agreements with the business allow for spids blocking > x minutes to be killed by the monitoring script. See script sp_block in Appendix A for an example.

2.9 Data Consistency


Data consistency must be checked periodically. dbcc checkdb, dbcc checkcatalog and dbcc checkalloc are options that may take awhile to run on large databases, possibly blocking regular user access to the objects currently being checked. dbcc checkstorage is an alternative that includes many dbcc checks, as is archive db (ADA) that provides for offline dbccs. Neither will fix errors reported. Setup your monitoring to first check for successful running and completion of this job and two, to report on faults reported.

Sybase, Inc. 2009

Page 10 of 14

Appendix A
Each sample script below executes the tasks it is written for, stored output of the job to the scripts logfile, writes status information to a central logfile (per server) and where possible also writes timing and status information to a central database.

Sample directory structure Using a standard setup of script and log directories eases the way scripts can be used and copied to other hosts/ASEs.
Directory: -------------/dba/jobs /dba/input /dba/output /dba/log /dba/sysmon Contents: ----------------------------------------------------DBA scripts for standard ASE tasks and monitoring. DBA input files, used by the DBA scripts DBA output files, like *.bcp etc. DBA script logfiles and central server logfile(s). DBA script logfiles for a specific job, e.g.: run_sysmon

The above mentioned directories are used in the script samples, shown below.

Sample dataserver startup script: #!/bin/sh


# ------------------------------------------------------------------------------# Start dataserver: PROD_ASE # ------------------------------------------------------------------------------. /sybase/sybase_12.5/SYBASE.sh SERVER=PROD_ASE MASTER=/dev/rdsk/sybase/PROD_ASE.master CFGDIR=/sybase/$SERVER/cfg LOGDIR=/sybase/$SERVER/log CFGFIL=$CFGDIR/$SERVER.cfg LOGFIL=$LOGDIR/errorlog TRACES="" TRACES="$TRACES -T1204" # TRACES="$TRACES -T4013" # Set Sybase environment # Name of the ASE # Its master device # # # # Configuration directory Log directory Configuration file Errorlog

# Traceflags # Print deadlock info in log # Show login records in log # (now config setting)

# ------------------------------------------------------------------------------# Check is server is already running (prevent logfile rename) # ------------------------------------------------------------------------------if [ `/usr/bin/ps -ef | grep dataserver | grep $SERVER | wc -l` -gt 0 ] then echo "WHOA ... $SERVER is already running !!!" exit 1 else # -----------------------------------------------------------# Check Async IO setting # -----------------------------------------------------------asyncERR=`grep -c "allow sql server async i/o = 0" $CFGFIL` if [ $asyncERR -gt 0 ] then echo ------------------------------------------------------------ echo " ERROR: Async IO not configured !!! echo ------------------------------------------------------------

Sybase, Inc. 2009

Page 11 of 14

fi

exit 1 # Rename log

[ -f $LOGFIL ] && { mv $LOGFIL $LOGFIL.`date +%y%m%d.%H%M` ; } # -----------------------------------------------------------# Start ASE, specifying special directories and options # -----------------------------------------------------------$SYBASE/$SYBASE_ASE/bin/dataserver -s$SERVER \ -e$LOGFIL \ -d$MASTER \ -c$CFGFIL \ -i$SYBASE \ -M$CFGDIR \ $TRACES > /dev/null &

fi

Sybase, Inc. 2009

Page 12 of 14

Filename: Purpose:
#!/bin/sh

db_spy Save current processing information.

# -------------------------------------------------------------------------------------# db_spy Collect ASE information about currently running processes # -------------------------------------------------------------------------------------$RUNSQL <<-EOF | egrep -v "return status" >> $LOGFIL $USRPWD SET NOCOUNT ON go print " -------------------------" print " Current processes" print " -------------------------" EXEC go print " -------------------------" print " Current blocked processes" print " -------------------------" EXEC go print " -------------------------" print " Current locks" print " -------------------------" EXEC go print " -------------------------" print " Heavy hitters" print " -------------------------" EXEC go print " -------------------------" print " Monitor info" print " -------------------------" EXEC go EOF sp_monitor -- Standard proc sp_hogs -- Shows cpu and IO info from sysprocesses sp_lock sp_block -- Shows blocking info sp_who

Sybase, Inc. 2009

Page 13 of 14

Filename: Purpose:

run_sysmon Runs sp_sysmon for the given nr of times and duration

#!/bin/sh # -----------------------------------------------------------------------------# Runs sp_sysmon X times against the given server for a given time period # -----------------------------------------------------------------------------SCRIPT=`basename $0` SERVER=`echo $1 | tr "[a-z]" "[A-Z]"` RUNMAX="$2" PERIOD="$3" [ $# -lt 3 ] && { echo echo echo echo echo echo echo exit # Server name # nr of time to run # Duration to run in hh:mm:ss format "------------------------------------------------------" "Usage: $SCRIPT server times period " " " "Where: server = name of the dataserver to connect to " " times = nr of times to run sp_sysmon " " period = hh:mm:ss " "------------------------------------------------------" 1 ; } # Set SYBASE environment

. /sybase/sybase_15/SYBASE.sh SYBUSR=`getserverusr $SERVER` SYBPWD=`getserverpwd $SERVER`

FILNAM="/dba/sysmon/sysmon.out.$SERVER.${RUNMAX}x$PERIOD" RUNSQL="$SYBASE/$SYBASE_OCS/bin/isql -U$SYBUSR -S$SERVER" RUNCNT=1 while [ $RUNCNT -le $RUNMAX ] do OUTFIL=$FILNAM.$RUNCNT.`date +"%y%m%d.%H%M"` if [ $RUNMAX -ge 10 -a $RUNCNT -lt 10 ] then OUTFIL=$FILNAM.0$RUNCNT.`date +"%y%m%d.%H%M"` fi $RUNSQL <<- ENDSQL $SYBPWD exec sp_echotime "`basename $OUTFIL`" print 'Server: %1!', @@servername print 'Version: %1!', @@version exec sp_sysmon "$PERIOD", @dumpcounters='Y' | egrep -v "Password|return status = 0" > $OUTFIL

exec sp_echotime "`basename $OUTFIL`" go ENDSQL compress $OUTFIL RUNCNT=`expr $RUNCNT + 1` done

Sybase, Inc. 2009

Page 14 of 14