You are on page 1of 1

How Postilion calculates uptime of installed application

Precursor to task uptime calculation When a Postilion Application is installed, an entry is written to the 'task' table of the Postilion database. This entry contains information that can be used for calculating the uptime of the application. There is a sql job, "Postilion - Monitor - Postilion Tasks", which runs every 1 minute. This job executes the stored procedure 'task_set_down' which is responsible for setting the amount of time that all Postilion Applications have been up since the 'start_date'. The start_date is set to the current time (ie. now) the first time that the above stored proc runs. (Once Transaction Manager (Realtime Framework) is installed for the first time, and the stored proc is created.) Note that the entire task uptime computation relies on SQL Server Agent running. Subsequent runs of the stored proc will make use of this start_date to determine the percentage uptime of all installed Postilion applications. In order for the stored proc to be able to determine if a Postilion application is running or not, the application must update its status in the task table. This happens when the application is running. Every 30 seconds (if all is well) each Postilion application updates its entry in the task table. It is responsible for updating its Status, as well as updating the time at which the status was last updated (last_checked). Calculation of the uptime When the stored proc (task_set_down) runs every minute, it goes through the list of all applications in the task table, and for each one it sets the State to DOWN where the difference between the current time and the last_updated time is greater that 150 seconds (45 seconds from 4.1 on). (ie. where the application has not updated its status in the last two and a half minutes). - This is reflected in the Postilion Monitor where the state of the application will change to down, and the monitor will go red. (Although the default refresh period of the monitoring console is only every 1 minute, so the change could take a while to manifest visually) For all tasks (applications) that have now been marked as being DOWN, the next step performed by the stored proc is to calculate the total number of seconds that the application has been down (since the start_date) This is calculated as being the time difference between the time that the stored proc last checked the applications status (last_checked) and the current time (now) (typically this will be 60 seconds), plus the number of seconds that the application has already been down. ie. total_time = old_down_time + new_down_time (The number of seconds that an application has already been marked as being down, is stored in the task table for each application.) Once the total number of seconds that the application has been down is known, it is a simple calculation to determine the applications uptime where the uptime is defined as being the amount of time that an application has not been DOWN since the start_date, presented as a percentage of the total time between the start_time and the current_time. So an applications uptime, as a percentage, is calculated as: 100 * (no_of_seconds_between_start_date_and_now no_of_seconds_application_has_been_down) / (no_of_seconds_between_start_date_and_now)