You are on page 1of 19

MONITORING

ISSUES: A
GUIDE FOR
BEGINNERS.
BY: GLOBAL EAGLE NOC TEAM AT CONTACTA S.A.S.
BARRANQUILLA, SEPTEMBER 2019.
CONTENT

PURPOSE........................................................................................................................................3
OVERVIEW......................................................................................................................................4
ABOUT THE COMPANY...........................................................................................................4
ABOUT THE PLATFORM.........................................................................................................4
ITEMS TO BE MONITORED.........................................................................................................6
DATADOG.......................................................................................................................................7
OVERVIEW..................................................................................................................................7
INTRODUCING JIRA.....................................................................................................................9
OVERVIEW..................................................................................................................................9
MICROSOFT TEAMS...................................................................................................................11
REPORTING ISSUES..................................................................................................................12
TIPS FOR RAISING A JIRA INCIDENCE.............................................................................15
DIALING AND ANSWERING THE PHONE..........................................................................16
PURPOSE

On August 12th. 2019, Global Eagle, a multimedia-management company started


a Network Operation Center (NOC) in Contacta S.A.S. (formerly Tecnologías y
Servicios Contacta S.A.S.), a Contact Center and BPO company based in
Barranquilla. During the remaining days of that month, four people were chosen as
the first members of this NOC, along with Norma Theran, the current Bilingual
Coordinator and NOC supervisor.

This manual was created with the collaboration of these members of NOC, based
on Global Eagle documents and manuals, particularly the DCSC Runbook, in order
to serve as a guide for new members of this NOC understand how to raise Tickets
when certain issues is happening within the platforms where Global Eagle works.

Please note this document should not be a static one, but [it should be] a “living”
one, because it should be updated every time a new incidence happens. But this
document should not be the ultimate guide to the Global Eagle platforms. If a
deeper knowledge about certain topic is needed, then the trainee should rely on
the Global Eagle official documents to do it.
OVERVIEW.

ABOUT THE COMPANY.

Global Eagle Entertainment is a company that manages streaming of multimedia


content via VOD (Video-On-Demand) to certain customers ranging from internet
carriers like Movistar, to aircraft carriers around the world. It also has other
business lines regarding monitoring

ABOUT THE PLATFORM.

Currently the platform consists of 4 SaaS vendors, integrated by an on-premise


Enterprise Service Bus (ESB). Vendor integrations occur via AWS SQS queues
(for events) and REST APIs. In addition, S3 buckets are used to store assets.
Below is a high-level diagram of this:

Support: Vendor Queues/APIs, ESB Application/HW, AWS S3 Buckets.

SYSTEM DETAILS: Below is the diagram for the queues, APIs and server details
for the production server.
The company relies on Amazon AWS Cloud and the APIs of four companies for
managing/sorting multimedia content: RightsLine, used for file management;
BeBanjo, using two APIs for managing VOD platforms (Movida and Sequence);
SDVI, for managing media supply-chains and Hybrik, used for media and job
tracking.

Given the health of the platform depends on the good functioning of those APIs,
you need to monitor and reporting any possible issue happening with them.
ITEMS TO BE MONITORED

VENDOR QUEUES
VENDOR APIs
(Availability)
(Availability)
 arn:aws:sqs:US_WEST_2:013474081
•https://api.rightsline.com/v3
760:v2_div82.fifo
•https://movida.bebanjo.net/api/
•arn:aws:sqs:EU_WEST_1:051
•https://sequence.bebanjo.net/api/
915741463:external-gee-
•https://api.hybrik.com/v1
production
•arn:aws:sqs:US_EAST_2:1309
84441189:globaleagle_work
order_status
 arn:aws:sqs:US_EAST_2:1309844411
ESB Application
89:globaleagle_workorder_submit
(DCSC Bundle inside JBoss Container)

 App is active.
VENDOR PORTALS  App can Access SQS queues.
(Availability)  App can Access vendor APIs.

•https://admin.rightsline.com
•https://id.bebanjo.net
•https://globaleagle.sdvi.com
•https://admin.hybrik.com AWS SQS Buckets
(Availability)

•com.globaleagle.contentit.dcsc.prod
.raw
ESB HW •com.globaleagle.contentit.dcsc.prod
(Availability/Uptime) .mezzanine
•Com.globaleagle.contentit.dcsc.pro
 IP Addr.: 10.32.10.88 d.transcoded
 Port 443

APPLICATION LOGS—DATADOG
(Management and report).

 Data issue (source-target).


 App. Configuration mismatch (source-target).
 Connectivity/Network issues (timeouts, service denials etc.)
DATADOG

OVERVIEW

Datadog is a platform for monitoring the status of certain platforms via dashboards.
Regarding Global Eagle, there are some dashboards that need to be monitored.
You can see the status of these dashboards since the last 15 minutes, one hour, 4
hours, yesterday, the last week, the last month or a user-defined range.

Below there are the main dashboards for the Global Eagle platforms:

Custom:
 OPEN L1-Support Production Dashboard.
 ESB-JVM
Hosts:
 LA1MID-PRD

OPEN L1-Support Production Dashboard Overview

This is the main dashboard. The status of the APIs, page logins, reported log
errors and graphs related to the host LA1MID-PRD should be displayed here.
ESB-JVM

This dashboard shows the status of the Java Virtual Machine where(?) the platform
ESB works.

LA1MID-PRD

This dashboard shows the status of the host located in Los Angeles. Critical Usage
Percentage thresholds will be defined later.

LOG EXPLORER
This dashboard shows when something within the platforms GlobalEagle manages
happens. A little knowledge of HTML or JAVA should be enough to understand
these messages.
INTRODUCING JIRA

JIRA System Dashboard

OVERVIEW

JIRA is an on-line tool for managing tasks, tracking bugs and project management
that is used by companies like Fedora, NASA, Skype, and others. This is the
platform where incidences and errors regarding Datadog alerts must be reported. A
screenshot of the dashboard where Global Eagle incidents are tracked is shown
below.

How to access the Defect Tracker: Click on Projects>Open>Kanban Board


PARTS OF A JIRA TICKET: The most important parts of a JIRA Ticket are the
following ones:

 TITLE: Name of the issue that you are reporting on. It should be short and
concise.

 ISSUE TYPE: It should indicate the nature of the issue (Defect, Training,
Data, Business Process and Report Request). Usually data-related issues are the
most common ones.

 DESCRIPTION: A summary of the issue you are reporting on. You must be
clear at writing this part.

 ASSIGNEE: Depending on the issue you are reporting on, this person will
work on solving it.

 OPEN APPLICATION: It’s the platform where the issue is happening. You
should leave this blank if you don’t know where it’s happening.

 PRIORITY: It indicates the severity of the issue you are reporting on


(Critical, High, Medium, Low and Hold). If you are a newbie, you should always set
this to “Medium” because, normally the assignee should change the priority
accordingly.

 ACTIVITY/COMMENTS: It should indicate the steps that the assignee and


others would take to solve the issue you are reporting on. If there are any
corrections made to your ticket, those should appear here.

 STATUS: This field indicates if:

- The assignee is working on the issue (FIX IN PROGRESS).


- The assignee is not working on the issue yet/was just sent to JIRA
(BACKLOG).
- The assignee is testing a possible solution (CONTENT OP TEST).
- The issue is solved (UAT ISSUE SOLVED).
- The ticket is closed (UAT ISSUE CLOSED).

For more information regarding JIRA Tickets, please rely on this link.
MICROSOFT TEAMS

This is the platform where most of the news regarding NOC appear. The channel
used for this are:

 General: The less used one. Used only to notify when a user is
added/deleted.
 OPEN: The usual channel for news and recommendations regarding NOC.
 OPEN_DataDog: Another channel where issues regarding the platform are
reported on.
REPORTING ISSUES

Most of the issues to be reported on must be submitted via JIRA. The steps to
report an issue will be described below:

- Check the GTS-NOC inbox in the Outlook Desktop App for any Datadog e-
mail alert message. Its subject should begin as “DCSC Operational Alert: [Monitor
Alert] …”. An example of these messages is shown below.

- Click on “Related logs” to see more information about the issue.

- After that, click on “Search” to switch from the Graphical View to the Log
Explorer.
- Now, click on the error message(s) and then click on the “View in Context”
button above. Then you will be able to pinpoint its/their exact cause(s).

- This is an example of an error message viewed in context. The error


message is yellow-highlighted, and below are the ID of the title (46788) and the
specific asset which caused that (C46788A46604). This message indicates that
“The Special Treats Production Company Ltd.” Distributor is not yet in BeBanjo—
Movida.
- Then, login into JIRA and create a Ticket highlighting the issue you will be
reporting on by clicking on the “+” sign. Then fill in the fields corresponding to: Title,
Issue Type, Description, Assignee, Open Application and Priority. When you have
finished, click on “Create”.

If you are going to specify the OPEN platform where the issue is happening, you
should assign those tickets as it follows:

 DATA, RightsLine and BeBanjo issues: Chris Land.


 SDVI issues: Dan Thurlow.
 Training and Business Processes: Zoe Tierney.
 Reports: Heather Jewell.
 ESB: Raul Pavia.

And, if you’re not (usually during your first time at creating an incidence ticket), you
should leave it “automatic”.

For a more complete list of issues and actions, please rely on this link.
- When you have submitted the Ticket to JIRA, it should look like this.

Both the Title and the Description of the issue should be clear enough for the
OPEN Team to work on the issue.

TIPS FOR RAISING A JIRA INCIDENCE

 If there are two or more alerts regarding the same issue in the same
hour/day, you don’t have to create as many tickets. You should only raise one
ticket and update it every time an alert regarding that issue appears.

 Some APIs or platforms may have issues that resolve themselves after
certain amount of time (e.g. 15 minutes). You should wait for that “Recovered”
message. If that doesn’t happen, submit the ticket appropriately.

 Every time you submit a ticket, notify via Microsoft Teams by commenting
the corresponding Datadog alert in the Open_DataDog tab.
DIALING AND ANSWERING THE PHONE.

Sometimes, you will need to escalate high-severity issues to another


dependences, other times, GTS members can make calls to check how things are
going. In case of the latter, you should answer the phone like this:

“Hi! You’ve reached the GlobalEagle Network Operation Center! [Name] speaking!
How may I help you? “

Whether you’re making a call or answering the phone, please be polite and use the
right intonation and stress.
EDIT HISTORY

DATE CHANGES
09-06-2019 First draft made
09-10-2019 First corrections made
09-12-2019 Title changed, added purpose and cover.
09-29-2019 New section added, corrections made

You might also like