You are on page 1of 250
Uptime Institute’ e Accredited Operations Specialist’ UptimeInstitute ) ) > ) ) > ) ) Copyright ©2010-2018 by Uptime Institute, LLC ) 20 West 37th Steet - 6h Floor ) Now York, NY 10018 All rights reserved. ) The Uptime institute's (Insitute) Publications are protected by international copyright law, The Institute requires writen requests at each and overy occasion thatthe Insta’ intellectual property or partons of the Institute's intellectual property are roproduced or used. The inttute copyright extends to all medla—paper, elector ) land includes use in other publications, internal company cistibution, company Web sites anc marketing materials, ‘ane handouts for seminars and courses. For more information, please visit www uptimeinstitute.com/esources to ‘download a Copyright Reprint Permission Request Form sd Video content — Uptime Institute’ Table of Contents Tier Standard, eee nnn 1 Course Schedule, Session 1 Notes. tabs oo tba ) Sexson 8 Note, Tabs ) Session 4 Notes... Tab 6 MG Session Notes. reb7 ; Session Notes. ws ) Session 7 Notes. tab 9 > Session Notes. abo ) : oo rap 1 ; ‘Session 10 Notes. wweTab 12 > Session 11 Notes. Tab 13 ) Abbreviations. Tab 14 : Uptime institute Professional Services Contacts se abs ) y& ) ) > 3 oO a Uptime Institute’ ) ee ) 9:00 a.m. introduct _ ) 10:00 a.m: ) oe “| 5 | resopm - C I Z 10:00 a.m. ‘Session 7 — Capacity Management > 11:00 a.m, Session 8 — Commissioning > Day 2 12:00 p.m. > [i00—m. ing Charactersties > 2:00 p. Session 10 — Site Location Risk ) a 4 1200pm.. Examination Ends & Course Adjourns Sd ) . a d ) J ) ) d ) pnysuyewuTydy soueuSIUTEN ecaara! sn Buuren, sopuon, Buyers s0puop 40 ‘Rusduwo ‘fueduiog Buneiedg Bugeredg"sn ‘seoinosay ayeiodiog se0inosey ayeiodio5, ‘OURUSTUTE spueseid INS ‘s3}I10g eyes0d109 ‘ssauansuodsey sontoalqo ewndn eoueueiurey ® SuoReIedg Jo ydeouoD eouenjuT UL s10}9e4 soueusluley| pue suoiesedc jo }deou09 AR ARR AAR AR AR AAR AAA AAA AAD AARAARANAAAAIAARSAADA amnysujeudy, Auedwog ee EstTeyh ta EJele) ‘eaunoury 20100M \ a uuolunuon so -kel Vode) snsiea uo tt Eye et)

) 423 Informed. tn enone active a bohaver held by the ganization or an indi? Doal sia nave tnowiedge of and access all pracasses an procedures fr ary acti they ) iightbe request pero? Far exampl, dss the maintenance technician equred 0 perform a spect ati: 1) know tar amathad of erode (MOP) avaeabe fr tat ) fcfvy, 2) wher fn, and 3) granted access it ) ) ) > ) 8 UptimelInstitute’ 43. Prioritization ‘The piritzaton ofthe Management & Operations and Builsing Charactoristic behaviors are based on analy of the Als databeso, Within each element, the catogares and components are listed in the tables in order of decreasing impertance. Ste Looation rsks are of equal importance but spect criteria dent the sk seal as higher or lower based on the magnitude of potential inpact The level of miigaton in place wil reduce potential impacto operations, 5. Summary The behaviors astabiched in Tier Stands: Opera sabi combined withthe infrastructure requirements inthe Tir Standard Topology are essential for ast to achiove ts uptime potential Te instal infrastructure alone cannot ensure the long-term vabily ofthe ste uns Operational Sustainably behaviors are addressed. Site management teams that incorporate the princpas of both Standards wit have notably better results in realizing or exceeding the ful ute potential athe Istalod inrastutue 6. Certification C The Uptn Ista resenes th exc ratand Carty data coir accor Tr Platt owe unnatl Modifications This Standard incorporates wording and organizational changes to ay select behaviors. The (Operational Sustainaity Rating information i avalaba at wow. uptmalnettte.com. Saaiog ‘Stating and Organization Category Ce 1 aiua assigned Tor grt tina to oversee rial fact operat 2, Stat andor vendors to support th busines presence obj es minimum oft qualia FTE 10212 avaifed FTES offaciity support ger shit Yes, Yes ‘Yes Mia over | vendorsuoportfordesgnatedcritel ystems and equipment “V 7 Engineing ade (9 eect mechanical, contol, biking management | TI sytney pt ys eed on pert na maintenance lel requirements a ee Arpt tea cued by governmental regain lviviel< 2: Erperenes and enzaltiing resin opapehmanianandoowatste | || y | y inated inact | | a Sitpersonnl qualedor pele st onealonsndvualyandesashitam | | |v | # weeny Ne es ‘raerizaton 10 Engineering, in ization chart showing porting chain andl interfaces between the Faciy, ration Technolgy (and Seely groups available and in [2 cricat taco dscns Bi 5 data coter—avalabe Roles andvesporsibites matrix covering alae at ardinuse 4, Key incu and aerate are dsigate 5. Integrated approach o operational management. Including alfaces ofthe data | eater operation (Facies, 1, and Scary Me Table 1.1 Management & Operations Stating and Organization Category aa UptimeInstitute’ Maintenance Category coe Preventive ]1. Elective prevertivemainterance (PH) program netudngistotmainenance — | | | y | y fpr Nabtononce | acton, dv dates, ond cord of complton Ile] Program ‘PM program encompasses original equipment manufacturer (OEM) maintenance Te Wye eam ames ei saa [- «|< [ert be wos otter air eben én I vl Nel sy | Fats raven mines ebies (Pu) ea, rowbematodst 1 | |, | y [ee proete MOP oes) “| ‘| [F uatty contol orocess in place that validates a) the proper competion of andthe | | viel\ [Pca fet | Housekesring [1 Gompuierroom floorand undertones of dirt and debris PARARARAY Policies > Data canter tree of eombustibes, cleaning equipment, shipping boxes, or personal viele Ne | conics tens (ogc ps, irons) - re 5 Hourlening pois wala an enfercoio ose acrtariantiwe ats | | | » | » Nee centr envionment ‘ Warionance [1 etesve maintenance narapmant astm (08) pine orsompatese) rae | Management | status of all maintenance v1 | sets —avasane and use __ 2 Maintains ist of insta equipment (make, mocel, year of maniac “instalation, operat 9s, warranty infomation 3, Work orcs st spacial ois parts aquired to complete PMs 4 Maintein perfomancetend dats on equipment an history of maintenanee seis Program jedyoe Planning aor inastructure comp Failure Anais Progam 1. Maintains Uist ofa outages including dates, times, ifrastrocure equipment” systems involved and spect computing outages, ook-cause analysis, end lessons leanea 2. Effective process to determine rot cause, identity lessons learned, and implement | 8. Tend analysis process “able 1.2 Management & Operaons—Maintanance Category 5. Take calbraioneaprenans cara = 6. Maintain lst of critical spares and reader points PARAl ny Vendor Sorp07|1 Ustf uaa vendors by io aaa for normal and omeapcy wrk raralr 2. Sere vl agreements LA outig scope a wor, Asche, ang viele Vendor eatin protess anc prnts-c-ontact for pre-approved nd quid vara seo) “ Defered |, Putacsmptsiment le reser han 5) 90% “ | Maineranco am 717 Program (2PM aseomptshment rato 100% oO tT IT tre ¢ Pastrana aniince and engages [7 | vino | | Fesene | | vaitenance |, v | + {yo Uptime Institute’ ey ‘Component Coed .On-te-b trining (OUT) program tor each naw employe on a) the systems they vibe esponita or operating and aitarng, and) the uso working inthe Data Center Staff Talning Dauner oral casoon opal ovations nr Vas 1 1 coat eon | | Stops Pcs sot ada re he normal operation bad carter Sut purges 6OP)-ev east shanged during normal operations, Se SH wore | led and opaates v |v lap Yes [Fra qin ova ssid psaelpeoia sooner” TT [yy | oes el [7 [M+ egal ‘Vendor Training |1. List of training required before a vendor is allowed to work in the data center Tv [viv |. Re cree = [efoto te | Parstime "fp. prising on data ante rocsees an process wth epectio te work obo viele suppor) | oetermed " n comers Tiina prograns nde ain ced oso sas, eaed reece me rails, and racardsof attendance lL Table 1.3 Management & Opaations—Traning Calegony ) 5 ) ) > ) ) ) ‘Component ‘Site Potcies| UptimelInstitute’ ey Peer Behavior ‘Site stat performs al st intasrvtureo nd operations un + sit ‘+ Standard Operations: changes to normal operating configuration (ea, siting callers) oral, emergency, or abnormal contions) Coniguration: site inrastrustreconguaton for normal operations rations (.9., configuration changes. | | | | ‘+ Emergeney Operations: control ofthe site durin abnormal circumstances event | ‘+ change Management: review and approval of changes t he sit bassin and) | ‘ealvation of isk as related to planned changes + Mitigation plan or st risks ie 1 Process to ensure that operating and capital funding levels ae consistently sufficient and avai to suport he busines objective 2 Operating and eapital budgets manage ot poled with thar buildings or groups of buldings separately trom non-rical facies and are| 1. Formal documented goices and pronedurs forthe folowing viele bray perational personne 4. Proves ensuring master copls are mainland current with additonal copes avaiable to site opertiona personel, vendors, designers et + As-built drawings | | | | + mation sequnces of operat | | 5. Reference documents lated in cetralized lation (ibray) aval to ste vie Management 7, Press for manaping te instalation and removal af T equipment rom te computer rom 2. Computer oom for pan developed and reply reviewes/updatea 8. Proves fr forecasting future space, power, and colng growth requirements ona peri ess (9, V82/246 month) 6. tacking mechanism fr current space, power, and cooing capacity anéuzaion | reviewed periodically 5. Etfective process fora} computer room aitiow manag monitoring, management, 2nd aalis ment and) elect power Table 1.4 Maragoment & Operations Planning, ooranation, and Management Catogory Yes wee nw ) VUUUUYU ne Uptime Institute ere Applicable forTier [Load /1. Process toensue the maximum icassarenatexcaededandcapactyisresenesion | | | | | enagerient | _ swing between components Let] Operating Set [1 Consist operating set point (a onpertue, peu, valunate ow eS) viviel is éstabted bed on both ito coniuau vanity and cost of operation | L7H Hep Basie |), excess or atesing reuse redundant nase quent as lle Equpment bart ofthe site maintenance rogram | Commissicning| Tablo 1.5 Management & Operatons-—Operating Gondtons Category Coed Factory witness testing (FT) of eral infrastructure equipment structure components pi, mstalon, and po unstional testing oferta in 3, Furetional dress anfiguration 4, System start, OEM test, and individual system test (ST) 5. igcated systems operational est (SOT) ing oral inrastucure stané-aone testing, and pre-systm startup | | ¥es Table 2.1 Buldng Cheracteristios— Pre Operational Category Uptime Institute pears Component Purpose Bulk 2, Stand done Dulin oy Purpose-bult dita center 2: Segoe nme operations iy separatod fom other corporate alts ont Data center but to standards exceeding acl Bulding codes o ensure continued peratlon atoning natural event i 1. Adequate space separate rom compulr room or hardware receiving, tong, staging, bling, end testing _ 2, Adequate space separate trom compute room forth allowing functions: + BMSYBuldng Automaton Systm (AS) contro center + Command GenteDisasterResovery + Parisand ool storage + Enginoring an Fait shop activites + Mosting and taining purposes ‘Seculy and ‘coess Contra 2. Controlled buiing 5, Period revow ofa | ontrotea ste access 1. Adequate space around the data center to minimize impacts from adjacent facies “Table 22 Building Characteristics Building Festures Category VEUUY ponent amen ed pear exit for]. Designed and constructed so tat eampute room space can be econtgured with incremental | reasonabeeffr, and incremental increases in space, power, and exoling canbe | ‘ Capacity | _ accomplished wth minim ial oad Lo | Increasas [2” connection points for flue _ Zhe infrastructure a waive cewbha #) | opart 3 4 2 Mecha ss ‘Consistent abelng of nrasirutureeauipmentand st ea of operations ical ystoms stad to fait Adequate space forthe cate conduc of al aoral maintenance actvitis on lnrastracture equipment 2, Adequate space (sufelentsrng ad, iting pons, nd invout pathway) safe conduct of ap remaval and replacement on nrastrustre equipment idea to facta de 5. Equipment acess pr larg components iyand instal of moore a ater ‘and Coaling 1 Datacenter cesigncoorinated space, power, and cootng capac exaust points tells Table 29 Belding Charactostes—Infasiucture Category ere on ce Lower loading (ve ake, reser, canal pond, 2) ‘onvon Feod Pa ooYear Fined Plan [oops <100-¥eer Foe Par (00-Year Food Plan |uricanes, Toads, and Typhoons | igh [Sesrichowy? 20am is [reve voearoes High “abe 31 Ste LoatonNatual Dieser lak Catagory eee ea Pers Component : Lower arpo bitrya Smiestrom any atvo Tunway, idea txS-rie | runway, asi a txS-nle 7 a evar enterion turway edersion 16 nipcon Wepwtes oeaee Cherian. rena fice bung, factory ot undeveped and, te. Table 02 Ste Location ~Man Made Disaster Risk Category * ak valuation ath regina ar eel od psn map rear gun * Peak Grou cee nett per sesnd quae ha ca be expecta ing he nxt 0 yea wh 10% proba. About the Uptime Institute Uptime institute fs an unbiased advisory organization focused on improving the performance, efficiency, and availabilty of business ertcal infrastructure through Innovation, collaboration, and independent eattotions. Uptime institute sores all stakeholders responsibe for IT service availabilty through industry leading standards, sducation, peso er natworking, consulting, and award programs delivered to enterprise organizations and third-party operators, manufacturers, and providers. Uptime inttute is recognized globally forthe creation and administration of the Tier Standards & Certfations for Data Center Design, Construction, and Operational Sustanabity alongwith is Management & Operations eviews, FORCSS® methodology, and Efficient IT Stamp of Approval Questions? Please contact your raglonal epresentative hitp:/uptimeinstitute.convcontactue ‘or email us st: info@uptimeinstitute.com. vals msn ) VIOVVVY ) ) ) CGUVLOU Uptime Institute’ Select portions of the curriculum presentation are not provided in this booklet due to permissions granted or withheld by companies providing actual examples for educational purposes only. Please refrain from taking pictures or videotaping any portion of this course without the instructor's prior written permission Copyright ©2010-2018 by Uptime insttut, LLC 20 West 87th Street - eth Floor Now York, NY 1008 All rights reserved The Uptime Institute's (Instute) Publications are protected by International copyright law. The Insite cequires writen "2 intllactual propery or portions ofthe Institute's intellectual ni, and video content requests at each and every oceasion that th In property are reproduced ar used, The Institute copyright extonds to all msdia—paper, elec! ‘and includes use in other publications, internal company cstbution, company Wob sites and marketing materials, ‘and handouts for seminars and cou sttute.comtesources to download a Copyright Reprint For more information, ploase visit wow. uptir jesion Raquost Form ™ ty BSB SY BB BB BB UU BS BS BV BU VU Uptime Institute’ Accredited Operations Specialist® Session 1: Introduction Uptimelnstitute Data Center Career Advancement Track acs PROFESSIONAL a © BS B Haccreoreo | [itssmcsans Uptime Institute The CPD Standards Office INDEPENDENTLY ACCREDITED CPD www.cpdstandards.com Uptimetnstitute’ Course Goals + Achieve an understanding of the concepts and criteria for developing a Management and Operations program for a critical facility » Improved oversight of each component of operations » Ensure operational performance meets the site business objectives + Evolve critical facility management practices in order to maximize the potential of installed infrastructure + Minimize leading cause of critical facility center outages Human Errors Uptime Institute Tier Certification + Tier Certification of Design Documents. C1C OF ) L > In-depth review of design to ensure full compliance }- Tier Certification of Constructed Facility Ci¢¢f) > Onsite verification of installed infrastructure + Tier Certification of Operational Sustainability (7% 2£) > On-site verification of ongoing operations Uptime Institute also offers preliminary and progress reviews of designs and operations Uptimelnstitute Benefits of Tier Certification + Insurance for data center facilities investment > Protects against ‘loss’ of weakness in infrastructure > Ensures consistent solution + Enterprise~Recognizes organizational accomplishment > Demonstrate to upper management that performance capability is there * Managed Service Provider—Awards industry achievement » Competitive differentiation » Reduces or eliminates need for due diligence Uptime Institute Data Center Operations Reviews Uptime Institute has conducted Operational Sustainability and M&O Assessments since 2010—based upon decades of site operations knowledge and experience + Operational Sustainability Certifications: Tier + Gold, Silver, or Bronze + Management & Operations (M&O) Stamp of Approval cout Bo Uptime!nstitute’ Operational Sustainability Certification + Certification Process » Need Design Documents and Constructed Facility Tier Certification first LAV > Operational Sustainability Certification based on Tier Certification * Gold, Silver, Bronze > Operational Sustainability Certification becomes suffix to Tier Certification + Tier Hi Gold Uptimelnstitute YVVIIVUGIIUUY CVV VUUUEUUUUUUUUVUUUEY VUUUOUY Operational Sustainability Certification Low Ste Location risks ~UAled 2 years Full uptime potential ofthe installed infrastructure realized or exceeded Evident Management & Operations and Bulking Characteristics behaviors Silver | Low Site Location risks | - Valid 2 yenes | Opportunites for improvement in order to achieve the ful potential ofthe instatled | infrastructure Tein eagerly Cubans aeons Des S Laas ai ‘Significant opportunities for improvement in order to achieve the full potential ofthe installed infrastructure UptimeInstitute Management & Operations (M&O) Stamp of Approval + Recognition for achieving a high level of management and operations effectiveness —independent of the infrastructure + Fornon-Tier Certified critical facilities + Does not assess building infrastructure or outside 4 site location risks that may impact operations + Based on Behaviors from the Tier Standard: Operational Sustainability ~ Management & Operations Element * 80 points out of 100 is the passing level ialid 2 yeavs Uptime institute’ Why outages occur SU Peet cy Downtime happens... + According to 2015 Survey Data: Nearly 50% of enterprise IT organizations experienced a business-impacting data center outage in their own data center during the previous 12 months + In. 2016, one-third of enterprise IT organizations experienced an outage from a colocation provider in the previous 12-month period Uptime!nstitute we Outages due to “Human Error”: 5 ) _ ) 3C 3 > Uptime Institute’ > > > > ; Conventional Wisdom is Wrong > ¢ + Conventional wisdom blames human error for the majority of outages: operator mistakes. The responsibility falls to the ) y y operator for failing to rescue a situation ) + But failure in most cases, can be attributed to poor management decisions (design compromises, budget cuts, staff reductions, vendor selection, a lack of appropriate procedures and resources) disconnected from the incident What decisions led to a situation where front-line operators were unprepared or untrained to respond to an incident and mishandled it Uptime Institute’ VEE UUUUEUUY Definition Operational Sustainability is defined as(the behaviors and risks beyond Design Topology that impact the ability of a data center to meet its Business Objectives or Mission Imperatives over the long term.) Uptimelnstitute Highest Impact, greatest opportunity for change Management & Operations Building Characteristics | Uptimelnstitute’ ) J a ) 3 J ) ) d ) vuuuuny Management and Operations Deficiencies based on over 100 assessments Percent of Behaviors Ineffective Uptimeinstitute Staffing and Organizational Deficiencies oe eon e & oaae AN +2 Staffing Qualification Organization Inadequate stating + No ist of required + Roles and esponsbities uaiications ot documentes + Excessive overtina + No experience wth deta + Datacenter erganization + No esalation process center specific equipment not intagrated Uptimeinstitute Maintenance Findings Preventive Maintenance (PM) “+ Nalist of equred PM activites Panes ays + No quality canal process Hw A Housekeeping Maintenance Management “Combusiblesindaa System (MMS) ‘Ie sun aetnamna Cree cibigemetamt bale, ne Hevegthenny depeod om VBR Uptimelnstitute Management Deficiencies & Vendor Support ‘Contracts missing URS response times, callin process, deal SOW, oF technician qualifications + No biting for escorted vendors 2 Analysis Training “ature Aaj No *Noomal ing Tem ootages or er proven theses pans misses Coatshand 4) + UndacumenteéOn-te- + Nopesctve 400 (07) programs mlnenance program + No isto traning required by postion Uptimelnstitute IVIVWUY SEE GU SUSU UU GU I UU UU Operations Deficiencies Ee, 6 kK Operating Conditions Documentation Capacity Planning Sc NarapeetAza stool ocureisd sig Ste Pol a proes fe feast tee {pon poner and cooing regaemeis ‘Aus notsttonPOUEto arcu maximum = Eapeiy SteContoaton ‘aes eat eas Paley * acie wat ot lg cnt feasts a nctoceantorpatel so pons pee Hotel ‘henge Mangement rc relrees tray doses oped + Canin of st ons i ot oboe ‘atone manage of Col sto ls Uptimelnstitute’ Purpose of the 7ier Standard: Operational Sustainability Address the gap to site operations based on the Standard Document Operational Sustainability behaviors and risks Prioritize those behaviors and risks to focus on the factors that will most improve the performance of the critical facility * Provide a tool to measure a critical facility's Operational Sustainability + Encourages doing it ‘your way’—results oriented > Behaviors, not requirements or prescriptions > Behaviors definition: The way or method that is used to accomplish a task Uptimelnstitute Management & Operations Categories Staffing and Organization Maintenance > Staffing > Preventive Maintenance Program > Qualifications > Housekeeping Policies > Organization > Maintenance Management + ‘System = trast impotvah > Vendor Support - prapit. crente > : i a , owrnges > Predictive Maintenance > Life-Cycle Planning > Failure Analysis Program ost important par Uptimelnstitute’ Management & Operations Categories Training Operating Conditions > Critical Facility Staff Load Management > Vendors > Operating Set Points > Alternating Use of Infrastructure Equipment Planning, Coordination, and Management eee Uptimelnstitute’ > ) ) ) Building Characteristics Categories 5 > Pre-Operational Infrastructure ) . ao ) > ae me ) Building Features ) 6 acialty Sp > Space, Power, and Cooling > S Exhaust Points 3¢ Uptimeinstitute Site Location Categories Natural Disasters > Flooding > Hurricanes, Typhoons, Tornadoes > Volcanoes Earthquakes v Man-made Disasters > Adjacent Property Exposures > Airports > Transportation Corridors Uptimeinstitute’ SY BY YUU SU VU UU UU UU UU VU UU Ut How Do | Get There? This course will guide you through the major aspects of building an Operations Program An operations program is unique to each site, no two are exactly alike The program is specific to the business objectives of the site An effective program results in reducing the risk of human errors and outside impacts Itall starts with taking calculated steps to reach the desired goal Use the three key attributes of effective behaviors for all process and procedures developed _ fv act practices Anerthed ‘Uptimelnstitute Key Characteristics of Effective Behaviors Proactive > Anticipate an issue or risk rather than react to it ~ jo paver} from > Continuous improvement mapper Practiced > Disciplined approach to achieve desired result ~fiy, geiices > prreedses > Processes and procedures in place and followed Informed » Site operations resources ere known and available > Knowledge is maintained by the organization, not the _ infsomtie shfgs individual with company » Relational approach Uptime institute’ ww WI YO VRVVUYVUVU VE LVUUUUEVUU BEVEL UUUUUUUUUYUY VULUL Operational Sustainability Summary Operational Sustainability behaviors are your main defense in avoiding human errors Effective Operational Sustainability Behaviors lead to a more energy efficient critical facility The Management and Operations program development needs to start long before your critical facility becomes operational The Management and Operations program should constantly evolve and adapt as your critical facility changes > Technology evolves » Load fluctuates > Assets age » Staff changes Uptimelnstitute’ IOI wo Uptime Institute’ Accredited Operations Specialist® Session 2: Concept of Operations and Maintenance Uptime Institute’ _ Vee OP Excimy 4 Resource Options * Corporate Cinternsl Cert) Cesearues > Staff employees of the company that owns the critical facility + Operating Company Ceutgoured) 24 party > Staff employees of the company hired to operate/maintain the critical facility for the owner + Vendor ( Sonneier ; vertiv ) > Company that either the owner or operating company uses for specific maintenance Uptime Institute’ VEE EEUU EY UU UB UB U VU UE VU VU UU UU Concept of Operations & Concept of Maintenance Concept of Operation and Concept of Maintenance are the two concepts that drive all other decisions in developing an Operational Sustainability Program First decisions that need to be made before the development of the facility management and operations process + Many factors influence these concepts UptimeInstitute’ Concept of Operations and Maintenance Factors That Influence Concept of Operations & Maintenance ee B. | Uptimelnstitute * What is Operations? + System-wide knowledge Coversl) ) + Typical operations duties Facility daily checks Performing pre-preventive maintenance activity isolations and switching Monitoring critical building alarms Minor maintenance and repair activities that are primarily quarterly or monthly checks + Performed by » Corporate Resources > Operating Company >» Not Vendors Uptime Institute Concept of Operations — Concept 7 Ec Uptimelnstitute ** What is Maintenance? + Component-level knowledge + Typical maintenance duties > Monthly, quarterly, semi-annual infrastructure + Uninterruptible power supply (UPS) systems + Engine generators * Chillers and computer room air handler (CRAH) units » Provide technical knowledge support for key critical components + Performed by > Corporate Resources Cane ws4sl Pagsieal inspection ) > Operating Company Veen, > Vendor Ci depth ene ~ owarterly) » Combination ;nd annual maintenance activities for Uptime Institute’ Concept of Maintenance nbocerens) prataty of & - 66 ereia mea Uptimeinstitute CCGG UU UU UU G UU UU UU VU U UU GU GU U UU UU UV VU BU UU Maintenance versus Operations + Although there may be some overlap between the positions of an “operator” and a “maintenance technician,” they are philosophically different positions + The only way there can be overlap is if the job skills of key personnel allow them to be capable of both operations and maintenance shite Prosonco Uptimelnstitute’ Uptime Objective and Responsiveness + Staffing presence + Rigor of maintenance activities Responsiveness + Tolerance for risk Uptimelnstitute’ Corporate Policies + Union of nonunion + Shift presence + Hiring policies » Job descriptions > Recruitment > Pay scales + Safety requirements + Security C&ceoti¢s ALU tine) Uptimelnstitute’ Availability of Qualified Personnel + Are there qualified local persons available for corporate resource staff positions? + Are there qualified local vendors to maintain equipment and meet response times? + Are there operating companies available locally or willing to move to your area? Uptime Institute BO System Complexity and Fault Tolerance * Qualification and experience required by technicians + Training requirements + Processes and procedures Uptime!nstitute’ VUVUVOVOUUGUUGUUVEUUL UE UY Organizational Structure + Integrated organization vs. Functional (traditional) organization + Goal is to create a high-reliability organization Improves communication Focus only on data center operations Single accountability chain Faster response to changing technology Greater ability to manage the whole critical environment Uptime institute’ VEY UE UU CULE UUUUUY Traditional Organization Cer °c) eee Dey Uptimelnstitute Integrated Organization («--\. paws) UptimeInstitute VOGUE VV VOY VU OV VU VOU UU GEV U UU UU UUY VUEUUUEUUYL u a Corporate Resources versus Operating Company + Operating Company » Most Operational Sustainability development/implementation moved to operating company » Operating company assumes some of the risk of operations and maintenance > Owner in contract management role + Corporate Resources > Tolal contol > More adaptable throughout critical facility life cycle Uptimeinstitute’ All Decisions Affect Cost - Risk - Control + Costs associated with » Corporate Resources > Operating Company » Vendors » Combination + Risks >» Operations » Maintenance + Control > Less control with Operating Company or Vendor » Accountability and ownership Uptimelnstitute Speaking of decisions, a brief Dl oli eS tid FORCSS® Overview ‘A method to capture, compare, and prioritize the various impacts to the many IT deployment alternatives + FORCSS takes a holistic view of financial, risk, performance, and other key determinants. The system provides a consistent approach to evaluating a variety of solutions + Apply FORCSS to any major operations decision + Internal Corporate Staffing vs. Outsource Mode! + Build vs. Colocation deployment + Vetting cloud computing providers * Comparing vendors in major infrastructure investments. + FORCSS Ir an ee igne eengue 8 t s aatiit auuity sind ore Segees aunees Uptime Institute VIVO a x [yy gy ee VOGOVVEGUVUU 18-point Holistic Framework + Net Revenue Impact + Comparative Cost of Ownership * Cash and Funding Commitment INANCIAL + Time to Value + Scalable Capacity + Business Leverage and Synergy PPORTUNITY + Cost of Downtime vs. Availabiliy + Acceptable Securty Assessment + Supplier Flexibilty + Government Mandates + Corporate Policies + Compliance & Certifications to Industy Standards OMPLIANCE + Carbon and Water Impact + Green Compliance & Certifcations + PUE Reporting USTAINABILITY + Application Avaitabiity + Application Performance + End-User Satisfaction ERVICE QUALITY Uptimelnstitute Why FORCSS? Typical organizational decision-making challenges Incomplete requirements information Difficulty collecting information across all stakeholder constituencies Lack of cost and performance insight into third-party services Challenge of capacity planning for >6 months out Avoiding over-provisioning Inconsistency in reporting structures across geographies, divisions, internal and external resources, and providers Uptimelnstitute FORCSS Responds to Business Needs + FORCSS was developed to be applied at the Application or Physical layers. FORCSS process adapts to unique characteristics of the organization + Business need + Deployment alternatives Organizational structure On-staff resources and capabilities + Geography + Schedule Uptime!nstitute’ Dy of ccom Summary Stething, + Concept of Operations and Maintenance ae » There are many factors that can influence who will perform (Q Vwiriny operations and who will perform maintenance Wns ) These decisions will establish the framework of your ye operations program Organization structure can complicate communication if not well defined + FORCSS is an example of a decision making process that aids in consistency when evaluating solutions Uptime Institute’ C rer coer oe a err Cees Cone read eo a Cred cone) Ce en VL 4 Cr 3 co ) cod ) ) crore Pee d Ce eo or CN Cees cee Ce) eon oy creed corey Earthquakes AOS Exercise Scenarios Scenario #1 Enerpise, corporate owned =300,000 2 700,000 fez ‘Now Build under construction; 2 years to Operations Flemote area without mach other industry nearby no airports or military bases within 10 miles: close major highway for main route for data center access B engine genorators 4 battery UPS modules 4 child water system for coating Poliay for some in-house, onsite shift presence Non - uring Soclton ota bul copra ed dala Gaara lw conor Site is in a flood plane == as = ‘Site experiences moderate earthquakes oocasionally ‘None a i Gus uw 0 mph ‘Moderate temperature zone, bolow treezing winter temperatures, near 100°F summer temperatures Low erm in the remote tea Scenario #2 Enogrsereoporionawed Colocation 275,000 ta 30,000 82 ‘Newly purchased existing data center being retrofited; 7 months to Operations. Tio I with Net redundancy ‘Near major mtropoltanaroa with muliple reads aacessing the eta; ralway He 2 miles to the south; major airport less than 3 miles away 2 engine generors 1 rotary UPS 41.DX cooling system Corzoraton owns other data centers 2d by in-house staff and others by operating companies Les costly option isa high priory Lis tone isk of flooding ‘Area exparianced a song earthquake nthe recent past Relivo, 60 miles upwind ‘Gusts up to 60 mph Desert region with highs of 15°F Madera with vandalism inthe area typloaly oocuring at g 5 ea Fr a S aa F s 4 Dp ‘eousuaep wouba 0 F ouBG=RUENY SPUNOND L ‘eoedg eamouEN fo Guay so0005 _eabung ue ‘Aueduiog 6uneiedo UONEINGO HIRD PIG ION) EE CE CUUOUEG UU UCU UU CU UV UCU VU UU OU UUU UU UU BU UU UU Uptime Institute’ Accredited Operations Specialist ® Session 3: Staffing Uptime institute’ Staffing and Organization + Management & Operations is the most important Element of Operational Sustainability > Staffing and Organization is the most important Category f ) | Management | | Operators | | | | | aay + The right number of qualified, trained personnel, organized correctly, define success Uptimeinstitute’ Staffing + As.uptime objectives increase, so does staffing presence requirement + More complex infrastructure and high availability objectives require faster response times » Leads to increased staffing » Importance of escalation procedures + Determining staffing levels and justifying them is critical to achieving the full potential of your installed infrastructure Uptimelnstitute’ Critical Facility Staffing + Full critical facility staffing encompasses > Facility Operations > {T Operations > Security Operations > Client Services > Sales > Network Support + Fazility Operations staffing consists of > Management, engineering, and administrative support > Building operations + Shift presence + Maintenance + Vendor support + This session focuses on building operations but the same principles apply Uptimelnstitute CUCEVSSC SOV OV OV VU SV VOCUCSCS SO VOU OVO VOVVE VV GUVVU9VN9 Variables Affecting Staffing + Business objective + Shift coverage requirements + Tolerance for risk presented by staffing overtime + Critical facility size, age, and configuration 7 Complexity of operations of infrastructure + Level of IT hardware turnover Qualified vendor availability +_Maintenance workload * Learning curve + Site support requirements (projects, tenant, etc.) Uptimeinstitute Staffing Development Process a aor Se) (Renl oer] ee Ses [er al Hours must be broken out by trade UptimeInstitute’ Determining Trades + Labor required for each activity must be then broken out by trade > Some activities require specific trades and specialties » Others can be done by more generalists + Shift personnel must operate and sometimes perform maintenance on both mechanical and electrical systems > Instead of having both an electrician and mechanic on each shift using generalists on shift may provide more flexibility + Maintenance activities that require specific trades or skills are good candidates to be outsourced + Choosing the right trades can reduce the overall headcount e » Fulltime equivalent (FTE) Uptimelnstitute Maintenance Hours = C Preventive Maintenance (PM) Corrective Maintenance (CM) Fey Fy Rene : Y Vendor Support opcabing man a\ / Vist roars Project Support BEY, cand Tenant Work Orders o Uptime Institute ) Preventive Maintenance G Maintaining Infrastructure in Like-New Condition + Examples > UPS battery maintenance > UPS module maintenanc ) > Generator maintenance > Replacing filters and belts , > Chiller maintenance + Hours derived from > Original equipment manufacturer (OEM) and vendor recommendations ¢ > Historical PM data > Trending data from MMS > Warranty information ) y Must break out by trade > ‘Uptime Institute’ > ) ) ei Corrective Maintenance ) ( Repairing Existing Infrastructure ) + Examples ) > Repairing a computer room air ) conditioning (CRAC) unit ) > Replacing defective electrical bre ) » Replacing failed seals in pumps ‘ + Hours derived from . > Historical information ) » Age and condition of infr ) > Predictive maintenance ) Must break out by trade ) ) Uptimelnstitute Vendor Support Time Assisting an Outside Vendor + Examples > Checking vendor onto and off the job > Reviewing vendor work scripts > Escorting a vendor > Supporting a portion of vendor task Cust > Monitoring vendor work + Hours derived from > Histori inforrnation nates Must break out by trade > Labor esti Uptimelnstitute Project Support Major Expansion or Replacement of Infrastructure + Examples > Installing a new elect > Replacing a chiller, engine generator, or UPS > Renovating computer room floor > Installing addition PDUs of CRACS + Hours derived from ct labor e: imates > Historical information Must break out by trade Uptime institute Tenant Work Orders Support to the Tenants of the Critical Facility + Examples » Hotand cold calls > Fixing plugged toilets » Replacing light bulbs > > Minor electrical installs, etc. + Hours derived from > Other similar facilities > Historical information Must break out by trade Uptime!nstitute Staffing Exercise + Determine maintenance hour requirements by trade + Determine shift presence requirements + Calculate productive work hours per employee + Determine staffing requirements by trade Uptime Institute’ Uptime Institute 5 AOS Exercise ~ Staffing Scenario a Data Center Size ~ 5,000 (0,000 f, 2,000 Ke . ‘Shifts are 8 hours per shift with no overiap, Typical workweek is 40 hours/week ) Uptime Objactve 24x7 with on-site sit presence oft electrical and + mechanical onsite at all times. Shift round = ‘time and 50% of mechanical's time available for maintenance. ) Four primary trades cover the daily operations of the data center: ) Electrical 5 Specialist, Task Cee os Bones ) [Preventive 5 | Maintenance C Ws) | s200 ears) im ) Goreeny cay azo aa | an 50 ) Vendor Support (ve) 300 350 ~ 300 300 ) | Project Support Cy 800 600 400 150 5 [Tenant Work Orders 200 | 300 500 250 ) OTs A, toe 26s Te0 ¢ 5 sick days per year totaling 40 hours 26 hours of oft site training per year What should the FTE count be by individual rade to adequately staf for operations and maintenance? ) UUVEVVU IVI VUUY VOVVGEEVVVY Example: Maintenance Hours Preventive Mairtenance psvo | sro |ivo | ase Corrective Maintenance 1200 S50 Dee 5 | Vendor Support 300 eo Jos Ee Project Support jo boo too iso Tenant Work Orders re op ‘Sb bye Total | it, ovo Aww dsboo woo ‘Uptimelnstitute’ Shift Presence nacre = — Die [ee Lee Le ‘Uptime!nstitute’ Shift Presence + Number of personnel required by shift determined by > Corporate or site policies > Response time » Shift work requirements + Shift activities include > Watch standers (BMS, rounds) > Operating equipment and responding to alarms + Remaining hours used for other maintenance activities Uptimeinstitute Shift Presence Calculations Hours Sayawook=Orourstiay | Sdayex houstiayx 52 works 2080 : yank 1Ohowatny | Sayan 16 heal x 62 wena a Syaioek 24 rouratay |S dayex24 holy x52 wenke e240 Theory 95 caper hour 365 aye 2 1S houraday 355 dayanyear 1 hour 285 day 3800 uray 205 aay Ba ouriay 95 aye aro iY + The number of hours of coverage required must be calculated by trade + Table provides the hours for sample typical coverage requirements (per person) Uptimetnstitute JBI GVO ECVVVVOEVVGUVUEUUVUUYY SCCCUCUEULULY Example: Shift Presence Requirements ay Coverage Time g4uo | Fie ° Pevcorage Avlabietor m 7 Natrtenance bor. | S04 ° 0 Hours Available for Maintenance y25b yao 6 O° Assumptions + One person 24x7x365 for both Electrical and Mechanical + Electrical: 60% of time available for maintenance activities * Mechanical: 50% of time available for maintenance activities Uptime!nstitute’ Defining Total Hours Required a Uptimelnstitute’ Total Hours Required Calculation + Total hours required by trade > Hours required for maintenance (MH) plus » Hours required for shift presence (SP) minus > Shift hours available for maintenance (SHM) + The more maintenance done by shift presence staff the lower the total hours of staffing required MH + SP - SHM = Total Staffing Hours Uptime Institute Example: Total Hours Required | Maintenance + Shift Presence ~ Reduction Uptimeinstitute Determining Productive Hours a ee] =] HESe = = Si (oa) rt eon r = moinkenale, ‘Develop Uptimeinstitute Productive Hours + Productive hours are those hours per person per year available to perform work (maintenance or shift presence) + Dependent on Typical work week schedule Vacation Sick leave Training Holidays + Human Resources may have data Uptimelnstitute Example: Productive Hours rea re Hours cor Pee for Work Perea | Work Year 2,080 Based on 40 hoursiweek schedule | Holidays 80 10 holidaysiyear Vacation and Other 96 12 daysiyear Sick Time 40 5 sick daysiyear Training 26 0.5 hoursiweek Total 242 Total Available ay HourslYear Uptimelnstitute Calculating Staffing Requirements our Fade a | Uptimelnstitute Jy Calculating Staff Requirements + FTE required by trade is calculated by dividing the hours required by the productive hours + This calculation will most likely produce fractional FTE requirements, which can be dealt with in several ways Total Hours Required/Productive Hours = Staffing Requirement Uptimelnstitute Example: Staffing Requirements [Patcbernrearte | ase [tem 1 Shift Schedule May Require More FTES Uptimelnstitute Handling Fractional FTEs ows by Trade earn jane 5 ez) Uptimelnstitute™ Solutions for Fractions of FTEs + Consider overtime > Use overtime to cover the fractions of FTEs required >» Overtime <10% is reasonable; however, overtime >10% can. sometimes lead to an increase in human error that causes outages + Outsource the fractional work + Hite additional staff (round up) + Try to move some of the work from one trade to another—more overlap Uptime Institute CCU EUG U UU VUE UU OU VU VU UU UU VU VU VU VU SUSU Case Study: Shift Schedule Impact on FTEs + Requirement: 2 people per shift 24 hours/day x 365 days= Bf1y x 2 17,520 hours of coverage each year Productive hours per year: 1,849 hours FTEs required with no shift overlap: 9.47 FTEs Current staffing: 10 FTEs + Shift schedule was Monday-Thursday: (3) 10-hour shifts (2 hours of overlap) Friday-Sunday: (2) 12-hour shifts This shift schedule required 11.28 FTEs Average overtime rate: 11.32% ‘Uptimelnstitute Potential Impacts to Staffing + Availability of labor » Local pool of people with a specific trade may be limited > Local labor quality may be poor + Company preferences > Company preferences and policies on what tra¢ > Only certain job classifications may exist + Availability of outside vendor support > Site access by vendors may be limited or difficult > The specific vendor support required may not be available locally + Union or nonunion workforce > Union rules may determine what work certain trades can perform 1g can be hired Uptimelnstitute Staffing Justification + Any documented staffing process will justify staffing requirements » Hours required by activity > Productive hours available » Headcount required by trade > Cost comparisons between in-house and outsourced work Petty accrsls + Accurate historical data of labor hours by activity Csiaed Ame — 40d ty > Asummary of major uscoming projects with labor estimates based on historical data > Hours trending will support staffing level adjustments as the site ages and PM and CM work increases Uptimelnstitute’ Fighting Off Staffing Cuts + Communicate the risks to continuous operations if > PMs and CMs are not completed > Old infrastructure equipment cannot be replaced/upgraded > Shift personnel are not available to monitor infrastructure and respond to events + Document what will not get done > Which infrastructure expansion projects to support IT installs cannot be supported due to man-hour restrictions > How direct support to tenant requirements will decrease due to man-hour priorities ‘Uptime!nstitute’ COG CO CUCU GU UU U VV VU VU GU GU GU FU pOUULGCE Qualifications + In addition to having the right number of personnel they must also be technically qualified to perform their assigned duties + The qualifications of personnel increase with the sophistication of the critical facility (Tier HV) + A qualified individual will have at a minimum > The required government licensure for their trade and job description ) The appropriate experience with critical facility operations + Qualifications should be current and documented UptimeInstitute Impact of Lack of Qualified Personnel + Personnel not having the technical qualifications to perform their assigned duties results in > Lack of knowledge on infrastructure equipment » Cost of hiring vendors that are qualified > Maintenance not performed correctly > Poor quality of work Higher incidents of human error Delays in responding to infrastructure concerns Inability to react to critical facility issues + Lack of qualifications increases the likelihood of failures and loss of uptime Uptime institute’ inpaoOxd pe SAO ydo4 5 Kreg Sau BIND ‘op Buldeay 0) som peo0d puke BaRIO GuTBOrSSEH — Tus Be ex wr Wawra KRuepy & aen BUBOO io 0805 6 obese hyped 5500) WUBUISBELE BURY Foubie3 FON | — Susueanbe 2090 LOSI UE, AIS SBEHOA UE ‘ass eoeds poUIveD ea PU PH | Tuauudinbe 9beHOn WB va Wom Ou BEOKR =a = ‘seneds pouyon ur 6 Institute’ iy A i) ‘oxo 0 ny PUR waists mewsbouey Bu —— saing0ig jonbou suns [anveo pus ea, 00,30) sempeoasd pus Seid ‘2n sodeud pu (sd) seunposout Jo spowiusy souevod 0pvon fq 30 asa soninoe We BUN payinbay suoneoyyend jo 1s! RnB SEG SRA EHD | wamnpacera vonee2=3 | ‘aK HOM 61080 BEG CCCOUCOCUTOTOVVVOVV CVV VV 8S VU VU YU UYU YY YY Organization + Aneffective organizational structure requires that the reporting chain and individual roles and responsibilities are understood by everyone + The following should be available, current, and understood > Organizational Chart + Showing reporting chain and interfaces between Facilities, IT, and Security > Key Individuals + Backup for key individuals Uptimelnstitute! Roles and Responsibilities * Organizational charts vary based on many factors > Foc ‘on service delivery to maximize critical facility operations + Optimal critical facility operations clearly define roles and responsibilities for every key action in a critical facility (eg., move, add, change) + Advantages > Improves communication > Focus only on service delivery > Clearly defined accountability chain > Faster response to changing technology > Greater abi manage the whole critical environment Uptimelnstitute Roles and Responsibilities Matrix + Roles and responsibilities of all critical facility activities should be documented and available to all personnel + Documenting this information leads to accountability + Aroles and responsibility matrix should identify at a minimum » Who is Responsible » Who is Accountable » Who is Consulted > Who do I need to Inform + Acommon method used to document critical facility roles and responsibilities is the RACI Model Uptimelnstitute: RACI Model Responsible + Who is/will be doing this task? + Who is assigned to work on this task? Accountable + Whose head will oll f this goes wrong? + Who has the authority to make decisions? Consulted + Anyone who can tell me more about this task? + Any stakeholders already identified? Informed + Anyone whose work depends on this task? + Who has to be kept updated about the progress? Uptimelnstitute GPP EEC UCUUUEOCUUU UU UUSGOUCCOUUULUVU UYU YY Y cy Example: Maintenance Crew KPI RACI Chart Inputtng allure Data Work OrgerCompiation | © c € a 7 7 Work Order Close Out c R ce 5 i a [akarrairebae E a ; E Fi ¢ A ‘Analyze Failure e A ' Reports : ! J R ce 1 1 e a R R Implementing New : x a Strategie R ° ' 1 Uptimelnstitute’ Swim Lane Diagram Par a Coie Ste Dry AndEsots— Uptime Institute’ Staffing and Organization Summary wpvces 8 > + Following the process provided in this session allows youto (avon both determine and justify the staffing required to effectively Totolt. operate your critical facility / Having the right number of qualified personnel organized correctly is the cornerstone to achieving your uptime objective Documenting roles and responsibilities leads to greater ' accountability and better ability to fight staffing cuts [AC Uptimelnstitute PEO EUG BU UU UU UU UV GUUS VUE Uptime Institute’ Accredited Operations Specialist® Session 4: Maintaining a Critical Facility Uptime Institute Maintenance Second most important category to meeting uptime objectives Keeps equipment in ( . eal & Operations like-new condition (seis) Identifies potential problems before they cause a failure Ge) Extends the life of the equipment | al An effective maintenance program reduces opportunity for failure Uptime Institute’ Preventive Maintenance Program + Purpose: Keep equipment in like-new condition > Reduces equipment breakdowns > Identifies potential problems before they occur + As the critical facility sophistication increases so do the requirements for maintenance and the processes and procedures to support it + Program includes > List of equipment requiring maintenance » Decision on who will do the maintenance (in-house or vendor) > Detail processes and procedures on how to perform the work » Schedule of when the maintenance needs to be accomplished Uptimelnstitute Goor- Shendins © pevalion pro cede nop - Methed oP Precede Building a Preventive Maintenance Program oft [slerion No Taco | [Sane See LE LES WE ‘Seroue oy Desiop Raed Uptimelnstitute Mma ahien on CE ECU RYE CV UU UU UU VU UU UU UU UB U GU UU UU Equipment Identification Uptimeinstitute’ Equipment Identification + Need inventory of all equipment in the critical facility > Description > Make » Model > Identification (location, serial number, etc.) » Installation Date » Warranty Information + This information is needed to make the decision on what requires preventive maintenance Uptimelnstitute’ Equipment Maintenance Scope ‘Geta ee re Servolo Couper Format” J Leschasaee Coonan iaborHeue by Desks Raa Uptimetnstitute Equipment Maintenance Scope + Sources for determining what needs PM > Original equipment manufacturer (OEM) recommendations industry standards (IEEE, NETA, ASHRAE, etc.) Equipment installation records and documentation /Mlarranty information > Building commissioning records > Design engineer recommendations. ~ > Historical Information Uptime Institute CE GUU CEU CEU UG OU UO VOU UU U UU GU UG Equipment Maintenance Schedule rT Uptimeinstitute Equipment Maintenance Schedule + Sources for PM frequency requirements » OEM recommendations and warranties ~ » NETA, IEEE, ASHRAE documentation > Design engineer recommendations » MMS: adjust frequencies based on history of the equipment + Balance the maintenance schedule » Availability of in-house labor and vendors > Budget » Operational cycles (risk windows) Uptimeinstitute™ OEM Schedule Example po coe = | Typical = operation and aia — + maintenance =| (0&M) manuals z—}—} contain a ‘Sreieteerene x |] recommended —— | Seater = maintenance SNe eee kame : schedule ‘Uptime Institute’ Methods of Procedure Uptimelnstitute LOjeyaneetobeyanaeics BIGGS Method of Procedure Crag ‘Author rad Dad ray ‘RAC Quarony PM ‘Quarterly PM Trey ee cn Initiat Feform it Review MOP: Parte & Toole ON hand Tools or Parts Required cd ia ch Bott Tension Gauge Shop Vacuum Ladder ts we Uptime Institute’ Method of Procedure Procedures ee ae 2 | GotoGRAGt (On Local screen, verity unit is ON, check fr vibrations or noises. (Expected Resut Units ON) | 4 _[Gowonace (Gn Local ser00n, vr units OFF. (Expected Resul Listen for unusual noises and fee for unusual vibrations by ousting te exer pas - 7__| Proceed to BS Workshop |g | Atte 84S Computer tur GRAG tof by conmanding al EAS Pont 70000 ¢ (Eton hes O#A0 2 sts 9 ON sod RAG T Sata OFF Preceda Roo 125, Gots CRAG, vy uts OFF Exped Rosat ORC 18 a) 11 [Gots AAC? very untis ON 12 | Prozeod to Swichboad A Room Isolate the unit at awitonboard “A” by opening "CRAC-1" breaker. LOTO breaker (Expected Result CRAC thas no pawer) [14 | Proceed to room 125 _ 15 _| Gots CRAG 1 vey the unt is without power by seeing the front oreen is blank [16 wna srt6" ton wrench, opon all exterior pancls on CRAG 1 [yy _ | Ona ladder visually inspect the fiter bank located at the top of the unit (there are 2 id layers with a total of 16 fters) | | —| se ek pl kr ope Wa i cto ae as (oe Ree ite wee | Wea epet ber coneermet Dae TUATHA aE one on | . , [ igaatowin raven arson patra Meare || C a exceed 14”. Ii tension exceeds %",r it... .PART # a ravel a koton tte hair pa xpd Rost Hao eat) Femave wye strainer on lft side ofthe humidfler pan. | Check or debris. Clean and rinse | Oper With SHG" Alen Wrench, rosa l extelo panels on CRAG 4 domestic water vahe to the humor pan | 72i_| Proceed to Switehtoard A Room | smave LOTO locks. Unsolate the uni al sulichboard “A” by closing CRAG? breaker: (Expocted Rest CRAG 1 has power) {a0 _ | Proosed to Room 125 (Goto CRAG 1 — vet power has boon restored othe writ by seing font panel ight up. (Expacted Result CRAG 1 power has been restored) a I [7 ee [eo Proceed ta BS Workshop Uptime Institut Method of Procedure Procedures cont. ‘ike 6AS Compr sm ORAC 1 ON by commending ON-8RS Pore |S ) (Grnsans Res a0 2 seus OFF a CRAG t Sans ot : 3 rowed oom 125 I ? "35 | Gow CRAG T i > [go | On tozal coon: vey ts cea rrr ons EpecdReaat| , carson ) ar [eowonace t > {38 | On Local screen, very unit is OFF (Expected Raul: Unt le OFF) ) [ | pu comPLETE — = — ¢ ) d ) ) 5 ) ( ¢ ) ) ) ) ) J CUVEE UC UU EU GU VU UU VV UU UU UU UU UU UU Method of Procedures + MOPs are critical to reducing human-error outages Reduces potential for mistakes by providing detailed step-by- step procedures All maintenance/switching activities should be scripted As uptime objectives increase, so does the sophistication required in the scripts Uptimelnstitute’ Method of Procedures + Step-by-step procedures for each PM activity > Every step well documented > Cross reference SOPs to begin and end the MOP + Establish the normal operating configuration of each infrastructure system + Switch between redundant components or from one system configuration to another + Exercise immediate and controlling actions when abnormal circumstances occur » MOPs need to be continually reviewed and updated (incorporate any changes to infrastructure) Uptimelnstitute Method of Procedures + Labor Requirements > Trades required > Hours by trade + Tools and Parts > Personal Protective Equipment (PPE) for safety » Proper tools reduce risk and potential damage to equipment > Availability of the right parts reduces maintenance period + Resources for developing MOPs > Craftsmen > OEM recommendations > Historical data from MMS > Similar sites Uptime Institute Implement Preventive Maintenance Program + This process will provide all the information for a PM program + Implementation and tracking for a PM program is through an MMs + Aneffective PM program also requires a quality control program Uptime Institute CIOL ISLS LOLOL OLS LOLOL OleOle Lele L UL le le Lele Lele Lele le lere Quality Control Program Ensures the PM activities are * Being conducted + Performed correctly Develop a methodology that works for your site Various metrics derived from the MMS can point to areas needing quality improvement Uptimelnstitute Maintenance Management System Center of an effective maintenance program System to track status and trends of all maintenance activities Repository for historical data Library of continuously improving processes and procedures Computerized...or not Uptime Institute’ Process to Implement Maintenance Management System | UptimeInstitute’ Maintenance Management System: Typical Data 1 2. aane Unique asset identification number Equipment data Manufacturer Make Model number Serial number Size (capacity) Year of manufacture Date installed Location Warranty information . Operating specifications > Normal operating specifications History data on actual operating specifications Uptimelnstitute PIGGY JBOVUVUUUUY VUEVVOVLVUYU CECLUGOOUUYN Typical Data in MMS 8. PM requirements Maintenance actions required Responsible for performing activity (in-house, vendor, etc.) Frequency (schedule) Parts required ‘Any special tools required Calibration requirements Estimated labor hours Associated MOPs and SOPs Safety precautions Record of all maintenance activities Date Description of activity and results Labor (estimated and actual) Parts used and cost Uptimelnstitute’ Using the Data Production of work orders Justification of resources (people, tools, parts) Continuous improvement of maintenance practices Typical Reports > Maintenance hours tracking reports for estimated vs. actual > Asset/component maintenance history > Maintenance costs for labor and materials > Work order completion tracking » Deferred maintenance tracking to verify all maintenance work is being completed as scheduled > Future maintenance requirements tracking to schedule vendors Unused data is useless data Uptimeinstitute Preventive Maintenance Work Orders Work order number Equipment description, location, and identification number List of required SOP and MOP to be performed ee duration for all work activities associated with the work order Materials and tools required to perform work Warranty information Safety precautions Notification instructions Technician/Engineer assigned Date assigned and completed Description of the work to be performed History of work performed on the asset (including predictive, preventive, and corrective) + Operating specification data to collect Uptime Institute Importance of Tools and Parts Identification on Work Orders + You spend 30 minutes isolating an electrical component in preparation for a PM activity + You realize a tool or part to complete the maintenance activity is missing + You have 2 options > Option 1: Back out of the procedure and set the system back to its normal operating state, thereby wasting at least 1 hour of a technicians time » Option 2: Extend the maintenance window to retrieve the tool or part thereby extending the amount of time that the critical facility is at risk + Neither of these options are desired! UptimeInstitute CEU EGU UU UU GU UU VU UU VU VU UU UU UU UG UU BU I Importance of Maintenance History on Work Orders + You have a UPS module that consistently experiences cooling fan failures If the maintenance history of that UPS module is identified on work orders, then the maintenance personnel will know to specifically check the operation of the cooling fans The identification of the maintenance history of specific assets on work orders can help to reduce the instances of unplanned component failures UptimeInstitute’ Warranty Information + PM of Warranty » APM action maintains the equipment in like-new condition > Awarranty action repairs the equipment * Do not void the warranty > Maintenance work is usually required to maintain the warranty » Work done improperly or by unqualified technicians may void the warranty + Should be part of the work order to ensure craftsmen know repairs need to be done under warranty Uptime institute’ Critical Spare Parts + Inventory will vary based on > Business objectives~system and equipment redundancy > Availability of parts locally > Cost of acquiring and maintaining > Storage spac > Age of equipment + Continuous process > Identify requirements * Initially from OEM information + Trending > Tracking status of stock > Reorder at specific points Uptimelnstitute In-House vs. Vendor Decision + Decision should be based on > Availability of labor >» Costs >» Risk + Availability of labor >» In-house + Does or should in-house expertise exist? * Are individuals with required skills available locally to hire? » Vendor » Are qualified local vendors available to meet SLA requirements? Uptimeinstitute’ JIS i VREVUUVOGOVUVUUY BVYEUEELEUUUUUUUUYU Process to Develop Vendor Program ae Hee He Uptimelnstitute’ Service Level Agreement Detailed Scope of Work for the Vendor PMs Equipment to be maintained and approved schedule Details on work to be performed and response times Responsibilities for tools and parts Qualification of technicians Work rules Requirement for vendor » MOPs submitted for approval > Linked to in-house MOPS or site operating procedures Uptime Institute’ Vendor Support + List of qualified vendors by system should be available for normal and emergency work » Callin process for emergencies » Points-of-contact list > Approved and qualified technicians Uptime Institute Deferred Maintenance + Deferred maintenance is any maintenance activity postponed for any reason + Risk of infrastructure failure is increased + MMS should provide a deferred maintenance report > List of all maintenance activities that have been deferred > Reason for deferment > Required resource to complete deferred activities + Justification for getting necessary resources * For critical facilities with high-availability objectives, deferred maintenance should be zero + Maintenance windows for deferred maintenance need to be scheduled + If deferred maintenance does occur, monitor performance closely Uptime Institute Go VGGVIINIY BS BEE CEU UCU UU UU UU UU UU YUU UY Predictive Maintenance * Various techniques are used to evaluate the condition of equipment and modify maintenance activities accordingly + Typical critical facility predictive maintenance activities > Infrared (IR) analysis » Machinery oil analysis > Vibration analysis » Ultrasonic analysis Uptimelnstitute’ Life-Cycle Planning + The end-oflife expectancy date for each piece of equipment should be in MMS > This information can then be used to build a life-cycle plan and capital replacement budget » Dates should be reviewed and updated periodically based on the condition of equipment + Effective life-cycle planning is a careful balance » Cost » Risk to continuous availability business objectives, > Ongoing operational requirements Uptime Institute Life-Cycle Planning + Equipment life expectancy can be extended PD Ovality preventive maintenance program > Operating equipment within design specifications > Operating in average or nonextreme environmental conditions > Alternating use of like equipment _y Keeping deferred maintenance backlog at zero Uptimeinstitute’ Failure Analysis + Every outage or save should be logged and analyzed + Analysis » Detailed documentation of the incident and impacts to operation » Review of actions taken before and after incident > Determine root cause >» Lessons learned > Recommended corrective actions + Corrective actions need to be implemented + Review failure logs periodically to identify any trends Uptime Institute YUUVUVVIVUVVVIVUIUVL Housekeeping + Poor housekeeping exacerbates risk > Combustibles > Contaminants > Safety trip hazards + Applies to overhead, above floor area, and underfloor + Housekeeping policies should address > Convenience items (cigarettes, beverages, food) > Job cleanup practices (including installs/de-installs) + Requires awareness by in house staff and vendors + Enforcement of policies is essential to maintaining a clean critical facility Uptime institute Maintenance Summary + MMSis the center of an effective maintenance program because it tracks all activities and their status + PMs required to keep equipment in like-new condition + The more complex the facility, the more sophisticated the MOPs to minimize human-error incidents and outages Poor housekeeping results in introduction of combustibles, contaminants, and safety trip hazards, which increase risk of outages Uptime Institute > > ) | Engine Generators | Engine maintenance Engine Generators | Run against load _ | Engine Generators | Paralleling switchgear Engine Generators | Start Gatery Maintenance (hore peter Z ups | Automatic static transfer switches: a UPS UPS batteries Zo | UPS Batley montoing sytem i ] System cabinet malntenance, incl Sal anetrowiches ups (where present) 4 ups Electrical | Transtormers | Hydrogen detection system (for vented batteries) 1 [Aomat | eteticar | Power ditibuion unis a Electcal | Remote power panele Eleteat | Brann cul mowonng pale Eletcat | Manual vanslerswches Elecrcal | Staton baer yet [Celestial] Stati trate snchos —— {technical [Ar handing units | wechanical | Suppiyfxnaust ane I ‘Computer room sir condltoner unis [ Mechanical | Inspect fans Mechanical | Dampers Mechanical | Expansion tanks Mechanical | Inspect water cooled chillers [echanieat— Sump ums Mechanical | Conrl valves and actuator = Monty = Guartery = Semianaully Uptime Institute’ Accredited Operations Specialist® Session 5: Operational Documentation Uptimeinstitute’ Planning, Coordination, and Management * Site Policies and Procedures * Escalation Procedures + Financial Process * Reference Library Uptime institute Site Policies and Procedures + Formal documented policies and procedures ensure consistency + Site Configuration Procedures (SCPs) » Establish the normal site configuration + Standard Operating Procedures (SOPs) > Spell out how changes are performed during normal operations through the utilization of step by step instructions + Method of Procedure (MOP) + Emergency Operating Procedures (EOPs) > Address control during an abnormal event Using Policies and Procedures Leads to Consistency Uptime Institute’ Site Policies + Achange management program must be in place to address permanent and temporary changes to the site configuration > Changes to the site infrastructure in terms of redundancy, availabilty, or operating characteristics and parameters > Changes to the critical load as manifested by IT changes to normal ‘operations of hardware, software, and applications. » Changes to the security of the site + A formal approval process must be in place » Change board + Any change is formally documented to ensure any existing processes are modified to account for the new configuration or change Uptimelnstitute

You might also like