You are on page 1of 151

Open Source Management Options

September 30th, 2008 Jane Curry Skills 1st Ltd www.skills-1st.co.uk

JaneCurry Skills1stLtd 2CedarChase Taplow Maidenhead SL60EU 01628782565 jane.curry@skills1st.co.uk

Synopsis
Nutsandboltsnetworkandsystemsmanagementiscurrentlyunfashionable.The emphasisisfarmoreonprocessesthatimplementservicemanagement,drivenby methodologiesandbestpracticessuchastheInformationTechnologyInfrastructure Library(ITIL).Nonetheless,allservicemanagementdisciplinesultimatelyrelyona waytodeterminesomeofthefollowingcharacteristicsofsystemsandnetworks:

Configurationmanagement Availabilitymanagement Problemmanagement Performancemanagement Changemanagement Securitymanagement

Thecommercialmarketplaceforsystemsandnetworkmanagementofferingstendto bedominatedbythebigfourIBM,HP,CAandBMC.Eachhavelarge,modular offeringswhichtendtobeveryexpensive.Eachhasgrowntheirportfoliobybuying upothercompaniesandthenperformingsomelevelofintegrationbetweentheir respectivebrandedproducts.Onecanarguethattheresultingofferingstendtobe marketechturesratherthanarchitectures. ThispaperlooksatOpenSourcesoftwarethataddressesthesamerequirements. OfferingsfromNetdisco,CactiandTheDudeareexaminedbriefly,followedbyanin depthanalysisofNagios,OpenNMSandZenoss. Thispaperisaimedattwoaudiences.Foradiscussiononsystemsmanagement selectionprocessesandanoverviewofthreemainopensourcecontenders,readthe firstfewchapters.Thelastfewchaptersthenprovideaproductcomparison. ForthosewhowantlotsmoredetailonNagios,OpenNMSandZenoss,themiddle sectionsprovideindepthdiscussionswithplentyofscreenshots.

Table of Contents
1DefiningSystemsManagement....................................................................................5 1.1Jargonandprocesses................................................................................................5 1.2SystemsManagementforthispaper....................................................................6 2Systemsmanagementtools.............................................................................................6 2.1Choosingsystemsmanagementtools......................................................................7 2.2TheadvantagesofOpenSource...............................................................................8 3OpenSourcemanagementofferings...............................................................................8 4CriteriaforOpenSourcemanagementtoolselection.................................................10 4.1Generalrequirements.............................................................................................10 4.1.1MandatoryRequirements...............................................................................10 4.1.2DesirableRequirements..................................................................................10 4.2Definingnetworkandsystemsmanagement.....................................................11 4.2.1Networkmanagement.....................................................................................11 4.2.2Systemsmanagement......................................................................................12 4.3Whatisoutofscope?..............................................................................................13 5AquicklookatCacti,TheDudeandnetdisco..............................................................14 5.1Cacti.........................................................................................................................14 5.2netdisco....................................................................................................................17 5.3TheDude..................................................................................................................20 6Nagios..............................................................................................................................21 6.1ConfigurationDiscoveryandtopology................................................................22 6.2Availabilitymonitoring...........................................................................................27 6.3Problemmanagement.............................................................................................32 6.3.1Eventconsole....................................................................................................33 6.3.2Internallygeneratedevents............................................................................37 6.3.3SNMPTRAPreceptionandconfiguration.....................................................39 6.3.4Nagiosnotifications........................................................................................39 6.3.5Automaticresponsestoeventseventhandlers..........................................41 6.4Performancemanagement......................................................................................42 6.5Nagiossummary.....................................................................................................45 7OpenNMS........................................................................................................................46 7.1ConfigurationDiscoveryandtopology................................................................47 7.1.1Interfacediscovery...........................................................................................47 7.1.2Servicediscovery..............................................................................................48 7.1.3Topologymappinganddisplays......................................................................51 7.2Availabilitymonitoring...........................................................................................53 7.3Problemmanagement.............................................................................................59 7.3.1Eventconsole....................................................................................................59 7.3.2Internallygeneratedevents............................................................................62 7.3.3SNMPTRAPreceptionandconfiguration.....................................................65 7.3.4Alarms,notificationsandautomations..........................................................69 3

7.4Performancemanagement......................................................................................76 7.4.1Definingdatacollections.................................................................................76 7.4.2Displayingperformancedata..........................................................................85 7.4.3Thresholding....................................................................................................91 7.5ManagingOpenNMS..............................................................................................97 7.6OpenNMSsummary...............................................................................................98 8Zenoss..............................................................................................................................98 8.1ConfigurationDiscoveryandtopology..............................................................100 8.1.1Zenossdiscovery.............................................................................................100 8.1.2Zenosstopologymaps....................................................................................107 8.2Availabilitymonitoring........................................................................................108 8.2.1Basicreachabilityavailability......................................................................108 8.2.2AvailabilitymonitoringofservicesTCP/UDPportsandwindowsservices ...................................................................................................................................110 8.2.3Processavailabilitymonitoring....................................................................113 8.2.4Runningcommandsondevices.....................................................................120 8.3Problemmanagement...........................................................................................121 8.3.1Eventconsole.................................................................................................122 8.3.2Internallygeneratedevents..........................................................................123 8.3.3SNMPTRAPreceptionandconfiguration...................................................125 8.3.4email/pageralerting....................................................................................126 8.3.5Eventautomations.........................................................................................131 8.4Performancemanagement....................................................................................132 8.4.1Definingdatacollection,thresholdingandgraphs.....................................132 8.4.2Displayingperformancedatagraphs...........................................................138 8.5Zenosssummary....................................................................................................141 9ComparisonofNagios,OpenNMSandZenoss...........................................................142 9.1Featurecomparisons.............................................................................................143 9.1.1Discovery........................................................................................................143 9.1.2Availabilitymonitoring.................................................................................144 9.1.3Problemmanagement....................................................................................144 9.1.4Performancemanagement............................................................................145 9.2Producthighpointsandlowpoints....................................................................146 9.2.1Nagiosgoodiesandbaddies.....................................................................146 9.2.2OpenNMSgoodiesandbaddies...............................................................146 9.2.3Zenossgoodiesandbaddies.....................................................................147 9.3Conclusions............................................................................................................148 10References...................................................................................................................149 11AppendixACactiinstallationdetails.....................................................................149

1 Defining Systems Management


1.1 Jargon and processes
Everyorganisationandindividualhastheirownperspectiveonsystemsmanagement requirements;thefirstessentialstepwhenlookingforsystemsmanagementsolutions istodefinewhatthoserequirementsare.Thisgivesameanstomeasuresuccessofa project. Therearemanydifferentmethodologiesanddisciplinesforsystemsmanagementfrom theInternationalStandardsOrganization(ISO)FCAPSacronymFault, Configuration,Accounting,PerformanceandSecurity,throughtotheInformation TechnologyInfrastructureLibrary(ITIL)whichdividestheITILV2frameworkinto twocategories:

ServiceSupportwhichincludesthe:

ServiceDeskfunction Incidentmanagementprocess Problemmanagementprocess Configurationmanagementprocess Changemanagementprocess Releasemanagementprocess ServiceLevelmanagementprocess Capacitymanagementprocess ITServiceContinuitymanagementprocess Availabilitymanagementprocess FinancialmanagementforITservices

ServiceDeliverywhichincludesthe:

KeytothecoreofconfigurationmanagementandtheentireITILframeworkisthe conceptoftheConfigurationManagementDatabase(CMDB)whichstoresand maintainsConfigurationItems(CIs)andtheirinterrelationships. Theartofsystemsmanagementisdefiningwhatisimportantwhatisinscope,and perhapsmoreimportantly,whatiscurrentlyoutofscope.Thescienceofsystems managementisthentoeffectively,accuratelyandreliablyprovidedatatodeliveryour systemsmanagementrequirements.Thedevilreallyisinthedetailhere.A comprehensivesystemsmanagementtoolthatdeliversathousandmetricsoutof theboxbutwhichisunreliableand/ornoteasilyconfigurable,issimplyarecipefora projectthatisdeliveredlateandoverbudget.

ForsmallerprojectsorSmall/MediumBusiness(SMB)organisations,apragmatic approachisoftenhelpful.Manypeoplewillwantasayinthedefinitionof management.Others,whoserequirementsmaybeequallyvaluable,maynotknow theartofthepossible.Hence,combiningtopdownrequirementsdefinition workshopswithabottomupapproachofdemonstratingtop10metricsthatcan easilybedeliveredbyatool,canresultinaniterativeprocessthatfairlyquickly deliversatleastaprototypesolution.

1.2 Systems Management for this paper


Forthepurposesofthispaper,Ishalldefinesystemsmanagementasspanning:

Configurationmanagement Availabilitymanagement Problemmanagement Performancemanagement

Ishallfurtherdefinesystemstoincludelocalandwideareanetworks,aswellas PCsandUnixlikesystems.Inmyenvironment,Idonothavemainframeor proprietarymidrangesystems.PCsrunavarietyofversionsofWindows.Unix liketendstomeanaflavourofLinuxratherthanavendorspecificUnix,though thereissomelegacyIBMAIXandSunSolaris.

2 Systems management tools


Therearenosystemsmanagementsolutionsforsale.Thesuccessful implementationofsystemsmanagementrequirementsisacombinationof:

Appropriaterequirementsdefinition Appropriatetools Skillstotranslatetherequirementsintocustomisationoftools Projectmanagement Usertraining Documentation

Intheory,thechoiceoftoolshouldbedrivenbytherequirements.Inpractise,thisis oftennotthecaseandasolutionforoneaspectofsystemsmanagementinoneareaof abusinessmaybecomethedefactostandardforawholeorganisation. Therearegoodreasonswhythismightcomeabout.Itisnotpracticaltoruna centralisedServiceDeskwithaplethoraofdifferenttools.AFrameworkbasedtool withacentraliseddatabase,andacommonlookandfeelacrossbothGraphicalUser Interface(GUI)andCommandLineInterface(CLI),offeringmodulesthatdeliverthe differentsystemsmanagementdisciplines,isamuchmorecosteffectivesolutionthen 6

differentpiecemealtoolsfordifferentprojects,especiallywhenthecostofbuildingand maintainingskillsandeducatingusersistakenintoaccount. Toolintegrationisalargefactorinthesuccessfulrolloutofsystemsmanagement. TheconceptofasingleConfigurationManagementDatabase(CMDB)thatalltools feedanduse,iskeytothis. Agoodtooldeliversusefulstuffeasilyoutoftheboxandprovidesastandardwayto thenprovidelocalcustomisation. Atitsmostbasic,thetoolisacompilerorinterpreter(C,bash,...)andthe customisationiswritingprogramsfromscratch.Atthecomplexendofthespectrum, thetoolmaybealargesuiteofmodulesfromoneofthebigfourcommercial suppliers,IBM,HP,CAandBMC.Atthereallycomplexend,iswhereyouhave severalofthebigcommercialproductsinvolvedinadditiontohomegrownprograms.

2.1 Choosing systems management tools


Everyorganisationhasdifferentprioritiesforthecriteriathatdrivetoolselection. Forthemoment,let'sleaveasidethetechnicalmetricsandlookatsomeoftheother decisionfactors:

Easeofusenotjustwhatdemoswellbutwhatimplementswellinyour environment Skillsnecessarytoimplementtherequirementsversusskillsavailable Requirementsforandavailabilityofusertraining Costallofitnotjustlicencesandtinevaluationtime,maintenance, training,... Supportfromsupplierand/orcommunities Scalability Deployabilitymanagementserver(s)easeofinstallationandagent deployment Reliability Accountabilitytheabilitytosue/chargethevendorifthingsgowrong

Ifaccountabilityishighinyourprioritiesandthesoftwarecostisarelativelylow prioritythenyouarelikelytochooseoneofthecommercialofferings;howeverifyou haveawellskilledworkforce,oronepreparedandabletolearnquickly,andoverall costisalimitingfactor,thenOpenSourceofferingsarewellworthconsidering. Interestingly,youcanfindofferingsthatsuitalltheotherbulletsabove,fromboththe commercialandtheOpenSourcestables.

2.2 The advantages of Open Source


OneattractionofOpenSourcetomeisthatyoudon'tactuallyhavetofund salesfolk.Somecostsdoneedtobeinvestedinyourownpeopletoinvestigatethe offeringsavailable,researchtheirfeaturesandrequirements,andparticipateinthe onlineforathatshareexperiencearoundtheglobe.Thesecostsmaynotbesmallbut atleasttheinvestmentstayswithinthecompanyandhopefullythosepeoplewhohave donetheresearchwillthenbeakeypartoftheteamimplementingthesolution.This isoftennotthecaseifyoupurchasefromacommercialsupplier. OpenSourcedoesnotnecessarilymeanyou'reonyourown,pal!.MostoftheLinux distributionshaveafreeversionandasupportedversion,whereasupportcontractis availabletosuityourorganisationandbudget.SeveraloftheOpenSource managementofferingshaveasimilarmodelbutdoensurethatthefreeversionhas sufficientfeaturesforyourrequirementsandisnotjustawellfeatureddemo. Allsoftwarehasbugsinit.Ultimately,ifyougoOpenSource,youhavethesource codesoyouhavesomechanceoffixingproblemswithlocalstafforbuyinginglobal expertiseandthatdoesn'tnecessarilymeantransportingagurufromAustraliato Paris.OpenSourcecodeisavailabletoeveryonesoremotesupportandconsultancyis adistinctpossibility.Withthebestwillintheworld,commercialorganisationswill prioritiseproblemreportsaccordingtotheircriterianotyours. TherearesomeexcellentforaanddiscussionlistsforcommercialproductsIhave participatedinseveralofthemformanyyears;someevenhaveinputfromthesupport anddevelopmentteams;however,thesourcecodeisnotopenfordiscussionor communitydevelopment.WithaveryactiveOpenSourceoffering,theretendstobea muchlargerpoolofdevelopersandtesters(ie.us)andthechanceofgettingproblems fixedmaybehigher,evenifyoucannotfixityourself.Iwouldemphasiseveryactive OpenSourceofferingsunlessyoureallydohavesomeveryhighlyskilledlocalstaff thatyouaresureyouaregoingtokeep,itmaybeariskychoicetoparticipateina smallOpenSourceproject.

3 Open Source management offerings


TherearelotsofdifferentOpenSourcemanagementofferingsavailable.Manyofthem relyontheSimpleNetworkManagementProtocol(SNMP)whichdefinesbotha protocolforanSNMPmanagertoaccessaremoteSNMPagent,andalsodefinesthe datathatcanbetransferred.SNMPdatavaluesthatanSNMPmanagercanrequest, aredefinedinManagementInformationBases(MIBs)whichcaneitherbestandard (MIB2)orcanbeenterprisespecificinotherwords,eachdifferentmanufacture canprovidedifferentdataaboutdifferenttypesofdevice.Informationevents emanatingfromanagent(typicallyproblems)areSNMPtraps.Therearethree versionsoftheSNMPstandard:

V1(1988)stillmostprevalent.Significantpotentialsecurityandperformance issues.

V2(1993)solvedsomeperformanceissues.Neverreachedfullstandard status. V3(2002)significantlyimprovedperformanceandsecurityissues.Muchmore complex.

OftheOpenSourcemanagementsolutionsavailable,someareexcellentpoint solutionsforspecificnicherequirements.MRTG(MultiRouterTrafficGrapher) writtenbyTobiOetiker,isanexcellentexampleofacompactapplicationthatuses SNMPtocollectandlogperformanceinformationanddisplayitgraphically.Ifthat satisfiesyourrequirement,don'tlookanyfurtherbutitwillnothelpyouwith definingandcollectingproblemsfromdifferentdevicesandthenmanagingthose problemsthroughtoresolution. AnenhancementofMRTGisRRDTool(RoundRobinDatabaseTool),againfromTobi Oetiker.Itisstillfundamentallyaperformancetool,gatheringperiodic,numericdata anddisplayingitbutRRDToolhasadatabaseatitsheart.Thesizeofthedatabaseis predeterminedoncreationandnewerdataoverwritesolddataafterapredetermined interval.RRDcanbefoundembeddedinanumberofotherOpenSourcemanagement offerings(Cacti,Zenoss,OpenNMS). AfurtherenhancementfromRRDToolisCactiwhichprovidesacompletefrontendto RRDTool.AbackendMySQLrelationaldatabasecanbeusedbehindtheRoundRobin databases;datasourcescanbeprettywellanyscriptinadditiontoSNMP;andthere isusermanagementincluded.Thisisstillaperformancedatacollectionanddisplay package,notamultidiscipline,framework,systemsmanagementsolution. Movingupthescaleoffeaturesandcomplexity,someofferingsareslantedmore towardsnetworkmanagement(netdisco,TheDude);otherstowardssystems management(Nagios). Someaimtoencompassanumberofsystemsmanagementdisciplineswithan architecturebasedaroundacentraldatabase(Nagios,Zenoss,OpenNMS). Someareextremelyactiveprojectswithhundredsofappendstomaillistspermonth (Nagios,Zenoss,OpenNMS,cacti);othershavearegularbutsmallercommunitywith hundredsofmaillistappendsperyear(netdisco). SomearepurelyOpenSourceprojects,typicallylicensedundertheGnuGPL(MRTG, RRDTool,cacti)orBSDlicense(netdisco);somehavefreeversions(againtypically underGPL)withextensionsthathavecommerciallicences(Zenoss).Inadditionto freelicences,severalproductsoffersupportcontracts(Zenoss,Nagios,OpenNMS). MostareavailableonseveralversionsofLinux;MRTG,RRDToolandcactiarealso availableforWindows.TheDudeisbasicallyaWindowsapplicationbutcanrun underWINEonLinux. MosthaveawebbasedGUIsupportedonOpenSourcebrowsers.OpenNMScanonly displaymapsbyusingInternetExplorer.

4 Criteria for Open Source management tool selection


Itisessentialtodefinewhatisinscopeandwhatisoutofscopeforasystems managementproject.Aprioritisedlistofmandatoryanddesirablerequirementsis helpful.

4.1 General requirements


Forthepurposesofthispaper,herearemyselectioncriteria.

4.1.1 Mandatory Requirements


OpenSourcefreesoftware Veryactivefora/maillists Establishedhistoryofcommunitysupportandregularfixesandreleases Integratednetworkandsystemsmanagementincluding:


Configurationmanagement Availabilitymanagement Problemmanagement Performancemanagement

Centralised,opendatabase BothGraphicalUserInterface(GUI)andCommandLineInterface(CLI) Easydeploymentofagents Scalabilitytoseveralhundreddevices Adequatedocumentation

4.1.2 Desirable Requirements


SupportforSNMPV3 Usermanagementtolimitaspectsofthetooltocertainindividuals Graphicalrepresentationofnetwork Controllableremoteaccesstodiscovereddevices Easyserverinstallation Norequirementforproprietarywebbrowsers Scalabilitytoseveralthousanddevices Gooddocumentation Availabilityof(chargeable)support

10

4.2 Defining network and systems management


TheIntegratednetworkandsystemsmanagementrequirementneedssomefurther expansion:

4.2.1 Network management

Configuration

Automatic,controllablediscoveryofnetworkLayer3(IP)devices Topologydisplayofdiscovereddevices SupportforSNMPV1,V2andpreferably,V3 Abilitytodiscoverdevicesthatdonotsupportping AbilitytodiscoverdevicesthatdonotsupportSNMP Central,opendatabasetostoreinformationforthesedevices Abilitytoaddtothisinformation Ideally,abilitytodiscoveranddisplaynetworkLayer2(switch)topology Customisablepingtestforalldiscovereddevicesandinterfaces SNMPavailabilitytestfordevicesthatdonotrespondtoping(eg. comparisonofSNMPInterfaceadministrativestatuswithInterface operationalstatus) Simpledisplayofavailabilitystatusofdevices,preferablybothtabularand graphical Eventsraisedwhenadevicefailsitsavailabilitytest Abilitytomonitorinfrastructureofnetworkdevices(eg.CPU,memory,fan) Differentiationbetweendevice/interfacedownandnetworkunreachable Eventstobeconfigurableforanydiscovereddevice Centraleventsconsolewithabilitytoprioritiseevents Abilitytocategoriseeventsfordisplaytospecificusers AbilitytoreceiveandformatSNMPtrapsforSNMPV1,V2andpreferably, V3 Customisationofactionsinresponsetoevents,bothmanualactionsand automaticresponses Abilitytocorrelateeventstofindrootcauseproblems(eg.failureofarouter deviceisrootcauseofallinterfacefailureeventsforthatdevice)

Availabilitymonitoring

Problem

Performance

11

Regular,customisablemonitoringofSNMPMIBvariables,bothstandard andenterprisespecific,withdatastorageandabilitytothresholdvaluesto generateevents AbilitytoimportanyMIB AbilitytobrowseanyMIBonanydevice Customisablegraphingofperformancedata

4.2.2 Systems management


Manyofthecriteriaforsystemsmanagementaresimilartothenetworkmanagement bulletsabovebuttheyarerepeatedhereforconvenience.

Configuration

Automatic,controllablediscoveryofWindowsandUnixdevices Topologydisplayofdiscovereddevices SupportforSNMPV1,V2andpreferably,V3 Abilitytodiscoverdevicesthatdonotsupportping AbilitytodiscoverdevicesthatdonotsupportSNMP Central,opendatabasetostoreinformationforthesedevices Abilitytoaddtothisinformation Customisablepingtestforalldiscovereddevices Availabilitytestfordevicesthatdonotrespondtoping(eg.comparisonof SNMPInterfaceadministrativestatuswithInterfaceoperationalstatus, supportforsshtests) Abilitytomonitorcustomisableportsonadevice(eg.tcp/80forhttpservers) Ideallytheabilitytomonitorapplications(eg.ssh/snmpaccesstomonitor forprocesses,wgettoretrievewebpages) Simpledisplayofavailabilitystatusofdevices,preferablybothtabularand graphical Eventsraisedwhenadevicefailsanyavailabilitytest AbilitytomonitorbasicsystemmetricsCPU,memory,diskspace, processes,services(eg.theSNMPHostResourcesMIB) Eventstobeconfigurableforanydiscovereddevice

Availabilitymonitoring

Problem

12

Centraleventsconsolefornetworkandsystemsmanagementeventswith abilitytoprioritiseevents Abilitytocategoriseeventsfordisplaytospecificusers AbilitytoreceiveandformatSNMPtrapsforSNMPV1,V2andpreferably, V3 AbilitytomonitorUnixsyslogsandWindowsEventLogsandgenerate customisableevents Ideallytheabilitytomonitoranytestlogfileandgeneratecustomisable events Customisationofactionsinresponsetoevents,bothmanualactionsand automaticresponses Abilitytocorrelateeventstofindrootcauseproblems(eg.singlepointof failurerouterisrootcauseofavailabilityfailureforalldevicesinanetwork) Regular,customisablemonitoringofSNMPMIBvariables,bothstandard andenterprisespecific,withdatastorageandabilitytothresholdvaluesto generateevents AbilitytoimportanyMIB AbilitytobrowseanyMIBonanydevice AbilitytogatherperformancedatabymethodsotherthanSNMP(eg.ssh) Customisablegraphingofperformancedata

Performance

4.3 What is out-of-scope?


Inmyenvironment,somethingsarespecificallyoutofscope:

Softwaredistribution Remoteconfiguration Remotecontrolofdevices Highavailabilityofmanagementservers Applicationresponsetime

InthenextfewsectionsofthisdocumentIwillexploresomeofthenicheproducts brieflyandthentakeaslightlymoreindepthlookatOpenNMS,NagiosandZenoss. Thesesectionsarenotintendedtobeafullanalysisoftheproducts,moreaninitial impressionsandacomparisonofstrengthsandweaknesses.Subsequentdocuments willinvestigateNagios,OpenNMSandZenossinmoredetail.

13

5 A quick look at Cacti, The Dude and netdisco


Cacti,TheDudeandnetdiscodonotmeetmymandatoryrequirements;howeverthey areinterestingnichesolutionsthatwereinvestigatedduringthetoolsevaluation process.Cactiandnetdiscowereinstalled;TheDudewasonlyInternetresearched.

5.1 Cacti
Cactiisanichetoolforcollecting,storinganddisplayingperformancedata.Itisa comprehensivefrontendtoRRDTool,includingtheconceptofusermanagement. AlthoughthedefaultmethodofdatacollectionisSNMP,otherdatacollectors, typicallyscripts,arepossible. DatacollectionisveryconfigurableandisdrivenbytheCactiPollerprocesswhichis calledperiodicallybytheOperatingSystemscheduler(cronforUnix).Thedefault pollingintervalis5minutes. DevicesneedtobemanuallyaddedusingtheCactiwebbasedGUI.Basicinformation suchashostname,SNMPparametersanddevicetypeshouldbesupplied.Depending onthedevicetypeselected(eg.ucd/netSNMPHost,CiscoRouter),oneormoredefault graphtemplatescanbeassociatedwithadevicealongwithoneormoredefaultSNMP dataqueries.InadditiontothewebbasedGUI,configurationofCacticanbedoneby CommandLine,usingPHPwhichisageneralpurposescriptinglanguageespecially suitedforwebdevelopment. CactinowhassupportforSNMPV3. Forhighperformancepolling,Spine(usedtobecactid)canreplacethebasecmd.php pollingengine.TheusermanualsuggeststhatSpinecouldsupportpollingintervals oflessthan60secondsforatleast20,000datasources. CactiissupportedonbothUnixandWindowsplatforms. GettheCactiUserManualfromhttp://www.cacti.net/downloads/docs/pdf/manual.pdf. Cactihasaveryactiveuserforumwithhundredsofappendspermonth.Thereisalso adocumentedreleaseroadmapgoingforwardto2ndquarter2009. HereareafewscreenshotsofCactitogiveafeelfortheproduct.

14

Figure1:CactimainDevicespanel

15

Figure2:Cactigraphofinterfacetraffic

16

Figure3:Cactigraphofmemoryfordevicebino

5.2 netdisco
netdiscowascreatedattheUniversityofCalifornia,SantaCruz(UCSC),Networking andTechnologyServices(NTS)department.Itisinterestingasanetwork managementconfigurationoffering.ItusesSNMPandCiscoDiscoveryProtocol (CDP)totryandautomaticallydiscoverdevices.Unlikemostothermanagement offerings,netdiscoisLayer2(switch)awareandcanbothdisplayswitchportsand optionallyprovideaccesstocontrolswitchports. ItprovidesaninventoryofdevicesthatyoucansorteitherbyOSorbydevicemodel, displayingallportsforadevice.Italsohastheabilitytoprovideanetworkmap. Usermanagementisincludedsoyoucanrestrictwhoisallowedtoactivelymanage devices.ThereisgoodprovisionofbothcommandlineinterfaceandwebbasedGUI. netdiscoissupportedonvariousplatformsitwasoriginallydevelopedonFreeBSD;I builtitonaCentos4platform.

17

Ifyourrequirementisstrictlyfornetworkconfigurationmanagementandyour devicesrespondsuitablytonetdiscothenthismightbeworthatry.Ifounditvery quirkyastowhatitwoulddiscover.ItappearsverydependentontheSNMPsystem sysServicesvariabletodecidewhetheradevicesupportsnetworklayer2and3 protocols;ifadevicedidnotprovidesysServicesordidn'tindicatelayer2/3,then netdiscowouldnotdiscoverit.IalsohadveryfewdevicessupportingCiscoCDPso theautomaticdiscoverydidn'tworkwellforme.Althoughthereisafilewhereyou canmanuallydescribethetopology,thiswouldbeahugejobinasizeablenetworkif youhadtohandcraftasignificantamountofthenetworktopology. Thisprojectisnotnearlysoactiveassomeoftheotherofferingsdiscussedhere (around500appendstotheusersmaillistin2007)butthereseemstobeasteadyflow. Buildingthesystemwasafairmarathonbutthedocumentationisreasonablygood. Herearesomescreenshotsofthemaindeviceinventorypanel,plusthedetailsofa routerandthedetailsofaswitch.

Figure4:Netdiscomaindeviceinventorydisplay

18

Figure5:Netdiscodetailsofrouterdevice

19

Figure6:Netdiscodetailsofaswitchdevice,includingports

5.3 The Dude


IputsomeresearchintoTheDudeasitapparentlyprovidesautodiscoveryofa networkwithgraphicalmaplayoutsomethingthatishardtofinddonewell.From theOpenSourceperspectivethough,itreallydoesn'tqualify.Itisbasicallya WindowsapplicationthoughitcanapparentlyrununderWINEonLinux.Itcomes fromacompanycalledMikroTikandtheirwebsitesaysitisfreebutitisunclear whatthelicensingarrangementisforTheDude.Ithasaveryactiveforum. Itoffersmorethansimplydiscoveryandconfigurationasitcanapparentlymonitor linksanddevicesforavailabilityandgraphlinkperformance.Itcanalsogenerate notifications

20

6 Nagios
Nagiosevolvedin2002outofanearliersystemsmanagementprojectcalledNetSaint, whichhadbeenaroundsincethelate1990s.Itisfarmoreasystemsmanagement product,ratherthananetworkmanagementproduct.Itisavailabletobuildonmost flavoursofLinux/Unixandtheinstallationhasbecomemucheasierovertheyears. TheNagiosQuickstartdocumentisreasonablycomprehensive(althoughitmissesa fewprerequisitesthatIfoundnecessarylikegd,png,jpeg,zlib,netsnmpandtheir relateddevelopmentpackages).IdownloadedandbuiltNagios3.0.1onaSuSE10.3 platform(hostnamenagios3),andhaditworkinginsidehalfaday. TostarttheWebInterface,pointyourbrowserathttp://nagios3/nagios/.The Quickstartdocumenthasyoucreatesomeuseridsandpasswordsthedefaultlogon fortheWebconsoleisnagiosadminwiththepasswordyouspecifiedduring installation. HereisascreenshotoftheNagiosTacticalOverviewdisplay.

Figure7:NagiosTacticalOverviewscreen

21

6.1 Configuration Discovery and topology


Nagiosusesanumberoffilestoconfigurediscoveryoutoftheboxitwillfind nothing.Samplesareavailable,bydefault,in/usr/local/nagios/etc.Themain configurationfileisnagios.cfgwhichdefinesalargenumberofparameters,mostof whichyoucanleavealoneattheoutset. Typicallythemainthingstodiscoverarehostsandservices.Thesearedefinedin anobjectorientedwaysuchthatyoucandefinehostandservicetoplevelclasseswith particularcharacteristicsandthendefinesubclassesandhoststhatinheritfromtheir parentclasses.Ratherthanhavingasingle,hugenagios,cfg,itcanreferenceother files(typicallyintheobjectssubdirectory),wheredefinitionsforhosts,servicesand otherobjecttypes,canbekept.So,forexample,/usr/local/nagios/etc/nagios.cfgmay containlinessuchas:
cfg_file=/usr/local/nagios/etc/objects/hosts.cfg cfg_file=/usr/local/nagios/etc/objects/services.cfg cfg_file=/usr/local/nagios/etc/objects/commands.cfg

Definitionsofhostsarebuiltupinahierarchicalmannersothetopleveldefinitions maylooklikethefollowingscreenshot.Notetheusestanzatodenoteinheritanceof characteristicsfromapreviousdefinition.

22

Figure8:Nagioshosts.cfgtopleveldefinitions

Hostavailabilityparametersareshowninthescreenshotabove:

check_period check_interval retryinterval max_check_attempts check_command

(24x7) (5mins) (1min) (10) (check_host_alivewhichisbasedoncheck_ping)

23

Figure9:Nagioshosts.cfgshowinghosttemplatedefinitions

Subsequentdefinitionsofsubgroupsandrealhostswillfollow.Notetheuseofthe parentsstanzatodenotethenetworknodethatprovidesaccesstothedevice.This meansthatNagioscantellthedifferencebetweenanodethatisdownandanodethat isunreachablebecauseitsaccessrouterisdown.

24

Figure10:Nagioshosts.cfgfileshowingrealhostdefinitions

Hostscanbedefinedtobeamemberofoneormorehostgroups.Thisthenmakes subsequentconfigurationmorescalable(forexample,aservicecanbeappliedtoahost groupratherthantoindividualhosts).Hostgroupsaretypicallydefinedinhosts.cfg.

Figure11:Nagioshosts.cfghostgroupdefinitions

25

HostgroupsarealsousedintheGUItodisplaydatabasedonhostgroups.

Figure12:NagiosHostgroupsummary

Wheneverchangeshavetakenplacetoanyconfigurationfile,thecommand: /etc/init.d/nagiosreload shouldbeused.ThisdoesnotstopandstarttheNagiosprocesses(usestop|start| restart|statustocontrolthebackgroundprocesses)thereloadparametersimplyre readstheconfigurationfile(s).Thereisalsoahandycommandtoverifythatyour configurationfilesarelegalandconsistent,beforeactuallyperformingthereload: /usr/local/nagios/bin/nagiosv/usr/local/nagios/etc/nagios.cfg AllobjectstobemanagedneeddefiningintheNagiosconfigurationfilesthereisno formofautomaticdiscovery;howevertheabilitytocreateobjecttemplatesandthus anobjecthierarchy,makesdefinitionsflexibleandeasy,onceyouhavedefinedyour hierarchies.

26

Agreatbenefitofthisconfigurationfileistheabilitytodenotethenetworkdevices thatprovideaccesstospecificnodes(parent/childrelationship).Thismeansthata maphierarchycanbedisplayedandalsomeansthatnodereachabilityisencoded.If, forexample,allnodesonthe172.31.100.32networkinheritfromatemplatethat includesaparentsgroup100r3stanza,whengroup100r3goesdownthen Nagiosknowsthatallnodesinthatnetworkareunreachable(ratherthandown). Definingmultipleparentsforameshednetworkseemedproblematicalthough. Nagiosautomaticallygeneratesatopologymap,basedonthetheparentsstanzasin theconfigurationfiles.Colourcodingprovidesstatusfornodes.

Figure13:NagiosStatusmap

6.2 Availability monitoring


Nagiosavailabilitymonitoringfocusesmuchmoreonsystemsthanonnetworks. Nagiosprovidesalargenumberofofficialpluginsformonitoring;inadditionthereare 27

othercommunitypluginsavailable,oryoucanwriteyourown.Theofficialplugins shouldbeinstalledalongsidethebaseNagios.Theexecutablescanbefound in/usr/local/nagios/libexec(use<pluginname>helpforusageoneachplugin).The officialpluginsinclude: check_ping check_snmp check_ifstatus check_ssh check_by_ssh check_nt check_nrpe configurablepingtestwithwarning&criticalthresholds genericSNMPtesttogetMIBOIDs&testreturnvalues checkSNMPifOperStatusagainstifAdminStatusforall Administrativelyupinterfaces checkthatthesshportcanbecontactedonaremotehost usesshtoruncommandonremotehost checkWindowsparameters(disk,cpu,services,etc..).Needs NSClient++agentinstalledonWindowstargets checkremoteLinuxparameters(disk,cpu,processes,etc..). NeedsNRPEagentinstalledonUnix/Linuxtarget

Nagioshastwoseparateconceptshostmonitoringandservicemonitoringandthere isaknownrelationshipbetweenthestateofthehostandthestateofitsservices. Hostmonitoringisareachabilitytestandwillgenerallyusethecheck_pingNagios plugin.IfyouhavedevicesthatsupportSNMPbutdonotsupportping(perhaps becausethereisafirewallinthewaythatblocksping),thenthecheck_ifstatusplugin workswelltotestallinterfacesonadeviceandcomparestheSNMPadministrative statuswiththeoperationalstatus.HostmonitoringisdefinedintheNagios configurationfileswiththecheck_commandstanza,wheretypicallythisisdefined atahighlevelofthehostdefinitionhierarchybutcanbeoverriddenforsubgroupsor specifichosts.Forexample,inhosts.cfg:
define host { host_name use parents alias address check_command } group-100-a1 host_172.31.100 group-100-r2 ;Inherits from this parent class ;This is n/w route to device

group-100-a1.class.example.org group-100-a1.class.example.org check_ifstatus ;SNMP status check, not ping

AsummaryofhoststatusisgivenontheTacticalOverviewdisplay.TheHost Detaildisplaythengivesfurtherinformationforeachdevice.Thehostsmonitored usingcheck_pingshowtheRoundTripAverage(RTA).Notethatgroup100a1is monitoredusingthecheck_ifstatuspluginsoshowsdifferentStatusInformation.

28

Figure14:NagiosHostDetaildisplay

Availabilitymonitoring,especiallyforcomputersratherthannetworkdevices,can meanmanythings.Nagiosprovidesmanypluginsforportmonitoring,including genericTCPandUDPmonitors.Thecheck_snmpplugincouldbeusedtocheck SNMPparametersfromtheHostResourcesMIB(ifatargetsupportsthis).Nagios alsoprovidesremoteagents,NSClient++forWindowsandNRPEforUnix/Linux systems,whichprovideamuchmorecustomisabledefinitionofsystemmonitoring. Servicesaretypicallydefinedinservices.cfg.Aswithhostdefinitions,servicescanbe definedinaclasshierarchywherecharacteristicsofanobjectareinheritedfromits parent.

29

Figure15:Nagiosservice.cfgtoplevelobjects

Again,notethecheck_period,max_check_attempts,normal_check_intervaland retry_check_intervalstanzas.Morespecificservicedefinitionscanbethenbedefined, inheritingcharacteristicsofparentsthroughtheusestanza:

30

Figure16:Nagiosservices.cfgshowingspecificservices

Notethatservicescanbeappliedeithertogroupsofhosts(hostgroup_name)orto specifichosts(host_name). Aswithhosts,itispossibletocreategroupsofservicestoimprovetheflexibilityof configurationandthedisplayofservices. AlsonotethatsomeservicesruncommandsthatareinherentlylocaltotheNagios systemeg.check_local_disk.Thecheck_dnscommandrunsnslookupontheNagios systembutthehost_nameparametercanbeusedtospecifytheDNSservertoquery from.Thecommandsareactuallyspecifiedintheconfigurationfilecommands.cfg, which,inturn,callsexecutablepluginsin/usr/local/nagios/libexec.

31

Figure17:NagiosServicedetail

ServicedependenciesareanadvancedfeatureofNagiosthatallowyoutosuppress notificationsandactivechecksofservicesbasedonthestatusofoneormoreother services(thatmaybeonotherhosts). Bothhostandservicemonitoringcanbeconfiguredtogenerateeventsonfailure(and thisisthedefault).

6.3 Problem management


Nagios'seventsystemdisplayseventsgeneratedbyNagios'sownhostandservice monitors.ThereisnobuiltincapabilitytocollateeventsreceivedasSNMPTRAPsor syslogmessages.Whenaneventisgenerated,itcanbeconfiguredsothat

32

notification(s)aregeneratedtooneormoreusersorgroupsofusers.Itisalsopossible tocreateautomatedresponsestoevents(typicallyscripts). NotethatNagiostendstousethetermseventandalertinterchangeably.

6.3.1 Event console


TheNagiosEventLogisdisplayedfromthelefthandmenu:

Figure18:NagiosEventLog

Bydefault,theeventlogisdisplayedinonehourlysections.Thelogshowstheevent statusandalsoshowswhetheraNotificationhasbeengenerated(themegaphone symbol).Thisdisplayiseffectivelysimplyshowing/usr/local/nagios/var/nagios.log.

33

UndertheReportingheadingonthelefthandmenu,therearefurtheroptionsto displayinformationonevents(alerts).TheAlertHistoryiseffectivelythesameasthe EventLog.TheAlertHistogramproducesgraphsforeitherahostorservicewith customisableparameters.

Figure19:NagiosConfigurationforAlertHistogram

Noteinthefigureabovethatahost/serviceselectionhasalreadybeenpromptedfor and,havingselectedhost,thespecifichosthasbeensupplied.Thefollowingfigure showstheresultinggraph.Notethebluelinkstowardsthetopleftofthedisplay providingaccesstoafilteredviewoftheeventslog(ViewHistoryforthisHost)andto notificationsforthishost.

34

Figure20:NagiosAlertHistogramforhostgroup100r1

TheAlertSummarymenuoptioncanprovidevariousreports,specifictohostsor services.

35

Figure21:NagiosAlertSummaryconfigurationoptions

Limitingthereporttoaspecifichost,group100r1,producesthefollowingreport.

36

Figure22:NagiosAlertSummaryforgroup100r1

6.3.2 Internally generated events


Nagioshastheconceptofsofterrorsandharderrorstoallowforoccasionalglitchesin hostandservicemonitoring.Anyhostorservicemonitorcanspecifyorinherit parametersforthecheckintervalunderOKconditions,thecheckintervalundernon OKconditionsandthenumberofcheckattemptsthatwillbemade.

Hostparameters

check_interval retry_interval

default5mins(checkintervalwhenhostOK) default1min(checkintervalwhenhostnonOK)

maxcheck_attempts default4(numberofattemptsbeforeHARDevent) normal_check_intervaldefault10mins retry_check_interval default2mins

Serviceparameters

37

max_check_attempts default3(numberofattemptsbeforeHARDevent)

WhenanonOKstatusisdetected,asofterrorisgeneratedforeachsamplinginterval untilmax_check_attemptsareexhausted,afterwhichahardeventwillbegenerated. Atthispoint,thepollingintervalrevertstothecheck_intervalratherthanthe retry_interval.

Figure23:NagiosEventLogshowinghardandsoftevents

Notefromtheearlierfigureshowingthetopologylayout,thatgroup100r3sits behindgroup100r1.Eachofthesehostdevicesisbeingpolledevery5minuteswhen inanOKstate(ormax_check_attemptshasbeenexceeded)andevery1minutewhen aproblemhasarisen.Theactualproblemthathascausedtheeventlogshownabove, isthatgroup100r1hasfailed;however,group100r3ispolledfirstandresultsinthe firsteventforthisdevicewithastatusofDOWNandastatetypeofSOFT. Subsequently,group100r1ispolledandfoundtobeDOWNwhichresultsinthe associatedpolltogroup100r3receivingastatusofUNREACHABLEandastatetype

38

ofSOFT.Thethirdpollofgroup100r3againhasastatusofUNREACHABLEanda statetypeofSOFT. Thenexteventforgroup100r3isaservicepingmonitor(whichrunsevery5minutes forthisdevice).NotethatthiseventhasastatetypeofHARDthisisbecauseNagios knowsthatthehoststatusassociatedwiththisservicemonitorisalready UNREACHABLE(orDOWN). ThefourtheventresultsinastatetypeofHARDandthestatusofUNREACHABLE. Thehardeventalsogeneratesanotification.

6.3.3 SNMP TRAP reception and configuration


Nagios'sowndocumentationsaysthatitisnotareplacementforafullblownSNMP managementapplication.IthasnosimplewaytoreceiveSNMPTRAPsortoparse them. ItispossibletointegrateSNMPTRAPsbysendingthemtoNagiosaspassivechecks butthiswillrequiresignificanteffort.Thedocumentationsuggestsusinga combinationofnetsnmpandtheSNMPTRAPTranslator(SNMPTT)packages.

6.3.4 Nagios notifications


InNagios,thetermseventandalertareusedinterchangeably. Thereisacomprehensivemechanismfornotificationswhichisdrivenbyparameters onthehostandservicechecks.Thereisalsoconfigurationfornotificationsonaper contactbasis;eachcheckcanhaveacontact_groupsstanzaspecifyingwhotocontact. Contactscanappearinseveraldifferentcontactgroups(althoughonlyasingle notificationwillbesenttoanyindividual).NotificationsareonlygeneratedforHARD statustypeevents,notSOFTones. Whethernotificationsaresentdependsonthefollowingparameters/characteristics (inthisorder);

notifications_enabled

globalon/offparameter

Eachhost/servicecanhavescheduleddowntimenonotificationsindowntime Eachhost/servicecanbeflappingnonotificationsifflapping Hostnotification_options(d,u,r) specifiesnotificationsondown, unreachable,recoveryevents

Servicenotification_options(w,u,c,r) specifiesnotificationsonservicewarning, unreachable,critical,recoveryevents Host/servicenotification_period Host/servicenotification_interval notificationsonlysentduringthisperiod (eg.24x7,workdays,...) ifnotificationalreadysent,problemstill extantandnotification_periodexceeded thensendanothernotification

39

Onceeachofthesefiltersfornotificationhasbeentestedandpassed,contactfilters arethenappliedforeachcontactinthegroup(s)indicatedinthehostorservice contact_groupsstanza.Hereisthedefaultdefinition:

Figure24:NagiosDefaultcontactdefinition

Notificationsforhostsandservicescanbesent24x7.Theyaresentforalltypesof eventsanduseaNagioscommandthatdrivestheemailsystem.Aswithallother Nagiosconfigurations,morespecificusersandgroupsofuserscanbedefinedwhich changeanyoftheseparameters. Aneventhastosatisfytheglobalcriteria,thespecifichost/servicecriteriaandthe contactcriteria,beforeanotificationisactuallysent. RememberfromtheAlertsHistogramreport,itispossibletoseenotificationsfora particularhost.

Figure25:NagiosHostNotifications

40

6.3.5 Automatic responses to events event handlers


Nagioscanrunautomaticactions(eventhandlers)whenaserviceorhost:

IsinaSOFTproblemstate InitiallygoesintoaHARDproblemstate InitiallyrecoversfromaSOFTorHARDproblemstate

Thereisaglobalparameter,enable_event_handlerswhichmusttakethevalue1 (true),beforeanyautomationcantakeplace. Therearetwoglobalparameters,global_host_event_handlerand global_service_event_handlerwhichcanbeusedtoruncommandsonallhost/service events.Thesemightbeused,say,tologalleventstoanexternalfile. Inaddition,individualhostandservices(orgroupsofeither)canhavetheirown event_handlerdirectiveandtheirownevent_handler_enableddirective.Notethatif theglobalenable_event_handlersisoffthennoindividualhost/servicewillrunevent handlers.Individualeventhandlerswillrunimmediatelyafterandglobalevent handler. Typically,aneventhandlerwillbeascriptorprogram,definedintheNagios commands.cfgfile,torunanyexternalprogram.Thefollowingparameterswillbe passedtotheeventhandler: ForServices:$SERVICESTATE$,$SERVICESTATETYPE$,$SERVICEATTEMP$ ForHosts: $HOSTSTATE$,$HOSTSTATETYPE$,$HOSTATTEMPT$ Eventhandlerscriptswillrunwiththesameuserprivilegeasthatwhichrunsthe nagiosprogram. Sampleeventhandlerscriptscanbefoundinthecontrib/eventhandlers/subdirectory oftheNagiosdistribution.Hereisthesamplesubmit_check_resultscommand:

41

Figure26:NagiosSamplesubmit_check_resultcommandforeventhandlerfromcontribdirectory

6.4 Performance management


Nagiosdoesnothaveperformancedatacollectionandreportingoutofthebox; however,itdoesprovideconfigurationparameterssuchthatanyhostcheckorservice checkmayalsoreturnperformancedata,providedthepluginsuppliessuchdata.This datacantheneitherbeprocessedbyaNagioscommandorthedatacanbewrittento afiletobeprocessedasynchronouslyeitherbyaNagioscommandorbysomeother mechanismmrtg,RRDToolandCactimayallbecontendersforthepostprocessing. Thereareanumberofglobalparametersthatcontrolthecollectionofperformance data,typicallyin/usr/local/nagios/etc/nagios.cfg:

process_performance_data host_perfdata_command service_perfdata_command host_perfdata_file service_perfdata_file

globalon/offswitch Nagioscommandtobeexecutedondata Nagioscommandtobeexecutedondata datafileforasynchronousprocessing datafileforasynchronousprocessing

Noteeitherusethecommandparameterfordataprocessingwhenthedata isretrieved,orusethedatafileforlaterprocessing

42

host_perfdata_file_processing_interval host_perfdata_file_processing_command host_perfdata_file_template

processdatafileevery<n>seconds Nagioscommandtoprocessdata

service_perfdata_file_processing_interval processdatafileevery<n>seconds service_perfdata_file_processing_commandNagioscommandtoprocessdata formatofdatafile formatofdatafile service_perfdata_file_template

Figure27:NagiosPerformanceparametersinnagios.cfg

Thedefaultisthatprocess_performance_data=0(ie.off)andalltheotherparameters arecommentedout. Inadditiontotheglobalparameters,eachhostandserviceneedstoeitherexplicitly configureorinheritadefinitionfor: 43

process_perf_data=1

1=datacollectionon,0=datacollectionoff

Bydefault,thegeneric_hostandgeneric_servicetemplatedefinitionssetthese parametersto1(on). IfaNagiospluginisabletoprovideperformancedata,itisreturnedaftertheusual statusinformation,separatedbya|(pipe)symbol.Itcanberetrievedasthe $HOSTPERFDATA$or$SERVICEPERFDATA$macro.ItisthenuptoyourNagios commandstointerpretandmanipulatethatdata. Thenextfigureshowsperformancedatathathasbeengatheredinto/tmp/service perfdatausingthedefaultservice_perfdata_file_templatewherethelastfieldisthe $SERVICEPERFDATA$value(iftheplugindeliversperformancedata).

Figure28:NagiosPerformancedatacollectedinto/tmp/serviceperfdata

Themostrecentperformancedatagatheredforhostsandservicescanalsobeseen fromtheHostDetailorServiceDetailmenuoptions.

44

Figure29:NagiosPerformancedatahighlightedDNSCheckservice

6.5 Nagios summary


Nagiosisamaturesystemsmanagementtoolwhosedocumentationismuchbetter thantheotheropensourceofferings.It'sstrengthisincheckingavailabilityofhosts andservicesthatrunonthosehosts.Supportfornetworkmanagementislessstrong asthereisnoautomaticdiscovery;howeveritispossibletoconfiguresimplenetwork topologiesanditincludestheconceptofasetofdevicesbeingUNREACHABLE (ratherthanDOWN)ifthereisanetworksinglepointoffailure.Handlingmeshed networkswithmultipleroutingpathstoanetworkisproblematical. Sinceallmonitoringisperformedbyplugins,someofwhichcomewiththeproduct andsomeofwhichareavailableascommunitycontributions,thetoolisasflexibleas anyonerequires.Therearealargenumberofpluginsavailableandyoucanalsowrite yourown. Oneofthestandardpluginsischeck_snmpwhichcanbeusedtoqueryanyhostfor anySNMPMIBvariable;thisobviouslyrequiresthetargettosupportSNMPandthe MIBinquestion.

45

ItisalsopossibletorunchecksonremotehostsbyinstallingtheNRPEagent (availableforbothUnix/LinuxandWindowshosts)andtherequiredNagiosplugins, ontheremotesystem.Thecheck_nrpepluginmustalsobeinstalledontheNagios system.ThisallowspluginsdesignedtoberunlocaltotheNagiossystem,toberun onremotehosts.WithNRPEagents,checksarerunonascheduledbasis,initiated fromtheNagiossystem. AnotheralternativeistoinstalltheNSCAaddontoremotesystems.Thispermits remotemachinestoruntheirownperiodicchecksandreporttheresultsbackto Nagios,whichcanbedefinedaspassiveservicechecks. TheeventsubsystemofNagiosislesspowerfulandconfigurablethansomeofthe otherofferingsithaslessfocusonaneventconsolebutincludesmoreinformation abouthostandserviceeventsfromothermenus.Nagioshasnoeasybuiltinwayto collectandprocessSNMPTRAPs. IfyouwantlotsofperformancegraphsthenNagiosaloneisnotgoingtodelivereasily. Insummary,Nagiosseemsgoodformonitoringarelativelysmallnumberofsystems, providedyoudon'tneedhistoricalperformancereporting.

7 OpenNMS
OpenNMSpresentsitselfasthefirstEnterprisegradenetworkmanagement platformdevelopedundertheOpenSourcemodel.ItisaJavaapplicationthatruns underseveralflavoursofLinux.AVMwareVirtualMachine(VM)isalsoavailable withthelatestreleaseofOpenNMS,whichmakesinitialevaluationveryeasywithout havingtogothroughafullbuildprocess.Thereisalsoanonlinedemosystemwhich appearstobemonitoringrealkitwhichgivesagoodfirsttasteoftheproduct. ThefollowingsectionisbasedontheVMdownloadwhichisOpenNMS1.5.93based onMandrivaitworkedveryeasily.TheVMwassetupforDHCPbutImodifiedthe OperatingSystemfilestousealocalfixedaddress,withtheVMnetworkbridgedto mylocalenvironment. ToaccesstheOpenNMSWebConsole,pointyourbrowserathttp://opennms: 8980/opennms/.Thedefaultlogonidisadminwithapasswordofadmin. HereisascreenshotofthemaindefaultwindowofOpenNMS.

46

Figure30:MaindefaultwindowforOpenNMS

ThefollowingsectionswilldescribehowtoconfiguredifferentaspectsofOpenNMSby editingxmlconfigurationfiles.ItispossibletoconfiguremanyaspectsofOpenNMS usingGUIdrivenmenus.Seesection7.5ManagingOpenNMSforabrief description.

7.1 Configuration Discovery and topology


7.1.1 Interface discovery
OpenNMSusesastraightforwardfileforinterfacediscoverybydefaultthis is/opt/opennms/etc/discoveryconfiguration.xml.Itcomeswithsomecommentedout defaults,sobydefaultitdiscoversnothing!Thisfileneedsmodifyingtospecify includerangesandexcluderangestoping;specificIPaddressesfordiscoverycanalso beconfigured.Thefirststanzaspecifiesthecharacteristicsofthepingdiscovery mechanism.Ifthereisaresponsewithinthetimeout,a"newsuspect"eventis generated.
<discovery-configuration threads="1" packets-per-second="1" initial-sleep-time="300000" restart-sleep-time="86400000" retries="3" timeout="800">

<include-range retries="2" timeout="3000"> <begin>10.0.0.1</begin>

47

<end>10.0.0.254</end> </include-range> <include-range > <begin>172.30.100.1</begin> <end>172.30.100.10</end> </include-range> <specific 10.191.101.1/specific> </discovery-configuration>

Intheaboveexample,pingdiscoverywillstart300,000ms(5minutes)after OpenNMShasstartedup;thediscoveryprocesswillberestartedevery86,400,000ms (24hours);1pingwillbesentpersecond;thetimeoutforapingwillbe800msand therewillbe3pingretriesbeforethediscoveryprocessgivesuponanaddress.All devicesontheClassC10.0.0.0networkwillbepolled(withonly2retriesbuta3 secondtimeout).The10devices172.30.100.1through10willbepolledforwiththe defaultcharacteristics.Thespecificnode10.191.101.1willbepolled. Allthatthediscoverprocessdoesistogeneratenewsuspecteventsthatarethen usedbyotherOpenNMSprocesses.Ifthedevicedoesnotrespondtothispingpolling thenitwillnotbeaddedtotheOpenNMSdatabase. Anotherwaytogeneratesuchevents(sayforaboxthatdoesnotrespondtoping),isto useaprovidedPerlscript:
/opt/opennms/bin/sendevent.plinterface<ipaddr> uei.opennms.org/internal/discovery/newsuspect

7.1.2 Service discovery


Whenanewsuspecteventhasbeengeneratedbythediscoveryprocessitisthe capabilitiesdaemon,capsd,thattakesoveranddiscoversservicesonasystem.capsd isconfiguredusing/opt/opennms/etc/capsdconfiguration.xml.Thus,discoveryin OpenNMSconsistsoftwoparts:discoveringanIPaddresstomonitor(thediscover process)andthendiscoveringtheservicessupportedbythatIPaddress(thecapsd process). Thebasicmonitoredelementiscalledan"interface",andaninterfaceisuniquely identifiedbyanIPaddress.Servicesaremappedtointerfaces,andifanumberof interfacesarediscoveredtobeonthesamedevice(eitherviaSNMPorSMB)then theymaybegroupedtogetherasa"node". capsdusesanumberofpluginssuppliedwithOpenNMS,todiscoverservices.Each servicehasa<protocolplugin>stanzaincapsdconfiguration.xml.Forexample:
<protocol-plugin protocol="SSH" class-name="org.opennms.netmgt.capsd.TcpPlugin" scan="on" user-defined="false"> <property key="banner" value="SSH"/> <property key="port" value="22"/> <property key="timeout" value="3000"/>

48

<property key="retry" value="1"/> </protocol-plugin>

Thisdefinesaservice(protocol)calledSSHthattestsTCPport22usingtheTCP plugin.ItwilllookforthestringSSHtobereturned.Timeoutis3secondswith1 retry. Thefirstprotocolentryincapsdconfiguration.xmlisforICMP.


<protocol-plugin protocol="ICMP" class-name="org.opennms.netmgt.capsd.IcmpPlugin" scan="on" user-defined="false"> <property key="timeout" value="2000"/> <property key="retry" value="1"/> </protocol-plugin>

Itispossibletoapplyprotocolstospecificaddressrangesorexcludeprotocolsfrom addressranges(thedefaultisinclusion).
<protocol-plugin protocol="ICMP" class-name="org.opennms.netmgt.capsd.IcmpPlugin" scan="on" user-defined="false"> <protocol-configuration scan="off" user-defined="false"> <range begin="172.31.100.1" end="172.31.100.15"/> <property key="timeout" value="4000"/> <property key="retry" value="3"/> </protocol-configuration> </protocol-plugin>

Notethescan=offforIPaddresses172.31.100.115. TheSNMPprotocolisspecialinthat,ifsupported,itprovidesawaytocollect performancedataaswellaspollforavailabilitymanagementinformation.SNMP parametersfordifferentdevicesandrangesofdevicesarespecified in/opt/opennms/etc/snmpconfig.xml.Hereisasample:


<snmp-config retry="3" timeout="800" version=v1 port=161 read-community="public" write-community="private"> <definition version="v2c"> <specific>10.0.0.121</specific> </definition> <definition retry="2" timeout="1000"> <range begin="172.31.100.1" end="172.31.100.254"/> </definition> <definition read-community="fraclmye" write-community="rrwatr"> <range begin="10.0.0.1" end="10.0.0.254"/> </definition> </snmp-config>

Thefirststanzainsnmpconfig.xmlprovidesglobaldefaultparametersforSNMP access.Variationsinanyoftheseglobalparameterscanbemadeusingadefinition stanzaandeitherarangeoraspecificstatement.Thisfileisusedbothfordiscovery andforcollectingperformancedata. 49

WhentestingSNMP,capsdmakesanattempttoreceivethesysObjectIDMIB2 variable(.1.3.6.1.2.1.1.2.0).Ifsuccessful,thenextradiscoveryprocessingtakesplace. First,threethreadsaregeneratedtocollectthedatafromtheSNMPMIB2system treeandtheipAddrTableandifTabletables.If,forsomereason,theipAddrTableor ifTableareunavailable,theprocessstops(buttheSNMPsystemdatamayshowupon thenodepage). Second,alloftheIPaddressesintheipAddrTablearerunthroughthecapsd capabilitiesscan.Notethatthisisregardlessofhowmanagementisconfiguredinthe configurationfile.Thisonlyhappensontheinitialscanandonforcedrescans.On normalrescans(bydefault,every24hours),IPaddressesthatare"unmanaged"in capsdarenotpolled. Third,everyIPaddressintheipAddrTablethatsupportsSNMPistestedtoseeifit mapstoavalidifIndexintheifTable.Ifthisistrue,theIPaddressismarkedasa secondarySNMPinterfaceandisacontenderforbecomingtheprimarySNMP interface.

Figure31:OpenNMSnodedetailforaswitchshowingswitchports

50

Thefirststanzaincapsdconfiguration.xmldefinesservicepollingparameters:
<capsd-configuration rescan-frequency="86400000" initial-sleep-time="300000" management-policy="managed" max-suspect-thread-pool-size = "6" max-rescan-thread-pool-size = "3" abort-protocol-scans-if-no-route = "false">

Thisdefinesthatcapsdwillwait5minutesafterOpenNMSstartsbeforestartingthe capsddiscoveryprocess.Itwillrescantodiscoverservicesevery24hours.The defaultmanagementpolicyforallIPaddressesfoundinnewsuspecteventswillbe toscanforeachoftheservices.Thismanagedparametercanbeoverriddenatthe endofcapsdconfiguration.xmlbyunmanagedrangestanzas:


<ip-management policy="unmanaged"> <specific>0.0.0.0</specific> <range begin="127.0.0.0" end="127.255.255.255"/> </ip-management>

Whenanewsuspecteventisgenerated,providedtheIPaddressisinamanaged managementpolicyrange,theIPaddressischeckedforeachoftheservicesincapsd configuration.xml,startingfromthetop. Ifthedevicedoesnotrespondtoanyconfiguredservicethen,eveniftriggeredwith send_event.pl,itwillnotbeaddedtotheOpenNMSdatabase.Look in/opt/opennms/logs/daemon/discovery.logfordebugginginformation.

7.1.3 Topology mapping and displays


OpenNMSdoesnotuseatopologymappingfunctioninthecorecode(indeed,someof itsproponentsarevociferousthatyoudonotneedamappingability).Thereisa mappingcapabilityifyouuseanInternetExplorerwebbrowserwithaspecificAdobe ScalableVectorGraphics(SVG)pluginthisisonlysupportedinIEanddidnotwork forme.Thereisalsoamapsonfirefoxcodebranchbutperformanceissaidtobepoor andthemaillistssuggestthatneithermappingcapabilityisheavilyused. ANodeListisavailablefromthemainmenuwhereeachnodenameisalinktoa detailednodepage.

51

Figure32:OpenNMSNodeListofdiscoverednodes

52

Figure33:OpenNMSnodedetailforgroup100r1

Notetheservicesthathavebeendiscoveredforthenode.Thelistofservicesper interfacearethosethathavebeenactuallydetected;whethertheyareMonitoredor notwillbediscussedinthenextsection.

7.2 Availability monitoring


OpenNMSperformsavailabilitymonitoringbypollingdeviceswithprocessesknown asmonitorswhichconnecttoadeviceandperformasimpletest.Pollingonlyhappens toaninterfacethathasalreadybeendiscoveredbycapsd. Theconfigurationfileforpollingis/opt/opennms/etc/pollerconfiguration.xml.There aremanysimilaritiesbetweenthisandcapsdconfiguration.xml;howeverthe monitorsaredefinedwithmonitorservicestanzas(ratherthanprotocolstanzas), whichdefinetheJavaclasstouseformonitoring. 53

<monitor service="DominoIIOP" <monitor service="ICMP" <monitor service="Citrix" <monitor service="LDAP" <monitor service="HTTP" <monitor service="HTTP-8080" <monitor service="HTTP-8000" <monitor service="HTTPS" <monitor service="SMTP" <monitor service="DHCP" <monitor service="DNS" <monitor service="FTP" <monitor service="SNMP" <monitor service="Oracle" <monitor service="Postgres" <monitor service="MySQL" <monitor service="Sybase" <monitor service="Informix" <monitor service="SQLServer" <monitor service="SSH" <monitor service="IMAP" <monitor service="POP3" <monitor service="NSClient <monitor service="NSClientpp

class-name="org.opennms.netmgt.poller.DominoIIOPMonitor"/> class-name="org.opennms.netmgt.poller.IcmpMonitor"/> class-name="org.opennms.netmgt.poller.CitrixMonitor"/> class-name="org.opennms.netmgt.poller.LdapMonitor"/> class-name="org.opennms.netmgt.poller.HttpMonitor"/> class-name="org.opennms.netmgt.poller.HttpMonitor"/> class-name="org.opennms.netmgt.poller.HttpMonitor"/> class-name="org.opennms.netmgt.poller.HttpsMonitor"/> class-name="org.opennms.netmgt.poller.SmtpMonitor"/> class-name="org.opennms.netmgt.poller.DhcpMonitor"/> class-name="org.opennms.netmgt.poller.DnsMonitor" /> class-name="org.opennms.netmgt.poller.FtpMonitor"/> class-name="org.opennms.netmgt.poller.SnmpMonitor"/> class-name="org.opennms.netmgt.poller.TcpMonitor"/> class-name="org.opennms.netmgt.poller.TcpMonitor"/> class-name="org.opennms.netmgt.poller.TcpMonitor"/> class-name="org.opennms.netmgt.poller.TcpMonitor"/> class-name="org.opennms.netmgt.poller.TcpMonitor"/> class-name="org.opennms.netmgt.poller.TcpMonitor"/> class-name="org.opennms.netmgt.poller.TcpMonitor"/> class-name="org.opennms.netmgt.poller.ImapMonitor"/> class-name="org.opennms.netmgt.poller.Pop3Monitor"/> class-name="org.opennms.netmgt.poller.NsclientMonitor"/> class-name="org.opennms.netmgt.poller.NsclientMonitor"/>

<monitor service="Windows-Task-Scheduler" class-name="org.opennms.netmgt.poller.Win32ServiceMonitor"/>

Precedingthemonitorservicestanzasinpollerconfiguration.xmlarethedefinitions ofservices.Theselookverysimilartotheentriesincapsdconfiguration.xml(which makessenseasthisistheregularpollingdefinitionsforthesameservicesthatcapsd hasalreadyfound);howeverparametersinthepollerfilemaywelltakedifferent values(forexample,thediscoveryservicemaybeallowedlongertimeoutsandmore retriesthanthepollingservice).


<service name="ICMP" interval="300000" user-defined="false" status="on"> <parameter key="retry" value="2"/> <parameter key="timeout" value="3000"/> </service> <service name="SNMP" interval="300000" user-defined="false" status="off"> <parameter key="retry" value="2"/> <parameter key="timeout" value="3000"/> <parameter key="port" value="161"/> <parameter key="oid" value=".1.3.6.1.2.1.1.2.0"/> </service>

Notethatthedefaultpollerconfiguration.xmlhastheSNMPmonitorserviceturned off. Servicesmaybedefinedseveraltimeswithdifferentparameterseachservicewill obviouslyrequireauniquename.Thisissothatdifferentdevicescanreceive availabilitymonitoringwithdifferentcharacteristics. Foravailabilitypolling,devicesaregroupedtogetherinpackages,whereapackage defines:


targetinterfaces servicesincludingthepollingfrequency

54

adowntimemodel(whichcontrolshowthepollerwilldynamicallyadjustits pollingonservicesthataredown) anoutagecalendarthatschedulestimeswhenthepollerisnottopoll(i.e. scheduleddowntime).

Therearetwopackagesdefinedinthedefaultpollerconfiguration.xmlfile,example1 andaseparatepackage,strafer,tomonitorStrafePing.Apackagedefinitionmust includeasinglefilterstanza;itmayalsohavespecific,includerangeand excluderangestanzas.Hereisthestartofthedefault,asshipped: <package name="example1">


<filter>IPADDR != '0.0.0.0'</filter> <include-range begin= 1.1.1.1 end= 254.254.254.254 />

Itisthenfollowedbythelistofservicespertinenttothatpackageexample1includes manyoftheservices,witheachservicesettostatus=onexceptSNMP. Theopeningstanzainpollerconfiguration.xmlcontrolstheoverallbehaviourof polling:


<poller-configuration threads="30" serviceUnresponsiveEnabled="false" nextOutageId= SELECT nextval('outageNxtId') xmlrpc= false > <node-outage status="on" pollAllIfNoCriticalServiceDefined="true"> <critical-service name="ICMP"/> </node-outage>

30threadsareavailableforpolling.Thebasiceventthatisgeneratedwhenapoll failsiscalled"NodeLostService".Ifmorethanoneserviceislost,multiple NodeLostServiceeventswillbegenerated.Ifalltheservicesonaninterfacearedown, insteadofaNodeLostServiceevent,an"InterfaceDown"eventwillbegenerated.Ifall theinterfacesonanodearedown,thenodeitselfcanbeconsidereddown,andthis sectionoftheconfigurationfilecontrolsthepollerbehaviourshouldthatoccur.Ifa "NodeDown"eventoccursandnodeoutagestatus=onthenalloftheInterfaceDown andNodeLostServiceeventswillbesuppressedandonlyaNodeDowneventwillbe generated.Insteadofattemptingtopollalltheservicesonthedownnode,thepoller willattempttopollonlythecriticalservice.Oncethecriticalservicereturns,the pollerwillthenresumepollingtheotherservices. Noteinthefollowingscreenshotthatsixserviceshavebeendiscoveredonthe 10.0.0.95interfaceofthenodecalleddeodar.skills1st.co.uk,ofwhichfourare monitored.Thetwointerfacesonthe172.16networkhavebeendetectedthrough SNMPqueriesbutthereisnomonitoringofanyservicesonthesenetworks.There arenocurrentissueswithdeodarandavailabilityhasbeen100%overthelast24 hours. 55

Figure34:OpenNMSnodedetailwithmonitoredservices

OpenNMSincludesastandardsetofAvailabilityreports.Theycanbeselectedfrom theReportsmenu:

56

Figure35:OpenNMSAvailabilityreportsmenu

Hereisasample:

57

Figure36:OpenNMSOverallserviceavailabilityreport

Notethatthereisan/opt/opennms/etc/examplesdirectorywithextrasamplesofall theOpenNMSconfigurationfiles. AlsonotethatOpenNMSneedsrecyclingifanyconfigurationfileshavebeenmodified. Use: /etc/init.d/opennmsstop /etc/init.d/opennmsstart

58

7.3 Problem management


Forproblemmanagement,OpenNMShastheconceptsof:

Events Alarms Notifications

allsortsofbothgoodandbadnews importantevents typicallyemailorpagerbutcouldbeothermethods

Theeventssubsystemisdrivenbytheeventdprocesswhichlistensonport5817.Out ofthebox,eventdreceivesinternaleventsfromOpenNMS(suchasnewsuspect events)andSNMPTRAPs.Itispossibletoalsoconfigureforothereventsources (suchasfromsyslogs).

7.3.1 Event console


EventscanbeviewedfromthewebGUIbyselectingtheEventsoption.

Figure37:OpenNMSEventsmenu

TheAdvancedSearchoptionprovidesseveralwaystofilterevents.Bydefault Outstandingeventsaredisplayed(ie.eventsthathavenotbeenAcknowledged).

59

Figure38:OpenNMSAdvancedEventSearchoptions

Notethatifyouwishtosearchonseverity,youhavetospecifyanexactseverity;you cannotspecifyseveritygreaterthan.....

60

Figure39:OpenNMSdisplayofAllevents

Thecolumnheaderscanbeclickedontouseassortkeys(ascending/descending). TheAckboxcanbetickedtoAcknowledgeoneormoreeventstheywillthen disappearfromthisdisplaywhichonlyshowsOutstandingevents.Clickonthe symbolbesideEvent(s)outstandingtoseeEvent(s)Acknowledged,includingthe nameoftheuserthatacknowledgedtheevent. Thevarious[+]and[]linkscanbeusedtofilterin/outontheparameter(suchas node,interface,orservice).The[<]and[>]besidetheTimecanbeusedtofilterfor eventsbeforeorafterthistime. Toseetheeventdetail,clickontheIDlink.

61

Figure40:OpenNMSEventdetailforevent139192

7.3.2 Internally generated events


Events(andindeedalarms)areconfiguredin/opt/opennms/etc/eventconf.xml,where thefirstmatchforaneventdefinesitscharacteristics.Forthisreason,theorderingof stanzasineventconf.xmlisveryimportant.Anyindividualeventisidentifiedbya UniversalEventIdentifier(uei). Eventsarebracketedby<event></event>tags.Withintheeventdefinition,the followingtagscanalsobeused:

uei eventlabel descr logmsg


alabeltouniquelyidentifytheevent atextlabelfortheeventusedintheWebGUI descriptionoftheevent summaryoftheeventwherethedestparameterisoneof: logtoeventsdatabaseanddisplayinwebGUI logtodatabasebutdon'tdisplayinwebGUI don'tlogtodatabaseorwebGUI don'tlogordisplaybutdopasstootherdaemons(eg.for notification) trapdtodiscardTRAPsnoprocessingwhatsoever createanalarmforthiseventwith

logndisplay logonly suppress donotpersist discardtraps

severity alarmdata

reductionkey fieldstocomparetodetermineduplicateevent

62

alarmtype autoclean

1=problem,2=resolution.alarmtype=2alsotakesa clearkeyparameterdefiningtheproblemeventthisresolves trueorfalse optionalinstructionsforoperatorsusingthewebGUI texttodisplaywhenmousepositionedoverthisevent absolutepathnametoexecutableprogramexecutedevery eventinstance

operinstruct mouseovertext autoaction

Manyofthetagscanusedatasubstitutedfromtheevent.Thesearedocumentedon theOpenNMSwiki:

63

Figure41:OpenNMSeventparametersthatcanbesubstituted

Hereisanexampleeventfromthedefaulteventconf.xml:

64

Figure42:OpenNMSeventdefinitionfornodeLostService

ThedifferentseveritiesavailablecanbeseenbyselectingtheSeverityLegendoption fromthetopofaneventslist.

Figure43:OpenNMSeventseveritylegend

Notethatthereisnoseparatefiletoconfigurealarms;itissimplydonewiththe <alarmtype>tagineventconf.xml. OpenNMScomeswithahugenumberofeventspredefined.Tomakeeventconf.xml muchmoremanageable,inclusionfilescanbespecifiedattheend,suchas: <eventfile>events/NetSNMP.events.xml</eventfile> Theeventssubdirectorycurrentlyhasaround100filesinit!Forperformancereasons, itmakessensetoediteventconf.xmlandremoveany<eventfile>stanzasthatarenot relevantforyourorganisation. AlsonotethatthewholeOpenNMSsystemmustberecycledinorderforchangesto eventconf.xmltotakeeffect!

7.3.3 SNMP TRAP reception and configuration


OpenNMSwillautomaticallymonitortheSNMPTRAPpart(UDP/162)withthe trapdprocess.The/opt/opennms/etc/eventsdirectorycontainsaround100fileswhich specifySNMPTRAPtranslationsintoOpenNMSevents.IfaTRAPissentto OpenNMSthatithasnoconfigurationfor,thenitwilluseadefaultmappingfoundin default.events.xml.

65

Figure44:OpenNMSUnknowntrapappearsintheEventslist

ClickingontheeventIDgivesthedetailoftheeventwhichshowsalltheinformation thatarrivedwiththeTRAP.

Figure45:OpenNMSEventdetailforanunformattedTRAP

TRAPsareconfiguredineventconf.xml(oranincludefile),usingthe<mask>tag. Thistagspecifiesmaskelementswithname/valuepairsthatmustmatchdata deliveredbytheTRAP,inorderforthisparticulareventconfigurationtomatch.

66

Figure46:OpenNMSDefinitionindefault.events.xmlforanunknownspecifictrap

ThisexampleeventwillmatchanyTRAPwhosegenericfieldisequalto6.Note,as withotherconfigurationsineventconf.xml,thatthisdefinitionwillonlymatchthe incomingTRAPifnopreviousdefinitionhigherinthefile(orincludefiles)hadalready matchedit. Themaskelementnametagmustbeone(ormore)ofthefollowing:


uei source host snmphost nodeid interface service id(OID) specific generic

Itispossibletousethe"%"symboltoindicateawildcardinthemaskvalues. SNMPTRAPsoftenhaveadditionaldatawiththem,knownasvarbinds.Thisdata canbeaccessedusingthe<parm>element,where: Eachparameterconsistsofanameandavalue.


%parm[all]%:Willreturnaspaceseparatedlistofallparametervaluesinthe formparmName1="parmValue1" parmName2="parmValue2"etc. %parm[values-all]%:Willreturnaspaceseparatedlistofallparameter valuesassociatedwiththeevent. %parm[names-all]%:Willreturnaspaceseparatedlistofallparameter namesassociatedwiththeevent.

67

%parm[<name>]%:Willreturnthevalueoftheparameternamed<name>ifit exists. %parm[##]%:Willreturnthetotalnumberofparameters. %parm[#<num>]%:Willreturnthevalueofparameternumber<num>.

Anyofthisdatacanbeusedinthemessageordescriptionfields. Inaddition,thevarbinddatacanalsobeusedtofiltertheeventwithinthe<mask> tags,followingthe<maskelement>tags.Itispossibletomatchmorethanone varbind,andmorethanonevaluepervarbind.Forexample:


<varbind> <vbnumber>3</vbnumber> <vbvalue>2</vbvalue> <vbvalue>3</vbvalue> </varbind> <varbind> <vbnumber>4</vbnumber> <vbvalue>2</vbvalue> <vbvalue>3</vbvalue> </varbind>

Theabovecodesnippetwillmatchifthethirdparameterhasavalueof"2"or"3"and thefourthparameterhasavalueof"2"or"3".Itisalsopossibletouseregular expressionswhenmatchingvarbindvalues. Again,notethattheorderinwhicheventsarelistedisveryimportant.Putthemost specificeventsfirst. Hereisanexampledefinitionthatincludesmatchingavarbindwitharegular expression.Notethe<vbvalue>matchesanystringthatcontainseitherBadorbad. Extrastanzashavealsobeenaddedfor<operinstruct>help(whichprovidesaweb linkononelineandplaintextonthesecond),a<mouseovertext>tag(whichdoesn't appeartowork)andatagtorunanautomaticaction(ashellscript)wheneverthis eventoccurs.

68

Figure47:OpenNMSConfigurationofspecificTRAPwithvarbindmatchingaregularexpression

IfyouhaveSNMPTRAPdefinitionsinamibfile,theopensourceutility mib2opennmscanbeobtainedtoconvertSNMPV1TRAPsandSNMPV2 NOTIFICATIONSintoanOpenNMSeventconfigurationxmlfile.Forasourcefile vcs.mibin/home/jane,use: mib2opennmsf/opt/opennms/etc/events/vcs.events.xmlm/home/janevcs.mib

7.3.4 Alarms, notifications and automations


InOpenNMSyoucanaddan<alarmdata>tagtoaneventconfigurationtocreatean alarm.AlarmsaredefinedasImportantEventsandhaveaseparatedisplay.Itis similartotheEventsdisplayinthatyoucanselectAllAlarmsoryoucanspecifya searchtofilterforparticularalarms.

69

Figure48:OpenNMSAlarmsdisplay

Alarmsaredefinedaspartofaneventdefinitionineventconf.xmlanditsincludefiles. Itusesthe<alarmdata>tagwhere:

reductionkey alarmtype

fieldstocomparetodetermineduplicateevent 1=problem,2=resolution.alarmtype=2alsotakesa clearkeyparameterdefiningtheproblemeventthisresolves

autoclean trueorfalse.Trueensuresthatalleventsotherthanthe latestone,thatmatchthereductionkey,areremoved(veryusefulforclearing outduplicateevents)

Oneofthekeycharacteristicsofanalarmthatdifferentiatesitfromanevent,isthe reductionkeyfield,whichshouldensurethatduplicateeventsaretreatedasone eventwithmultipleinstances,ratherthanasmultipleevents. MostoftheinformationprovidedwithaneventisalsoavailableintheAlarmdisplay. ThenewfieldisCountwhichshowsthenumberofduplicateeventsthathavebeen integratedintothisalarm.Toseetheindividualevents,clickonthenumberinthe Countcolumn.

70

Atpresent(July10th,2008),acknowledgingeventshasnoeffectonrelatedalarms, andviceversa.NotethattheconceptsofAcknowledgingandClearingare completelydifferent.Anoperatorcanacknowledgeaneventoranalarm,andthen ownsit.Thisdoesnotcleartheevent(ie.removeitentirelyfromtheevents database). Automaticactionscanbeconfiguredforaneventusingthe<autoaction>tagbutthis canonlyrunanexecutableanditrunsoneveryoccurrenceoftheevent(whichmay notbewhatyouwant!). OpenNMS'sconceptofautomation,however,istriggeredfromalarmsratherthan events.Automationistheconceptofactionsbeingperformedonascheduledbasis, providedthecorrecttriggersexist.An<automation>tagincludes:

name interval triggername actionname

thenameoftheautomation thefrequencyinmillisecondsatwhichtheautomationruns astringthatreferencesatriggerdefinition astringthatreferencesanactiondefinition

ThetriggersandactionsareSQLstatementsthatoperateontheeventsdatabase. Automationisdefinedin/opt/opennms/etc/vacuumd.xmlwherethereareanumberof usefulrules,bydefault:

71

Figure49:OpenNMSDefaultdefinitionsforautomationsinvacuumd.xml

Notethatautomationsalwaysrequireanactionnamebutdonotnecessarilyneeda triggername. ThecosmicClearautomationisthemeansbywhichan<alarmdata>alarmtype=2 tagineventconf.xml,canclearbadnewseventswhengoodnewseventsarrive. HereisthedefinitionoftheselectResolverstriggername:

Figure50:OpenNMSDefinitionofselectResolverstriggerinvacuumd.xml

...andtheclearProblemsaction:

72

Figure51:OpenNMSDefinitionofclearProblemsactioninvacuumd.xml

ThetriggeriskeyedonthefieldalarmType=2.Notethatthefirstversionofthe actioniscommentedouttheclearueielementisnowdeprecatedinthe<alarm data>tagandonlytheclearkeyelementonthegoodnewseventisusedtomatch againstthereductionkeyelementofthebadnewsevent,settingtheseverityto2 (ie.Cleared).Alsonotefromthe<automation>tagthatcosmicClearwillrunevery30 seconds. IfusersneedtobenotifiedofaneventthenOpenNMSprovidesemailandpager notificationsoutofthebox,runbythenotifddaemon.Itisalsopossibletocreate othernotificationmethodssuchasSNMPTRAPsoranarbitraryexternalprogram. Thereareseveralrelatedconfigurationfilesin/opt/opennms/etc:

destinationPaths.xml notifdconfiguration.xml notificationCommands.xml notifications.xml javamailconfiguration.properties

who,when,howtonotify/escalate globalparametersfornotifd notificationmethodsemail,http,page whateventsgeneratenotifications,where configurationforjavaemailer(default)

ThemainfilesthatwillneedattentionaredestinationPaths.xml,notifd configuration.xmlandnotifications.xml.Hereispartoftheexamplesfileprovided in/etc/opennms/etc/examples/destinationPaths.xml:

73

Figure52:OpenNMSExampleentriesindestinationPaths.xml

The<name>tagspecifiesauserorgroupofusersdefinedinOpenNMS.The <command>tagspecifiesamethodthatmustbedefinedin notificationCommands.xml.Notethatescalationsarepossible. Whenaneventisreceivedforwhichanotificationisrequired,OpenNMS"walks"the destinationpath.Wesaythatthedestinationpathis"walked"becauseitisoftena seriesofactionsperformedovertimeandnotnecessarilyjustasingleaction(although itcanbe).Thedestinationpathcontinuestobewalkeduntilallnotificationsand escalationshavebeensentorthenotificationisacknowledged(automaticallyorby manualintervention). Outofthebox,theonlydestinationPaththatisconfiguredisforjavaEmailtothe Admingroupofusers. Thenotifications.xmlfilespecieswhateventstriggernotificationsandtowhom.Here isanexamplefromthedefaultfile:

74

Figure53:OpenNMSExtractofnotificationsfromnotifications.xml

ThenotificationcalledinterfaceDownisturnedon;itappliestoallinterfacesother than0.0.0.0;thenotificationissenttothedestinationEmailAdmin(definedin destinationPaths.xml)andthetextmessageoftheemailincludes3parametersfrom theevent4parametersareincludedontheemailsubject.Thedefault notifications.xmlgeneratesemailtotheAdmingroupforthefollowingevents:


interfaceDown nodeDown nodeLostService nodeAdded interfaceDeleted HighThreshold LowThreshold HighThresholdRearmed LowThresholdRearmed

Nothing,sofar,hashandledacknowledgingnotifications.Thiscaneitherbedone manuallybyauserorcanbeperformedautomatically.Eitherway,whena notificationisacknowledged,itstopsthedestinationpathbeingwalkedforthe originalnotification.Itwillalsocreateanewnotificationtotellusersthattheoriginal issueisresolved.Automaticacknowledgementsareconfigured 75

in/opt/opennms/etc/notifdconfiguration.xmlwhere<autoacknowledge>tagsspecify theueiresolution/problemevents,alongwiththeparametersontheeventwhich mustalsomatchforthenotificationtobeautomaticallyacknowledged.

Figure54:OpenNMSnotifdconfiguration.xmlwithautoacknowledgementsfornotifications

Notethatatpresent(July2008)notificationsaredrivenbyeventsnotalarms.Also notethatacknowledgingnoticeshasnoeffectontheirassociatedeventsoralarms. Itwouldappearthattherehasbeenadiscussionofachangeinarchitecturearound events,alarmsandnotifications,atleastthroughout2008.Inthefuture,itis suggestedthatalarmswillbewheremostautomationisdrivenfrom,including notifications,andthateventswillbecomemoreofabackgroundlog.

7.4 Performance management


7.4.1 Defining data collections
Thereareseveralparallelsbetweenthecapabilitydiscoverysubsystemandthe performancedatacollectionsubsystem.Eachusesthesnmpconfig.xmlfile,described insection7.1.2,togetSNMPparametersforeachdevicesuchasSNMPversion,port number,communitynames. Thecapabilitydiscoveryprocess,capsd,usestheprotocoldefinitionsincapsd configuration.xmltodeterminewhatservices(capabilities)todiscovertheseare thingslikeSNMP,DNS,ICMP,SSH.Theperformancedatacollectionprocess, collectd,uses2filestodefinewhatdatatocollect:

76

datacollectionconfig.xmlspecifiescollectionnames(justthesnmpcollection calleddefaultoutofthebox),whichdefines(typicallyMIB)valuestocollect collectdconfiguration.xmlspecifiespackagesforcollection.Apackagecombines filtersandrangestodeterminewhichinterfacescollectionsshouldbeappliedto, withserviceswhichreferencecollectionsindatacollectionconfig.xml.collectd configuration.xmlcanalsospecifydatacollectionintervalsandwhetherthe collectionisactive.

Notethatifadevicehasseveralinterfacesthat:

SupportSNMP HaveavalidifIndex Isincludedinacollectionpackageincollectdconfiguration.xml

thenthelowestIPaddressismarkedasprimaryandwillbeusedbydefaultforall performancedatacollection. collectdistriggeredwhencapsdgeneratesaNodeGainedServiceevent.The discoveredprotocolname(eg.SNMP,SSH)ispassedfromcapsdtocollectd,alongwith theprimaryinterfacefromtheevent.Thesearecheckedagainsttheconfigurationin collectdconfiguration.xmltoseewhetheranycollectionpackagesarevalid(there shouldbeatleastone,bydefinition!)anddatacollectionisstarted.

Figure55:OpenNMScollectdconfiguration.xmlasshipped

Thereisonlyonepackagespecifiedincollectdconfiguration.xml,asshipped,which appliestoallinterfacesotherthan0.0.0.0andintherange1.1.1.1through 254.254.254.254.Aswithpollerconfiguration.xml,youmusthaveonefilter 77

statementperpackageandcanthenusemultiple<specific>,<includerange>and <excluderange>statementstodefinewhichinterfacesthispackageappliesto.You canalsousethe<includeurl>tagtospecifyafilewithalistofinterfaces. ThereisonlyonedatacollectionservicedefinedforOpenNMSoutofthebox,in collectdconfiguration.xmltheSNMPservice.Itwillrunevery5minutes(300,000 ms)andwillcollecttheMIBvariablesspecifiedinthecollectioncalleddefault, specifiedindatacollectionconfig.xml.The<service>stanzacanalsospecifyvaluesfor SNMPtimeouts,retriesandportnumberwhichwouldoverridethedefaultvaluesin snmpconfig.xml. Thepackagedefinitioncanalsousethe<outagecalendar>tagtospecifyscheduled downtimefordevices,duringwhichdatacollectionwillbesuspended.Thisshouldbe usedtopreventlotsoffailedSNMPcollectionevents.Outageperiodsaredefinedin thepolloutages.xmlfile. Obviouslyyoucanspecifydifferentpackageswithdifferentaddressranges,collection intervalsandwithdifferentcollectionkeys.Youcanalsospecifydatacollectorsother thanSNMP,suchasNSClient,JMXandHTTP.Seehttp://blogs.opennms.org/?p=242 foranoteonusinganHTTPdatacollector. Thedatacollectionconfig.xmlfiledefinesoneormoreSNMPdatacollectionsthat TarusBalog(theprimedeveloperbehindOpenNMS)callsa"scheme",todifferentiate itfromthepackagedefinedinthecollectdconfigurationfile.Theseschemesbring togetherOIDsforcollection,intogroupsandthegroupsaremappedtosystems.The systemsaremappedtointerfacesbyadevice'ssystemOID.Inaddition,each"scheme" controlshowthedatawillbecollectedandstored. Fundamentally,OpenNMSusesRRDTool(RoundRobinDatabaseTool)tostore performancedata.ThispaperisnotatutorialonRRDToolsopleasefollowthe referencetoRRDattheendofthispaperformoreinformation. ThebasisofRRDisthatafixedamountofspaceisallocatedforagivendatabase whenitiscreated.Itholdsdataforagivenperiodoftime,say1month,1year,etc. Thesamplingintervalisknownsoyouknowhowmanydatapointswillgointothe databaseandhencehowmuchspaceisrequired.Oncethedatabaseisfull,newer datapointswillreplacetheoldestones,cyclingaround.

Figure56:OpenNMSdatacollectionconfig.xmlcollectionandRRDparameters

78

The<rrd>stanzaspecifieshowdatawillbestoredinaRoundRobinArchive(RRA). Thesnapshotshowninthefigureabovespecifies:

<rrdstep="300">

datatobesavedevery5minutes,perstep createanRRAwithvaluesAVERAGE'dover1step(ie.thisdataisraw, notconsolidated).TheRRAwillhave2016rowsrepresenting7daysofdata (5minutesteps=12/hour*24hours*7days=2016).Consolidatethe samplesprovided0.5(half)ofthemarenotUNKNOWN(otherwisethe consolidatedvaluewillbeUNKNOWN) createanRRAwithvaluesAVERAGE'dover12steps(ie.thisdatais consolidatedover1hour).TheRRAwillhave1488rowsrepresenting2 monthsofdata(1hourconsolidations*24hours*62days=1488). Consolidatethesamplesprovided0.5(half)ofthemarenotUNKNOWN (otherwisetheconsolidatedvaluewillbeUNKNOWN) createanRRAwithvaluesAVERAGE'dover288steps(ie.thisdatais consolidatedover288*5minsteps=1day).TheRRAwillhave366rows representing1yearofdata(1dayconsolidations*366days=366). Consolidatethesamplesprovided0.5(half)ofthemarenotUNKNOWN (otherwisetheconsolidatedvaluewillbeUNKNOWN) createanRRAwithMAXvaluesaverageddailyandkeep1yearofdata createanRRAwithMINvaluesaverageddailyandkeep1yearofdata

RRA:AVERAGE:0.5:1:2016

RRA:AVERAGE:0.5:12:1488

RRA:AVERAGE:0.5:288:366

RRA:MAX:0.5:288:366

RRA:MIN:0.5:288:366

Thetopofdatacollectionconfig.xmldefineswheretheRRDrepositoriesarekeptand howmanyvariablescanberetrievedbyanSNMPV2GETBULKcommand(10isthe default).Withintherepositorydirectory,foreachnode,therewillexistadirectory thatconsistsofthenodenumber.Thus,ifthesystemwascollectingdataonnode21, therewouldbeadirectorycalled/opt/opennms/share/rrd/snmp/21containinga datafileforeachMIBOIDbeingcollected.Filenameswillmatchthealiasparameter foraMIBOID,indatacollectionconfig.xml. Thenodenumbercanbefoundbygoingtothedetailednodeinformationforadevice andchoosingtheAssetInfolink:

79

Figure57:OpenNMSAssetInfolinkforadevice

TheresultingpageincludestheNodeIDatthetop.

80

Figure58:OpenNMSAssetinformationpage,includingNodeID

ThesnmpStorageFlagparameterinthesnmpcollectionstanzaofdatacollection config.xmldefinesforwhichinterfacesofadevice,datawillbestored.Possiblevalues are:


all primary select

(theolddefault) theprimarySNMPinterface collectfromallIPinterfacesandcanuseAdminGUIto selectadditionalnonIPinterfacestocollectdatafrom(new defaultsinceOpenNMS1.1.0)

81

Figure59:OpenNMSGUIAdminpageforspecifyinginterfacestocollectdatafrom

Mostofthecontentsofdatacollectionconfig.xmlisdefininggroupsandsystems:

groups systems

definegroupsofSNMPMIBOIDstocollect useadevice'sSystemOIDasamasktodeterminewhichgroupsof OIDsshouldbecollected

82

Figure60:OpenNMSgroupdefinitionsindatacollectionconfig.xml

UnfortunatelyOpenNMSdoesnothaveaMIBcompilersoallMIBOIDsneedtobe manuallyspecifiedinthisfile(thegoodnewsisthattherearelotsthereoutofthe box).OncegroupsofMIBvariablesaredeclared,systemstanzassaywhichgroup(s) aretobecollectedforanydevicewhosesystemOIDmatchesaparticularpattern. EachSNMPMIBvariableconsistsofanOIDplusaninstance.Usually,thatinstance iseitherzero(0)oranindextoatable.Atthemoment,OpenNMSonlyunderstandsa smallnumberoftableindices(forexample,theifIndexindextotheifTableandthe hrStorageIndextothehrStorageTable).Allotherinstanceshavetobeexplicitly configured. TheifTypeparametercanbeusedtospecifythesortofinterfacestocollectfrom. Legalvaluesare:

all

collectfromallinterfacetypes

83

ignore

usedwhenthevaluewouldbethesameforallinterfaceseg. CPUutilisationforaCiscorouter

<i/ftypenumber> usedtodenoteoneormorespecificinterfacetypes.For exampleifType=6forethernetCsmacd.See http://www.iana.org/assignments/ianaiftypemibfora comprehensivelist.

OpenNMSunderstandsfourtypesofvariablestocollectongauge,timeticks,integer, octetstring.NotethatRRDonlyunderstandsnumericdata.

Figure61:OpenNMSsystemsdefinitionsindatacollectionconfig.xml

Inthefigureabove,anydevicewhichhassatisfiedthefilteringincollectd configuration.xmlandhasasystemOIDstartingwith.1.3.6.1.4.1(thestartofthe EnterpriseMIBtree),willcollectperformancedataforMIB2interfaces,tcpandicmp, asspecifiedintheearlier<group>stanzas. Notethatthedefaultsincollectdconfiguration.xmlanddatacollectionconfig.xml meanthatalargenumberofSNMPdatacollectionswillbeactivatedoutofthebox. Thisisgoodinprovidinglotsofsamplesinsmallenvironmentsbutitcouldbea seriousperformanceanddiskusagefactorifthesedefaultsareleftunchanged,where alargenumberofinterfacesaremonitoredbyOpenNMS.

84

7.4.2 Displaying performance data


OpenNMSprovidesalargenumberofreportsoutofthebox,basedonthedefaultdata collectionparameters.UsetheReportsmainmenutoseetheoptions.

Figure62:OpenNMSReportcategoriesavailableoutofthebox

ResourceGraphs Availability StatisticsReports

providelotsofstandardreports allowsuserstocustomiseownreports availabilityreportsforinterfaces&services showsTop20ifInOctets acrossallnodes

KSCPerformance,Nodes,Domains

FollowingtheResourceGraphslinkprovidesaccesstomanystandardreports.

85

Figure63:OpenNMSStandardperformancereports

Thestandardperformancereportsdisplayvariouscollectedvaluesforoneparticular nodewhichyouchoosefromthemenuprovided.Thedifferentcategoriesprovide:

NodelevelperformancedatasuchasTCPconnections,CPU,memory Interfacedataforeachinterfacesuchasbitsin/out ResponsetimedataforservicessuchasICMP,DNS,SSH DiskspaceinformationfromtheucdsnmpMIB

86

Figure64:OpenNMSStandardResourcegraphsavailableforaselectednode

Hereispartofthenodelevelperformancedatasetofgraphs.

87

Figure65:OpenNMSpartialdisplayofthenodelevelperformancedatagraphs

Ifyouwishtocreatemoreselectivesetsofgraphsforotherpeopletouse,theKey SNMPCustomized(KSC)Reportsmenutocreateyourownreportswhichcaninclude graphsofselectedMIBvariablesfromonedeviceorcanselectMIBvariablesfrom differentdevices.UsingtheCreateNewbuttonwillpromptfornodesthathavedata collectionsconfiguredasChildResources.

88

Figure66:OpenNMSKSCReportsmenu

SelectinganodeandclickingViewchildresourcesresultsinamenuofreport categories.

89

Figure67:OpenNMSReportcategoriesavailableforcustomisedreports

IfyouselecttheNodelevelPerformanceDataoptionandtheChoosechildresource buttontheneachoftheMIBvariablescollectedcanbedisplayedandselected.

90

Figure68:OpenNMSSelectingprefabricatedreportstoincludeinacustomisedreport

ThedropdownalongsidethePrefabricatedReportfieldallowsyoutoselectanyof thedefaultreportstoincludeinyourowncustomisedreports.Youcanincludeseveral differentgraphs,fromthesameordifferentnodes,inyourKSCreport.

7.4.3 Thresholding
ThethresholdingcapabilityinOpenNMShaschangedfairlysignificantlyovertime seehttp://www.opennms.org/index.php/Thresholding#Merge_into_collectd.foragood explanation. PreOpenNMS1.3.10,collectdcollecteddataandthreshdperformedthresholding twoseparateprocesses.Thisdesignusedarangeparameterinthreshd configuration.xmltogetaroundproblemscausedbytheasynchronousmannernature ofcollectdandthreshd. OpenNMS1.3.10mergedthethresholdingfunctionalityintocollectdandintroduceda newparameterintocollectdconfiguration.xml:

<parameterkey=thresholdinggroupvalue=defaultsnmp/>

wherethevalueofthethresholdinggroupmatchedadefinitioninthreshd configuration.xml.Theneedfortherangeparameterdisappeared.However,to definedifferentfiltersforthresholding,differentpackageshadtobedefinedin collectdconfiguration.xml. 91

FromOpenNMS1.5.91,(thispaperisbasedonversion1.5.93),filterscanbedefined inthreshdconfiguration.xmlsothatpackagesincollectdconfiguration.xmlcanbe keptsimple.Theparameterinthreshdconfiguration.xmlchanges;thethresholding groupkeydisappearsandisreplacedby:

<parameterkey=thresholdingenabledvalue=true/>

Hereisthedefaultcollectdconfiguration.xml:

Figure69:OpenNMSDefaultcollectdconfiguration.xml

Thelackofanythresholdingparameterimpliesthatthresholdingisdisabled. ...andthedefaultthreshdconfiguration.xml:

Figure70:OpenNMSDefaultthreshdconfiguration.xml

92

Thedefaultthreshdconfiguration.xmlissetupfortheinterimdesignbetween versions1.3.10and1.5.90.ForOpenNMS1.5.93,collectdconfiguration.xmlshouldbe changedasshownbelow:

Figure71:OpenNMSModifiedcollectdconfiguration.xmltoenablethresholds

threshdconfiguration.xmlcanbemodifiedwithdifferentpackagesofthresholdingto applytodifferentrangesofnodes.

Figure72:OpenNMSModifiedthreshdconfiguration.xml

93

Differentfiltersareappliedtoeachpackage.Thethresholdinggroupparameteris requiredhereandthevaluepointstoamatchingdefinitioninthresholds.xml,where theMIBstothresholdandthethresholdvalues,arespecified.

Figure73:OpenNMSModifiedthresholds.xmlforCCsnmpgroupandraddlesnmpgroup

Theattributesofathresholdare:

type:A"high"thresholdtriggerswhenthevalueofthedatasourceexceedsthe "value",andisrearmedwhenitdropsbelowthe"rearm"value.Conversely,a "low"thresholdtriggerswhenthevalueofthedatasourcedropsbelowthe "value",andisrearmedwhenitexceedsthe"rearm"value."relativeChange"is forthresholdsthattriggerwhenthechangeindatasourcevaluefromone collectiontothenextisgreaterthan"value"percent. expression:Amathematicalexpressioninvolvingdatasourcenameswhichwill beevaluatedandcomparedtothethresholdvalues.Thisisusedin"expression" thresholding(supportedfrom1.3.3). dsname:Thenameofthevariabletobemonitored.Thismatchesthenamein thealiasparameteroftheMIBstatementindatacollectionconfig.xml. dstype:Datasourcetype.nodefornodeleveldataitems,and"if"for interfacelevelitems. dslabel:Datasourcelabel.Thenameofthecollected"string"typedataitemto useasalabelwhenreportingthisthreshold.Note:thisisadataitemwhose valueisusedasthelabel,notthelabelitself. value:Thevaluethatmustbeexceeded(eitheraboveorbelow,dependingon whetherthisisahighorlowthreshold)inordertotrigger.Inthecaseof relativeChangethresholds,thisisthepercentthatthingsneedtochangein ordertotrigger(e.g.'value="1.5"'meansa50%increase). rearm:Thevalueatwhichthethresholdwillresetitself.Notusedfor relativeChangethresholds.

94

trigger:Thenumberoftimesthethresholdmustbe"exceeded"inarowbefore thethresholdwillbetriggered.NotusedforrelativeChangethresholds. triggeredUEI:AcustomUEItosendintotheeventssystemwhenthis thresholdistriggered.Ifleftblank,itdefaultstothestandardthresholdsUEIs. rearmedUEI:AcustomUEItosendintotheeventssystemwhenthis thresholdisrearmed.Ifleftblank,itdefaultstothestandardthresholdsUEIs.

Bydefault,standardthresholdandrearmeventswillbegeneratedbutitisalso possibletocreatecustomisedeventswiththethresholdattributes.Thiswouldthen makeiteasiertogeneratenotificationsforspecificthresholding/rearmevents. Hereisascreenshotwithstandardeventsgeneratedbythresholdsontheraddle network:

Figure74:OpenNMSThresholdeventsfromvariousdevicesintheraddlenetwork

ForthosewhoprefernottoeditXMLconfigurationfiles,theOpenNMSAdminmenu providesaGUIwaytocreateandmodifythresholds.

95

Figure75:OpenNMSAdminmenu

SelectingtheManageThresholdsoptiondisplaysallthresholdscurrentlyconfigured inthresholds.xml.

96

Figure76:OpenNMSConfiguringthresholdsthroughtheAdminmenu

UsingtheEditbuttonpermitsmodificationofanexistingthreshold.

Figure77:OpenNMSModifyingthresholdsthroughtheAdminGUI

7.5 Managing OpenNMS


Sofar,thisdescriptionofOpenNMShasfocusedverymuchonconfigurationby editingxmlfiles.ItiswellworthmentioningthatthereisnowanAdminmenu (touchedonintheThresholdingsectionpreviously),whichmeansmanyofthe configurationtaskscanbedrivenbyamenubased,fillintheblanksGUI.Referback 97

toFigure75:OpenNMSAdminmenuforalistoftheareaswhichcanbeconfigured thisway.

7.6 OpenNMS summary


OpenNMSisamatureandverycapablesystemsandnetworkmanagementproduct. Itsatisfiesmostrequirementsfordiscovery,availabilitymonitoring,problem managementandperformancemanagement. IthasacleanarchitectureforconfigurationwitheverythingbeingdefinedinXML files.IthasanexcellentmechanismforcollectingandconfiguringSNMPTRAPs. ForthosewhoprefertocustomisethroughaGUI,theAdminmenuprovidesaccessto configuresomeofthesefileswithoutneedingtoknowaneditororXML. Itfeelslikeasolid,reliableproductandisdesigned(saythedevelopers)toscaleto trulylargeenterprises.Therearelotsofgoodsamplesprovidedandthedefault configurationsproviderichfunctionality. Areaswhereitisweakarearoundformaldocumentationandthelackofausable topologymap.Thatsaid,thehelpthatisprovidedwithOpenNMSpanelsisvery good.Datacollectionandthresholdingisstrong.TheadditionofaMIBcompilerand browserwouldimprovemattersenormously.Itisalsoshortofawaytodiscover applicationsthatdonotsupportportsniffingorSNMP. TherearetwolargeproblemswithOpenNMSthatgivemegreatconcern.Youhaveto bouncethewholeOpenNMSsystemifyouchangeanyconfigurationfiles! Thesecondbigissueknowntobeunderreviewistheassociationbetweenevents, alarmsandnotifications.Currently,notificationsaredrivenfromeventswhereas drivingthemfromalarmswouldseempreferable.Thereisalsonolinkbetween acknowledgingevents,alarmsandnotifications. IhavetwopersonalnegativefeelingswithOpenNMS.Thefirstisthatitiswrittenin Java.Sorry,butIhateJavaapplications!Tobefair,OpenNMSdoesnotsufferfrom performanceissuesthataffectsomanyotherJavaapplicationsbutitslogfilesare Javalogfilesandlifeisjusttooshorttofindanythingusefulinthem!Mysecond personalnonpreferenceisthatOpenNMSisverywordy.Theimportantinformation neverseemstohittheeyeonmostscreens.

8 Zenoss
ZenossisathirdOpenSource,multifunctionsystemsandnetworkmanagementtool. UnlikeNagiosandOpenNMS,thereisafree,coreoffering(whichdoesseemtohave mostthingsyouneed),andZenossEnterprisethathasextraaddongoodies,high availabilityconfigurations,distributedmanagementserverconfigurationsandvarious

98

supportcontractofferingswhichincludessomeeducation.Foracomparisonofthe freeandfeealternatives,tryhttp://www.zenoss.com/product/#subscriptions. Zenossoffersconfigurationdiscovery,includinglayer3topologymaps,availability monitoring,problemmanagementandperformancemanagement.Itisbasedaround theITILconceptofaConfigurationManagementDatabase(CMDB),theZenoss StandardModel.ZopeEnterpriseObjects(ZEO)isthebackendobjectdatabasethat storestheconfigurationmodel,andZopeisthewebapplicationdevelopment environmentusedtodisplaytheconsole.TherelationalMySQLdatabaseisusedto holdcurrentandhistoricalevents. Zenoss2.2hasrecentlybeenreleasedwhichprovidesstackbuildscomplete bundlesincludingZenossandallitsprerequisites.Thesestackinstallersare availableforawidevarietyofLinuxplatforms;standardRPMandsourceformatsare alsoavailable.Foreasyevaluation,aVMwareappliancecanbedownloaded,readyto go. ItriedboththeVMwarebuildandthe2.2stackinstallforSuSE10.3;bothwere relativelypainless.Therestofthissectionisbasedonthe2.2stackinstallationona machinewhosehostnameiszenoss. ToaccesstheWebconsole,pointyourbrowserathttp://zenoss:8080.Thedefaultuser isadminwithapasswordofzenoss.Thedefaultdashboardiscompletelyconfigurable butthisscreenshotisclosetothedefault.

99

Figure78:Zenossdefaultdashboard

8.1 Configuration Discovery and topology


ThereisagoodZenossQuickstartdocumentavailablefrom http://www.zenoss.com/community/docs.SimilartoOpenNMS,thearchitectureis basedonobjectorientedtechniques.

8.1.1 Zenoss discovery


zPropertiescanbedefinedfordevices,services,processes,productsandevents. ObjectscanbegroupedandsubgroupedwithzPropertiesbeingrefinedandchanged throughoutthehierarchy.So,forexample,theDeviceobjectclasshasdefault subclassesfordifferentdevicetypes,asshownbelow.

100

Figure79:Zenossdeviceclasses

TheclassofDeviceshasazPropertiespageasdotheclassesNetwork,Server,Printer, etc.DeviceswillinitiallybeaddedtotheDiscoveredclassandcanthenbemovedtoa moreappropriateclass.

101

Figure81:ZenossLinuxServerdevices Figure80:ZenossServerDeviceclasses

DiscoveryandmonitoringislargelycontrolledbythecombinationofzProperties appliedtoadevice,ofwhichtherearealargenumber(mostwithsensibledefaults). Initially,basicSNMPandpingpollingparametersshouldbeconfiguredinthe zPropertiespageforDevices.

102

Figure82:ZenosszPropertiesfortheDeviceclass(part1)

103

Figure83:ZenosszPropertiesfortheDeviceclass(part2)

104

Figure84:ZenosszPropertiesfortheDeviceclass(part3)

ThelefthandmenusofthewebconsoleprovideanAddDeviceoption(nothingis discoveredautomatically,outofthebox).

Figure85:ZenossAddDevicesdialogue

Onceadevicehasbeendiscovered(whichbydefaultusesping),ifthediscovery protocolissettoSNMPthenthedevicewillbequeriedforitsSNMProutingtable. Anynetworksthatthedevicehasroutestowillthenbeaddedtotheobjectclassof networks. 105

Figure86:ZenossNetworksclasswithdropdownmenu

Oncethepresenceofanetworkhasbeendiscovered,devicescanautomaticallybe discoveredonthatnetworkthisusesaspraypingmechanism.Thereisadropdown menufromthetopleftcorneroftheNetworkspage(whichworksfineforsimpleClass Cnetworks).AlthoughtheGUIdoesmanagetodisplaysubnetworksaccurately,even ifthesubnetmaskisnotonabyteboundary,theDiscoverDevicesmenudoesnot honourthesubnetmask.However,agoodfeatureofZenossisthatthereisa commandline(CLI)forvirtuallyeverythingandtheCLIfordevicediscoveryona networkdoeshonoursuppliednetmasks.Forexample: zendiscrunnet10.0.0.0/24 NotethattheZenossdiscoveryalgorithmisverydependentongettingroutingtables usingSNMPandtheZenossservermustsupportSNMPitself. FordevicesthatdonotsupportpingbutdosupportSNMP,theycanbeadded manuallywiththeAddDevicemenu.ThezPropertiesofthedevice(orclassof

106

devicesifyoucreateasubclass)shouldhavezPingMonitorIgnore=Trueand zSsnmpMonitorIgnore=False. TherearethreeZenossprocessesthatimplementdiscovery:

zenmodelercanuseSNMP,sshandtelnettodiscoverdetailedinformation aboutdevices.zenmodelerwillonlyberunagainstdevicesthathavealready beendiscoveredbyzendisc.Bydefault,zenmodelerrunsevery6hours. zenwindetectsWindows(WMI)services zendiscisasubclassofzenmodeler.IttraversesroutingtablesusingSNMP andthenusespingtodetectdevicesondiscoverednetworks.

8.1.2 Zenoss topology maps


Zenosshasanautomatictopologymappingoptionwhichcandisplayupto4hopsfrom aselecteddevice.Itevenseemstobeabletounderstandnetworksservedbyseveral routers!

Figure87:ZenossNetworkMapshowing4hopsfromgroup100r1

107

8.2 Availability monitoring


AvailabilitymonitoringinZenosscanuse3differentmethods:

pingtests

implementedviazenping detectsdeviceavailability implementedviazenstatus detectsservicesasdefinedbyTCP/UDPports implementedviazenprocess detectsprocessesusingtheSNMPHostResourcesMIBusingthe snmp.IpServiceMapzCollectorPlugindrivenbyzenmodeler detectsWindowsservicesusingWMIusingtheWinServiceMapdrivenby zenwin

servicetests

processtestsandWindowsServicestests

8.2.1 Basic reachability availability


BasicavailabilitymonitoringiscontrolledbyCollectors.Thesearealsoknownas Monitors(andthedocumentationcanbeconfusing!),TheCollectorsmenucanbe foundonthelefthandside.

108

Figure88:ZenossCollectors(Monitors)overview

Thedevicesbeingmonitoredareshownatthebottomofthescreen.Tochangeanyof theseparameters,usetheEdittab.Thedefaultsforavailabilitymonitoringare:

Pingcycletimepolling Pingtimeout Pingretries Status(TCP/UDPservice)pollinginterval Process(SNMPHostResources)pollinginterval SNMPperformancecycleinterval

60sec 1.5sec 2 60sec 180sec 300sec

WhatavailabilitychecksarecarriedoutonadeviceiscontrolledbythezPropertiesof thatdevice,rememberingthatzPropertiescanbesetatanyleveloftheobject hierarchy.Bydefaultthe/DevicesclasshaszPingMonitorIgnore=Falseand zSnmpMonitorIgnore=Falsesoeverydevicewillgetpingpollingat1minuteintervals andSNMPpollingat5minuteintervals.

109

8.2.2 Availability monitoring of services - TCP / UDP ports and windows services
ServicemonitoringforTCP/UDPportsandWindowsservices,isconfiguredthrough theServicesmenu.

Figure89:ZenossServicesmenu

AverylargenumberofWindowsservicesarepreconfiguredoutofthebox.These servicesareactuallymonitoredbythezenwindaemonwhichuses(andrequires)WMI ontheWindowstargetmachine.NotetheCountcolumnshowingonhowmany devicestheseserviceshavebeendetected

110

Figure90:ZenossWindowsservices

EvenmoreIPservicescomeconfiguredoutofthebox.TherearetwosubclassesofIP servicesPrivilegedandRegistered;eithercanmonitoreitherTCPorUDPports.

111

Figure91:ZenossPrivilegedIPservices

Again,notetheCountcolumn.Clickingontheservicenameshowswherethe servicehasbeendetected:

Figure92:Zenossdevicesrunningthedomain(DNS)serviceonTCP53orUDP53

112

Thefactthataservicehasbeendetecteddoesnotimplythatitisbeingmonitoredfor availability(thedefault,outofthebox,isthatnothingismonitored).TheMonitor columnfordevicesshowswhetheractivemonitoringistakingplace(andhenceevents potentiallybeinggenerated).TheMonitorfieldinthetoppartofthewindowshows theglobaldefaultforthisservice. Toturnonservicemonitoringgloballyforaparticularservice,usetheServicesmenu tofindtheserviceinquestion.YoucanthenuseeitherthezPropertiestaborthe EdittabtochangetheMonitorglobaldefaulttoTrue(thedefault,asshipped,is False). Toturnonservicemonitoringforaspecificdevice,accessthemainpageforadevice andopentheOStab.UndertheIPServicessection,clickontheNamecolumn headertoseeservicesdetected.Clickontheservicenamewhichbringsuptheservice statuswindowforthedevicewheretheMonitorfieldcanbechangeddon'tforgetto clicktheSavebutton.NotethattheMonitoredboxintheIPServicesheadingbar canbeusedtotogglethedisplaybetweendetectedservicesandmonitoredservices. NotethatthedropdownmenutoAddIpServiceisdrivenbytypinginapartial matchoftheservicenameyouwantthesubsequentdropdownthenshows configuredservicesthatmatchyourselection.

8.2.3 Process availability monitoring


Unix/LinuxprocessmonitoringreliesontheSNMPHostResourcesMIBonthe targetdevice.Processestobemonitoredcanbeflexiblydefinedusingregular expressions.StartfromtheProcessesmenutoseeprocessesdefined(therearenone outofthebox).UsethedropdownmenutoAddprocess.

113

Figure93:ZenossProcesseswithdropdownmenu

Supplyaprocessnameanditwillbeaddedtothelist.Tomodifythedefinitionofthe process,clickontheprocessnameandselecttheEdittab.

Figure94:Zenossdialogueformodifyingprocessdefinition

TomodifythezPropertiesofaprocess,usethezPropertiestab.

114

Figure95:ZenosszPropertiesforthefirefoxprocess

Toapplyprocessmonitoringtoadevice,fromtheOStabofthedevicepage,selectthe dropdownmenuandusetheAddOSProcessmenu.Definedprocessesareselectable fromthedropdownwindow.

Figure96:ZenossAddOSProcessmonitoringtoaspecificdevice

115

Notethattherearecurrently(July4th,2008)acoupleofbugstodowithprocess monitoringwherebyprocessesdisappearfromtheOStabofadeviceand/orshowthe wrongstatus(tickets#3408,#3399,#3270).Tomitigateagainstthese,thezenprocess daemonshouldbestoppedandrestartedwhenevermodificationshavebeenmadeto dowithprocesses.YoucanusetheGUIbychoosingSettingsandselectingthe Daemonstab. Temporarily,itwouldalsobewisetousethemenufortheprocessandselecttoLock theprocessfromDeletion. Moresophisticatedavailabilitymonitoringcanbeimplementedusingstandard zCollectorPluginsnotethatthesearemodellingpluginsasdistinctfrom performanceplugins.zCollectorpluginsareappliedtodeviceclassesordevices throughthezPropertiestabusetheEditlinkalongsidezCollectorPluginstoshow ormodifythepluginsappliedandavailable.

Figure97:ZenosszCollectorPlugins

NotethattheAddFields/HideFieldsappearsgreyedoutbutdoesactuallywork.The pluginsshownontheleftinthescreenshotabovearethedefaultforthe/Devicesclass. The/Devices/ServerclasshasseveralmoreSNMPbasedplugins,bydefaultand the/Devices/Server/Windowsclasshasanextrawmi.WinServiceMapplugin. Documentationonthesepluginsseemsalittlesparsebuthereareafewclues:

116

Figure98:Zenossdefaultpluginsforclass/Devices/Server/Windows

zenoss.snmp.InterfaceMap zenoss.snmp.IpServiceMap zenoss.snmp.HRSWRunMap zenoss.wmi.WinServiceMap

usesSNMPtoqueryforinterfaceinfo zenstatusdaemonqueriesTCP/UDPportinfo usesSNMPtogetprocessinfofromHost resourcesMIB zenwindaemonusesWMItoqueryforWindows services

Onewaytofindwhatpluginsareappliedbydefaulttodeviceclassesistoinspectthe migrationscriptsupplied in/usr/local/zenoss/zenoss/Products/ZenModeler/migrate/zCollectorPlugins.py. Toseewhatpluginsareactiveonaspecificdevice,usethedevicesmainpagemenu andselecttheMoremenutofindtheCollectorPluginsmenu.

117

Figure99:ZenosszCollectorPluginsfordevicegroup100r1.class.example.org

Whenmodifyingcharacteristicsforspecificdevices,donotethatthemainpagemenu (fromthearrowdropdownatthetopleftcorner)hasbothaMoresubmenu(which includeszPropertiesamongotherthings)andaManagesubmenu.

118

Figure100:ZenossDeviceMoresubmenu

Figure101:ZenossDeviceManagesubmenu

119

8.2.4 Running commands on devices


AfewCommandsaredefinedoutoftheboxandcanbeseenusingthelefthand SettingsmenuandthenselectingtheCommandstab.Newcommandscanbe addedusingtheAddUserCommanddropdownmenu.

Figure102:ZenossCommandsprovidedoutofthebox

Fromadevice'smainpage,thereisasubmenutoRunCommands.

Figure103:ZenossRunCommandsforaparticulardevice

120

Althoughmuchoftheavailabilitymonitoringthathasbeendemonstratedsofarrelies onSNMP,itisalsopossibletousesshortelnettocontactremotedevicesandrun monitoringscriptsonthem.

8.3 Problem management


TheZenosseventmanagementsystemcancollecteventsfromsyslogs,windowsevent logs,SNMPTRAPsandXMLRPC,inadditiontomanagingeventsgeneratedby Zenossitself(suchasavailabilityandperformancethresholdevents). WhenaneventarrivesintheStatustableoftheeventsdatabase,thedefaultstateof theeventissettoNew.TheeventcanthenbeAcknowledged,Suppressedor Dropped.Fromthere,aneventwillbearchivedintotheEventHistorydatabaseinone offourways.

Manuallymovedtothehistoricaldatabase(historifying) Automaticcorrelation(goodeventclearsbadevent) Aneventclassrule Atimeout

Eventsautomaticallyhaveaduplicationdetectionruleappliedsothatifaneventof thesameclass,fromthesamedevice,withthesameseverityarrives,thentherepeat countofanexistingeventwillsimplybeincremented. Globalconfigurationparametersfortheeventsystemcanbeconfiguredfromthe EventManagerlefthandmenu. Bydefault,statuseventsofseveritybelowError,areagedouttotheEventHistory databaseafter4hours.Historicaleventsareneverdeleted.

121

Figure104:ZenossEventManagerconfiguration

8.3.1 Event console


ThemainEventConsoleisreachedfromtheEventConsolemenuontheleft.The defaultistoshowallstatuseventswithaseverityofInfoorhigher,sortedfirstby severityandthenbytime(mostrecentfirst).Eventsareassigneddifferentseverities:

Critical Error Warning Info Debug Clear

Red Orange Yellow Blue Grey Green

Theeventssystemhastheconceptofactivestatuseventsandhistoricalevents(two differentdatabasetablesintheMySQLeventsdatabase). EventsintheconsolecanbefilteredbySeverity(Infoandabovebydefault)andby State(New,AcknowledgedandSuppressedwhereNewandAcknowledgedareshown bydefault).AnyeventwhichhasbeenAcknowledgedchangestoawishywashy versionoftheappropriatecolour.ThereisalsoaSearchboxatthetoprightfor filteringevents.

122

Figure105:ZenossEventConsole

FromtheConsole,eventscanbeselectedbycheckingtheboxalongsidetheeventand thedropdowncanbeusedforvariousfunctionsincludingAcknowledgeandMove toHistory.ThedropdowncanalsobeusedtogenerateanytesteventwiththeAdd Eventoption(ifyouareaCLIpersonratherthanaGUIperson,thezensendevent commandisalsoavailable). ThecolumnheadersoftheEventConsolecanbeusedtochangethesortingcriteria andtheiconatthefarrightoftheeventcanbeusedtodisplaythedetaileddataof fields.

8.3.2 Internally generated events


EventsareautomaticallygeneratedbyZenossifanavailabilitymetricismissed(such asapingcheckfailingoraservicecheckfailing).Similarly,ifperformancesampling issetupalongwiththresholds,theneventswillbegeneratedifthethresholdis breached.Reasonabledefaultsforsucheventsareconfiguredoutofthebox.

123

EventsareorganisedinclasshierarchieswhichhavezProperties,justlikeDevices. Tomodifythepropertiesofanevent,selecttheEventsoptionfromthelefthand menu.

Figure106:ZenossEventclassesandsubclasses

Tomodifythecontextofanyevent,selecttheeventandusethezPropertiestab.

Figure107:ZenosszPropertiesfortheeventclass/Event/Status/OSProcess

124

EventsaremappedtoEventClassesbyEventClassinstances.EventClassinstances arelookedupbyanonuniquekeycalledEventClassKey.Whenaneventarrivesit is:


Parsed Assignedtotheappropriateclassandclasskey Contextisthenapplied:


EventcontextisdefinedinthezPropertiesofaneventclass Aftertheeventcontexthasbeenapplied,thenthedevicecontextisapplied wherebytheProductionState,Location,DeviceClass,DeviceGroups,and Systems,areallattachedtotheeventintheeventdatabase.

Oncethesepropertieshavebeenassociatedwiththeevent,Zenossattemptsto updatethezEventProperties. Thisallowsaparticulardeviceorclassofdevices tooverridethedefaultvaluesforanygivenevent.

Tochangetheeventmapping,selecttheeventclassandusetheMappingstab.

Figure108:ZenossEventmapping

TheEdittaballowseditingofanyofthesefields.

8.3.3 SNMP TRAP reception and configuration


ZenossautomaticallylistensforSNMPTRAPsonUDP/162(thewellknowntrapport) usingthezentrapprocess.SomegenericTRAPs(23and4forLinkDown,LinkUp andAuthenticationFailure)areautomaticallymappedtodefinedclasses.Other genericTRAPs(suchas0,1forColdStartandWarmStart)appearasthe/Unknown eventclass,aswillanyspecificTRAPs.Itissimpletomapsucheventstoanalready

125

configuredeventclassbyselectingtheoccurrenceoftheeventandusingthepulldown menutoselectMapEventstoClasspickthecorrectclassfromthescrollablelist. Itisalsopossibletocreateneweventclasses.StartingfromEventsontheleftmenu, navigatetotheplaceintheeventclasshierarchyunderwhichyouwanttocreatea newclassandusethedropdownmenutoAddNewOrganizerandgivetheclassa uniquename.

Figure109:Zenossmenutocreateaneweventclass

8.3.4 email / pager alerting


AlertingRulesareZenoss'swayofsendingemailand/orpagingnotifications.These areconfiguredonaperuserbasis,startingfromthePreferencesmenutowardsthe toprightofthewebconsole.TheAlertingRuletabthenshowsexistingrulesand permitsrulecreation/deletion.

126

Figure110:ZenossmenutocreateAlertingRule

UsingtheEdittabpermitschangesofexistingalertingrules.Differentrulescanbe appliedbasedonacombinationofseverity,eventstate,productionstateandamore genericfilter.TheProductionStateisassignedtoadeviceordeviceclass:


Production PreProduction Test Maintenance Decommissioned

TheProductionStatecanbesetorchangedusingtheEdittabfromadevicemain page.ThedefaultisProduction.TheProductionStateattributecanbeusedto controlwhetheradeviceismonitoredatall,whetheralertsaresentandwhethera deviceisrepresentedontheZenossmaindashboard.Itisverysimpletomodifythe ProductionStatetoputadeviceorclassofdevicesintomaintenance,forexample.

127

Figure111:ZenossEditingalertingrule

TheemailorpagermessageoftheAlertingRuleisconfiguredbytheMessagetab andtheScheduletabcanbeusedtocreatedifferentalertingrulesatdifferenttimes.

128

Figure112:ZenossAlertingrulemessageformat

Globalparametersforemailandpaging,alongwithotherusefulparameters,canbe definedfromtheSettingslefthandmenu.

129

Figure113:ZenossSettingsparameters

TheoutoftheboxemailnotificationsprovidehandylinksbacktoZenossto manipulatetheeventthatisbeingreportedon.

130

Figure114:Zenossemailgeneratedbyeventnotification,includinglinks

8.3.5 Event automations


Anyeventcanbeconfiguredtorunanautomaticscript.Thiscanbeinadditiontothe email/pageralertingrulesdescribedabove.Suchautomationscriptsareknownas ZenossCommandsandarerunbythezenactionsdaemon.Theyareconfiguredfrom theEventManagerlefthandmenuusingtheCommandstab.

Figure115:ZenossEventCommanddefinition

131

8.4 Performance management


ZenosscancollectperformancedataandthresholditusingeitherSNMP(throughthe zenperfsnmpdaemon)orbycommands(typicallyssh),usingthezencommanddaemon. ThedataisstoredanddisplayedusingRRDTool.

8.4.1 Defining data collection, thresholding and graphs


Configurationofperformancedatacollection,thresholdinganddisplayisdone throughtemplates.AswithotherZenossobjects,templatescanbeappliedtoaspecific deviceortoahigherlevelinthedeviceclassobjecthierarchy.Toseeallthedefined templates,navigatetotheDevicespageandusethelefthanddropdownmenuandthe MoresubmenutochooseAllTemplates.

Figure116:ZenossAllTemplatesshowingalldefinedperformancetemplates

WiththeexceptionofthetemplateswithHRMIBinthename,theabovefigure showsthedefaulttemplatesasshipped.Notethatthesearedefinedtemplates thereisnoindicationhereastowhichareactiveonwhatobjects. NoteinthescreenshotabovethatthereareseveraltemplatescalledDevice. Templatescanbeboundtoadeviceordeviceclasstomakeitactive.When 132

determiningwhatdatatocollect,thezenperfsnmp(orzencommand)daemonfirst determinesthelistofTemplatenamesthatareboundtothisdeviceorcomponent. Fordevicecomponentsthisisusuallyjustthemetatypeofthecomponent(e.g. FileSystem,CPU,HardDisk,etc.)Fordevices,thislististhelistofnamesinthe device'szDeviceTemplateszProperty.

Figure117:ZenosszPropertiesshowingzDeviceTemplate

Thedefault,outofthebox,isthatthedevicetemplatecalledDeviceisboundtoeach devicediscovered.Asnotedinthepreviousscreenshot,thereareseveraltemplates calledDevice.TheDevicetemplatefortheclass/DevicessimplycollectssysUpTime. ThetemplatecalledDevicefor/Devices/Servercollectsanumberofparameters supportedbythenetsnmpMIB.ThetemplatecalledDevice for/Devices/Server/WindowscollectsvariousMIBvaluesfromtheInformantMIB. ForeachtemplatenameZenosssearchesfirstthedeviceitselfandthenuptheDevice Classhierarchylookingforatemplatewiththatname.Zenossusesthefirsttemplate thatitfindswiththecorrectname,ignoringotherswiththesamenamethatmight existfurtherupthehierarchy.

133

So,thezenperfsnmpdaemonwillcollectnetSNMPMIBinformationforUnix/Linux serversandwillcollectInformantMIBinformationforWindowsservers (as/Devices/Server/Windowsismorespecificthan/Devices/Server).Anyactualdevice canhavealocalcopyofatemplateandchangeparameterstosuitthatspecificdevice. TemplatebindingscaneitherbemodifiedbychangingthezProperties zDeviceTemplatesfieldorthereisaBindTemplatesmenudropdownfromthe templatesdisplayofanydevice.(Dorememberthat,foradevice,boththeTemplates menuandthezPropertiesmenuareofftheMoredropdownsubmenu).

Figure118:ZenossBindTemplatesmenu

Beawarethatwhenselectingtemplatestobind,youneedtoselectallthetemplates youwantbound(usetheCtrlkeytoselectmultiples). So,whatdothesetemplatesactuallyprovide? Templatescontainthreetypesofsubobjects: Datasources whatdatatocollectandmethodtouseeg.MIBOID Thresholds expectedboundsfordataandeventstoraiseifbreached Graphdefinitions howtographthedatapoints

134

Figure119:ZenossDevicetemplatefor/Devices/Server

ZenossprovidestwobuiltintypesofDataSources,SNMPandCOMMAND.Other typescanbeprovidedthroughZenPacks.ClickingontheDataSourcedisplaysdetails whichcanthenbemodified.TypicallyanSNMPDataSourcewillprovideasingle DataPoint(aMIBOIDvalue).Typicallythenameofthedatapointwillbethesame asthenameofthedatasource.Thismeansthatwhenyoucometoselectthreshold valuesorvaluestograph,youwillbeselectingnameslike ssCpuRawWait_ssCpuRaw_wait.

Figure120:ZenossDataSourcememAvailReal

135

NotethatthereisausefulTestbuttontocheckyourOIDagainstanodethatZenoss knowsabout.However,bewarethatthisTestbuttonappearstousesnmpwalkunder thecoverssoifaMIBOIDhasmultipleinstancesthenthesnmpwalkwillreturn valuessuccessfully.Whenzenperfsnmpactuallycollectsdata,itrequiresthecorrect instanceaswellasthecorrectMIBOID.Ifyourtestissuccessfulbutyou subsequentlyseeemptygraphswithamessageofMissingRRDfilethenthe problemislikelytobethattheMIBinstanceisincorrect. DatasourcescanbeaddedordeletedwiththedropdownAddDataSourceand DeleteDataSourcemenus. Thresholdscanbeappliedtoanyofthedatapointscollected,alongwitheventsto generateifthethresholdisbreached.

Figure121:ZenossThresholdonCPUcollecteddata

Allofthedatapointsdefinedinthedatasourcessectionaresuppliedinthetop selectionbox.Ifaneventistobegenerated,dropdownsareprovidedtoselectthe eventclassandseverity.Youcanalsospecifyanescalationcount. ThresholdscanbeaddedordeletedfromtheThresholdsdropdownmenu.

136

Figure122:ZenossDropdownmenufordatathresholds

Notethatthisdropdownmenu(asisalsotrueoftheDataSourcesdropdown)hasan optiontoAddtoGraphs. Graphscanbedefinedforawidecombinationofthecollecteddatapointsand thresholds.ThemenupanelsarebasicallyafrontendtotheRRDgraphingtooland, withlotsofsamplesprovided,youdon'tneedtogetintothedetailsofRRDTool; howeverifyouwishto,thereisplentyofscopetodoso. Graphscanbeadded,deletedorresequencedusingthedropdown.Existinggraphs aremodifiedbyclickingonthegraphname.

137

Figure123:ZenossPerformancetemplategraphdefinition

Notethatgraphscandisplaybothdatapointsandthresholds. Allgraphsarestored,bydefault,under/usr/local/zenoss/zenoss/perf/Devices.Thereis asubdirectoryforeachdevice.Componentdatarrdfilesareundertheossubdirectory withfurthersubdirectoriesforfilesystems,interfacesandprocesses.

8.4.2 Displaying performance data graphs


Toviewperformancegraphs,theOperatingSystemcomponentgraphscanbeseen fromtheOSpageofadevice,byclickingontherelevantinterface,filesystemor process.TherestoftheperformancegraphscanbefoundunderthePerftab.

138

Figure124:ZenossPerformancegraphsforeth1interfaceonbino

YoucanchangetherangeofdatawiththeHourlydropdown(todaily,weekly, monthlyoryearly).Datacanbescrolledusingthe<>barsateithersideandthe+ andmagnifierscanbeusedtozoomin/out.Bydefault,allgraphsonthepageare linked(sothatifyouchangetherangeonone,itchangesforall).Theycanbede coupledwiththeLinkGraphs?checkbox. HereisapartialscreenshotofthegraphsforbinounderthePerftab.

139

Figure125:ZenossPerformancegraphsavailableunderthePerftabforbino

NotethattheReportslefthandmenualsoprovidesaccesstovariousreports, includingperformancereports.

140

Figure126:ZenossReportsmenu

FollowingthePerformanceReportslinkprovidesaccesstoallperformancereports foralldevices.

Figure127:ZenossPerformanceReportsmenu

8.5 Zenoss summary


Zenossisanextremelycomprehensivesystemsandnetworkmanagementproduct, satisfyingmostofmyrequirements.Onefeelsthattheobjectorientedarchitectureis extremelyflexibleandpowerfulwithmostthingsyourequirealreadyconfiguredout ofthebox.Theautomaticdiscoveryandtopologymappingoptionsarethemost powerfuloftheproductsdiscussedhere.ItcanaccommodateNagiosandCacti pluginsandhasitsownaddonarchitectureintheformofZenPacks. 141

ZenosswilluseSNMPtogainstatusandperformanceinformationfromadevicebutit alsohassshandtelnetasalternatives,forthosedeviceswhereSNMPis inappropriate. TheQuickStartGuidegetsyourunningfastandtheAdminGuideprovideswhatit saysareasonablecomprehensiveAdministrator'sGuide.Thereisalsoabookby MichaelBadger,publishedJune2008,ZenossCoreNetworkandSystem Monitoring,whichiswellworththeinvestment(availablebothinpaperandin electronicformat).However,onefeelsthatthereissomuchmoreinthedetailof Zenossthatoneneedstoknowandcanfindnoinformationon! MyonlyrealnegativecommentonZenoss,otherthanthelackofdetailedtechnical information,isthatitisarapidlyevolvingproductanditfeelsratherbuggy.The current(August2008)pollonthezenossusersforumforinputtoZenoss2.3,has manyrequesterswithcodereliabilityandbetterdocumentationatthetopoftheir lists!

9 Comparison of Nagios, OpenNMS and Zenoss


Necessarily,comparisonsarebasedonamixtureoffactandfeelingandyouneeda cleardefinitionofwhatfeaturesareimportanttoyourenvironmentbefore comparisonscanbevalidforyou. Nagiosisanolder,morematureproduct.ItevolvedfromtheNetSaintproject, emergingasNagiosin2002.OpenNMSalsodatesbackto2002butfeelslikethelead developer,TarusBalog,haslearnedsomelessonsfromobservingNagios.Zenossisa morerecentoffering,evolvingfromanearlierprojectbydeveloperErikDahland emergingtothecommunityasZenossaround2006. AlltheproductsexpecttouseSNMPOpenNMSandZenossuseSNMPasthe defaultmonitoringprotocol.TheyallprovideotheralternativesZenosssupportsssh andtelnetalongwithcustomisedZenPacks;NagioshasNRPEandNSCAagents(both ofwhich,ofcourse,requireinstallingonremotenodes);OpenNMSdoesn'thavemuch elsetoofferoutoftheboxbutitcansupportJMXandHTTPaswellashaving supportforNagiosplugins. Alltheproductshavesomeusermanagementtodefineusers,passwordsandroles withcustomisationofwhatausersees. OpenNMSandZenossuseRRDTooltoholdanddisplayperformancedata;Nagios doesn'treallyhaveaperformancedatacapabilityCactimightbeagoodcompanion product. Mostsurprisingly,giventhattheyallrelyonSNMP,noneoftheproductshasan SNMPMIBBrowserbuiltintoassistwithselectingMIBsforbothstatusmonitoring andperformancedatacollection.

142

Thereareadvocatesforandagainstagentlessmonitoring.Personally,Idon't believeinagentless.Onceyouhavegotpastpingthenyouhavetohavesomeform ofagenttodomonitoring.Thequestionis,shouldamanagementparadigmusean agentthatistypicallypartofaboxbuild(likessh,SNMPorWMIforWindows),or shouldthemanagementsolutionprovideitsownagent,likeNagiosprovidesNRPE (andmostofthecommercialmanagementproductscomewiththeirownagents).If yourmanagementsystemwantsitsownagents,youthenhavethehugeproblemof howyoudeploythem,checktheyarerunning,upgradethem,etc,etc.OpenNMSand ZenosshaveastrongdependencyonSNMPalthoughZenossalsosupportssshand telnetmonitoring,outofthebox(ifyourenvironmentpermitsthese).SNMPmaybe oldandSimple,butallthreeproductssupportSNMPV3(forthosewhoareworried aboutthesecurityofSNMP)andvirtuallyeverythinghasanSNMPagentavailable. Theotherformofagentlessmonitoringbasicallycomesdowntoportsniffingfor services.Whilstthiscanworkfineforsmallerinstallations,thensquarednatureof lotsofdevicesandlotsofservicesdoesn'tscaletoowell.Allthreeproductsdoport sniffingsoitcomesdowntohoweasyitistoconfigureeconomicmonitoring.

9.1 Feature comparisons


Thefollowingtablesstartwithmyrequirementsdefinitionandcomparethethree productsonafeaturebyfeaturebasis.(OOTB=OutOfTheBox).

9.1.1 Discovery
Nagios OpenNMS Zenoss GUI,CLIandbatch importfromtextor XMLfile Nodediscovery Configfileforeach Configfilewith node include/exclude ranges Automatic discovery Interface discovery No Possiblethrough configfile

Yesnodeswithin Yesnetworks&nodes configuredn/wranges Yesincludingswitch ports Yessend_event.pl Yesincludingswitch ports YesuseSNMP,sshor telnet mySQL&ZopeZEO YesTCPandUDP Yeswithssh, zenPacksorplugins

Discovernodes Yesuse thatdon't check_ifstatus supportping plugin SQLDatabase No Service(port) discovery Application discovery Yesuseplugin (TCP,UDP,....)

PostgreSQL Yesvariousoutof thebox

Yesdefineservice Notwithoutextra agenteg.NRPE

143

Nagios Supports NRPE/ NSClient L3topology map L2topology map Yes Yes

OpenNMS Possible

Zenoss

SNMPsupport V1,2&3 Yes No

V1,2&3 No No

V1,2&3 Yesupto4hops No(butmaybein plan!)

9.1.2 Availability monitoring


Nagios Pingstatus monitoring Alternativesto pingstatus Portsniffing Yes Yes OpenNMS Yes Yesssh,telnet, ZenPacks,Nagios plugins Yes YesHostResources MIB Zenoss

Yesanyplugin Nagiosplugins eg.check_ifstatus Yes Yes

Processmonitoring Yeswithplugins Nagiosplugins

Agenttechnology Generallyrelies SNMPoutofthebox; SNMP,sshclient, onNagiosplugins customisedplugins WMIforWindows, deployed possible ZenPackstobe deployed Availabilityreports Yes Yes Yes

9.1.3 Problem management


Nagios Configurable eventconsole Severity customisation No Yes Yes Yes OpenNMS Yes Yes Zenoss

144

Nagios Event configuration SNMPTRAP handling email/pager notifications Automation No No Yes

OpenNMS Flexible.LotsOOTB Flexible.LotsOOTB Yeswith configurable escalation autoactionson events

Zenoss Flexible.LotsOOTB Flexible.LotsOOTB Yes

autoactionson events

goodnews/badnews goodnews/badnews correlationonalarms correlationonevents andnotifications andnotifications Deduplication Noautomaticrepeat Yes countmechanismbut eventsdonotcontinue toberaisedfor existingproblems Service/host dependencies Rootcause analysis Yes UNREACHABLE Outages/Path statusfordevices outages behindnetworksingle pointoffailure. Also,host/service dependencies. Yes

No No

9.1.4 Performance management


Nagios Collect No performancedata usingSNMP Collect No performancedata usingother methods Yes OpenNMS Yes Zenoss

NSClient,JMX, HTTP

ssh,telnet,other methodsusing ZenPacks

145

Nagios Threshold performancedata Graph performancedata MIBcompiler MIBBrowser No No No No Yes

OpenNMS Yes

Zenoss

Yeslotsprovided OOTB No No

Yeslotsprovided OOTB Yes No(thoughaMIB BrowserZenPackis saidtobeavailable for2.2)

9.2 Product high points and low points


Thissectionisfarmoresubjectiveyourmileagemayvary!

9.2.1 Nagios goodies and baddies


Goodpoints Good,stablecodeforsystems management Goodcorrelationbetweenservice eventsandhostevents Commandtocheckvalidityofconfig files Badpoints Noautodiscovery Weakeventconsole NoOOTBcollectionorthresholdingof performancedata

Commandtoreloadconfigfileswithout NoeasywaytoreceiveandinterpretSNMP disruptingNagiosoperation TRAPs Gooddocumentation NoMIBcompilerorbrowser

9.2.2 OpenNMS goodies and baddies


Goodpoints GoodOOTBfunctionality Codefeelssolid Badpoints WritteninJavalogfileshopeless!Difficult togetindividualdaemonstatus Nomap(thatworksreasonably)

Clean,standardconfigurationthrough GUIiswordydifficultfortheeyetofocus wellorganisedxmlfiles ontheimportantthings 146

Goodpoints Singledatabase(PostgreSQL) LOTSoftrapcustomisationOOTB

Badpoints NeedtobounceentireOpenNMSwhen almostanyconfigfileischanged Event/alarm/notificationarchitectureis currentlyamess(underreview) Nowaytochangecoloursofevents NoMIBcompilerorbrowser

Abilitytodosomeconfiguration throughwebAdminmenu EasyimportofTRAPMIBs (mib2opennms) ChargeablesupportavailablefromThe OpenNMSGroup SupportsNagiosplugins

Nopdfdocumentation.Wikihardtofind detailedinformation.

SomegoodHowtodocumentsforbasic Lotsofthingsundocumentedwhenyouget configurationonthewiki downtodetails.

9.2.3 Zenoss goodies and baddies


Goodpoints GoodOOTBfunctionality Badpoints Nocorrelationbetweenserviceeventsand hostevents

Architecturegoodbasedaroundobject Implementationfeelsbuggy orientedCMDBdatabase Topologymap(upto4hops) Lotsofplugins&zenPacksavailable emailnotificationsincludeURLlinks backtoZenoss Commercialversionavailable GoodQuickStartmanual, Administratorsmanualandbook SupportsNagios&Cactiplugins NoMIBbrowser Nowaytochangecoloursofevents Commercialversionavailable Lotsofthingsundocumentedwhenyouget downtodetails

147

9.3 Conclusions
Whattochoose?Backtoyourrequirements! Forsmallish,systemsmanagementenvironments,Nagiosiswelltestedandreliable withahugecommunitybehindit.Foranythingmorethansimplepingchecksplus SNMPchecks,bearinmindthatyoumayneedawaytoinstallremotepluginson targethosts.Notificationsarefairlyeasytosetupbutifyouneedtoproduceanalysis onyoureventlogthenNagiosmaynotbethebestchoice. OpenNMSandZenossarebothextremelycompetentproductscoveringautomatic discovery,availabilitymonitoring,problemmanagementandperformance managementandreporting.Zenosshassometopologymappingandhasbetter documentationbutthecodefeelslessreliable.OpenNMScurrentlyhasarather messyarchitecturearoundevents,alarmsandnotifications,thoughthisissaidtobe underreview.Ialsostruggletobelievethatyouhavetorecyclethewholeof OpenNMSifyouhavechangedaconfigurationfile!Thecodefeelsverystablethough. Mychoice,hopingferventlythatcodereliabilityanddocumentationimproves,is Zenoss.

148

10 References
1. itSMFPocketGuide:ITServiceManagementaCompaniontoITIL,IT ServiceManagementForum 2. MultiRouterTrafficGrapher(MRTG)byTobiOetiker, http://oss.oetiker.ch/mrtg/ 3. RRDtoolhighperformancedataloggingandgraphingsystemfortimeseries datahttp://oss.oetiker.ch/rrdtool/ 4. netdisconetworkmanagementapplicationhttp://www.netdisco.org/ 5. TheDudenetworkmonitorbyMicroTik,http://www.mikrotik.com/thedude.php 6. nagioshost,serviceandnetworkmonitoringprogramhttp://www.nagios.org/ 7. Zenossnetwork,systemsandapplicationmonitoringhttp://www.zenoss.com/ 8. OpenNMSdistributednetworkandsystemsmanagementplatform http://www.opennms.org/ 9. cactinetworkgraphingsolutionhttp://www.cacti.net/ 10. SNMPRequestsForComment(RFCs)http://www.ietf.org/rfc.html 11. 12. 13. V1RFCs1155,1157,1212,1213,1215 V2RFCs2578,2579,2580,3416,3417,3418 V3RFCs25782580,341618,3411,3412,3413,3414,3415

14. SNMPHostResourcesMIB,RFCs1514and2790http://www.ietf.org/rfc.html 15. PHPscriptinglanguagehttp://www.php.net/ 16. ZenossCoreNetworkandSystemMonitoringbyMichaelBadger,published byPACKTPublishing,June2008,ISBN9781847194282.

11 Appendix A Cacti installation details


Cacti0.8.6j64.4wasinstalledonanOpenSuSE10.3Linuxsystem. Prerequisitesare:

Awebserver(Apache2.2.470) PHP(5.2.58.1) RRDTool(1.2.2347) netsnmp(5.4.119)

149

MySQL(5.0.4522)

Cacti,aswellasalloftheprerequisites,wereavailableontheOpenSuSE10.3 standarddistributionDVD. UsetheInstallationunderUnixinstructionsavailablefrom http://www.cacti.net/downloads/docs/html/install_unix.html. Afewmodificationswererequiredsuchas:

NoPHP5configurationwasdoneasthefilesdocumentedintheinstallation guidedidnotexist ConfigurationofApache2requirednomodifications in/etc/apache2/conf.d/php5.conf CactiwasinstalledusingthestandardSuSEYastmechanism CreatetheMySQLdatabaseby: cd/usr/share/cacti mysqluser=rootp(andsupplytherootpasswordwhenprompted) createdatabasecacti; sourcecacti.sql; GRANTALLONcacti.*TOcactiuser@localhostIDENTIFIEDBY

'cacti'; (Notethatcactiintheabovecommandisthepasswordfortheuser cactiuser)

YouneedtomanuallycreatetheOperatingSystemusercactiuserwith passwordcacti Whenpointingyourwebbrowserathttp://<yourserver>/cacti/ensurethatyou includethetrailingslash.Useaweblogonofadmin,passwordadmin. Ensurethatapache2andmysqlareeithermanuallystarted(/etc/init.d/<name> start)orstartthemautomaticallyatsystemstartusingchkconfig Ensurethatthecactiuseruseridcanexecutethe/usr/share/cacti/poller.php scriptthatisrunby/etc/crontab. AlsoensurethatthedirectorythattheRRDdataiswrittento(/var/lib/cacti)is writeablebythisuser. cacti.logisin/var/log/cacti Ifound(through/var/log/messages)thatpoller.phpwasbeingruntwice,oncein /etc/crontabascactiuserandoncein/etc/cron.d/cactiasuserwwwrun commentoutthelinein/etc/cron.d/cactiandcheckagainthatcactiusercan writetothedatafilesin/var/lib/cacti.

150

Theinitialconsolepageisagoodstartingpointtoadddevicestomonitorand associatedgraphs.

About the author


JaneCurryhasbeenanetworkandsystemsmanagementtechnicalconsultantand trainerfor20years.Duringher11yearsworkingforIBMshefulfilledbothpresales andconsultancyrolesspanningthefullrangeofIBM'sSystemViewproductspriorto 1996andthen,whenIBMboughtTivoli,shespecialisedinthesystemsmanagement productsofDistributedMonitoring&IBMTivoliMonitoring(ITM),thenetwork managementproduct,TivoliNetViewandtheproblemmanagementproductTivoli EnterpriseConsole(TEC).AlltheseproductsarebasedaroundtheTivoliFramework productandarchitecture. Since1997Janehasbeenanindependentbusinesswomanworkingwithmany companies,bothlargeandsmall,commercialandpublicsector,deliveringTivoli consultancyandtraining.Overthelast5yearsherworkhasbeenmoreinvolvedwith OpenSourceofferings.

151

You might also like