You are on page 1of 35

ArcSight System Monitoring Content

Created and Presented by

Balahasan V. | SIEM Engineer


Contents
ArcSight System Monitoring Content ................................................................................................................................................................ 0
Useful System Monitor DataMonitors ............................................................................................................................................................... 2
Active List ............................................................................................................................................................................................... 3
Session List ............................................................................................................................................................................................. 3
Smart Connector .................................................................................................................................................................................... 3
Asset ....................................................................................................................................................................................................... 4
Resource Memory Usage........................................................................................................................................................................ 5
Task Scheduler ........................................................................................................................................................................................ 6
Database Transactions............................................................................................................................................................................ 6
Active Channel ........................................................................................................................................................................................ 7
Data Monitor .......................................................................................................................................................................................... 7
License .................................................................................................................................................................................................... 8
Error Log ................................................................................................................................................................................................. 8
Task Scheduler ........................................................................................................................................................................................ 8
Rules ....................................................................................................................................................................................................... 9
User ........................................................................................................................................................................................................ 9
Report..................................................................................................................................................................................................... 9
ESM Resource Audit Events ............................................................................................................................................................................. 10
General Resource Updates ................................................................................................................................................................... 10
Smart Connector .................................................................................................................................................................................. 10
User Authentication ............................................................................................................................................................................. 12
Actor ..................................................................................................................................................................................................... 13
Archive ................................................................................................................................................................................................. 13
Active Channel ...................................................................................................................................................................................... 14
License .................................................................................................................................................................................................. 15
Content Management .......................................................................................................................................................................... 15
Group Management ............................................................................................................................................................................. 16
Manager and Database ........................................................................................................................................................................ 16
Notification ........................................................................................................................................................................................... 17
Pattern Discovery ................................................................................................................................................................................. 17
Query Viewer........................................................................................................................................................................................ 18
Report................................................................................................................................................................................................... 18
Trend .................................................................................................................................................................................................... 18
Resource Quota .................................................................................................................................................................................... 19
Rule ...................................................................................................................................................................................................... 19
Scheduler .............................................................................................................................................................................................. 20
Session List ........................................................................................................................................................................................... 21
User Login ............................................................................................................................................................................................. 21
ESM Health Monitoring Example Scenarios .................................................................................................................................................... 22
ESM Manager ....................................................................................................................................................................................... 22
Database............................................................................................................................................................................................... 27
Logger ................................................................................................................................................................................................... 28
Connector Appliance ............................................................................................................................................................................ 31
Connector ............................................................................................................................................................................................. 32
Useful System Monitor DataMonitors
Choose ur Data Monitor Type as: System Monitor Attribute and provide the Possible Attribute
Names.
So the Monitoring and Troubleshooting works Vice versa.
Active List
Monitor Type : ActiveListMonitor
Attribute Name : ActiveCacheInformation

Session List
Monitor Type : SessionListMonitor
Attribute Name : SessionCacheInformation

Smart Connector
Monitor Type : AgentStateTracker
Attribute Name : HeartbeatState
Attribute Name : AgentStatuses
Attribute Name : AgentsFilter
Attribute Name : ManagerStatistics
Attribute Name : ManagerThroughputs
Monitor Type : SeededJsseListener
Attribute Name : OngoingSessions

Asset
Resource Memory Usage
Monitor Type : CapsManager
Attribute Name : MemoryUsageInfo
Attribute Name : MemoryLimit

Monitor Type : HostSystemInfo


Attribute Name : CPUStatisticsPercent

Monitor Type : QueryHistoryManager


Attribute Name : FinishedQueries
Task Scheduler
Monitor Type : CheckScheduler
Attribute Name : ScheduledChecks

Database Transactions
Monitor Type : DBSecurityEventBroker
Attribute Name : SideObjectCacheStatistics
Attribute Name : SideObjectManagerStatistics
Attribute Name : SideObjectFloodStatistics
Attribute Name : SideObjectPerAgentStats

Monitor Type : DatabaseInfoBroker


Attribute Name : DatabaseFreeSpaceSummary
Attribute Name : EventTableCompressionInfo
Active Channel
Monitor Type : DynaChannelImplRegistry
Attribute Name : ChannelsByClient

Monitor Type : DynaChannelImpl


Attribute Name : ChannelInformation

Data Monitor
Monitor Type : FilterOptimizedXCPUDMPC
Attribute Name : ProbeStats
Attribute Name : ProbeTypeStats
License
Monitor Type : LicenseInfo
Attribute Name : LicenseInfoSummary

Error Log

Task Scheduler
Monitor Type : Scheduler
Attribute Name : TaskQueue
Rules
Monitor Type : RulesEngine
Attribute Name : LoadedRules

User

Report
ESM Resource Audit Events
General Resource Updates

Smart Connector
User Authentication
Actor

Archive
Active Channel
License

Content Management
Group Management

Manager and Database


Notification

Pattern Discovery
Query Viewer

Report

Trend
Resource Quota

Rule
Scheduler
Session List

User Login
ESM Health Monitoring Example Scenarios
ESM Manager
‘Event Throughput’ Dashboard Check
 Compare the ‘current’ event rates (EPS/EPD) with what the architecture was ‘originally sized’
for regularly.
 If the customer has outgrown the architecture, make recommendations in the Architecture
Review section of the Health Check Report.
 Check for any Network Latency Issues between ur Agents and Manager and update ur
Manager/Agent Settings to avoid caching and events dropping.
 Enable Load Balancing in Agent Setup to avoid conflicts in Performance and Data Loss.

Current Event Sources’ Check


 Are there any ‘Unknown’ Vendor/Products listed?
o If yes, maybe there’s a possible parsing problem to investigate.
o Upgrade to newer version or install updates(.aup) to avoid more Unparsed Events.
o Are the ‘Unknown’ Vendor/Products useless devices and should be excluded?
 Which Vendor/Products have the highest EPS?
o Helps to prioritize which device types should be tuned first.
o Apply Filter on Agent to remove Noise and other unwanted Events.
o Use information to work on new Use Cases

o
Hardware and OS Check
 Is there sufficient CPU Cores and Memory?
 Is there sufficient Disk Space for Archives?
 Is the Operating System requires any patching/upgrade?

CPU and Memory Utilization Check


 Check for high CPU and Memory utilization
o Linux/Unix: Execute top
o Windows: Task Manager or Performance Monitor
 If the utilization is high, is it ArcSight or a third-party process that’s causing it?

ESM Manager JVM Utilization Check


 ‘ESM System Information’ Dashboard
 Review.../manager/logs/default/server.std.log to determine frequency of Full GCs
o Healthy = A Full GC once every hour or more
o Unhealthy = A Full GC once every 5 to 10 minutes or less
o determine how long each Full GC takes to complete
 Review CAPS Manager and the Rules Status Dashboard to determine resources consuming
the most memory
 Configure the Manager's JVM heap size to 2 x the average heap usage
Data Monitor Utilization Check
 Review the Data Monitor section of Caps Manager to reveal which Data Monitors are
consuming the most memory
 Disable all unused Default Data Monitors and disable ACL to deploy DataMonitors to Users
Accordingly.
 Tune Data Monitors that are used in Use Cases
o Avoid using generalized Filters in Data Monitors (too many matches).
o If possible, adjust the number of buckets (samples) and the seconds for each bucket
(sample size) to reduce memory utilization.
 Additional details for each Data Monitor can be found in the ProbeStats section of
FilterOptimizedXCPUDMPC

Active List/Session List Utilization Check


• Review the Active Lists section of Caps Manager to reveal which Active Lists are consuming
the most memory
o If an Active List is only used to lookup a value after a Rule, consider changing the
Active List to Partially Loaded to reduce memory consumption.
• Review the ActiveCacheInformation/ SessionCacheInformation section of Active List Monitor
o Fix Active/Session Lists that are at or near 100% capacity.
o The Queries and Changes per Second columns may help determine how heavily the
Active/Session Lists are used.
Rules Engine Check
• Review the ‘Rules Status’ Dashboard
o Tune or disable Rules with excessive partial matches(can cause high memory
utilization in the Manager’s JVM)
o Fix Rules with errors, loops, and auto disabled Rules, or other areas of concern.
o Review the ‘Top Firing Rules’ Data Monitor for excessive Rule fires and tune or
disable as needed.
o Fine Tune Rules with Aggregation, Complex Conditions and Rule Trigger Action
Methods.
• Tip: Utility Rules (Example: Rules used to update Active List) should be configured as Light-
weight Rules to prevent unnecessary Rule fires (High aggregation).

Event Persistence Performance Check


• Review.../manager/logs/default/server.std.log or LogFu for event persistence performance.
o Benchmark = 1 event in 1 ms
• Event Insertion performance can be negatively impacted by poorly written content, network
latency to the Database, or Disk I/O contention on the SAN attached to the Database.

Error Check
• Review both.../manager/logs/default/server.std.log and server.log for ERROR and WARN
messages
o tail -f server.log | grep -v INFO (exclude INFO messages)
o Review the MostRecentErrorLogRecords of LogManager for the Recent Errors
Logged
o Utilize the ‘arcsight exceptions’ command:
 <ARCSIGHT_HOME>/bin/arcsight exceptions –n
 <ARCSIGHT_HOME>/logs/default/*.log*
o Review the ‘System Events’ Active Channel for High and Very-High system events
Scheduled Task Check
• Verify that scheduled tasks don’t conflict with each other
• Heavy Tasks should be scheduled during off hours
• Are there any failed jobs?

Agent and Console Threads Check


• Check the Thread Logs for any Performance and other Parameters(Activities)
Database
Database Performance Statistics’ Check
• Database Free Space
o If the Event Data Free Space is low (below 10% free), there are three ways to fix this
situation:
 Increase the ‘online’ event storage size or Event Data Table and extend the
database using xts Command.
 Reduce the ‘online’ retention period
 Reduce the event volume
o Check the Scheduled Event Archiving and Partition Archival is happening in right
pace.
• If un-indexed columns are being used, then the query performance will be slower as
expected.

Reports and Trend Jobs Check


• Check the Reporting Subsystem Statistics Dashboard to check the Report Runtime and queue reports
and fine tune the Longest Querying Report.
• Check the ‘Trends Status’ Dashboard
o Failed Trend Runs and Disabling Trends
o Trends that appear to take longer than others to complete
o Are problems caused by poorly written Trends or pre-existing Database performance
problems?
ESM Database Storage Check
• Check for I/O contention
o Linux/Unix: Execute iostat and look for high I/O Waits
o Windows: Use Performance Monitor and check for high Disk Queue Length
o Is there sufficient free Disk Space to extend the ‘online’ database?
• Is there sufficient free Disk Space for the offline archives?

Logger

CPU, Memory, and EPS In/Out Check


 Check monitor dashboards and Summary for high CPU and Memory utilization.
 Check EPS In/Out ratio on each Receiver and Device is as per the Architecture.
 System Process Status.

Search Performance Check


 Check the Runtime of a Standard Search Query (check for each type- text, field, regex,..) .
 Check the Peer Search, Search Options and Export Performances.
 Configure and use appropriate the Field sets, Storage/Device Groups in Search.

Custom Report Performance Check


 Check the Runtime of a Standard EPS Report
 Note: Only indexed fields used in Where, Order by clause and limit fields in Group by Clause.

Receivers and Forwarders Check


 Check the Receivers (EPS 0) and Forwarders (Caching) Configured is sending/receiving the Logs by
checking the Smart Connector Destination Settings.
 Check the Forwarder and Events Forwarded Filter Configurations.
Storage Group Check
 Check the Storage groups Space Available, Configured Receivers and utilization of each Storage Group
and reduce the Event Rate in Agent Level.
 Check the Device Groups and Storage Rules Configured if any changes required in Load Balancing the
Storage.

Index Configuration Check


 Check the Indexing options and the Fields used for Indexing
 Check the Field Search Options for Optimizing the Search Performance
 Custom Field sets for making Search Quicker must be validated.

Configured Alerts Check


 Check the configured Alerts as a part of Health Monitoring and Storage Errors.

Scheduled Task Check


• Verify that scheduled tasks don’t conflict with each other and disable unwanted Tasks.
• Heavy Tasks should be scheduled during off hours
• Are there any failed jobs like Backup & Archival, Reports and investigate them.
Event Archive and Configuration Backup Check
 Check whether the Archive Process is running regularly and Archive Events are Generated.
 Check the Retention Policies, Storage rules/groups and Archive Schedules
 Check the Configuration Backup is taken and disk space on regular basis

Logger System Health and Audit Event Forwarding Check


 Perform the Logger Maintenance Activities regularly to improve Performance.
 Check the System Health of the Logger in the Dashboard Summaries.
 Check the Audit Events are being forwarded to ESM.

Network Configuration Check


 Check the NTP Settings are properly updated in every logger and Sync with Time zone.
 Check the Network Interface and duplex settings are working properly.
 Check the DNS settings, including Search Domains.
Connector Appliance
CPU and Memory Check
 Review the following for excessive utilization:
o CPU utilization is continuously above 70-80% in CA Dashboard
o EPS In Details.
o Check the Monitor Dashboards for unusual peaks or drops
o Check the System Process Status
 At regular intervals, SSH to the Connector Appliance
o Run commands such as top, df, ifconfig, to get to the OS level

Connector Appliance Version Check


Any ‘known issues’ with current version- If Yes Maintain Separate Tracker to update the
Issues regularly.

Network Settings Check


 Common problems to check:
o Incorrect duplex settings on network interface
o DNS or NTP not configured properly
Configuration Backup Check
 The daily Configuration Backup job should be scheduled on all Connector Appliances.
 Perform the Audit Log Check in order to identify any Modifications made during certain
period of time.
 Use the Diagnostic Tools to check the CA Base Machine.

Connector
Up/Down Check and Version Check
 Check the Connector Status Dashboard for the Latest Connector Version.

Connector Event Rate Check


 Are there any Connectors receiving high event rate?
o Syslog : >= 1,500 EPS
o WUC: > 500 to 1,000 EPS
o DB-based : >= 200 EPS
 Is the high EPS Connector stable?
o If not, should recommend another Connector for load balancing
Cache Check
 All Connectors should have 0 events in cache
o If most are ‘continuously’ caching = Possible ESM ‘Event Insertion’ issue(check thread)
o If one or two are ‘continuously’ caching = Possible Connector problem or network issue.
o U can clear the Cache Data in the AgentHome/Agentdata/ of the Agent which is caching.
o

Logs Check
 ../current/logs/agent.out.wrapper.log
• Java Heap Memory Utilization
o Memory utilization
o Frequency of Full GCs
o Memory in Red Zone alerts
• Unexpected Restarts
• Time zone errors
• Connectivity Errors
o End Devices
o ArcSight Destinations
o Certificate Errors
 ../current/logs/agent.log
o Parsing errors
o DOSProtector
o WARN and ERROR messages
o Custom Override Details

Configuration Check
 Destination Settings
 Common problems found:
o No Networks and Customer
o Poor Fields-based Aggregation
o No Filter applied on high EPS Connectors
o Non consistent Settings
Reference:
ESM Administration Guide
ESM User Guide Audit Events

Tip: A Lot of useful Contents Found in Community Forum like one below:
https://protect724.hp.com/docs/DOC-1877
Next Release with more on Fine Tuning Steps and Advanced ESM System Content Management and
Detailed Troubleshooting Steps with Individual Resource Breakdowns.

You might also like