IBM QRadar Fundamentals Administration

IBM Security Systems
IBM QRadar Fundamental Administration

Faisal Rafiq
Security Analyst | Blue Teamer | Threat Hunter | Trainer
faysalrafeeq@gmail.com
© 2012 IBM Corporation

1 © 2013 IBM Corporation
What is Security Operations Center (SOC)

Definition
• A Security Operation Center (SOC) is a centralized function within an organization employing people, processes, and technology to continuously
monitor and improve an organization's security posture while preventing, detecting, analyzing, and responding to cybersecurity incidents.
• A SOC acts like the hub or central command post, taking in telemetry from across an organization's IT infrastructure, including its networks,
devices, appliances, and information stores, wherever those assets reside.
• Essentially, the SOC is the correlation point for every event logged within the organization that is being monitored.

Security Operations and Organizational Structure

Staffing and Structure
• The function of a security operations team and, frequently, of a security operations center (SOC), is to monitor, detect, investigate, and respond to
cyberthreats around the clock.
• Security operations teams are charged with monitoring and protecting many assets, such as intellectual property, personnel data, business
systems, and brand integrity.
• Security operations teams act as the central point of collaboration in coordinated efforts to monitor, assess, and defend against cyberattacks.
• SOCs have been typically built around a hub-and-spoke architecture.
• The SOC is usually led by a SOC manager, and may include incident responders, SOC Analysts (levels 1, 2 and 3), threat hunters and incident
response manager(s). The SOC reports to the CISO, who in turn reports to either the CIO or directly to the CEO.

Key functions performed by the SOC

SOC Functions
Protection of assets
What the SOC How the SOC

Protects Protects


SOC Functions
Preparation and
Preventative
Maintenance
Preventative
Preparation
Maintenance


SOC Functions
• Continuous Proactive • Alert Ranking and

Monitoring Management
SIEM/EDR Monitoring
Tools
Root Cause
Log Management
Investigation
Remediation What, When,
& Forensics How & Why
Security Refinement Compliance

and Improvement Management
Red, Blue & GDPR, HIPAA &

Purple teaming PCI DSS
• Threat Response • Recovery and
Remediation
Incident Restore/Backup
Response

IBM QRadar Architecture

All-in-One Deployment
QRadar All-in-one QRadar All-in-one
Hardware Specification Hardware Specification

Up to 5000 EPS & 200,000 FPM Up to 15,000 EPS & 300,000 FPM
Model No. 3105 Model No. 3128
User Interface
Log Events
The QRadar All-in-One appliance performs the following tasks:
• Collects event and network flow data, and then normalizes the data in to a data
format that QRadar can use.
NetFlows
• Analyzes and stores the data, and identifies security threats to the company.
• Provides access to the QRadar web application.
As your data sources grow, or your processing or storage needs increase, you can add
Hosts Apps appliances to expand your deployment.


QRadar Distributed Deployment
QRadar Distributed – Console
Console
Unlimited Flow & Event Processors
Model No. 3100
3 User Interface
Log Events Net Flows
Event Processor Flow Processor

Up to 80,000 EPS Up to 3.6 million FPM
Model No. 1600 Model No. 1700
2 Data compress & Encrypted
Event Collector Flow Collector
8 Log Sources © 2013 IBM Corporation


QRadar Add-on Components
QRadar Distributed – Console
Console
Unlimited Flow & Event Processors
Model No. 3100
App Host
128 GB RAM
User Interface
Log Events Net flows
Event Processor Flow Processor

Data Node Up to 3.6 million FPM
100 TB storage, 20 CPU Model No. 1700
128 GB RAM

QRadar Architecture Overview
QRadar Console

High-level Component Architecture & Data Store

High-level Component Architecture & Data Store

Flow Collector Architecture

Event Collector Architecture

Event Processor Architecture

Console Architecture

Three Core Layers Functionality of QRadar System
Data Data
Collection Processing
Data Searches

QRadar Events and Flows

Functions of Event Collector Component

Functions of Event Processing Component

Magistrate on the QRadar Console

Functions of QRadar Flows

High Availability Deployment

High Availability Deployment

Context and Correlation Drive Deepest Insight
Security Devices
Servers & Mainframes

Event Correlation True Offense
• Logs • IP Reputation
Network & Virtual Activity
• Flows • Geo Location
Data Activity Offense Identification

Activity Baselining & Anomaly • Credibility
Application Activity
Detection • Severity
• Relevance
• User Activity
Configuration Info • Database Activity
• Application Activity
Vulnerability & Threat • Network Activity
Users & Identities Suspected Incidents
Extensive Data Deep Exceptionally Accurate and

Sources + Intelligence = Actionable Insight

IBM QRadar Deplyment Overview
Single OR Multiple hosts deployments
• IBM QRadar architecture supports deployments of varying sizes and topologies.

• Single host deployment, where all the software components run on a single system.
• Multiple hosts, where appliances such as Event Collectors, and Flow Collectors, Data
Nodes, an App Host, Event Processors, and Flow Processors, have specific roles.
• QRadar deployment depend on the capacity of your chosen deployment to both
process and store all the data that you want to analyze in your network.
Before you plan your deployment, consider the following

questions:
• How does your company use the Internet? Do you upload as much as you
download? Increased usage can increase your exposure to potential security
issues.
• How many events per second (EPS) and flows per minute (FPM) do you need to
monitor?
• EPS and FPM license capacity requirements increase as a deployment grows.
• How much information do you need to store, and for how long?

An Integrated, unified architecture in a single console

Dashboard View


Offenses View


Log Activity View


Network Activity View


Assets View


Reports View


Admin View

Fully Integrated Security Intelligence
• Turn-key log management and reporting

Log
• SME to Enterprise
Management
• Upgradeable to enterprise SIEM
• Log, flow, vulnerability & identity correlation

SIEM • Sophisticated asset profiling
• Offense management and workflow
Configuration • Network security configuration monitoring

& Vulnerability • Vulnerability prioritization
Management • Predictive threat modeling & simulation
Network
• Network analytics
Activity &
• Behavioral anomaly detection
Anomaly
• Fully integrated in SIEM
Detection
Network and • Layer 7 application monitoring

Application • Content capture for deep insight & forensics
Visibility • Physical and virtual environments

Fully Integrated Security Intelligence
• Turn-key log management and reporting

Log
• SME to Enterprise
Management
• Upgradeable to enterprise SIEM
One Console Security
• Log, flow, vulnerability & identity correlation
SIEM • Sophisticated asset profiling
• Offense management and workflow
Configuration • Network security configuration monitoring

& Vulnerability • Vulnerability prioritization
Management • Predictive threat modeling & simulation
Network
• Network analytics
Activity &
• Behavioral anomaly detection
Anomaly
• Fully integrated in SIEM
Detection
Network and • Layer 7 application monitoring

Application
Visibility
Built on a Single Data Architecture
• Content capture for deep insight & forensics
• Physical and virtual environments

Windows Collection Options

Microsoft Security Event Log over MSRPC

WinCollect Managed Deployment
The managed WinCollect deployment has the following capabilities:

• Central management from the QRadar Console or managed host.
• Automatic local log source creation at the time of installation.
• Event storage to ensure that no events are dropped.
• Filters events by using XPath queries or exclusion filters.
• Supports virtual machine installations.
• Console can send software updates to remote WinCollect agents
without you reinstalling agents in your network.
• Forwards events on a set schedule (Store and Forward)
Note: In a managed deployment, WinCollect is designed to work with up to 500 Windows agents per Console and managed host. For example, if you have a deployment with a Console,
an Event Processor, and an Event Collector, each can support up to 500 Windows agents, for a total of 1,500. If you want to monitor more than 500 Windows agents per Console or managed
host, use the stand-alone WinCollect deployment.

WinCollect stand-alone Deployment
Stand-alone WinCollect mode has the following capabilities:

• You can configure each WinCollect agent by using
the WinCollect Configuration Console.
• You can update WinCollect software with the software update
installer.
• Event storage to ensure that no events are dropped.
• Filters events by using XPath queries or exclusion filters.
• Supports virtual machine installations.
• Send events to QRadar using TLS Syslog.
• Automatically create a local log source at the time of agent
installation.

Installation prerequisites for WinCollect
Local Collection:
The WinCollect agent collects events only for the host on which it is installed. You can use this collection method on
a Windows host that is busy or has limited resources, for example, domain controllers.
Important:
QRadar Support recommends local collection on Domain Controllers and other high EPS servers, as it is more stable
than remote collection. If you are remote polling logs on potentially high EPS servers, QRadar Support might require
you to install an agent locally on the server.

Remote collection for WinCollect agents
Remote Collection:
The WinCollect agent is installed on a single host and collects events from multiple Windows systems. Use remote
collection to easily scale the number of Windows log sources that you can monitor.

WinCollect agents remotely polling Windows event sources

WinCollect agents that remotely poll other Windows operating systems require extra ports. The following table describes the ports that are used.
Port Protocol Usage

135 TCP Microsoft Endpoint Mapper
137 UDP NetBIOS name service
138 UDP NetBIOS datagram service
139 TCP NetBIOS session service
445 TCP Microsoft Directory Services for file transfers that use Windows share
49152 – 65535 TCP Default dynamic port range for TCP/IP
Note: Exchange servers are configured for a port range of 6005 – 58321 by default.
Important: To limit the number of events that are sent to QRadar, administrators can use exclusion filters for an event based on the Event ID or Process.

Deploy Changes and Deploy Full Configuration
Examples of QRadar changes that require a Deploy Changes: Examples of QRadar changes that require Deploy Full Configuration:
• Adding or editing a new user or user role. • Adding or removing a host in the deployment editor that has an EC, EP, or MPC component.
• Adding or updating network hierarchy. • Adding, removing, or editing the values on an EC/EP component or offsite source or target
• Adding a new security profile. component in the deployment editor.
• Creating a new authorized service token. • Adding or updating a license that changes the EPS or FPM (flows per minute).
• Adding a centralized credential (security descriptor) • Enabling or disabling encryption (Tunneling) on a "managed host".
• Adding a new log source.
• Setting a password for another user.
• User changing their own password.
• Change a users' user role and/or security profile.

IBM QRadar License Management
Define features of license management:
• Viewing license details.

• Uploading a license key.
• Allocating a license key to a host.
• Deleting expired licenses.
• Exporting license information.

Viewing a license details

View the license details to review the following:
• Status
• Expiration Date
• Event and Flow rate limit
- Navigate to the Admin console and execute the System and License Management application
- Select a license from the Display list and then select Licenses

License details
View license status and expiration:

Status of QRadar Licenses
Unallocated Undeployed Deployed

• The license is uploaded but not allocated • The license is allocated to a QRadar host • The license is allocated and active in
to a QRadar host. but is not deployed. The license is not yet your deployment.
active in your deployment.
• The events per second (EPS) and flow • The EPS and FPM are included in the
per minute (FPM) of the license do not • The EPS and FPM are included in the license pool.
contribute to the license pool. license pool.

When you uploading and assigning license keys
• To update and expired QRadar license.

• To increses the event per second (EPS) or flow per minute (FPM) limits.
• To add a QRadar product to your deployment, like QVM or QRM etc.
• In QRadar 7.3.0 or above, you do not need to upload a new license when you add an Event or Flow Processor to your
deployment.
• Event and Flow Processors are automatically assigned a perpetual appliance license, and you can allocate EPS or FPM capacity
from the license pool to the appliance.
Note:
• License keys determine your entitlements to QRadar products and feature, and the system's capacity for handling event and flows.
• You must upload a license key to your QRadar console when you have to update expired licenses.
• A License upload is also required when you want to increases the event and flow rate capacities for your deployed collectors,
processors, or console.

Upload and allocate a license process

Example license key file

Allocating a license key to a managed host
Prerequisite: • Allocating a license key to a managed • Multiple licenses can be allocated to a

host when you want to: QRadar Console
• You must upload a license key before
allocating it. • Replace an existing license • For example, you can allocate
• Add new QRadar products to the license keys that add QVM and QRM
system to your QRadar console.
• Increase the event or flow capacity in
the shared license pool

Allocating a license key to a managed host

The following steps illustrate the process to allocate a license to a managed host

When to delete an expired license
When you mistakenly allocated a license to a wrong managed host

When you want to stop the daily system license expiration notifications

How to restore a deleted license
To restore a deleted license:
1. Uploaded the same license

2. Allocate the license to a host
3. Deploy the full configuration
• You cannot revert a license key allocation after you have assigned it to a managed host.
• If you have mistakenly allocated a license to the wrong host, you must deploy the change, and then delete the license from the system.
• After the license is deleted, you can upload the license again, and then reallocate it.
• After the license has bee allocated to the correct host, deploy the changes again.

Exporting license information

Explain Core Services of IBM QRadar
Service Purpose Runs on Impact

Runs on each appliance in a deployment. Runs Hostcontext is the manager for all the other
the "ProcessManager" component that is services except ecs-ingress. All services
responsible for starting, stopping and verifying controlled by hostcontext would be inactive till
status for each component within the deployment. they restarted.
It is responsible for the packaging (console) and
the download/apply (MH) of our DB replication
bundles. Responsible for requesting,
downloading, unpacking, and notifying other
Console
hostcontext components within an appliance of updated
Managed Hosts
configuration files. Responsible for monitoring
postgresql transactions and restarting any
process that exceeds the pre-determined time
limit. This portion is referred to as "TxSentry".
Responsible for disk maintenance routines for
disk cleanup. Also responsible for starting tunnels,
ecs, accumulator, Ariel_proxy, Ariel_query, Qflow,
reporting, Asset_profiler.
Web container used to hold our UI and UI would not be available.
webservices/RPC calls.
tomcat Console


Runs as an on-going daemon. It keeps track of 2 The data base stops working as well as IMQ. This
other running processes, Message Queues (IMQ), also impacts Hostcontext and Tomcat.
which opens up communication ports between
QRadar Components and PostgreSQL.
Console
hostservices
Managed Hosts
Event Correlation Service - Event Collector This service when stopped would not allow events
Ingress: collects the events in a buffer while ecs- to collected in a buffer and spooled to the other
ec and ecs-ep are being restarted. Then it spools ecs services.
the data back to ecs-ec and ecs-ep
Console
ecs-ec-ingress
Managed Hosts


Event Correlation Service - Event Collector: Impacts Parsing and normalizing events and flows
parse, normalize and coalesce the events. would stop
Console
ecs-ec
Managed Hosts
Event Correlation Service - Event Processor:

correlates events (Custom Rule Engine), stores Impacts Correlation. parsing and event storage
events in the Ariel database and forwards events
matching rules within CRE to the Magistrate
component. Console
ecs-ep
Managed Hosts


Responsible for counting. In order to generate the Aggregate Data, Reports, Searches
time series graphs in the QRadar UI in a
reasonable time, the accumulator creates by-
minute, by-hour and by-day counts of data within
the system for quickly populating these time
series graphs.
Console
accumulator
Managed Hosts
Responsible for persisting the assets and identity Assets would not be added or updated until this
models into the database. service came back online.
asset_profiler Console


The ariel proxy server is responsible for proxying All requests to managed host for data from
search requests from different processes to the searches would stop until this service restarts.
various ariel query servers. Once the results are
returned from the query servers, ariel proxy can
transform and aggregate data into various
orderings and store into server-side cursors for
later processing and retrieval.
ariel_proxy_server Console Only
Ariel query server is responsible for reading the All searches of the Ariel Data base would stop on
ariel database on the Managed hosts and sends the managed hosts.
the data matching the request back to the proxy
server for further processing.
ariel_query_server Managed Hosts


Runs the scheduler for reporting. All running reports would be canceled and would
need to be restarted. New scheduled reports
would not run till this service starts.
reporting_executor Console
Provides the ability to create offense based on Impacts historical searches on Offense data.
historical data. E.g. Bulk loading data, one-time
rule testing, etc.
Console
historical_correlation_server
Managed Hosts

How to resolve disk space usage problems
Note: By default, the QRadar disk sentry check runs every 60 seconds and looks for high disk usage across all partitions. If the critical partition reaches 95%
capacity, it will stop the QRadar critical services.

Note: By default, the QRadar disk sentry check runs every 60 seconds and looks for high disk usage across the all partitions. If the critical partition reaches 95%
Important: Sometimes files are written to the mount points before other partitions are mounted, for example, if files are written
to /store, /store/transient, /store/backups but the partitions are not yet mounted, they will be written to the "/" partition under
the /nfs/backups, /store/transient, /store/backups directories.

Note: The first step in diagnosing the problem is determining which partition has the problem. Using the df -h command, you can get the output of the partitions. An
example output is seen below:

Disk usage notifications
Notification Description
Disk Sentry: Disk Usage exceeded warning threshold. Disk usage is at 90% for a monitored partition. QRadar is not
affected when the partition reaches this threshold. Continue to
monitor your partition levels.
Disk Sentry: Disk Usage exceeded max threshold. Disk usage is at 95% for a monitored partition. QRadar data
collection and search processes are shut down to protect the file
system from reaching 100%.
Disk sentry: System disk usage back to normal levels. After disk usage reaches a threshold of 95%, it must return to 92%
before QRadar automatically restarts data collection and search
processes.
Note: By default, the QRadar disk sentry check runs every 60 seconds and looks for high disk usage across the all partitions. If the critical partition reaches 95%
Important: Sometimes files are written to the mount points before other partitions are mounted, for example, if files are written
to /store, /store/transient, /store/backups but the partitions are not yet mounted, they will be written to the "/" partition under
the /nfs/backups, /store/transient, /store/backups directories.

Troubleshooting disk space usage problems

By default, the QRadar disk sentry check runs every 60 seconds and looks for high disk usage across the following partitions:
QRadar 7.3.x Critical Services Stop QRadar 7.2.x Critical Services Stop
/ Yes, at 95% / Yes, at 95%
/store Yes, at 95% /store Yes, at 95%
/transient Yes, at 95% /store/transient Yes, at 95%
/storetmp Yes, at 95% /store/tmp Yes, at 95%
/opt Yes, at 95% - -
/var No - -
/var/log No /var/log No
/var/log/audit No - -
/tmp No - -
/home No - -

Backup and Recovery
You can use two types of backups: configuration backups and data backups.
Configuration backups include the following components: Data backups include the following information:
• Application configuration • Audit log information

• Assets • Event data
• Custom logos • Flow data
• Custom rules • Report data
• Device Support Modules (DSMs) • Indexes
• Event categories
• Flow sources
• Flow and event searches
• Groups
• Index management information
• License key information
• Log sources
• Offenses
• Reference set elements
• Store and Forward schedules
• User and user roles information
• Vulnerability data (if IBM QRadar Vulnerability Manager is installed)
Note: By default, at midnight QRadar creates a daily backup archive of your configuration information. The backup archive includes configuration information, data, or both from the previous day. The
size of your backup will depend on the amount of event data from that day.
Important: Each managed host in your deployment, including the QRadar Console, creates and stores all backup files in the /store/backup/ directory. Your system might include
a /store/backup mount from an external SAN or NAS service. External services provide long term, offline retention of data, which is commonly required for compliancy regulations, such as PCI.

Using QRadar SIEM backup management
A particular backup management functions
On-
Back
demand
Archives
backup
Missing Nightly
backup backup
Import
Backup

QRadar Index Management
Key Points:
• Use Index Management to control database indexing on event and flow properties.
• Data indexes are built in real-time as data is streamed or are built upon request after data is collected.
• When a search starts in QRadar, the search engine first filters the data set by indexed properties.
• To improve the speed of searches in IBM® QRadar®.
• To narrow the overall data by adding an indexed field in your search query.
• An index is a set of items that specify information about data in a file and its location in the file system.
• Searching is more efficient because systems that use indexes don't have to read through every piece of
data to locate matches.
• The index contains references to unique terms in the data and their locations.
• Because indexes use disk space, storage space might be used to decrease search time.
Note: By default, QRadar stores full text indexing for the past 30 days.

Index Management
LAB Activities:
Enabling indexes
Enabling payload indexing to

optimize search times
Configuring the retention period for

payload indexes
Restriction: Payload indexing increases disk storage requirements and might affect system performance. Enable payload indexing if your deployment meets the following conditions:
• The event and flow processors are at less than 70% disk usage.
• The event and flow processors are less than 70% of the maximum events per second (EPS) or flows per interface (FPI) rating.
Note: Your virtual and physical appliances require a minimum of 24 GB of RAM to enable full payload indexing. However, 48 GB of RAM is suggested.
The minimum and suggested RAM values applies to all QRadar systems.

QRadar Data Retention
Retention Buckets
• QRadar collects event and flow data, which is physically stored in the Ariel database on the EP and FP
• Retention buckets handle data retention in QRadar
• As QRadar receives events and flows, each one is compared against the retention bucket filter criteria
• Matching offenses and flows & stored in a retention bucket until they reach the deletion policy time period
• By default, retention buckets have
o A retention period of 30 days
o A Sequenced priority
o A set of 10 buckets can be configured per shared or tenant
• A record is stored in the bucket that matches the filter criteria with highest priority
• If you don’t configure retention buckets for the tenant, the data goes into the default retention bucket for the tenant.

IBM QRadar Upgrade Plan – Health Check
Technical Part Is the system working correctly?
1. System error’s and notifications (console logs, System Notification log source)
2. Do we have dropped events / hit license limits
3. Disk, CPU and memory usage on managed hosts
4. Do we have enough storage
5. Do all log sources / flow sources / VA sources are sending data to QRadar
6. Do we get events that are not parsed correctly
7. Updates, Backups, Network Hierarchy, Server Discovery
Security Part Does the system complies with our security requirements?
1.System settings (e.g., log hashing, retention periods

2.Permission settings (and compare with external lists) Flow data
3.Verify audit logs
4.Data privacy settings
5.Authorized services, password ages
System is running according to the defined used cases and

Organizational Part procedures:
1.Are all required log sources configured
2.Are we getting the expected events (are sources configured correctly)
3.Do we generate the required reports and deliver them to the right persons
4.Are offenses handled correctly / timely
5.What is our false-positive rate
6.Are all modifications / tunings documented correctly
7.Console summary

Some Useful Scripts
Mini scripts to display Memory usage, Disk usage and CPU Load
Memory Usage:
free -m | awk 'NR==2{printf "Memory Usage: %s/%sMB (%.2f%%)\n", $3,$2,$3*100/$2 }’
Disk Usage:
df -h | awk '$NF=="/"{printf "Disk Usage: %d/%dGB (%s)\n", $3,$2,$5}’
CPU Load:
top -bn1 | grep load | awk '{printf "CPU Load: %.2f\n", $(NF-2)}'

Before you upgrade – IBM QRadar
Ensure that you take the following precautions:
• Back up your data before you begin any software upgrade and verify that you have recent configuration backups that match your existing Console version. If required,
take an on demand configuration backup before you begin. For more information about backup and recovery, see the IBM Security QRadar Administration Guide.
• HA appliances should have primaries in the online state and secondary as standby for their HA pair status.
• If you have off-board storage configured, see the QRadar Upgrade Guide. There are special instructions for administrators with /store using off-board storage.
• If you installed QRadar as a software installation using your own hardware, see the QRadar Upgrade Guide.
• All appliances in the deployment must be at the same software and patch level in the deployment.
• Verify that all changes are deployed on your appliances. The update cannot install on appliances that have undeployed changes.
• To avoid access errors in your log file, close all open QRadar user interface sessions.
• If you are unsure of the IP addresses or hostnames for the appliances in the deployment, run the /opt/qradar/support/deployment_info.sh utility to get a CSV file
with information about the QRadar deployment. The CSV file will contain a list of IP addresses for each managed host.
• If you are unsure of how to proceed when reading these instructions or the documentation, it is best to ask before starting your upgrade. To ask a question in our
forums, see: http://ibm.biz/qradarforums.

Monitor QRadar Error Notifications

Error notifications in QRadar product require a response by administrator.
Out Of Memory Error
• 38750004 - Application ran out of memory
Explanation: When QRadar components attempt to use more than the amount allocated for memory, the application or service can stop
working. Out of memory issues are caused by software, or user-defined queries and operations that exhaust the available memory.
User Response
Review the following resolutions:
• Review the error message that is written to the /var/log/qradar.log file to determine which component failed.
• If the Ariel proxy server is searching through large amounts of data or is using a grouping option that generates unique values in
the search results, reduce the number of unique values or reduce the time frame of the search.
• If the accumulator is generating a time series graph with many aggregated unique values, reduce the size of the query.
• If a protocol-based log source is recently enabled, decrease the polling period to reduce the data queried. If multiple protocol-
based log sources are running at the same time, stagger the start times.


Disk Usage Exceeded Threshold
• 38750038 - Disk Sentry: Disk Usage Exceeded Max Threshold
Explanation: At least one disk on your system is 95% full. To prevent data corruption, some processes are shut down. Event collection
is suspended until the disk usage falls below 92%.
User Response
Identify which partition is full, such as the / and /store file systems. Free disk space by deleting files that are not required. For
example, remove debug output and patch files from the / file system. If the /store file system is near capacity, reduce your retention
settings for events and flows.
You can also manually delete older data in the /store/ariel/ directories. The system automatically restarts processes after you
free enough disk space to fall below a threshold of 92% capacity.


Process Monitor Must Lower Disk Usage
• 38750045 - Process Monitor: Disk usage must be lowered
Explanation: The process monitor is unable to start processes because of a lack of system resources. The storage partition on the
system is likely 95% full or greater.
User Response
Free some disk space by manually deleting files or by changing your event or flow data retention policies. The system automatically
restarts system processes when the used disk space falls below a threshold of 92% capacity.


Event Pipeline Dropped Events
• 38750060 - Events/Flows were dropped by the event pipeline
Explanation: If there is an issue with the event pipeline or you exceed your license limits, an event or flow might be dropped.
Dropped events and flows cannot be recovered.
User Response
Review the following options:
• Verify the incoming event and flow rates on your system. If the license is exceeded and the event pipeline is dropping events,
expand your license to handle more data.
• Review the recent changes to rules or custom properties. Rule or custom property changes can cause changes to your event or
flow rates and might affect system performance.
• Determine whether the issue is related to SAR notifications. SAR notifications might indicate that queued events and flows are in
the event pipeline. The system usually routes events to storage, instead of dropping the events.
• Tune the system to reduce the volume of events and flows that enter the event pipeline.
SELECT LOGSOURCENAME(logsourceid) AS "Log Source", SUM(eventcount) AS "Number of Events in Interval", SUM(eventcount) / 300 AS "EPS in Interval" FROM events GROUP BY
"Log Source" ORDER BY "EPS in Interval" DESC LAST 5 MINUTES


Auto Update Installed with Errors
• 38750067 - Automatic updates installed with errors. See the Auto Update Log for details
Explanation: The most common reason for automatic update errors is a missing software dependency for a DSM, protocol, or scanner
update.
User Response
Select one of the following options:
• In the Admin tab, click the Auto Update icon and select View Update History to determine the cause of the installation error. You
can view, select, and then reinstall a failed RPM.
• If an auto update is unable to reinstall through the user interface, manually download and install the missing dependency on your
console. The console replicates the installed file to all managed hosts.


Standby High-availability (HA) System Failure
• 38750080 - Standby HA System Failure
Explanation: The status of the secondary appliance switches to failed and the system has no HA protection.
User Response
• Restore the secondary system.

• Click the Admin tab, click System and License Management, and then click Restore System.
• Inspect the secondary HA appliance to determine whether it is powered down or experienced a hardware failure.
• Use the ping command to check the communication between the primary and standby system.
• Check the switch that connects the primary and secondary HA appliances.
• Verify the IP tables on the primary and secondary appliances.
• Review the /var/log/qradar.log file on the standby appliance to determine the cause of the failure.


Active High-availability (HA) System Failure
• 38750081 - Active HA System Failure
Explanation: The active system cannot communicate with the standby system because the active system is unresponsive or failed. The
standby system takes over operations from the failed active system.
User Response
• Inspect the active HA appliance to determine whether it is powered down or experienced a hardware failure.
• If the active system is the primary HA, restore the active system.
• Click the Admin tab and click System and License Management. From the High Availability menu, select the Restore
System option.
• Review the /var/log/qradar.log file on the standby appliance to determine the cause of the failure.
• Use the ping command to check the communication between the active and standby system.
• Check the switch that connects the active and standby HA appliances.
• Verify the IP tables on the active and standby appliances.


Disk Storage Unavailable
• 38750092 - Disk Sentry has detected that one or more storage partitions are not accessible
Explanation: The disk sentry did not receive a response within 30 seconds. A storage partition issue might exist, or the system might be
under heavy load and not able to respond within the 30-second threshold.
User Response
• Verify the status of your /store partition by using the touch command.
• If the system responds to the touch command, the unavailability of the disk storage is likely due to system load.
• Determine whether the notification corresponds to dropped events.
• QRadar drops events when it cannot write events to disk. Investigate the status of storage partitions.


Insufficient Disk Space to Export Data
• 38750096 - Insufficient disk space to complete data export request
Explanation: If the export directory does not contain enough space, the export of event, flow, and offense data is canceled.
User Response
• Free some disk space in the /store/exports directory.

• Configure the Export Directory property in the System Settings window to use to a partition that has sufficient disk space.
• Configure an offboard storage device.


Accumulator is Falling Behind
• 38750099 - The accumulator was unable to aggregate all events/flows for this interval
Explanation: This message appears when the system is unable to accumulate data aggregations within a 60-second interval.
Every minute, QRadar creates data aggregations for each aggregated search. The data aggregations are used in time-series graphs and
reports and must be completed within a 60-second interval. If the count of searches and unique values in the searches are too large, the
time that is required to process the aggregations might exceed 60 seconds. Time-series graphs and reports might be missing columns
for the time period when the problem occurred. You do not lose data when this problem occurs. All raw data, events, and flows are still
written to disk. Only the accumulations, which are data sets that are generated from stored data, are incomplete.
User Response
The following factors might contribute to the increased workload that is causing the accumulator to fall behind:
• Frequency of the incomplete accumulations--If the accumulation fails only once or twice a day, the drops might be caused by
increased system load due to large searches, data compression cycles, or data backup. Infrequent failures can be ignored. If the
failures occur multiple times per day, during all hours, you might want to investigate further.
• High system load--If other processes use many system resources, the increased system load can cause the aggregations to be
slow. Review the cause of the increased system load and address the cause, if possible. For example, if the failed accumulations
occur during a large data search that takes a long time to complete, you might prevent the accumulator drops by reducing the size
85 of the saved search. (Cont…) © 2013 IBM Corporation

Accumulator is Falling Behind
User Response
The following factors might contribute to the increased workload that is causing the accumulator to fall behind:
• Large accumulator demands--If the accumulator intervals are dropped regularly, you might need to reduce the workload.
The workload of the accumulator is driven by the number of aggregations and the number of unique objects in those
aggregations. The number of unique objects in an aggregation depends on the group-by parameters and the filters that are
applied to the search.
For example, a search that aggregates for services filters the data by using a local network hierarchy item, such as DMZ area.
Grouping by IP address might result in a search that contains up to 200 unique objects. If you add destination ports to the
search, and each server hosts 5 - 10 services on different ports, the new aggregate of destination.ip + destination.
Port can increase the number of unique objects to 2000. If you add the source IP address to the aggregate and you have
thousands of remote IP addresses that hit each service, the aggregated view might have hundreds of thousands of unique
values. This search creates a heavy demand on the accumulator.
To review the aggregated views that put the highest demand on the accumulator:
1. On the Admin tab, click Aggregated Data Management.
2. Click the Data Written column to sort in descending order and show the largest views.
3. Review the business case for each of the largest aggregations to see whether they are still required.


CRE Failed to Read Rules
• 38750107 - The last attempt to read in rules (usually due to a rule change) has failed.
Please see the message details and error log for information on how to resolve this
Explanation: The custom rules engine (CRE) on an event processor is unable to read a rule to correlate an incoming event. The
notification might contain one of the following messages:
• If the CRE was unable to read a single rule, in most cases, a recent rule change is the cause. The payload of the notification
message displays the rule or rule of the rule chain that is responsible.
• In rare circumstances, data corruption can cause a complete failure of the rule set. An application error is displayed and the rule
editor interface might become unresponsive or generate more errors.
User Response
For a single rule read error, review the following options:
• To locate the rule that is causing the notification, temporarily disable the rule.
• Edit the rule to revert any recent changes.
• Delete and re-create the rule that is causing the error.


Disk Failure
• 38750110 - Disk Failure: Hardware Monitoring has determined that a disk is in failed state
Explanation: On-board system tools detected that a disk failed. The notification message provides information about the failed disk and
the slot or bay location of the failure.
User Response
If the notification persists, contact IBM Support or replace the parts.


Predictive Disk Failure
• 38750111 - Predictive Disk Failure: Hardware Monitoring has determined that a disk is in
predictive failed state
Explanation: The system monitors the status of the hardware on an hourly basis to determine when hardware support is required on the
appliance. The on-board system tools detected that a disk is approaching failure or end of life. The slot or bay location of the failure is
identified.
User Response
Schedule maintenance for the disk that is in a predictive failed state.


Magistrate is Unable to Persist Offense Updates
• 38750147 - Magistrate encountered serious errors that may prevent offenses from being
updated
Explanation: The system detected an exception when it wrote offense updates to the database. Events are processed and stored, but
they might not contribute to offenses.
User Response
Conduct a soft clean of the SIM data model with Deactivate offenses unchecked.
1. Click the Admin tab.

2. On the toolbar, click Advanced >Clean SIM Model.
3. Click Soft Clean to set the offenses to inactive.
4. Ensure that Deactivate offenses is not checked.
5. Click the Are you sure you want to reset the data model? check box and click Proceed.
When you clean the SIM model, all existing offenses are closed. Cleaning the SIM model does not affect existing events and flows.

Monitor QRadar Warning Notifications

Warning notifications in QRadar product require a response by administrator.
Unable to Determine Associated Log Source
• 38750007 - Unable to automatically detect the associated log source for IP address
Explanation: When events are sent from an undetected or unrecognized device, the traffic analysis component needs a minimum of 25
events to identify a log source.
If the log source is not identified after 1,000 events, the system abandons the automatic discovery process and generates the system
notification. The system then categorizes the log source as SIM Generic and labels the events as Unknown Event Log.
User Response
• Review the IP address in the system notification to identify the log source.
• Review the Log Activity tab to determine the appliance type from the IP address in the notification message and then manually
create a log source. Ensure that the Log Source Identifier field matches the host name in the original payload syslog header. Verify
that the events are appearing on the device by deploying the changes and searching on the manually created log source.
(Cont…)


Unable to Determine Associated Log Source
• 38750007 - Unable to automatically detect the associated log source for IP address
User Response
• Review any log sources that forward events at a low rate. Log sources that have low event rates commonly cause this notification.
• To properly parse events for your system, ensure that automatic update downloads the latest DSMs.
• Review any log sources that provide events through a central log server. Log sources that are provided from central log servers or
management consoles might require that you manually create their log sources.
• Verify whether the log source is officially supported. If your appliance is supported, manually create a log source for the events and
add a log source extension.
• If your appliance is not officially supported, create a universal DSM to identify and categorize your events.


Maximum Active Offenses Reached
• 38750050 - MPC: Unable to create new offense. The maximum number of active offenses has
been reached
Explanation: The system is unable to create offenses or change a dormant offense to an active offense. The default number of active
offenses that can be open on your system is limited to 2500. An active offense is any offense that continues to receive updated event
counts in the past five days or less.
User Response
• Change low security offenses from open or active to closed, or to closed and protected.
• Tune your system to reduce the number of events that generate offenses. To prevent a closed offense from being removed by
your data retention policy, protect the closed offense.


Maximum Total Offenses Reached
• 38750051 - MPC: Unable to process offense. The maximum number of offenses has been reached
Explanation: By default, the process limit is 2500 active offenses and 100,000 overall offenses.
If an active offense does not receive an event update within 30 minutes, the offense status changes to dormant. If an event update
occurs, a dormant offense can change to active. After five days, dormant offenses that do not have event updates change to inactive.
User Response
• Tune your system to reduce the number of events that generate offenses.
• Adjust the offense retention policy to an interval at which data retention can remove inactive offenses.
• To prevent a closed offense from being removed by your data retention policy, protect the closed offense. To free disk space for
important active offenses, change offenses from active to dormant.


SAR Sentinel Threshold Crossed
• 38750073 - SAR Sentinel: threshold crossed
Explanation: The system activity reporter (SAR) utility detected that your system load is above the threshold. Your system can
experience reduced performance.
User Response
• In most cases, no resolution is required.

• For example, when the CPU usage over 90%, the system automatically attempts to return to normal operation.
• If this notification is recurring, increase the default value of the SAR sentinel. Click the Admin tab, then click Global System
Notifications. Increase the notification threshold.
• For system load notifications, reduce the number of processes that run simultaneously. Stagger the start time for reports,
vulnerability scans, or data imports for your log sources. Schedule backups and system processes to start at different times to
lessen the system load.

Plan QRadar upgrade and migration
QRadar Upgrade Guide
https://www.ibm.com/support/knowledgecenter/SS42VS_7.4/com.ibm.qradar.doc/b_qradar_upgrade.pdf
QRadar SIEM Hardware Migration Scenarios
https://www.ibm.com/support/pages/qradar-siem-hardware-migration-scenarios

All domains feed Security Intelligence
Correlate new threats based on Hundreds of 3rd party

X-Force IP reputation feeds information sources
Guardium Identity and Access Management

Database assets, rule logic and Identity context for all security
database activity information domains w/ QRadar as the dashboard
IBM Security Network

Tivoli Endpoint Manager AppScan Enterprise
Intrusion Prevention System
Endpoint Management AppScan vulnerability results feed
Flow data into QRadar turns NIPS
vulnerabilities enrich QRadar’s QRadar SIEM for improved
devices into activity sensors
vulnerability database asset risk assessment

Statement of Good Security Practices: IT system security involves protecting systems and information through prevention, detection and response
to improper access from within and outside your enterprise. Improper access can result in information being altered, destroyed or misappropriated
or can result in damage to or misuse of your systems, including to attack others. No IT system or product should be considered completely secure
and no single product or security measure can be completely effective in preventing improper access. IBM systems and products are designed to
be part of a comprehensive security approach, which will necessarily involve additional operational procedures, and may require other systems,
products or services to be most effective. IBM DOES NOT WARRANT THAT SYSTEMS AND PRODUCTS ARE IMMUNE FROM THE
MALICIOUS OR ILLEGAL CONDUCT OF ANY PARTY.
ibm.com/security
© Copyright IBM Corporation 2012. All rights reserved. The information contained in these materials is provided for informational purposes
only, and is provided AS IS without warranty of any kind, express or implied. IBM shall not be responsible for any damages arising out of the use
of, or otherwise related to, these materials. Nothing contained in these materials is intended to, nor shall have the effect of, creating any
warranties or representations from IBM or its suppliers or licensors, or altering the terms and conditions of the applicable license agreement
governing the use of IBM software. References in these materials to IBM products, programs, or services do not imply that they will be available in
all countries in which IBM operates. Product release dates and/or capabilities referenced in these materials may change at any time at IBM’s sole
discretion based on market opportunities or other factors, and are not intended to be a commitment to future product or feature availability in any
way. IBM, the IBM logo, and other IBM products and services are trademarks of the International Business Machines Corporation, in the United
States, other countries or both. Other company, product, or service names may be trademarks or service marks of others.

IBM QRadar Fundamentals Administration

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

IBM QRadar Fundamentals Administration

Uploaded by

Copyright:

Available Formats

IBM Security Systems

IBM QRadar Fundamental Administration

© 2012 IBM Corporation

What is Security Operations Center (SOC)

2 © 2013 IBM Corporation

Security Operations and Organizational Structure

• SOCs have been typically built around a hub-and-spoke architecture.

3 © 2013 IBM Corporation

Key functions performed by the SOC

What the SOC How the SOC

4 © 2013 IBM Corporation

Key functions performed by the SOC

5 © 2013 IBM Corporation

Key functions performed by the SOC

• Continuous Proactive • Alert Ranking and

Security Refinement Compliance

Red, Blue & GDPR, HIPAA &

6 © 2013 IBM Corporation

IBM QRadar Architecture

QRadar All-in-one QRadar All-in-one

Hardware Specification Hardware Specification

• Provides access to the QRadar web application.

7 © 2013 IBM Corporation

IBM QRadar Architecture

QRadar Distributed – Console

Log Events Net Flows

Event Processor Flow Processor

2 Data compress & Encrypted

Event Collector Flow Collector

8 Log Sources © 2013 IBM Corporation

IBM QRadar Architecture

QRadar Distributed – Console

Log Events Net flows

Event Processor Flow Processor

9 © 2013 IBM Corporation

QRadar Architecture Overview

10 © 2013 IBM Corporation

High-level Component Architecture & Data Store

11 © 2013 IBM Corporation

High-level Component Architecture & Data Store

12 © 2013 IBM Corporation

Flow Collector Architecture

13 © 2013 IBM Corporation

Event Collector Architecture

14 © 2013 IBM Corporation

Event Processor Architecture

15 © 2013 IBM Corporation

16 © 2013 IBM Corporation

Three Core Layers Functionality of QRadar System

17 © 2013 IBM Corporation

QRadar Events and Flows

19 © 2013 IBM Corporation

Functions of Event Collector Component

20 © 2013 IBM Corporation

Functions of Event Processing Component

21 © 2013 IBM Corporation

Magistrate on the QRadar Console

22 © 2013 IBM Corporation

Functions of QRadar Flows

23 © 2013 IBM Corporation

High Availability Deployment

24 © 2013 IBM Corporation

High Availability Deployment

25 © 2013 IBM Corporation