0% found this document useful (0 votes)
833 views168 pages

ATRG ClusterXL R6x R7x R8x (09-June-2020)

Checkpoint

Uploaded by

Aitor Carazo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
833 views168 pages

ATRG ClusterXL R6x R7x R8x (09-June-2020)

Checkpoint

Uploaded by

Aitor Carazo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

ClusterXL

Advanced Technical

Reference Guide

9 June 2020

Classification: [Protected]

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 1
Latest Documentation

The latest version of this document is at:


 http://supportcontent.checkpoint.com/solutions?id=sk93306
 http://supportcontent.checkpoint.com/documentation_download?id=25321

For additional technical information, visit the Check Point Support Center.

Revision History

Date Description
09 June 2020 Updated requirements for latency (≤100 msec) and packet loss (≤5%)
21 Feb 2019 Corrected a typo in the cphaprob mmagic command examples
25 Feb 2018 Updated section (1-4-B) Requirements for number of cluster members:
1) Changed from
"Up to 8 cluster members are supported in ClusterXL",
which is a theoretical limit
To
"Up to 5 cluster members are supported in ClusterXL",
which is a practical limit

2) Added:
"Up to 13 cluster members are supported in VSLS"
11 June 2017 Added links to the relevant R80.10 documents
Added CCP versions for R80.10
Added CCP versions for R76SP.50
Added a note that in Gaia R80.10, the 5th byte of the Source MAC
address (MAC magic) in all types of CCP packets is assigned
automatically
Added new cluster kernel debug flags in R80.10: arp, mmagic, trap
Added new command in R80.10: cphaprob mmagic
Added new flag in R80.10: cphaprob -m if
01 Feb 2017 Updated and corrected CCP OpCode information
Improved design of tables
18 Jan 2017 Added the number of supported cluster members in VRRP on Gaia
21 Aug 2016 Replaced "Crossbeam" with "X-Series"
Updated explanation about multiple Sync networks
Added a note that Solaris OS is not supported since R70
10 July 2016 Added section (3-11-C) "ARP Forwarding"
22 June 2016 Improved explanation about configuration of synchronization network
16 May 2016 Improved Table of Contents
Added CCP versions for R76SP.40
03 Feb 2016 Improved calculation of Destination MAC address for CCP packets when
there is no VIP configured on the involved interface
14 Jan 2016 Updated the definition of 3rd party cluster to include VRRP on Gaia
27 Dec 2015 Added additional related solution
24 Nov 2015 Added a note about Cluster Under Load (CUL) mechanism
15 Nov 2015 Added CCP versions for R76SP.10_VSLS and R76SP.30
14 Nov 2015 Clarified the explanation about pingable host
Added the explanation that CCP packets are not encrypted

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 2
Date Description
07 Sep 2015 Improved description of Destination MAC Address of CCP packets
17 Aug 2015 Added a note that throughput of Sync interface does not depend on
throughput of traffic interfaces
06 Aug 2015 Added additional related solutions
Clarified the explanation about probing mechanism
Clarified the explanation about cpha_bond_ls_config.conf file
28 July 2015 Added location of new cluster log files (starting in R77.20):
$FWDIR/log/cphaconf.elg
$FWDIR/log/cphastart.elg
$FWDIR/log/cphamcset.elg
23 July 2015 Added description of Critical Device in VSX cluster - 'Instances'
Removed links to ClusterXL Guides R55, R60, R61, R62, R65
Added additional related solutions
12 June 2015 Added additional related solutions
30 Mar 2015 Added a note that starting from Gaia R75.47, R77.20, the file
$FWDIR/conf/discntd.if is not needed anymore
02 Mar 2015 Updated the information when the Full Sync is performed
10 Feb 2015 Added additional related solutions
05 Feb 2015 Corrected explanation of "Flush and ACK"
02 Feb 2015 Added additional related solutions
21 Dec 2014 Corrected the description of CCP FWHAP_IF_PROBE_RPLY packet
28 Oct 2014 Added a note that Legacy HA mode is not supported on Gaia OS
26 Oct 2014 Corrected the information about Source MAC address of CCP
Added CCP versions for R77.30 and R76SP.10
Added additional related solutions
01 Oct 2014 Corrected the information about Source MAC address of CCP
03 Sep 2014 Corrected the information about Legacy HA mode
26 Aug 2014 Corrected the note: more than one Sync Network is not supported
05 Aug 2014 Added new debug flag for 'cluster' module
Added documents with kernel debug flags
30 July 2014 Added additional related solutions
29 July 2014 Added additional related solutions
01 July 2014 Added additional related solutions
30 June 2014 Corrected the syntax for 'cphaconf -t ... add' command
17 Mar 2014 Added CCP versions for R77.20 and R76SP
Added additional related solutions
09 Feb 2014 Added additional related solutions
21 Jan 2014 Added additional related solutions
19 Dec 2013 Added additional related solutions
15 Dec 2013 Added additional related solutions
Added links to R77 Administration Guides
08 Oct 2013 Added additional related solutions
29 Sep 2013 Minor spelling corrections
22 Sep 2013 Added additional related solutions
09 Sep 2013 Added description of new Critical Device in R77 ClusterXL - 'ted'
15 Aug 2013 Added a note about Crossbeam DBHA to ClusterXL requirements
29 July 2013 Minor spelling corrections
18 July 2013 Added additional related solutions
16 July 2013 Added additional related solutions

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 3
Date Description
Added additional information about MILS
Added additional flags for 'cphaconf ... start' command
10 July 2013 Minor spelling corrections
Added some minor clarifications
Added additional related solutions
03 July 2013 First release of this document

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 4
Table of Contents:

1. Introduction to ClusterXL .................................................................................................. 7


(1-1) The need for gateway clusters ................................................................................................................7
(1-2) Check Point cluster solution ....................................................................................................................7
(1-3) ClusterXL definitions and terms ..............................................................................................................8
(1-4) ClusterXL requirements for hardware and software .............................................................................20
2. State Synchronization in ClusterXL ................................................................................ 25
(2-1) Introduction ...........................................................................................................................................25
(2-2) State Synchronization modes ...............................................................................................................25
(2-3) Restrictions ...........................................................................................................................................26
(2-4) Synchronization network .......................................................................................................................27
(2-5) Configuring synchronization network ....................................................................................................28
(2-6) Configuring a service to be non-synchronized ......................................................................................28
3. ClusterXL Modes ............................................................................................................ 30
(3-1) Mode comparison table .........................................................................................................................31
(3-2) Example cluster topology ......................................................................................................................32
(3-3) High Availability New mode ..................................................................................................................33
(3-4) Full High Availability mode ....................................................................................................................34
(3-5) High Availability Legacy (Traditional) mode ..........................................................................................35
(3-6) Load Sharing Multicast mode ...............................................................................................................36
(3-7) Load Sharing Unicast mode ..................................................................................................................37
(3-8) VRRP ....................................................................................................................................................38
(3-9) Bridge ....................................................................................................................................................39
(3-10) Sticky Decision Function .....................................................................................................................39
(3-11) Forwarding ..........................................................................................................................................41
4. ClusterXL Configuration ................................................................................................. 48
(4-1) Clock synchronization ...........................................................................................................................48
(4-2) Preparing cluster members ...................................................................................................................48
(4-3) Configuring cluster object in SmartDashboard .....................................................................................50
(4-4) Configuring routing on networks around the cluster .............................................................................51
(4-5) CCP mode .............................................................................................................................................51
(4-6) ClusterXL High Availability for IPv6 ......................................................................................................51
(4-7) Defining 'Disconnected' interfaces ........................................................................................................52
(4-8) SecureXL ..............................................................................................................................................54
(4-9) CoreXL ..................................................................................................................................................54
(4-10) VPN .....................................................................................................................................................54
(4-11) NAT .....................................................................................................................................................55
(4-12) VLAN ...................................................................................................................................................55
(4-13) Link Aggregation (Bonding) ................................................................................................................56
(4-14) Monitoring the Interface Link State (MILS) .........................................................................................63
(4-15) Configuring cluster addresses on different subnets ............................................................................63
(4-16) Moving from a single gateway to a cluster ..........................................................................................65
(4-17) Adding another member or interface to an existing cluster ................................................................65
(4-18) Proxy ARP ...........................................................................................................................................66
(4-19) ISP Redundancy .................................................................................................................................67
(4-20) Dynamic Routing .................................................................................................................................68
(4-21) SNMP ..................................................................................................................................................70
(4-22) Enhanced enforcement of the TCP 3-way handshake .......................................................................70
5. Cluster state transitions .................................................................................................. 72
(5-1) Special notes for state transitions .........................................................................................................72
(5-2) Policy installation...................................................................................................................................72
(5-3) State transitions of the cluster member ................................................................................................77
(5-4) State transitions due to 'FWHAP_MY_STATE' packet .........................................................................78
(5-5) State transitions due to a Critical Device (Pnote) .................................................................................79
(5-6) State transitions due to the 'Interface Active Check' Critical Device (Pnote)........................................79
(5-7) Actions performed by a cluster member following a state transition ....................................................80
6. Cluster Control Protocol (CCP) ...................................................................................... 81
(6-1) Introduction ...........................................................................................................................................81
(6-2) CCP and security policy rule base ........................................................................................................83
(6-3) CCP internal timers ...............................................................................................................................84

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 5
(6-4) CCP modes ...........................................................................................................................................85
(6-5) CCP and VLAN interfaces ....................................................................................................................86
(6-6) CCP packet header ...............................................................................................................................88
(6-6-A) External Header ............................................................................................................................88
(6-6-B) CCP Header ..................................................................................................................................93
7. ClusterXL Monitoring and Troubleshooting .................................................................. 109
(7-1) SmartView Tracker..............................................................................................................................109
(7-2) SmartView Monitor ..............................................................................................................................110
(7-3) Clock synchronization .........................................................................................................................111
(7-4) CCP mode ...........................................................................................................................................111
(7-5) SecureXL ............................................................................................................................................111
(7-6) CoreXL ................................................................................................................................................112
(7-7) VPN .....................................................................................................................................................112
(7-8) NAT .....................................................................................................................................................113
(7-9) VLAN ...................................................................................................................................................114
(7-10) Link Aggregation (Bonding) ..............................................................................................................114
(7-11) Adding another member or interface to an existing cluster ..............................................................116
(7-12) ISP Redundancy ...............................................................................................................................117
(7-13) Dynamic Routing ...............................................................................................................................117
(7-14) SNMP ................................................................................................................................................118
(7-15) Policy Installation ..............................................................................................................................118
(7-16) Full Sync ...........................................................................................................................................122
(7-17) Delta Sync .........................................................................................................................................123
(7-18) Traffic ................................................................................................................................................125
(7-19) Flapping ............................................................................................................................................126
(7-20) 'fw ctl pstat' command .......................................................................................................................127
(7-21) 'cphaprob' command .........................................................................................................................130
(7-22) 'cphastart' and 'cphastop' commands ...............................................................................................146
(7-23) 'cphaconf' command .........................................................................................................................147
(7-24) 'cpstat' command ..............................................................................................................................155
(7-25) $FWDIR/bin/clusterXL_admin script .................................................................................................158
(7-26) $FWDIR/bin/clusterXL_monitor_ips script ........................................................................................158
(7-27) $FWDIR/bin/clusterXL_monitor_process script ................................................................................158
8. ClusterXL Debugging ................................................................................................... 159
(8-1) Debugging Check Point Security Gateway .........................................................................................159
(8-2) Debugging modules and flags ............................................................................................................162
9. Working with kernel parameters ................................................................................... 167
10. ClusterXL Error Messages ......................................................................................... 168
11. Additional related solutions ........................................................................................ 168

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 6
1. Introduction to ClusterXL
(1-1) The need for gateway clusters

Gateways and VPN connections are business critical devices. The failure of a Security
Gateway or VPN connection can result in the loss of active connections and access to
critical data. The gateway between the organization and the world must remain open under
all circumstances.

(1-2) Check Point cluster solution

A ClusterXL cluster is a group of identical Check Point Security Gateways connected in


such a way that if one fails, another immediately takes its place.

ClusterXL is a software-based High Availability and Load Sharing solution that


distributes network traffic between clusters of redundant Security Gateways and provides
transparent failover between machines in a cluster:
 A High Availability cluster ensures Security Gateway and VPN connection
redundancy by providing transparent failover to a backup Security Gateway in the
event of failure.
 A Load Sharing cluster provides reliability and increases performance, as all cluster
members are active.

ClusterXL uses unique physical IP addresses and MAC addresses for the cluster
members, and Virtual IP addresses to represent the cluster itself on the attached networks.
Virtual IP addresses do not belong to an actual machine interface (except in High
Availability Legacy (Traditional) mode).

ClusterXL provides an infrastructure that ensures that data is not lost due to a failure, by
ensuring that each cluster member is aware of connections passing through the other
members. Passing information about connections and other Security Gateway states
between the cluster members is known as State Synchronization.

Security Gateway Clusters can also be built using OPSEC certified High Availability and
Load Sharing products. OPSEC certified clustering products use the same State
Synchronization infrastructure as ClusterXL. Refer to http://www.checkpoint.com/opsec/.

Note: This applies to ClusterXL in Security Gateway mode only. For more on VSX
mode, see the VSX Administration Guide (VSX NGX, VSX NGX Scalability Pack, VSX
NGX R65, VSX NGX R67, VSX NGX R68, R75.40VS, R76, R77.X, R80.10).

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 7
(1-3) ClusterXL definitions and terms

Different vendors give different meanings to terms that relate to clusters, High
Availability and Load Sharing (Load Balancing).

Check Point uses the following definitions and terms (the order of definitions and terms
listed below is dictated by education reasons):

Cluster - A group of machines that work together in a redundant configuration.

Cluster member - Machine that is a part of Cluster.

ClusterXL - Cluster of Check Point Security Gateways that work together in a redundant
configuration. These Check Point Security Gateways are installed on Gaia /
SecurePlatform / X-Series / Solaris (R6X and lower) / Windows OS.
 Up to 8 cluster members are supported in ClusterXL.
 Up to 5 cluster members are supported 3rd party cluster (IPSO / X-Series).
 Up to 2 cluster members are supported in VRRP cluster on Gaia OS (sk105170).
Notes:
 In ClusterXL Load Sharing mode, configuring more than 4 members significantly
decreases cluster performance due to amount of Delta Sync
 In X-Series chassis, configuring more than 4 members (APMs) significantly
decreases cluster performance due to amount of Delta Sync.
 In X-Series DBHA configuration, the above requirement applies to a single chassis
(Check Point code is not aware of DBHA).

3rd party cluster - Cluster of Check Point Security Gateways that work together in a
redundant configuration. These Check Point Security Gateways are installed on X-Series
XOS, or IPSO OS.
Notes:
 VRRP Cluster on Gaia OS is also considered a 3rd party cluster.

Cluster Mode - configuration of cluster members to work in either High Availability / VRRP
(one cluster member processes all the traffic), or Load Sharing (all traffic is processed in
parallel by all cluster members).

Failure - A hardware or software problem that causes a machine to be unable to serve as a


cluster member (for example, one of cluster interface has failed; one of the monitored
daemon has crashed).
Cluster member that suffered from a failure is declared as failed, and its state is
changed to 'Down' (A physical interface is considered 'Down' only if all configured VLANs
on that physical interface are 'Down').

Failover / Fail-over - Transferring of a control over traffic (packet filtering) from a cluster
member that suffered a failure to another cluster member (based on internal cluster
algorithms).

Failback / Fallback - Recovery of a cluster member that suffered from a failure. The state
of a recovered cluster member is changed from 'Down' to either 'Active', or 'Standby'
(depending on Cluster Mode).

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 8
State Synchronization (a.k.a. Sync) - Technology that synchronizes the relevant
information about the current connections (stored in various Check Point kernel tabled)
among all cluster members over Synchronization Network.
Due to State Synchronization, the current connections are not cut off during cluster
failover.

Synchronization (a.k.a. Sync, Secured, Trusted) Network - A set of interfaces on cluster


members that were configured as interfaces, over which State Synchronization information
will be passed (as Delta Sync packets ).
In R5X and R6X versions, it is supported to configure up to three Synchronization
Networks per cluster (SmartDashboard - cluster object - 'Topology' pane - 'Network
Objective'). All configured Synchronization Networks work in parallel - i.e., the same
information is passed in parallel over all configured Synchronization Networks.
In R70 and above, the use of more than one Synchronization Network for redundancy is
not supported because the CPU load will increase significantly due to duplicate tasks
performed by all configured Synchronization Networks. Refer to sk92804 (Sync
Redundancy in ClusterXL).

Cluster topology - set of interfaces on all members of a cluster and their settings (Network
Objective, IP address/Net Mask, Topology, Anti-Spoofing, etc.).

Network Objective - defines how the cluster will configure and monitor an interface -
Cluster, Sync, Cluster+Sync, Monitored Private, Non-Monitored Private.
Configured in SmartDashboard - cluster object - 'Topology' pane - 'Network Objective'.

Cluster interface - An interface that was configured as a part of cluster topology:


SmartDashboard - cluster object - 'Topology' pane - 'Network Objective' column - set to
'Cluster'.

Sync (a.k.a. Secured, Trusted) interface - An interface that was configured as a part of
cluster topology for State Synchronization: SmartDashboard - cluster object - 'Topology'
pane - 'Network Objective' column - set to 'Sync' or 'Cluster+Sync'.
Up to three Sync interfaces can be configured per cluster.

Monitored Private interface - An interface that was configured as not to be a part of


cluster topology: SmartDashboard - cluster object - 'Topology' pane - 'Network Objective'
column - set to 'Monitored Private'.
However, this interface will be monitored by cluster, and failure on this interface will
cause the member to change its state to 'Down'.

Non-Monitored Private interface - An interface that was configured as not to be a part of


cluster topology: SmartDashboard - cluster object - 'Topology' pane - 'Network Objective'
column - set to 'Non-Monitored Private'.
This interface will not be monitored by cluster, and failure on this interface will not cause
any changes in member's state.

Private interface - An interface that was configured as not to be a part of cluster topology:
SmartDashboard - cluster object - 'Topology' pane - 'Network Objective' column - set to
'Private'.
This interface will not be monitored by cluster, and failure on this interface will not cause
any changes in member's state. Applies only to 3rd party clusters.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 9
Disconnected interface - An interface that was set as 'Non-Monitored Private'
interface in SmartDashboard, or was explicitly added into $FWDIR/conf/discntd.if file
(e.g., pre-requisite for configuring Bond interface). This interface state appears in the output
of 'cphaprob -a if' command.

Full Sync - Complete synchronization of relevant kernel tables by a cluster member that
tries to join the cluster against the working cluster member(s). This process is meant to
fetch a “snapshot” of the relevant kernel tables of already Active cluster member(s).
Full Sync is performed during initialization of Check Point software (during boot process,
the first time the member runs policy installation, during 'cpstart'). Until the Full Sync
process is complete successfully, this member remains in 'Down' state because until it is
fully synchronized with other cluster members, it can not function as a cluster member.
Meanwhile the Delta Sync packets continue to arrive and are stored in kernel memory
until Full Sync completes.
The whole Full Sync process is performed by FWD daemons on TCP port 256 and is
always done over.
The information is sent by FWD daemons in chunks while making sure they confirm
getting the information before sending the next chunk.

Delta Sync - Synchronization of kernel tables between all working cluster members -
exchange of CCP packets that carry pieces of information about different connections and
operations that should be performed on these connections in relevant kernel tables.
This Delta Sync process is performed directly by Check Point kernel.
While performing Full Sync, the Delta Sync updates are not processed and saved in
kernel memory. After Full Sync is complete, the Delta Sync packets stored during the Full
Sync phase are applied by order of arrival.

Delta Sync retransmission - It is possible that Delta Sync packets will be lost or corrupted
during the Delta Sync operations. In such cases, it is required to make sure the Delta Sync
packet is re-sent. The cluster member request the sending member to retransmit the
lost/corrupted Delta Sync packet.
Each Delta Sync packet has a sequence number.
The sending member has a queue of sent Delta Sync packets.
Each cluster member has a queue of packets sent from each of the peer cluster
members.
If, for any reason, a Delta Sync packet was not received by a cluster member, it can ask
for a retransmission of this packet from the sending member.
The Delta Sync retransmission mechanism is somewhat similar to a TCP Window and
TCP retransmission mechanism.
When a member requests retransmission of Delta Sync packet, which no longer exists
on the sending member, the member prints a console messages that the sync is not
complete.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 10
Cluster Control Protocol (CCP) - Proprietary Check Point protocol that runs between
cluster members on UDP port 8116, and has the following roles (refer to 'Cluster Control
Protocol (CCP) section):
 State Synchronization (Delta Sync)
 Health checks (state of cluster members and of cluster interfaces):
o Health-status Reports
o Cluster-member Probing
o State-change Commands
o Querying for Cluster Membership
Note: CCP is located between the Check Point kernel and the network interface
(therefore, only TCPdump should be used for capturing this traffic).

Preconfigured mode - Cluster Mode, where cluster membership is enabled on all


members to be, however no policy had been yet installed on any of the members - none of
them is actually configured to be primary, secondary, etc. The cluster cannot function if one
machine fails. In this scenario, the “preconfigured mode” takes place.
The preconfigured mode also comes into effect when no policy is yet installed, right
after the machines came up after boot, or when running 'cphaconf init' command.

Blocking mode - Cluster Mode, where cluster member does not forward any traffic (e.g.,
caused by a failure).

Non-blocking mode - Cluster Mode, where cluster member keeps forwarding all traffic.

High Availability (a.k.a. Active/Standby) mode - Cluster Mode, where only one cluster
member ('Active' member) processes all the traffic, while other cluster members ('Standby'
members) are ready to be promoted to 'Active' state if 'Active' member fails.
In High Availability New Mode, the cluster Virtual IP address (that represents the cluster
on that network) is associated:
 with physical MAC Address of 'Active' member
 with virtual MAC Address (refer to sk50840 (How to enable ClusterXL Virtual MAC
(VMAC) mode))
In High Availability Legacy (Traditional) Mode, there are no Virtual IP addresses - the
cluster members share identical IP and MAC addresses, so that the Active cluster member
receives from a hub or switch all the packets that were sent to the cluster IP address.

Load Sharing (a.k.a. Active/Active, Load Balancing) mode - Cluster Mode, where all
traffic is processed by all cluster members in parallel.

Load Sharing Multicast mode - Load Sharing Cluster Mode, where all traffic is processed
by all cluster members in parallel - each member is assigned the equal load of [ 100% /
number_of_members ].
The cluster Virtual IP address (that represents the cluster on that network) is associated
with Multicast MAC Address 01:00:5E:X:Y:Z (which is generated based on last 3 bytes
of cluster Virtual IP address on that network).
A ClusterXL decision algorithm (Decision Function) on all cluster members decides
which cluster member should process the given packet.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 11
Load Sharing Unicast mode - Load Sharing Cluster Mode, where all traffic is accepted by
one member (called Pivot), and then the traffic is either processed by this member (Pivot),
or forwarded to one of the peer members (called non-Pivot).
The traffic load is assigned to cluster members based on the hard-coded formula per
the value of 'Pivot_overhead' attribute (refer to sk34668 (How to modify the assigned
load between the members of ClusterXL in Load Sharing Unicast mode)).
The cluster Virtual IP address (that represents the cluster on that network) is associated
with:
 Physical MAC Address of 'Pivot' member
 Virtual MAC Address (refer to sk50840 (How to enable ClusterXL Virtual MAC
(VMAC) mode))

Full High Availability (a.k.a. Full HA) mode - Special Cluster Mode (supported only on
Check Point appliances running Gaia OS or SecurePlatform OS) where each cluster
member also runs as a Security Management Server. This provides redundancy both
between Security Gateways (only High Availability is supported) and between Security
Management Servers (only High Availability is supported). Refer to sk101539 (ClusterXL
Load Sharing mode limitations and important notes) and sk39345 (Management High
Availability restrictions).

Decision Function - Special cluster algorithm applied by each cluster member on the
incoming traffic in order to decide, which member should process the given packet - each
cluster members maintains a table of hash values generated based on connections tuple
(source and destination IP addresses/Ports, and Protocol number).
In order to see the decision process, run kernel debug of 'cluster' module with flag
'df' (also recommended to enable the flag 'select').

Sticky Decision Function (SDF) - Special cluster algorithm in Load Sharing mode that
allows the user to control based on which parameters should the Decision Function be
applied to the incoming connections:
 IPs, Ports, SPIs
 IPs, Ports
 IPs

Selection - The packet selection mechanism is one of the central and most important
components in the ClusterXL product and State Synchronization infrastructure for 3rd party
clustering solutions. Its main purpose is to correctly decide (select) what has to be done to
the incoming and outgoing traffic on the cluster machine.
In order to see the selection process, run kernel debug of 'cluster' module with flag
'select' (also recommended to enable the flag 'df').
 In ClusterXL - the packet is selected by cluster member(s) depending on the cluster
mode:
o In HA modes - by Active member
o In LS Unicast mode - by Pivot member
o In LS Multicast mode - by all members.
Then the member applies the Decision Function (and SDF).
 In 3rd party / OPSec cluster - the 3rd party software selects the packet, and Check
Point code just inspects it (and performs State Synchronization).

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 12
HA not started - Output of 'cphaprob flag' command on the given cluster member -
means that Check Point clustering software is not started on this Security Gateway (e.g.,
this machine is not a part of a cluster, or 'cphastop' command was run, or some failure
occurred that prevented the ClusterXL product from starting correctly).

Initializing - State of a cluster member during initialization of Check Point software (this
state can be seen only in cluster debug). An initial and transient state of the cluster member
- the ClusterXL product is already running, but not all ClusterXL Critical Devices are
initialized yet and FireWall product is not ready yet.

Ready - State of a cluster member during after initialization and before promotion to the
next required state - Active/Standby/Master/Backup (depending on Cluster Mode).
A member in this state does not process any traffic passing through cluster. A member
might be stuck in this state due to several reasons - refer to sk42096 (Cluster member is
stuck in 'Ready' state).

Active - State of a cluster member that is fully operational:


 In ClusterXL - state of the Security Gateway component
 In 3rd party / OPSec cluster - state of the State Synchronization mechanism

Active attention - In ClusterXL - state of the 'Active' cluster member that suffers from a
failure (and failover is not possible because there are no other available members, e.g.,
while Standby member of an HA cluster reboots).

Standby - State of a cluster member that is ready to be promoted to 'Active' state (if Active
member fails) in ClusterXL configured in High Availability mode.

Master - State of a cluster member that processes all traffic in ClusterXL configured in
VRRP mode.

Backup - State of a cluster member that is ready to be promoted to 'Master' state (if Master
member fails) in ClusterXL configured in VRRP mode.

Active Up - ClusterXL in High Availability mode that was configured as 'Maintain


current active Cluster Member'.
This means the following:
 If the current Active member fails for some reason, or is rebooted (e.g., Member_A),
then failover occurs between cluster members - another Standby member will be
promoted to be Active (e.g., Member_B).
 When former Active member (Member_A) recovers from a failure, or boots, the
former Standby member (Member_B) will remain to be in Active state (and
Member_A will assume the Standby state).

Primary Up - ClusterXL in High Availability mode that was configured as 'Switch to


higher priority Cluster Member'.
This means the following:
 Each cluster member is given a priority (SmartDashboard - cluster object - 'Cluster
Members' pane) - member with highest priority appears at the top of the table, and
member with lowest priority appears at the bottom of the table.
 The member with highest priority will assume the Active state.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 13
 If the current Active member with highest priority (e.g., Member_A), fails for some
reason, or is rebooted, then failover occurs between cluster members - the member
next highest priority will be promoted to be Active (e.g., Member_B).
 When the member with highest priority (Member_A) recovers from a failure, or
boots, then additional failover occurs between cluster members - the member with
highest priority (Member_A) will be promoted to Active state (and Member_B will
return to Standby state).

Down - State of a cluster member during a failure:


 In ClusterXL - state of the Security Gateway component
 In 3rd party / OPSec cluster - state of the State Synchronization mechanism

Dead - State reported by a cluster member when it goes out of the cluster (due to
'cphastop' command (which is a part of 'cpstop'), or reboot).

Dying - State of a cluster member as assumed by peer members if it did not report its state
for 0.7 sec.
ClusterXL is inactive, or the machine is down - Such state is reported by the given
member regarding the peer member after the peer member notifies (via CCP) that it goes
out of the cluster (due to 'cphastop' command (which is a part of 'cpstop'), or reboot).

Critical Device (a.k.a. Problem Notification, Pnote) - Special software device on each
cluster member through which the critical aspects for cluster operation are monitored.
When the critical monitored component on a cluster member fails to report its state on
time, or when its state is reported as problematic, the state of that member is immediately
changed to 'Down'
The complete list of the configured critical devices (pnotes) is printed by the 'cphaprob
-ia list' command.

Restrictions:
 Total number of critical devices (pnotes) on cluster member is limited to 16.
 Name of any critical device (pnote) on cluster member is limited to 16 characters.

There are several predefined built-in critical devices (pnotes):

 Device Name: Problem Notification


o Current state: OK - none of the Critical Devices reports its state as 'problem'
o Current state: problem - at least one of the Critical Devices reports its
state as 'problem'

 Device Name: Interface Active Check


o Current state: OK - all cluster interfaces are up (CCP packets are sent and
received on all cluster interfaces)
o Current state: problem - at least one of the cluster interface is down (CCP
packets are not sent / received on time)

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 14
Note:
o The transmit state of an interface (as monitored by this pnote) is refreshed once
a FWHAP_IF_PROBE_RPLY packet is received in acknowledge to
FWHAP_IF_PROBE_REQ packet.
o The receive state of an interface (as monitored by this pnote) is refreshed once
any CCP packet (UDP on port 8116) is received.

 Device Name: HA Initialization


o Current state: OK - "HA module" was initialized successfully (sk36372)

 Device Name: Load Balancing Configuration


o Current state: OK - Pnote is currently not used (sk36373)

 Device Name: Recovery Delay


o Current state: OK - state of a Virtual System can be changed (sk92353)
o Current state: problem - state of a Virtual System can not be changed yet
(sk92353)
Note: Recovery Delay mechanism is disabled by default on 3rd party clusters.

 Device Name: IPSO member status


o Current state: OK - IPSO machine joined the cluster, all interfaces are up
o Current state: problem - IPSO machine left the cluster, less interfaces
than expected in UP state

There are several predefined registered critical devices (pnotes):

 Device Name: Synchronization


o Current state: OK - Full Sync has completed successfully
o Current state: problem - Full Sync has failed

 Device Name: Filter


o Current state: OK - Security Policy was installed successfully
o Current state: problem - Security Policy is not currently installed

 Device Name: VSX


o Current state: OK - On VS0 means that states of all VSs are not 'Down';
On other VSs means that VS0 is alive
o Current state: problem - minimum of blocking states of all VSs is not
"active" (the VSIDs will be printed on the line 'Problematic VSIDs:')
Note: This pnote appears on Gaia OS since R75.40VS.

 Device Name: fwd


o Current state: OK - FWD daemon reported its state on time (i.e., FWD is
up)
o Current state: problem - FWD daemon did not report its state on time
(e.g., FWD is not running)

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 15
 Device Name: cphad
o Current state: OK - CPHAMCSET daemon reported its state on time (i.e.,
CPHAMCSET is up)
o Current state: problem - CPHAMCSET daemon did not report its state on
time (e.g., CPHAMCSET is not running)

Note: Does not exist on VSX cluster R6x.


Note: Starting in R77.20, refer to $FWDIR/log/cphamcset.elg

 Device Name: cvpnd


o Current state: OK - CVPND daemon reported its state on time (i.e., CVPND
is up)
o Current state: problem - CVPND daemon did not report its state on time
(e.g., CVPND is not running)
Note: This pnote appears if Mobile Access Blade is enabled.

 Device Name: FIB


o Current state: OK - FIBMGRD daemon reported its state on time (i.e.,
FIBMGRD is up) and it is able to send/receive its packets (on TCP port 2010)
o Current state: problem - FIBMGRD daemon did not report its state on
time (e.g., RouteD is not running) or it is not able to exchange its packets with
peer members
Note: This pnote appears only on SecurePlatform Pro OS when Advanced Dynamic
Routing is enabled.

 Device Name: routed


o Current state: OK - RouteD daemon reported its state on time (i.e., RouteD
is up)
o Current state: problem - RouteD daemon did not report its state on time
(e.g., RouteD is not running)
Note: This pnote appears on Gaia OS since R76.

 Device Name: ted


o Current state: OK - Threat Emulation Daemon reported its state on time
(i.e., TED is up)
o Current state: problem - Threat Emulation Daemon did not report its state
on time (e.g., TED is not running)
Note: This pnote appears since R77.

 Device Name: Instances


o Current state: problem - mismatch between the number of CoreXL FW
instances in the received CCP packet and the number of loaded CoreXL FW
instances on this VSX member / this Virtual System (refer to sk106912).
Note: This pnote appears since R75.40VS in VSX HA mode (not VSLS) cluster.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 16
Additional critical devices (pnotes) can be registered by using Check Point shell scripts:
 '$FWDIR/bin/clusterXL_admin' shell script registers the admin_down device
(sk55081)
 '$FWDIR/bin/clusterXL_monitor_ips' shell script registers the
host_monitor device (sk35780)
 '$FWDIR/bin/clusterXL_monitor_process' shell script registers devices with
the names of processes that are specified in the
$FWDIR/conf/cpha_proc_list file (sk92904)

Additional critical devices (pnotes) can be registered by using the following syntax:
 cphaprob -d Device_Name -t TimeOut_in_Sec -s State [-p]
register
Important Note: For R76 and above, refer to sk92878 (User Space process monitoring
mechanism in R76 ClusterXL).
Note: On Security Gateway in VSX mode, global pnotes can be registered only from the
context of VS0.

Any critical device (pnote) can be unregistered by using the following syntax:
 cphaprob -d Device_Name [-p] unregister
Note: On Security Gateway in VSX mode, global pnotes can be unregistered only
from the context of VS0.

Subscribers - User Space processes that are made aware of the current state of the
ClusterXL state machine and other clustering configuration parameters. List of such
subscribers can be obtained by running the cphaconf debug_data command.

Sticky connection - A connection is called 'sticky' if all packets are handled by a single
cluster member (in High Availability mode, all packets reach the 'Active' machine, so all
connections are sticky).

Non-sticky connection - A connection is called 'non-sticky' if the reply packet returns via a
different cluster member than the original packet (e.g., if network administrator has
configured asymmetric routing; in Load Sharing mode, all cluster members are 'Active', and
in Static NAT and encrypted connections, the Source and Destination IP addresses
change, therefore, Static NAT and encrypted connections through a Load Sharing cluster
may be non-sticky).

Flush and ACK (a.k.a. FnA, F&A) - Cluster member forces the Delta Sync packet about
the incoming packet and waiting for acknowledgements from all other Active members and
only then allows the incoming packet to pass through.
In some scenarios, it is required that some information, written into the kernel tables, will
be Sync-ed promptly, or else a race condition can occur. The race condition may occur if a
packet that caused a certain change in kernel tables left cluster Member_A toward its
destination and then the return packet tries to go through cluster Member_B.
In general, this kind of situation is called asymmetric routing. What may happen in this
scenario is that the return packet arrives at cluster Member_B before the changes induced
by this packet were Sync-ed to this Member_B.
Example of such a case is when a SYN packet goes through cluster Member_A,
causing multiple changes in the kernel tables and then leaves to a server. The SYN-ACK
packet from a server arrives at cluster Member_B, but the connection itself was not Sync-
ed yet. In this condition, the cluster Member_B will drop the packet as an Out-of-State

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 17
packet ("First packet isn't SYN"). In order to prevent such conditions, it is possible
to use the “Flush and Ack” (F&A) mechanism.
This mechanism can send the Delta Sync packets with all the changes accumulated so
far in the Sync buffer to the other cluster members, hold the original packet that induced
these changes and wait for acknowledgement from all other (Active) cluster members that
they received the information in the Delta Sync packet. When all acknowledgements
arrived, the mechanism will release the held original packet.
This ensures that by the time the return packet arrived from a server at the cluster, all
the cluster members are aware of the connection.
F&A is being operated at the end of the Inbound chain and at the end of the Outbound
chain (it is more common at the Outbound).

Forwarding - Process of transferring of an incoming traffic from one cluster member to


another cluster member for processing.
There are two types of forwarding the incoming traffic between cluster members:
 Packet forwarding
 Chain forwarding
Refer to Forwarding section.

Packet Selection - Distinguishing between different kinds of packets coming from the
network, and selecting, which member should handle a specific packet (Decision Function
mechanism):
 CCP packet from another member of this cluster
 CCP packet from another cluster or from a cluster member with another version
(usually older version of CCP)
 Packet is destined directly to this member
 Packet is destined to another member of this cluster
 Packet is intended to pass through this cluster member
 ARP packets

CPHA - General term that stands for Check Point High Availability (historic fact: the first
release of ClusterXL supported only High Availability) that is used only for internal
references (e.g., inside kernel debug) to designate ClusterXL infrastructure.

Probing - If a cluster member fails to receive status for another member (does not receive
CCP packets from that member) on a given segment, cluster member will probe that
segment in an attempt to illicit a response.
The purpose of such probes is to detect the nature of possible interface failures, and to
determine which module has the problem.
The outcome of this probe will determine what action is taken next (change the state of
an interface, or of a cluster member).
Refer to Cluster Control Protocol (CCP) section.

IP tracking - Collecting and saving of Source IP addresses and Source MAC addresses
from incoming IP packets during the probing.
This information is saved in IP tracking tables according to IP tracking policy:
 host_ip_addrs_all, id 8125
 host_ip_addrs, id 8177
IP tracking is a useful for members within a cluster to determine whether the network
connectivity of the member is acceptable.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 18
IP tracking policy - Setting that controls, which IP addresses should be tracked during IP
tracking:
 Only IP addresses from the subnet of cluster VIP, or from subnet of physical cluster
interface (fwha_track_ip_policy=1; default value)
 All IP addresses, also outside the cluster subnet (fwha_track_ip_policy=0)

Pingable host - Some host (i.e., some IP address) that cluster members can ping during
probing mechanism. Pinging hosts in an interface's subnet is one of the health checks that
ClusterXL mechanism performs. This pingable host will allow the cluster members to
determine with more precision what has failed (which interface on which member).
On Sync network, usually, there are no hosts. In such case, if switch supports this, an
IP address should be assigned on the switch (e.g., in the relevant VLAN).
The IP address of such pingable host should be assigned per this formula:
IP_of_pingable_host = IP_of_physical_interface_on_member + ~10

Assigning the IP address to pingable host that is higher than the IP addresses of
physical interfaces on the cluster members will give some time to cluster members to
perform the default health checks.

Example:
 IP address of physical interface on a given subnet on Member_A is 10.20.30.41
 IP address of physical interface on a given subnet on Member_B is 10.20.30.42
 IP address of pingable host should be at least 10.20.30.50

Flapping - Consequent changes in the state of either cluster interfaces (cluster interface
flapping), or cluster members (cluster member flapping). Such consequent changes in the
state are seen in SmartView Tracker (if in SmartDashboard in cluster object, the cluster
administrator set 'Track changes in the status of cluster members' to 'Log').

VMAC - Virtual MAC address (available since R71). When this feature is enabled on cluster
members, all cluster members in High Availability New mode / Load Sharing Unicast mode
(Note: any VSX cluster works in High Availability mode) associate the same Virtual MAC
address with Virtual IP address.
This allows avoiding issues when Gratuitous ARP packets sent by cluster during failover
are not integrated into ARP cache table on switches surrounding the cluster.
Refer to sk50840 (How to enable ClusterXL Virtual MAC (VMAC) mode).

HTU - Stands for "HA Time Unit". All internal time in ClusterXL is measured in HTUs (the
times in cluster debug also appear in HTUs).
Formula in the code:
1 HTU = 10 x fwha_timer_base_res = 10 x 10 milliseconds = 100 ms

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 19
(1-4) ClusterXL requirements for hardware and software

Note: ClusterXL product is part of the standard Security Gateway installation.

(1-4-A) Open Servers vs. Check Point appliances

 If ClusterXL is installed on Open Servers, then it must be installed in a distributed


configuration, in which the cluster members and the Security Management Server
are installed on different machines.

 If ClusterXL is installed on Check Point appliances, then it can be installed in


either a distributed configuration (in which the cluster members and the Security
Management Server and installed on different machines), or in a Full High
Availability configuration (in which the cluster members and the Security
Management Server are installed on the same machines).

(1-4-B) Requirements for number of cluster members

 Up to 5 cluster members are supported in ClusterXL.


 Up to 5 cluster members are supported 3rd party cluster (IPSO / X-Series).
 Up to 2 cluster members are supported in VRRP cluster on Gaia OS (sk105170).
 Up to 13 cluster members are supported in Virtual System Load Sharing (VSLS).
Notes:
 In ClusterXL Load Sharing mode, configuring more than 4 members significantly
decreases cluster performance due to amount of Delta Sync
 In X-Series chassis, configuring more than 4 members (APMs) significantly
decreases cluster performance due to amount of Delta Sync.
 In X-Series DBHA configuration, the above requirement applies to a single chassis
(Check Point code is not aware of DBHA).

(1-4-C) Requirements for hardware

ClusterXL operation completely relies on internal timers and calculation of internal


timeouts, which are based on hardware clock ticks.

Therefore, in order to avoid unexpected behaviour, ClusterXL is supported only


between machines with identical CPU characteristics.

In addition, in order to avoid unexpected fail-overs due to issues with CCP packets on
cluster interfaces, it is strongly recommended to pair only identical physical interfaces as
cluster interfaces - even when connecting the cluster members via a switch:
 Intel 82598EB on Member_A with Intel 82598EB on Member_B
 Broadcom NeXtreme on Member_A with Broadcom NeXtreme on Member_B

Note: There is no requirement for throughput of Sync interface to be identical to / larger


than throughput of traffic interfaces (although, to prevent a possible bottle neck, a good
practice for throughput of Sync interface is to be at least identical to throughput of traffic
interfaces).

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 20
(1-4-D) Requirements for software

ClusterXL is supported only between identical operating systems (all cluster members
must be installed on the same operating system) and between identical Check Point
software versions (all cluster members must be installed with identical Check Point
software, including OS build and hotfixes).

All ClusterXL modes are supported on all operating systems (except High Availability
Legacy (Traditional) mode).

All Check Point software components must be identical on all cluster members.
Meaning that identical Software Blades and features must be enabled on all cluster
members:
 SecureXL status - SecureXL on all members has to be either enabled, or disabled
 Number of CoreXL FW instances - number of instances on all members must be
identical
 Advanced Dynamic Routing - on all members has to be either enabled, or disabled
Otherwise, traffic might not be processed as expected and/or state of cluster members
might change expectedly. In addition, Full Sync will fail.

Refer to this solution:


 sk41023 ('fwsync: there is a different installation of Check Point's products on each
member of this cluster' error in /var/log/messages)

(1-4-E) Requirements for switches and routers

Cluster interfaces can be connected only via Layer 2 networking devices - hubs and
switches. Connecting cluster interfaces via Layer 3 networking devices (routers) is not
supported.
Cluster networks must meet the requirements for latency (less than 100 milliseconds)
and packet loss (less than 5%).
Note: Latency cannot be measured correctly by a simple tool as Ping or Traceroute -
more sophisticated tools are required that measure electrical signals on the wire.

(1-4-E-i) High Availability New mode and Load Sharing Unicast Mode

When running the CCP in Multicast mode (default), the Layer 2 Destination MAC
address of CCP packets is a Multicast MAC address 01:00:5E:X:X:X.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 21
Configure the following settings:

Switch Setting Explanation


IGMP and Static CAMs By default, ClusterXL does not support IGMP registration (also
known as IGMP Snooping).
Either disable IGMP registration in switches that rely on IGMP
packets to configure their ports, or enable IGMP registration on
ClusterXL members per sk33221.
In situations, where disabling IGMP registration in switches is
not acceptable, it is necessary to configure static CAMs in order
to allow multicast traffic on specific ports.
Disabling multicast limits Certain switches have an upper limit on the number of
broadcasts and multicasts that they can pass, in order to
prevent broadcast storms. This limit is usually a percentage of
the total interface bandwidth.
It is possible either to turn off broadcast storm control, or to
allow a higher level of broadcasts or multicasts through the
switch.
If the connecting switch is incapable of having any of these
settings configured, it is possible, though less efficient, for the
switch to use broadcast to forward traffic, and to configure the
cluster members to run CCP in broadcast mode per sk20576.

Router Setting Explanation


Unicast MAC When working in High Availability New mode (without VMAC) /
Load Sharing Unicast mode, the Cluster Virtual IP address is
mapped to a physical MAC address of the 'Active' / 'Pivot'
member.
In case of fail-over, another member will be promoted to 'Active'
/ 'Pivot'. As a result, the Cluster Virtual IP address will be
mapped to new physical MAC address.
In order to update the surrounding networking devices, 'Active' /
'Pivot' member sends Gratuitous ARP packets.
The router needs to be able to learn this MAC through these
ARP packets (otherwise, it will route the traffic to "old" MAC
address, which will cause traffic outage on the network).

(1-4-E-ii) Load Sharing Multicast Mode

Load Sharing Multicast Mode, the cluster Virtual IP address (that represents the cluster
on that network) is associated with Multicast MAC Address 01:00:5E:X:Y:Z, which is
generated based on the last 3 bytes of cluster Virtual IP address.
Refer to Mode comparison table.

When working in Load Sharing Multicast mode, the router must support sending unicast
IP packets with multicast MAC addresses. This is required so that all cluster members will
receive the data packets.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 22
Configure the following settings:

Switch Setting Explanation


CCP in Multicast mode Multicast mode is the default Cluster Control Protocol mode in
Load Sharing Multicast.
Port Mirroring ClusterXL does not support the use of unicast MAC addresses
with Port Mirroring for Multicast Load Sharing solutions.

Router Setting Explanation


Static MAC Most routers can map the following ARP entries automatically
using the ARP mechanism:
 unicast Layer 3 IP address
 multicast Layer 2 MAC address
If you have a router that is not able to learn this type of
mapping dynamically, you will have to configure these
mappings as static MAC entries.
IGMP and static CAMs Some routers require disabling of IGMP snooping or
configuration of static CAMs in order to support sending
packets with unicast Layer 3 IP address and multicast Layer 2
MAC address.
Disabling multicast limits Certain routers have an upper limit on the number of
broadcasts and multicasts that they can pass, in order to
prevent broadcast storms. This limit is usually a percentage of
the total interface bandwidth.
It is possible either to turn off broadcast storm control, or to
allow a higher level of broadcasts or multicasts through the
router.
Disabling forwarding Some routers will send multicast traffic to the router itself. This
multicast traffic to the may cause a packet storm through the network, and should be
router disabled.

Refer to ClusterXL Administration Guide (R70, R70.1, R71, R75, R75.20, R75.40,
R75.40VS, R76, R77.X, R80.10) - Example Configuration of a Cisco Catalyst Routing
Switch.

Refer to these solutions related to IGMP snooping:


 sk31934 (ClusterXL IGMP Membership)
 sk33221 (Using ClusterXL with IGMP Snooping-enabled switches)
 sk22495 (Interface flapping (down/up) in a ClusterXL environment)
 sk93327 (IGMP groups are not learned on cluster member)

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 23
(1-4-F) ClusterXL licenses

To use ClusterXL, each Security Gateway in the cluster configuration must have a
regular Security Gateway license and the Security Management Server must have a
license for each cluster defined. There are separate licenses for cluster High Availability
mode and for cluster Load Sharing mode.

It does not matter how many Security Gateways are included in the cluster. If the proper
licenses are not installed, the policy installation operation will fail.

Refer to these solutions:


 sk11054 (Check Point License Guide)
 sk10200 ('too many internal hosts' error in /var/log/messages on Security Gateway)

For assistance with licenses, contact Check Point Customer Account Services
(http://www.checkpoint.com/form/contact_account.html, AccountServices@checkpoint.com,
+1-972-444-6600 ext 5).

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 24
2. State Synchronization in ClusterXL

(2-1) Introduction

A failure of a firewall results in an immediate loss of active connections in and out of the
organization. A failure of a firewall results in an immediate loss of active connections in and
out of the organization. Many of these connections, such as financial transactions, may be
mission critical, and losing them will result in the loss of critical data.

ClusterXL supplies an infrastructure that ensures that no data is lost in case of a failure,
by making sure each cluster member is aware of the connections going through the other
members. Passing information about connections (stored in various Check Point kernel
tabled) and other Security Gateway states between the cluster members is called State
Synchronization.

Every IP-based service (including ICMP, TCP and UDP) recognized by the Security
Gateway is synchronized (unless configured otherwise in SmartDashboard).

State Synchronization is used both by ClusterXL and by 3rd party OPSEC-certified


clustering products.

ClusterXL modes and state synchronization:


 ClusterXL High Availability configuration does not require state synchronization,
though if it is not enabled, connections will be lost upon failover.
 ClusterXL Load Sharing configuration requires state synchronization (it is enabled
automatically and can not be disabled).

(2-2) State Synchronization modes

State Synchronization uses the following two synchronization modes (since NG with
Application Intelligence):

 Full Sync - Complete synchronization of relevant kernel tables by a cluster member


that tries to join the cluster against the working cluster member(s). This process is
meant to fetch a “snapshot” of the relevant kernel tables of already Active cluster
member(s).

Full Sync is performed during initialization of Check Point software (during boot
process, the first time the member runs policy installation, during 'cpstart'). Until the
Full Sync process is complete successfully, this member remains in 'Down' state
because until it is fully synchronized with other cluster members, it can not function as a
cluster member.

Meanwhile the Delta Sync packets continue to arrive and are stored in kernel
memory until Full Sync completes.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 25
The whole Full Sync process is performed by FWD daemons on TCP port 256 and is
always done over SIC (the information is written into relevant kernel tables via IOCTL):

o The member that tries to join the cluster starts to serve as Full Sync Client.
$FWDIR/log/fwd.elg log file shows:
fwsync: Connected to Sync Server
Decimal_IP_Address_of_Peer_Member. Starting full sync
fwsync: Full sync connection finished successfully
fwsync: End Sync Connection successfully

o A member chosen for Full Sync starts to serve as Full Sync Server.
$FWDIR/log/fwd.elg log file shows:
fwd_syncn_handler: got new full sync connection request from peer
Hex_IP_Address_of_Peer_Member
The information is sent by FWD daemons in chunks while making sure they confirm
getting the information before sending the next chunk.

 Delta Sync - Synchronization of kernel tables between all working cluster members
- exchange of CCP packets that carry pieces of information about different
connections and operations that should be performed on these connections in
relevant kernel tables.

This Delta Sync process is performed directly by Check Point kernel.

While performing Full Sync, the Delta Sync updates are not processed and saved in
kernel memory. After Full Sync is complete, the Delta Sync packets stored during the
Full Sync phase are applied by order of arrival.

Whenever an operation is performed on a kernel table, which is marked as "sync"-ed


(in $FWDIR/conf/table.def file on Security Management Server), the Delta Sync
mechanism duplicates this action into a buffer of its own.
Once this Delta Sync buffer is full, and every Sync timer interval, the Delta Sync
buffer is sent to all cluster members over the Synchronization Network. The receiving
member will duplicate those actions into its kernel tables.

(2-3) Restrictions

1. Refer to ClusterXL Requirements for Hardware and Software section above.

2. State synchronization is supported only between cluster members that meet the
following requirements:
 identical operating systems
 identical Check Point software components
 latency on synchronization network is less than 100 milliseconds and packet loss
is less than 5%
 Note: There is no requirement for throughput of Sync interface to be identical to /
larger than throughput of traffic interfaces.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 26
In addition, some connections can not be synchronized by design:
 Connections that use User Authentication can not be synchronized (because
user authentication state is maintained on Security Servers, which are User
Space processes, and thus cannot be synchronized on different machines in the
way that kernel data can be synchronized).
 Connections that use Resources can not be synchronized (because state of
such connections is maintained on Security Servers, which are User Space
processes, and thus cannot be synchronized on different machines in the way
that kernel data can be synchronized).
 Accounting information can not be synchronized (because it is accumulated in
each cluster member and reported separately to the Security Management
Server, where the information is aggregated).
 Broadcasts and multicasts can not be synchronized by design.
 When DHCP Server is enabled on cluster members, the DHCP Server lease
database is not synchronized by design.
 In R6x versions, Web Intelligence features on a ClusterXL cluster do not survive
failover. This means that if ClusterXL is providing Web Intelligence protections
and a cluster member fails, HTTP connections passing through the failed
member are lost.

Refer to sk92909 (How to debug ClusterXL to understand why a connection is not


synchronized).

(2-4) Synchronization network

A set of interfaces on cluster members that were configured as interfaces, over which
State Synchronization information will be passed (as Delta Sync packets) comprise the
Synchronization Network.

Important Notes:
1. In R70 and above, the use of more than one Synchronization Network for
redundancy is not supported because the CPU load will increase significantly due to
duplicate tasks performed by all configured Synchronization Networks.
If a redundancy of Synchronization Networks is required, Check Point recommends
using Link Aggregation - configure several physical interfaces as a Bond interface,
and then configure single dedicated Synchronization Network over this single Bond
interface.
Refer to Link Aggregation (Bonding) section.
Refer to sk92804 (Sync Redundancy in ClusterXL).
2. State Synchronization information (payload of Delta Sync packets) is not encrypted.
It is up to cluster administrator to make sure that the Sync network is secured and
isolated.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 27
(2-5) Configuring synchronization network

It is strongly recommended to configure Synchronization Network on dedicated interface


- meaning, configuring 'Network Objective' for the interface only as '1st sync' and not as
'Cluster + 1st sync'.

Adding the 'Sync' objective to the interfaces that pass the production traffic will create
additional load on that interface, on the CPU and will make the troubleshooting much more
difficult.

In ClusterXL (including VSX), the Synchronization Network is supported only on


the lowest VLAN tag of a VLAN interface. For example, if three VLANs with tags 10, 20
and 30 were configured on interface eth1, then only interface eth1.10 may be used for
State Synchronization.

If 'Sync' Network Objective was configured on any VLAN tags other than the lowest tag,
then cluster members will reject such configuration, and the output of 'cphaprob -a if'
command will explicitly show that no synchronization interfaces were configured.

(2-6) Configuring a service to be non-synchronized

Refer to ClusterXL Administration Guide (R70, R70.1, R71, R75, R75.20, R75.40,
R75.40VS, R76, R77.X, R80.10) - Chapter 'Synchronizing Connection Information Across
the Cluster' - Configuring State Synchronization - Configuring a Service Not to Synchronize.

By default, all connections are synchronized (i.e., for each processed connection, a
Delta Sync packet is created, sent by the member that processed this connection, received
and processed by peer members).

If the amount of traffic is high, the amount of Delta Sync packets will cause noticeable
load on the CPU. This load will increase significantly, if more than one Synchronization
Network is configured.

In order to increase the performance of cluster members, synchronization of some


connections can be set in the following way:
 disabled completely (e.g., for DNS UDP, ICMP)
 start the synchronization after a pre-defined delay (e.g., for HTTP downloads)

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 28
In order to change the relevant settings for a specific service:
1. Locate the service (either in the rulebase, or on the 'Services' tab)
2. Right-click on the service - select 'Edit...' - click on 'Advanced...' button
3. Change the relevant settings:
A. To disable the synchronization completely, uncheck the box 'Synchronize
connections on Cluster':

B. To start the synchronization after a pre-defined delay, check the box


'Synchronize connections on Cluster', check the box 'Start
synchronizing [ ] seconds after connection initiation' and
enter the desired number of seconds:

4. Click 'OK' to apply the changes.


5. Save the settings: 'File' menu - 'Save'.
6. Install policy onto cluster object.

It is possible to have both a synchronized and a non-synchronized definition of the


service, and to use them selectively in the Security Rule Base:
1. Create a new service (TCP, UDP and Other type) and give it a name that
distinguishes it from the existing service.
2. Copy all the definitions from the existing service into the Service Properties window
of the new service.
3. In the new service, click on 'Advanced...' button.
4. Copy all the advanced definitions from the existing service into the Advanced
Service Properties window of the new service.
5. Set the 'Synchronize connections on Cluster' in the new service, so that it
is different from the setting in the existing service.
6. Save the settings: 'File' menu - 'Save'.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 29
Disable the Synchronization for a service if ALL of the following conditions are true:
1. A significant portion of the traffic crossing the cluster uses a particular service. If you
do not synchronize this service, then the amount of synchronization traffic is reduced
and cluster performance is enhanced.
2. The service usually opens short connections, whose loss may not be noticed. DNS
(over UDP) and HTTP are typically responsible for most connections, and generally
have very short life and inherent recoverability at the application level. However,
services, which typically open long connections, such as FTP, should always be
synchronized.
3. Configurations that ensure bi-directional stickiness for all connections do not require
synchronization to operate (only to maintain High Availability). Such configurations
include:
 Any cluster in High Availability mode (for example, ClusterXL New HA or VRRP
 ClusterXL in a Load Sharing mode with clear connections (no VPN or static NAT)
 OPSec clusters that guarantee full stickiness (refer to the OPSec cluster's
documentation)

3. ClusterXL Modes
 Up to 8 cluster members are supported in ClusterXL.
 Up to 5 cluster members are supported 3rd party cluster (IPSO / X-Series).
 Up to 2 cluster members are supported in VRRP cluster on Gaia OS (sk105170).
Notes:
 In ClusterXL Load Sharing mode, configuring more than 4 members significantly
decreases cluster performance due to amount of Delta Sync
 In X-Series chassis, configuring more than 4 members (APMs) significantly
decreases cluster performance due to amount of Delta Sync.
 In X-Series DBHA configuration, the above requirement applies to a single chassis
(Check Point code is not aware of DBHA).

Explanation: Cluster performance is decreased due to the high load on CPU that is
caused by the amount of Delta Sync packets, which increases significantly with the number
of cluster members (the whole cluster might suffocate, depending on the production traffic,
of course).

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 30
(3-1) Mode comparison table

Refer to ClusterXL Administration Guide (R70, R70.1, R71, R75, R75.20, R75.40,
R75.40VS, R76, R77.X, R80.10) - Chapter 'High Availability and Load Sharing in ClusterXL'
- ClusterXL Modes - Mode Comparison Table:

High High VRRP Load Load


Mode Availability Availability Sharing Sharing
Feature New Legacy Multicast Unicast
Mode Mode 1 Mode 2 Mode 2
High Availability Yes Yes Yes No No
Load Sharing No No No Yes Yes
Assigned Traffic
100% 100% 100% 100% / N sk34668
Load per Member
Performance Good Good Good Excellent Very Good
HW Support Not all
All All All All
routers 3
SecureXL
Yes Yes Yes Yes 2 Yes 2
Support
State
Synchronization is No No No Yes Yes
Mandatory
VLAN Tagging
Yes Yes Yes Yes Yes
Support
Number of
members that
1 1 1 N N
deal with network
traffic
Number of
members that
1 1 1 N 1
receive packets
from router
How cluster Unicast Unicast Unicast
Virtual Multicast
answers ARP MAC shared MAC
VRRP MAC MAC
requests for a VIP address of MAC address of
address address 5
address 4 Active address Pivot
CCP mode (also Multicast / Broadcast Multicast / Multicast / Multicast /
refer to sk36644) Broadcast only Broadcast Broadcast Broadcast

Notes:

1. High Availability Legacy Mode is not supported on Gaia OS (sk103083).


2. Refer to sk101539 (ClusterXL Load Sharing mode limitations and important notes).
3. Refer to Requirements for switches and routers section.
4. Refer to sk31782 (ClusterXL association between cluster Virtual IP addresses and
MAC Addresses).
5. In Load Sharing Multicast mode, the Multicast MAC address that is associated with
Virtual IP address is calculated automatically (and can be changed manually).

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 31
In order to see / change this Multicast MAC address:

A. Open SmartDashboard
B. Open cluster object
C. Go to 'Topology' pane - click on 'Edit...'
D. Select the relevant VIP interface - click on 'Edit...'
E. On 'General' tab, click on 'Advanced...'
F. If you need to change this address, then
select 'User defined:' and enter new Multicast MAC address
G. Click 'OK' in all windows to apply the changes
H. Save the changes: go to 'File' menu - click on 'Save'
I. Install policy

Automatic algorithm for generating a Multicast MAC address to be associated with


cluster Virtual IP address of a format "A"."B"."C"."D":

o If 2nd octet "B" < 127, then


Final MAC = 01:00:5E:("B"hex):("C"hex):("D"hex)

Example:
VIP = 192.50.204.20
Final MAC = 01:00:5E:("50"hex):("204"hex):("20"hex) =
= 01:00:5E:32:CC:14

o If 2nd octet "B" > 127, then


Final MAC = 01:00:5E:("B-128"hex):("C"hex):("D"hex)

Example:
VIP = 192.168.204.20
Final MAC = 01:00:5E:("168-128"hex):("204"hex):("20"hex) =
= 01:00:5E:28:CC:14

Refer to sk25977 (Connecting multiple clusters to the same network segment (same
VLAN, same switch) - section about the "Destination MAC address" of the Cluster
Control Protocol.

(3-2) Example cluster topology

This diagram can be used as example of cluster topology for these modes:
o High Availability New mode
o Full High Availability mode
o Load Sharing Multicast mode
o Load Sharing Unicast mode

Note: The example diagram for High Availability Legacy (Traditional) mode appears in
the corresponding section.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 32
(3-3) High Availability New mode

Refer to ClusterXL Administration Guide (R70, R70.1, R71, R75, R75.20, R75.40,
R75.40VS, R76, R77.X, R80.10) - Chapter 'High Availability and Load Sharing in ClusterXL'
- ClusterXL Modes - High Availability Mode.

The High Availability New mode provides basic High Availability capabilities in a cluster
environment. The cluster provides Firewall services even when it encounters a problem,
which on a StandAlone module would have resulted in a complete loss of connectivity.
When combined with Check Point's State Synchronization, ClusterXL in High Availability
New mode can maintain connections through failover events, in a user-transparent manner,
allowing a flawless connectivity experience.

ClusterXL High Availability New mode designates one of the cluster members as the
Active machine, while the rest of the members are kept in a Standby mode.

High Availability New mode uses unique, real IP addresses for the cluster members
interfaces. The cluster Virtual IP addresses are associated with the physical network
interfaces of the Active machine (by matching the Virtual IP address with the unique MAC
address of the appropriate physical interface).
The cluster members physical IP addresses do not have to be routable on the Internet.
Only the cluster Virtual IP addresses must be routable.

All traffic directed at the cluster is actually routed (and filtered) by the Active member -
assigned traffic load is 100%. The role of each cluster member is chosen according to its
priority, with the Active member being the one with the highest priority (and lowest Member
ID).

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 33
Whenever the cluster detects a problem in the Active member that is severe enough to
cause a failover event, it passes the role of the Active member to one of the Standby
machines (the member with the next highest priority).

If State Synchronization is applied, any open connections are recognized by the new
Active machine, and are handled according to their last known state.

Upon the recovery of a member with a higher priority, the role of the Active machine
may or may not be switched back to that member, depending on the cluster configuration.

(3-4) Full High Availability mode

Refer to ClusterXL Administration Guide (R70, R70.1, R71, R75, R75.20) - Chapter
'UTM-1 Clustering'.

Full High Availability (a.k.a. Full HA) mode is a special Cluster Mode that is supported
only on Check Point appliances running Gaia OS or SecurePlatform OS, where each
cluster member also runs as a Security Management Server.
This mode provides redundancy both between Security Gateways (only High Availability
is supported - assigned traffic load is 100%) and between Security Management Servers
(only High Availability is supported and there is no failover between the Security
Management server components).

Refer to these solutions:


 sk39345 (Management High Availability restrictions)
 sk61580 (How to rebuild a Full HA cluster after primary member fails and backup is
not available)
 sk98831 ("Execution finished with errors" message on migrate import / export
command failure)
 sk67681 (How to migrate from distributed environment to a UTM-1 Full HA cluster)
 sk44201 (How to migrate Full HA environment to Distributed)
 sk69627 (How to migrate from StandAlone configuration on Open Server to Full HA
cluster on UTM-1 appliances)
 sk60443 (How To Install UTM-1 Appliances in Full HA cluster)
 sk79200 (SmartView Monitor shows 'Log Server is not responding' error for a Full
HA cluster on Gaia OS)
 sk36863 (Limitations of UTM-1 cluster that consists of different UTM-1 appliance
models)

Refer to 'cp_conf fullha' command.

Notes:
 Unlike between the Security Gateway components, there is no failover between the
Security Management server components. If the Primary Security Management
Server goes down, the Secondary Security Management Server does not take over.
However, the database on the Secondary Security Management Server is fully
synchronized with the database on the Primary Security Management Server, so no
data is lost.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 34
 The members of Full HA cluster can be configured either together (both Check Point
appliances are linked before the First Time Configuration Wizard is opened), or
separately (the user chooses to configure a cluster consisting of a single, Primary
member, and configure the Secondary member later).
 Even if you decide not to install a Secondary cluster member during the initial
configuration, it is still worth your while to configure a cluster composed of a single
Primary member. A Full HA cluster is visible to the external network through its
Virtual IP addresses, not the actual physical addresses of its members. If at some
point you do decide to add a Secondary member, you will not have to alter the Layer
3 topology of your networks.

(3-5) High Availability Legacy (Traditional) mode

Refer to ClusterXL Administration Guide (R70, R70.1, R71, R75, R75.20, R75.40,
R75.40VS, R76, R77.X, R80.10) - Appendix A 'High Availability Legacy Mode'.

Note: High Availability Legacy Mode is not supported on Gaia OS (sk103083).

Important Note: Although it is still supported, High Availability Legacy mode is


considered obsolete. There is no technical reason to use this mode anymore. If the cluster
administrator needs High Availability mode and the association of Cluster Virtual IP
addresses with constant unicast MAC addresses, then the cluster should be configured in
High Availability New mode with enabled VMAC (refer to sk50840).

In High Availability Legacy mode, the cluster members share identical IP and MAC
addresses, so that the Active cluster member receives from a hub or switch all the packets
that were sent to the cluster IP address - assigned traffic load is 100%. A shared interface
is an interface with MAC addresses and IP addresses that are identical to those of another
interface on the peer members.

The Security Management Server must not be connected to these shared interfaces - in
other words, the synchronization network of the cluster, or to a dedicated management
network.

Configuring this mode is complicated, and must be performed in a precise sequence in


order to be successful.

Configuration instructions:
1. On Cluster Members
A. Disconnect the members from any switches / hubs
B. Install the same version of Check Point Security Gateway
C. Enable cluster membership
D. Configure identical IP addresses for shared interfaces
E. Do NOT reboot the machines
F. Connect the members through switch / hub - connect each network (internal,
external, Synchronization, DMZ, etc.) to a separate VLAN, switch or hub
G. Reboot the machines - for MAC Address configuration to take place

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 35
2. In SmartDashboard
A. Create a cluster object and select High Availability Legacy mode
B. Add members' objects - assign IP address from dedicated synchronization
network of the cluster, or from a dedicated management network
C. Initialize SIC
D. Define Topology:
 No Virtual IP addresses for shared interfaces
 Network Objective for shared interfaces is 'Monitored Private'
 Sync interfaces, Network Objective for dedicated management interfaces
is 'Monitored Private' or 'Non-Monitored Private'
E. Install policy
F. Reboot the machines - MAC Address configuration will take place

Example:

(3-6) Load Sharing Multicast mode

Refer to ClusterXL Administration Guide (R70, R70.1, R71, R75, R75.20, R75.40,
R75.40VS, R76, R77.X, R80.10) - Chapter 'High Availability and Load Sharing in ClusterXL'
- ClusterXL Modes - Load Sharing Multicast Mode.

ClusterXL in Load Sharing Multicast mode distributes traffic within a cluster of Security
Gateways, so that the total throughput of multiple machines is increased.
In Load Sharing configurations, all functioning machines in the cluster are Active, and
handle network traffic (Active/Active operation) - assigned traffic load is 100% equally
divided by the number of active members.
If there is a failure in one of the machines, its connections are redistributed amongst the
remaining operational machines in the cluster.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 36
If any individual Check Point Security Gateway in the cluster becomes unreachable,
transparent failover occurs to the other machines, thus providing High Availability. All
connections are shared between the remaining Security Gateways without interruption.

Load Sharing Multicast mode uses unique, real IP addresses for the cluster members
interfaces. The cluster Virtual IP addresses are associated with the Multicast MAC address
(created based on the Virtual IP addresses).
The cluster members physical IP addresses do not have to be routable on the Internet.
Only the cluster Virtual IP addresses must be routable.

ClusterXL uses the Ethernet Multicast mechanism to associate the cluster Virtual IP
addresses with all cluster members. By binding these Virtual IP addresses to Multicast
MAC addresses, it ensures that all packets sent to the cluster, acting as a gateway, will
reach all members in the cluster.
Distribution of the traffic between cluster members is performed by applying a Decision
Function to each packet - each member decides whether it should or should not process
the packets.
This decision is the core of the Load Sharing mechanism: it has to assure that at least
one member will process each packet (so that traffic is not blocked), and that no two
members will handle the same packets (so that traffic is not duplicated).
If it is required that specific connections are always processed by particular member,
then additional decision algorithm can be enabled - Sticky Decision Function.
Refer to sk101539 (ClusterXL Load Sharing mode limitations and important notes).

(3-7) Load Sharing Unicast mode

Refer to ClusterXL Administration Guide (R70, R70.1, R71, R75, R75.20, R75.40,
R75.40VS, R76, R77.X, R80.10) - Chapter 'High Availability and Load Sharing in ClusterXL'
- ClusterXL Modes - Load Sharing Unicast Mode.

The Load Sharing Unicast mode was developed in order to meet the needs of
customers, who use legacy routers that do not support the use of a multicast MAC address
for a unicast IP address (refer to Requirements for switches and routers section).
Customers, who work in such environment require a Load Sharing solution, but do not wish
or cannot afford to replace their existing hardware.
Historical fact: This mode was introduced in NG FP4.

One of the cluster members - machine with highest priority, called the Pivot, is the only
machine that communicates with the router. In this scheme, the router has to know and
deal with a single unicast MAC address only - the Pivot’s MAC address. The Pivot
communicates with the router "on behalf" of the cluster, thus, enabling the usage of a
unicast traffic.
The Pivot is responsible for forwarding and distributing the traffic throughout the cluster,
while implementing both load sharing and redundancy solutions.

Load Sharing Unicast mode uses unique, real IP addresses for the cluster members
interfaces. The cluster Virtual IP addresses are associated with the physical network
interfaces of the Pivot machine (by matching the Virtual IP address with the unique MAC
address of the appropriate physical interface).

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 37
The cluster members physical IP addresses do not have to be routable on the Internet.
Only the cluster Virtual IP addresses must be routable.

Distribution of the traffic by Pivot member is performed by applying a Decision Function


to each packet, the same way it is done in Load Sharing Multicast mode. The difference is
that only one member (Pivot) performs this selection: any non-Pivot member that receives
a forwarded packet will handle it, without applying the Decision Function.
If it is required that specific connections are always processed by particular member,
then additional decision algorithm can be enabled - Sticky Decision Function.

Note that non-Pivot members are still considered as “Active”, since they perform routing
and Firewall tasks on their share of the traffic (although they do not perform decisions).

Default traffic load assignment:


Cluster size % of traffic handled % of traffic handled by
(including Pivot) by the Pivot each of the other members
1 100 0
2 33 67
3 20 40
4 10 30
5 0 20
6 and more 0 100 / cluster size

Refer to these solutions:


o sk101539 (ClusterXL Load Sharing mode limitations and important notes)
o sk34668 (How to modify the assigned load between the members of ClusterXL in
Load Sharing Unicast mode)
o sk61331 (ClusterXL Load Sharing in Unicast (Pivot) mode - after second reboot of
Pivot member, output of 'cphaprob stat' on non-Pivot member shows wrong
assigned load as 0% for Pivot and 100% for non-Pivot)

(3-8) VRRP

Refer to Gaia Administration Guide (R75.40, R75.40VS, R76, R77.X, R80.10) - Chapter
'High Availability'.

Virtual Router Redundancy Protocol (VRRP, RFC 3768) provides dynamic failover of IP
addresses from one router (Master) to another router (one of the Backup routers) in the
event of failure. VRRP allows you to provide alternate router paths for end hosts.

The Check Point VRRP implementation on Gaia OS includes functionality called


Monitored Circuit VRRP. Monitored Circuit VRRP prevents connection issues caused by
asymmetric routes created when only one interface on Master router fails (as opposed to
the Master itself).

Each VRRP cluster, known as a Virtual Router, has a unique identifier, known as the
VRID (Virtual Router Identifier). A Virtual Router can have one or more virtual IP addresses
(VIP) to which other network nodes connect as a final destination or the next hop in a route.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 38
By assigning a Virtual IP address (VIP), you can define alternate paths for nodes
configured with static default routes. Only the Master router is assigned a VIP. The Backup
router is assigned a VIP upon failover when it becomes the new Master. Nodes can have
alternate paths with static default routes in the event of a failure.
Static default routes minimize configuration and processing overhead on host
computers.

Important Note: You cannot have a standalone deployment (Security Gateway and
Security Management server on the same computer) in a Gaia VRRP cluster.

Refer to these solutions:


 sk70380 (Gaia FAQ - Frequently Asked Questions)
 sk69684 (Using VRRP with Check Point 2012 Security Appliances)
 sk92061 (How to configure VRRP on Gaia)
 sk66569 (IPSO-to-Gaia Upgrade Scripts and VRRP Cluster Upgrade Instructions)
 sk86881 (Changing the High Availability configuration from ClusterXL and VRRP (or
from VRRP to ClusterXL) requires reboot)
 sk40278 (VRRP configuration is not updated when the logical interface information
(IP address) is changed)
 sk92880 (It is not possible to configure preempt in Simplified VRRP on IPSO and
Gaia)
 sk89980 (Sub-interfaces / Alias IP address / Secondary IP address on Gaia)

(3-9) Bridge

ClusterXL in Bridge Mode is supported only in R75.40VS / R76 / R77 and above.
Refer to sk101371 (Bridge Mode on Gaia OS and SecurePlatform OS).

(3-10) Sticky Decision Function

Cluster administrator should learn about Sticky Decision Function, which enables
certain services to operate in a Load Sharing deployment. For example, it is required for
L2TP traffic, or when the cluster is a participant in a Site-to-Site VPN tunnel with a 3rd party
peer.

The Sticky Decision Function has the following limitations:


 Sticky Decision Function is not supported when employing either Performance Pack
or a hardware based accelerator card. Enabling the Sticky Decision Function
disables these acceleration products.
 When the Sticky Decision Function is used in conjunction with VPN, cluster
members are prevented from opening more than one connection to a specific peer.
Opening another connection would cause another SA to be generated, which a third-
party peer, in many cases, would not be able to process.
 Sticky Decision Function does not maintain stickiness in VPN Routing (back-to-back
VPN) gateways where both sides of the connection are encrypted.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 39
Sticky Decision Function supports the following sharing methods (configured in
SmartDashboard - cluster object - go to 'ClusterXL' pane - select 'Load Sharing' - click on
'Advanced...' - check the box 'Use Sticky Decision Function'):

 IPs, Ports, SPIs (default) - provides the best sharing distribution, and is
recommended for use.
It is the least "sticky" sharing configuration.

Clarification:
A connection will stick to a cluster member based on IP addresses and based on
Ports.

Example:
Connection from IP_1:Port_1 to IP_2:Port_2 will stick to Member_A. Connection
from IP_1:Port_2 to IP_2:Port_2 might stick to Member_B.

 IPs, Ports - should be used only if problems arise when distributing IPSec packets
to a few machines although they have the same source and destination IP
addresses.

 IPs - should be used only if problems arise when distributing IPSec packets or
different port packets to a few machines although they have the same source and
destination IP addresses.
It is the most "sticky" sharing configuration.
In other words, it increases the probability that a certain connection will pass through
a single cluster member on both inbound and outbound directions.

Clarification:
A connection will "stick" to a cluster member based only on IP addresses.

Example:
All connections from IP_1 (from any port) to IP_2 (to any port) will stick to the same
Member_A.

Warning:
Since all connections between the given IP addresses will stick to the same
member, the CPU load on that member might increase significantly, which in turn
will negate the whole purpose of Load Sharing cluster mode.

Note:
Sticky Decision Function is enabled automatically, if Mobile Access Software Blade is
enabled on the cluster.
For more details, refer to ClusterXL Administration Guide (R70, R70.1, R71, R75,
R75.20, R75.40, R75.40VS, R76, R77.X, R80.10) - Chapter 'Sticky Connections'.
Refer to sk101539 (ClusterXL Load Sharing mode limitations and important notes).

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 40
(3-11) Forwarding

Forwarding is a process of transferring of an incoming traffic from one cluster member


to another cluster member for processing

There are two methods of forwarding the incoming traffic:


 Packet forwarding
o the packet is forwarded to the target member
o the packet will skip the inbound chain and get directly into the IP stack of target
machine
 Chain forwarding
o the chain is forwarded to the target member
o the chain will start the chain process from the chain module that has asked the
chain forwarding (or the one after it)

ARP Forwarding is referenced below in section (3-11-C).

(3-11-A) Packet forwarding

Example:
A connection was initiated on the 'Standby' member in High Availability cluster. The
reply packets to such connection will be accepted by 'Active' member, and must be
forwarded to 'Standby' member.

Description:
The sending cluster member forwards the packet at the end of the Inbound processing.
On the target cluster member, the processing of the forwarded packet will continue from
the chain at which it has stopped on the source cluster member, or the packet will be
entered directly into the TCP/IP stack (if the packet has already passed through all
Inbound chains).

Debugging:
In order to see how a packet is forwarded between cluster members, debug the
'cluster' module with 'forward' flag (in addition, these flags are recommended:
'select', 'if', 'mac'):
[Expert@GW_HostName]# fw ctl debug -m cluster + forward select if mac

Technical details:
Packet Forwarding is performed in the following way (so that the target cluster member
can understand that this packet is intended to him):

 In High Availability mode, the connection is forwarded over Synchronization


Network.

Description:
Since the processed packet may be already decrypted, it must be sent over the
secured interfaces.
On the receiving side, the machine will not pass this packet to the FireWall (the
packet will not perform the inspection again), but instead the packet is passed
directly to the IP stack of the operating system.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 41
o Layer 2 Source MAC address of the packet is changed to:

In ClusterXL:
1st 2nd 3rd 4th 5th 6th
00 00 00 00 fwha_mac_forward_magic ID_of_Target_Member
Notes:
 fwha_mac_forward_magic - name of the kernel parameter that controls
the value of 5th byte in forwarded packets (default value in R77.30 and lower
is 0xFD hex / 253 dec)
 Refer to sk25977 (Connecting multiple clusters to the same network segment
(same VLAN, same switch)
 Refer to sk95150 (When the Synchronization interfaces of three and more
ClusterXL members are connected to the same switch, port flapping occurs
on the switch)

In VSX cluster (any VSX cluster works in High Availability mode):


1st 2nd 3rd 4th 5th 6th
00 00 00 XXXXXXXX fwha_mac_forward_magic YYYZZZZZ
Notes:
 XXXXXXXX - 8 most significant bits of VSID
 fwha_mac_forward_magic - name of the kernel parameter that controls
the value of 5th byte in forwarded packets (default value is 0xF5 hex / 245 dec)
 YYY - 3 least significant (right-most) bits of VSID
 ZZZZZ - ID of target cluster member
 Refer to sk25977 (Connecting multiple clusters to the same network segment
(same VLAN, same switch)

o Layer 2 Destination MAC address of the packet is changed to the MAC address
of the Sync interface on peer member.

o Layer 3 Source IP address is the IP address of the host that sent the original
packet.

o Layer 3 Destination IP address is the physical IP address of the cluster member


on that subnet.

o The packet is dropped on the member that forwarded the packet (log is
generated only if forwarding fails).

Debug:

o In order to see the forwarding process, run the debug of 'cluster' module with
'forward' flag and of 'fw' module with 'drop' flag on Active member:
fwha_forward_msg_wrapper(if_number direction position
Target_Member_ID): forwarding
FW-1: fwha_forward_send_msg: Forwarding packet to id Target_Member_ID
fwha_forw_flush_callback: Forwarded successfully. Dropping chain
fw_log_drop: Packet proto= ... dropped by fwhaforw.c:LINE Reason:
unknown;

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 42
o In order to see the arrival of forwarded packet, run the debug of 'cluster'
module with 'select' flag on Standby member:
FW-1: FORWARDED Packet : fwha_select_ip_packet: (IF if_name (if_number)
at N sec) using magic ether header (0xZZZZZZZZ)

Example:

Layer 2 and Layer 3 Addresses:


Machine MAC address IP address
Host 00:50:56:c0:00:01 192.168.204.1
Active - eth0 (ext) 00:0C:29:DB:26:47 192.168.204.10
Active - eth1 (sync) 00:0C:29:DB:26:51 10.10.10.10
Standby - eth0 (ext) 00:0C:29:72:56:47 192.168.204.12
Standby - eth1 (sync) 00:0C:29:72:56:51 10.10.10.12

Traffic flow: Standby member initiates a TCP connection to the Host.

Packet flow:
1. The TCP SYN packet is sent by Standby over the External (eth0) with:
o Source MAC address 00:0C:29:72:56:47 (Standby ext eth0)
o Destination MAC address 00:50:56:c0:00:01 (Host)
o Source IP address 192.168.204.12 (Standby ext eth0)
o Destination IP address 192.168.204.1 (Host)

2. The SYN ACK packet is sent by Host over the External (eth0) with:
o Source MAC address 00:50:56:c0:00:01 (Host)
o Destination MAC address 00:0C:29:DB:26:47 (Active)
o Source IP address 192.168.204.1 (Host)
o Destination IP address 192.168.204.20 (cluster Virtual IP)

3. The SYN ACK is forwarded over the Sync (eth1) with:


o Source MAC address 00:00:00:00:FD:01 (ClusterXL Forwarding layer)
o Destination MAC address 00:0C:29:72:56:51 (Standby sync eth1)
o Source IP address 192.168.204.1 (Host)
o Destination IP address 192.168.204.12 (Standby ext eth0)

Refer to sk95150 (When the Synchronization interfaces of three and more ClusterXL
members are connected to the same switch, port flapping occurs on the switch).

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 43
 In Load Sharing Multicast mode, the connection arrives to all cluster members,
and each member decides whether it should process the packet or not.

When the Sticky Decision Function (SDF) is used, refer to sk95150 (When the
Synchronization interfaces of three and more ClusterXL members are connected to
the same switch, port flapping occurs on the switch).

In order to see the arrival of forwarded packet, run the debug of 'cluster' module
with 'select' flag on receiving member:

o If the local member should process this packet, the following is printed:
FW-1: fwha_select_ip_packet: Packet IN SourceIP_in_Hex->DestIP_in_Hex
FW-1: fwha_local_member_should_procces_mc: local member should process
packet
FW-1: fwha_select_ip_packet: Packet was filtered by member Member_ID
o If the local member should not process this packet, the following is printed:
FW-1: fwha_select_ip_packet: Packet IN SourceIP_in_Hex->DestIP_in_Hex
FW-1: fwha_local_member_should_procces_mc: local member should not process
packet
FW-1: fwha_select_ip_packet: Packet was dropped by member Member_ID

 In Load Sharing Unicast mode, the connection is forwarded over the same
interface, on which it was received - not over Synchronization Network.

o Layer 2 Source MAC address of the packet is inverted and combined in a special
way with values of these kernel parameters: fwha_mac_magic and
fwha_mac_forward_magic.
Notes:
 fwha_mac_magic - controls the value of 5th byte in Source MAC address of
CCP packets (default value in R77.30 and lower is 0xFE hex / 254 dec; starting
in Gaia R80.10, the value is assigned automatically)
 fwha_mac_forward_magic - controls the value of 5th byte in Source MAC
address of forwarded packets (default value in R77.30 and lower is 0xFD hex /
253 dec; starting in Gaia R80.10, the value is assigned automatically)
 Refer to sk25977 (Connecting multiple clusters to the same network segment
(same VLAN, same switch)

o Layer 2 Destination MAC address of the packet is changed to the MAC address
of the non-Pivot cluster member on the same subnet.

o Layer 3 Source IP address is the IP address of the host that sent the original
packet.

o Layer 3 Destination IP address is the physical IP address of the cluster member


on that subnet.

o The packet is dropped on the member that forwarded the packet (log is
generated only if forwarding fails).

Refer to sk41898 (Connecting multiple clusters running in Load Sharing Unicast


mode results in MAC Address flapping on switches).

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 44
Debug:

o In order to see the forwarding process, run the debug of 'cluster' module with
flags 'pivot' and flag 'select' on Pivot member:
FW-1: fwha_pivot_selection_from_packet: packet forwarded ok to machine
Target_Member_ID
fwhamultik_handle_ip_packet: Dropping packet since it is not my packet,
packet was forwarded (LS pivot)

o In order to see the arrival of forwarded packet, run the debug of 'cluster'
module with 'select' flag on non-Pivot member:
fwha_select_ip_packet: The inverted back source MAC address will be XX-
XX-XX-XX-XX-XX

Example:

Layer 2 and Layer 3 Addresses:


Machine MAC address IP address
Host 00:50:56:c0:00:01 192.168.204.1
Pivot - eth0 (ext) 00:0C:29:DB:26:47 192.168.204.10
non-Pivot - eth0 (ext) 00:0C:29:72:56:47 192.168.204.12

Traffic flow: Pivot cluster member receives a TCP connection from Host and
forwards it to the non-Pivot cluster member.

Packet flow:
1. Pivot cluster member performs bit-wise 'NOT' on the 4 last octets (from the left)
of the Source MAC address of the packet.

Hence, in our example:


00:50:56:c0:00:01 becomes 00:50:A9:3F:FF:FE.

2. Pivot cluster member performs bit-wise 'AND' between:


o the value of fwha_mac_magic kernel parameter
o the value of fwha_mac_forward_magic kernel parameter

Let us take the default values (in R77.30 and lower):


o fwha_mac_magic=0xFE
o fwha_mac_forward_magic=0xFD

Hence, in our example:


[fwha_mac_magic AND fwha_mac_forward_magic] =
[(0xFE) AND (0xFD)] = 0xFC.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 45
3. Pivot cluster member performs bit-wise 'XOR' between:
o the 3rd octet (from the left) of the 'NOT'-ed Source MAC address from Step 1
o the result of 'AND' operation from Step 2

Hence, in our example:


[(3rd octet of NOT-ed Host MAC) XOR
(fwha_mac_magic AND fwha_mac_forward_magic)] =
[(3rd octet of 00:50:A9:3F:FF:FE) XOR (0xFC)] =
[(A9) XOR (0xFC)] = 55

4. Pivot cluster member replaces the 3rd octet (from the left) of the 'NOT'-ed
Source MAC address from Step 1 with the result from Step 3.

Hence, in our example:


00:50:A9:3F:FF:FE becomes 00:50:55:3F:FF:FE.

5. Therefore, in our example, the final inverted Source MAC address during the
packet forward will be:
00:50:55:3F:FF:FE.

6. Pivot cluster member forwards the packet through the original interface (eth0)
towards the non-Pivot cluster member with:
o Source MAC address 00:50:55:3F:FF:FE (final inverted MAC of Host)
o Destination MAC address 00:0C:29:72:56:47 (non-Pivot eth0)
o Source IP address 192.168.204.1 (Host)
o Destination IP address 192.168.204.12 (non-Pivot eth0)

7. The non-Pivot cluster member performs the reversed inversion in order to extract
the original Source MAC address of the Host (to use it later as the Destination
MAC address in the packets).

Note: The only information that can be seen in kernel debug is the original Source
MAC address of the packet after the non-Pivot cluster member performs the
reversed inversion.

In order to see the original Source MAC address, debug the 'cluster' module with
'select' flag on the non-Pivot cluster member:
[Expert@GW_HostName]# fw ctl debug -m cluster + select

The following will be printed (based on our example):


;fwha_select_ip_packet: The inverted back source MAC address will
be 00:50:56:c0:00:01;

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 46
(3-11-B) Chain forwarding

Example:
A connection was initiated that requires inspection by Check Point Active Streaming
(CPAS) - e.g., SMTP Security Server.

Description:
Chain forwarding enables one cluster member to pass a chain (a packet filtered by a
FireWall module, along with data attached to the packet by the different handling
routines) to another cluster member.
Thus, the second member can resume the handling process at the same point the first
member has ceased.
Starting in NGX R60, chain forwarding is also used for Dynamic Routing.

Debugging:
In order to see how a chain is forwarded between cluster members, debug the 'fw'
module with 'chainfwd' flag (in addition, these flags are recommended: chain',
'conn', 'packet'):
[Expert@GW_HostName]# fw ctl debug -m fw + chainfwd chain conn packet

Technical details:
In CPAS case, packet forwarding cannot be used because in order to use packet
forwarding, the chain must finish passing through all the chain modules. But since all
the information that CPAS holds on this connection is located only on the other
member, the chain cannot be processed by CPAS, and therefore should be forwarded
to the member that handled this connection originally.
CPAS information is not forwarded between members because of the size of
information that will need to be synchronized and will cause performance issues.
The Forwarding Layer will receive a packed chain on the source cluster member, and
will transmit it to the target cluster member. Any table updates, which are the result of a
transmitted chain, will be applied to the target member before the chain is delivered for
processing on that machine.

Packet Forwarding is performed in the following way (so that the target cluster member
can understand that this packet is intended to him):
 In case, the target member is down, but its Sync interface is still up, the chain will be
forwarded to it and handled by it.
 In case, the Sync interface is down, the chain will be dropped by the source
member.

Note:
Why not forwarding the packet and starting it at the beginning of the chain? Because in
that case, the original packet needs to be kept (before the changes that made to it by
the chain modules) and the entire table changes that were made need to be undone,
because they will be on the target member again. It appears that such implementation is
more complicated.

(3-11-C) ARP Forwarding

Refer to sk111956 (ARP Forwarding in Check Point ClusterXL).

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 47
4. ClusterXL Configuration
Refer to this solution:
 sk66527 (Recommended configuration for ClusterXL)
 sk42096 (Cluster member is stuck in 'Ready' state)

(4-1) Clock synchronization

In order to improve cluster stability, the clocks on all cluster members must be
synchronized. Although cluster members are able to deal with difference within 1 hour
(VPN has much stricter limit of several minutes), it is strongly recommended to use NTP on
cluster members.

Refer to these solutions:


 sk25894 (Configuring NTP on SecurePlatform OS)
 sk76600 (How to confirm NTP settings on SecurePlatform OS)
 sk83820 (How to configure Advanced NTP features on Gaia OS)
 sk92379 (How to configure NTP authentication on Gaia OS)
 sk38957 (NTP FAQ for IP appliances)
 sk41502 (How to adjust the polling interval in NTP on IP appliances)
 sk62845 (How to enable or disable NTP on IP appliances)
 sk62861 (How to verify that NTP is working on IP Appliances)

(4-2) Preparing cluster members

Refer to ClusterXL Requirements for Hardware and Software section.

Important Note: Configuring cluster for the first time with a single member is not
supported - it will not be possible to install policy for the first time onto cluster object that
contains only a single member.

If an administrator plans to install the cluster of several members, but currently has only
one machine, then there are only these two options:

1. To avoid/minimize traffic outage in the future when adding 2nd member:


A. Prepare the switches
B. Connect the existing machine
C. Connect some temporary machine (in this case, only the operating system and
the number of interfaces matter)
D. Configure the cluster object of 2 members
E. Install policy for the first time
F. Shut down this temporary machine with 'cpstop' and 'shutdown' commands
G. The existing machine will function as a cluster member with VIP, etc. (this
member's state will be 'Active Attention')

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 48
H. When the 2nd desired machine is available to be installed as 2nd member:
a) Install the 2nd member on desired machine
b) In SmartDashboard, reset the SIC in the object of temporary member
c) Initialize the SIC with new member
d) Get topology from new member
e) Verify the configuration of cluster interfaces
f) Save the changes
I. Install policy onto cluster object

2. Configure the existing machine as a single gateway and suffer from traffic outage in
the future when adding 2nd member:
A. Prepare the switches
B. Connect only the existing machine
C. Configure object of singe Security Gateway
D. Initialize the SIC with existing machine
E. Install the policy
F. When the 2nd desired machine is available to be installed as 2nd member:
a) Install the 2nd machine
b) Configure object of singe Security Gateway for 2nd machine
c) Initialize the SIC with new machine
G. Create a cluster object
H. Add existing security gateway objects as members
I. Get topology from all members
J. Configure cluster interfaces
K. Install policy onto cluster object

To prepare the cluster member machines:


1. For IPSO clusters, configure VRRP or IP Clustering before installing Check Point
Security Gateway.
Notes:
 Before the initial policy installation, make sure that the 'Enable VPN-1/FW-1
monitoring' is set to 'Disable' in the IPSO Network Voyager.
 After the installation has finished, make sure that the 'Enable VPN-1/FW-1
monitoring' is set to 'Enable' in the IPSO Network Voyager. This assures that
IPSO will monitor changes in the status of the Check Point cluster member (the
state of the firewall is reported to the IPSO cluster for failover purposes since
IPSO 3.8.2).
 Refer to sk39008 (What does 'Monitor Firewall State' actually monitor (VRRP and
IP Clustering) and how does it influence availability).
2. For OPSec certified clusters, follow the vendor recommendations.
3. Install Check Point Security Gateway on all cluster member machines.
During the first time configuration phase (or later, using the 'cpconfig' command):
 Install a license for Check Point Security Gateway on each cluster member.
No special license is required to allow the OPSec certified product to work with the
Security Gateway.
 During the configuration phase, enable State Synchronization by selecting
'Enable cluster membership for this gateway' on Unix machines,
or 'This Gateway is part of a cluster' on Windows machines.
4. Define IP addresses on all relevant interfaces on all the cluster members.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 49
Notes:
 Unused interfaces must be configured as 'Disconnected' (refer to Defining
Disconnected Interfaces section).
 Alias IP addresses are not supported by ClusterXL. Refer to sk31821 (Traffic that
is sent to Secondary IP addresses / Alias IP addresses that were defined on
interfaces of ClusterXL members is not processed).
5. Configure identical number of CoreXL FW instances on cluster member machines
(using the 'cpconfig' command).
6. Configure SecureXL in identical way on cluster member machines (using the
'cpconfig' command and the 'sim affinity -s' command).
7. Connect the cluster member machines via the switches.
For the Synchronization interface(s), due to security reasons, a crossover cable or a
dedicated switch is recommended.
8. Proceed to the next section - configuration in SmartDashboard.

(4-3) Configuring cluster object in SmartDashboard

Refer to ClusterXL Administration Guide (R70, R70.1, R71, R75, R75.20, R75.40,
R75.40VS, R76, R77.X, R80.10) - Chapter 'Configuring ClusterXL' - Configuring Cluster
Objects & Members.

Important Note: Configuring cluster for the first time with a single member is not
supported. Refer to Preparing Cluster Members section.

To define a new Gateway Cluster object:


1. In the Network Objects tree, right-click on Check Point and then select Security
Cluster.
2. If the Security Gateway Cluster Creation window appears, select one of the
following methods to create your new cluster object:
o Simple Mode (Wizard), which guides you step by step through the configuration
process.
o Classic Mode, which allows to configure all relevant settings at once.

Note: In order to make cluster troubleshooting easier, configured the following in


SmartDashboard - cluster object - ClusterXL - under Tracking, set 'Track changes in
the status of cluster members' to 'Log' , or 'SNMP Trap Alert' (refer to SNMP
section).

Use the cluster object Topology page to configure the topology for the cluster object
and its members.
Pay attention to the names of the members' interfaces - they must match the names of
the interfaces are assigned by the operating system, subject to the guidelines provided in
sk30154 ($FWDIR/log/fwd.elg shows repeatedly - 'fwarp_initialize_myself: unable to find
mac address of interface IF_NAME').

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 50
(4-4) Configuring routing on networks around the cluster

Hosts and networking devices should forward all traffic to cluster Virtual IP address on
their subnet and not to physical IP addresses of cluster members.

Refer to Requirements for switches and routers section and to Configuring Cluster
Addresses on Different Subnets section.

Example:
In the topology depicted below, on the Host, the Default Gateway has to be configured
with IP address 192.168.204.20:

(4-5) CCP mode

The ClusterXL Control Protocol (CCP) uses multicast by default, because it is more
efficient than broadcast.
If the connecting switch cannot forward multicast traffic, it is possible, though less
efficient, for the switch to use broadcast to forward traffic.
Refer to CCP modes section and to Requirements for switches and routers section.

(4-6) ClusterXL High Availability for IPv6

Starting in R76, ClusterXL supports IPv6. All IPv6 status information is synchronized
and the IPv6 clustering mechanism is activated during failover.

ClusterXL performs both state synchronization and clustering for IPv6 as with IPv4. For
this to work, in SmartDashboard, you must define IPv6 addresses for all cluster interfaces.

In case of IPv4, during cluster failover, cluster sends Gratuitous ARP Request packets
to update an ARP cache of hosts/routers connected to the cluster interfaces, by advertising
the new MAC address for the cluster Virtual IPv4 addresses.

In case of IPv6, during cluster failover, cluster uses Neighbor Discovery Protocol (NDP)
and sends Neighbor Advertisement messages to update the neighbor cache of
hosts/routers connected to the cluster interfaces, by advertising the new MAC address for
the cluster Virtual IPv6 addresses. In addition, ClusterXL will reply to any Neighbor
Solicitation with a target address equal to the Cluster Virtual IPv6 address.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 51
Note: ClusterXL failover event detection is based on IPv4 probing (refer to the definition
of 'probing' and of 'pingable host' in Clustering Definitions and Terms section).
During state transition, the IPv4 driver instructs the IPv6 driver to reestablish IPv6
network connectivity to the cluster.

To enable IPv6 functionality for an interface, define an IPv6 address for the applicable
interface on the cluster and on each member. All interfaces configured with an IPv6
address must also have a corresponding IPv4 address. If an interface does not require
IPv6, only the IPv4 definition address is necessary.

Note: You must configure synchronization interfaces with an IPv4 address only. This is
because the synchronization mechanism works using IPv4 only. All IPv6 information
and states are synchronized using this interface.

In an IPv6 environment, the 'cphaprob -a if' command shows both the cluster
Virtual IPv4 addresses and cluster Virtual IPv6 addresses.

Refer to these solutions:


 sk35178 (How to set up IPv6 in ClusterXL)
 sk34552 (How to set up IPv6 on SecurePlatform)
 sk39374 (IPv6 Support FAQ)
 sk78220 ("fw ctl pstat" command shows "Sync: off" on cluster members when IPv6 is
enabled in R75.40 and above)
 sk91905 (Configuring Proxy NDP for IPv6 Manual NAT)
 sk92368 (ATRG: IPv6)

(4-7) Defining 'Disconnected' interfaces

For more information, refer to ClusterXL Administration Guide (R70, R70.1, R71, R75,
R75.20, R75.40, R75.40VS, R76, R77.X, R80.10) - Chapter 'ClusterXL Advanced
Configuration' - Defining Disconnected Interfaces.

Disconnected interfaces are cluster member interfaces that are not monitored by the
ClusterXL mechanism.

Important Notes:
 Unused interfaces must be defined as 'Disconnected' in order to avoid cluster
flapping.
 Never define sync interface as 'Disconnected'.
 This configuration applies only to physical interfaces.
 Starting from Gaia R75.47, R77.20, the $FWDIR/conf/discntd.if file is not needed
anymore. Any interface, which is not part of cluster topology, will be counted as
disconnected.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 52
Procedure:
 Interfaces that have IP addresses configured, can be defined as 'Disconnected' via
special configuration file / registry key (see below), or in SmartDashboard:

In SmartDashboard - open cluster object - go to 'Topology' pane - click on


'Edit...':
o If the unused interfaces are present, then set their Network Objective to 'Non-
Monitored Private' and install policy.
o If the unused interfaces do not appear in the topology yet, then click on 'Get...'
- select 'All Members' Interfaces...', then set their Network Objective to 'Non-
Monitored Private' and install policy.

 Interfaces that do not have IP addresses configured, can be defined as


'Disconnected' only in this way (procedure also applies to interfaces with configured
IP address):

o UNIX OS:
A. Create the $FWDIR/conf/discntd.if file (if does not exist yet)
B. Add the name of each relevant physical interface on a separate line
C. Save the changes in the file
D. Restart ClusterXL with 'cphastop;cphastart' commands

o Windows OS:
A. Open Windows Registry editor (Start - Run... - regedit)
B. Go to
HKEY_LOCAL_MACHINES\System\CurrentControlSet\Services\CPH
A\
C. Add a new key:
Value Name: DisconnectedInterfaces
Data Type: REG_MULTI_SZ
D. Check the names of the interfaces as assigned by the system (Start - run... -
cmd):
fw getfs
E. Add the name of each relevant physical interface using the following format:
\device\System_Interface_Name
F. Restart ClusterXL with 'cphastop;cphastart' commands

Refer to these solutions:


 sk30060 (SmartView Tracker repeatedly shows messages "cluster_info: (ClusterXL)
interface is down / up")
 sk65826 (Cluster member is 'Down' because normal operational interface was
configured as 'Disconnected')
 sk93037 (Output of 'cphaprob state' command on Crossbeam chassis cluster shows
only local member)
 sk52020 (The $FWDIR/conf/discntd.if configuration file on ClusterXL member
contains the 'pimreg' entry, which was not added by the user)

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 53
(4-8) SecureXL

Refer to Requirements for software section.

Refer to Performance Pack Administration Guide (R70, R71, R75, R75.20, R75.40,
R75.40VS).
Refer to Performance Tuning Administration Guide (R76, R77.X, R80.10) - Chapter
'Performance Pack'

In addition, refer to R70 Performance Optimization Guide.

Refer to these solutions:


 sk25972 (About SecureXL Performance Pack)
 sk32578 (SecureXL Mechanism)
 sk98348 (Best Practices - Security Gateway Performance)
 sk71200 (SecureXL NAT Templates)
 sk67861 (Accelerated Drop Rules Feature in R75.40 and above)
 sk66402 (SecureXL Drop Templates are not supported in versions lower than R76)

(4-9) CoreXL

Refer to Requirements for software section.

Refer to Firewall Administration Guide (R70, R71, R75, R75.20, R75.40, R75.40VS -
Chapter 'CoreXL Administration'; R76, R77.X - Chapter 'Maximizing Network Performance'
- CoreXL).
Refer to Performance Tuning Administration Guide (R76, R77.X, R80.10) - Chapter
'CoreXL Administration'

Refer to these solutions:


 sk61701 (CoreXL Known Limitations)
 sk98737 (ATRG: CoreXL)
 sk98348 (Best Practices - Security Gateway Performance)
 sk35990 (How Connections Table limit capacity behaves in CoreXL)
 sk36151 (Maximum Concurrent Connections in CoreXL)
 sk62620 (What is the fw_worker_X process?)

(4-10) VPN

For more information, refer to ClusterXL Administration Guide (R70, R70.1, R71, R75,
R75.20, R75.40, R75.40VS, R76, R77.X, R80.10) - Chapter 'ClusterXL Advanced
Configuration' - Working with VPNs and Clusters.

For 3rd party VPN products, refer to vendor's documentation.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 54
Refer to these solutions:
 sk92332 (Customizing the VPN configuration for Check Point Security Gateway -
'vpn_table.def' file)
 sk108600 (VPN Site-to-Site with 3rd party)
 sk35383 (How to configure VPN between Check Point cluster and a VPN-1 UTM
Edge with WAN High Availability (HA))

(4-11) NAT

Network Address Translation (NAT) is a fundamental aspect of the way ClusterXL


works.

When a packet leaves cluster member, the source IP address in the outgoing packet, is
the physical IP address of the cluster member interface.
The source IP address is changed using NAT to that of the Virtual IP address of the
cluster on that subnet.
This address translation is called "Cluster Hide".

The packet sent to the cluster Virtual IP address is accepted by one of the cluster
members. The destination IP address in the incoming packet is changed using NAT to that
of the physical IP address of the cluster member interface on that subnet.
This address translation is called "Cluster Fold".
For OPSec certified clustering products, this corresponds to the default setting (in
SmartDashboard) in the 3rd Party Configuration page of the cluster object, of Forward
Cluster's incoming traffic to Cluster Members' IP addresses being checked.

For more information, refer to ClusterXL Administration Guide (R70, R70.1, R71, R75,
R75.20, R75.40, R75.40VS, R76, R77.X, R80.10) - Chapter 'ClusterXL Advanced
Configuration' - Working with NAT and Clusters.
For OPSec certified clustering products, refer to vendor's documentation.

Refer to these solutions:


 sk31832 (How to prevent ClusterXL / VRRP / IPSO IP Clustering from hiding its own
traffic behind Virtual IP address)
 sk32224 (NAT Table 'fwx_alloc')
 sk30197 (Configuring Proxy ARP for Manual NAT)

(4-12) VLAN

When defining VLAN tags on an interface, cluster IP addresses can be defined only on
the VLAN interfaces (the tagged interfaces).
Defining a cluster IP address on a physical interface that has VLANs is not supported.
This physical interface has to be defined with the Network Objective Monitored Private.

Note: Refer to CCP and VLAN interfaces section.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 55
For more information, refer to ClusterXL Administration Guide (R70, R70.1, R71, R75,
R75.20, R75.40, R75.40VS, R76, R77.X, R80.10) - Chapter 'ClusterXL Advanced
Configuration' - Working with VLANs and Clusters.
In addition, refer to the Release Notes of the given version.

Refer to these solutions:


 sk92826 (ClusterXL VLAN monitoring)
 sk61323 (Monitoring of VLAN interfaces in ClusterXL)
 sk92784 (Configuring VLAN Monitoring on ClusterXL for specific VLAN interface)

(4-13) Link Aggregation (Bonding)

Refer to ClusterXL Administration Guide (R65, R70, R70.1, R71, R75, R75.20, R75.40,
R75.40VS, R76, R77.X, R80.10) - Chapter 'ClusterXL Advanced Configuration' - Link
Aggregation and Clusters.

(4-13-A) Overview

Link Aggregation, NIC Teaming, Bonding of interfaces, Bond interface - all refer to the
same redundancy technology of physical Network Interface Cards (NICs) - a virtual
interface, defined on the OS, similar to a physical interface - where the physical bonded
interfaces are set to act as a single interface, using the same MAC address and the same
IP address.

Each physical interface in a Bond is called a slave of that bond. Enslaved interfaces do
not function independently of the bond.

The interface bonding supplies High Availability in case of interface failure and, in Load
Sharing mode, can significantly increase total throughput.

Figure below depicts Bonded interfaces:

In this scenario:
 GW-1 is a single gateway, or a cluster
member
 S-1 and S-2 are switches
 eth0 and eth1 are bonded slave
interfaces
 eth0 is the Active slave interface
 eth1 is the Standby slave interface
 bond0 is the name of the bond

If GW-1 should lose connectivity with the


currently active switch, it is able to detect the
failure and initiate an internal failover to eth1.

Note: Link Aggregation is supported on SecurePlatform, Gaia, and IPSO OS.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 56
(4-13-B) Introduction

When dealing with mission-critical applications, an enterprise requires its network to be


highly available.

Clustering provides redundancy at the gateway level.


However, without Link Aggregation - redundancy of Network Interface Cards (NICs), or
redundancy of the switches on either side of the gateway are only possible in a cluster, and
only by failover of the gateway to another cluster member.

(4-13-C) Configuration

Refer to these User Guides:


 How to Configure ClusterXL for L2 Link Aggregation on SecurePlatform and Gaia
OS
 How to Configure Link Aggregation Groups on IPSO OS
Start with these ClusterXL Administration Guides (because the Link Aggregation
support was added for the first time in these versions):
 R65 ClusterXL Administration Guide - Chapter 'ClusterXL Advanced Configuration' -
Working with Link Aggregation and Clusters - Configuring Interface Bonds
 R70 ClusterXL Administration Guide - Chapter 'ClusterXL Advanced Configuration' -
Working with Link Aggregation and Clusters - Configuring Interface Bonds
 R70.1 ClusterXL Administration Guide - Chapter 'ClusterXL Advanced Configuration'
- Link Aggregation and Clusters

In addition, refer to ClusterXL Administration Guide (R71, R75, R75.20, R75.40,


R75.40VS, R76, R77.X, R80.10) - Chapter 'ClusterXL Advanced Configuration' - Link
Aggregation and Clusters.
Important Note: It is mandatory to define the physical slave interfaces (that will
comprise the bond interface) as 'Disconnected'. Refer to Defining Disconnected Interfaces
section.

Link Aggregation can be configured in one of these two modes:

 High Availability (Active/Backup) mode (supported since R65) - only one interface
at a time is active.
Upon interface failure, the bond fails over to another interface.
Different slave interfaces of the bond can be connected to different switches, to
benefit from high availability of switches in addition to high availability of interfaces
(refer to Fully Meshed Redundancy via Interface Bonding section above).

 Load Sharing (Active/Active) mode (supported since R70.1 / VSX R67) - all
interfaces are active, for different connections.
Connections are balanced between interfaces according to Layer 3 and Layer 4, and
follow either the IEEE 802.3ad standard, or XOR.
Load Sharing mode has the advantage of increasing throughput, but requires
connecting all the interfaces of the bond to one switch (which must support LACP).

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 57
For both Link Aggregation High Availability mode and for Link Aggregation Load
Sharing mode:
 The number of bond interfaces that can be defined is limited by the maximal number
of interfaces supported by each platform (refer to Release Notes of each given
version).
 Up to 8 slave NICs can be configured in a single High Availability bond or Load
Sharing bond.

(4-13-D) Link Aggregation - High Availability mode

Link Aggregation provides high availability of NICs. If one fails, the other can function in
its place. This functionality is provided by Link Aggregation in both High Availability mode
and Load Sharing mode.

High Availability mode of Link Aggregation, when deployed together with ClusterXL,
enables a higher level of reliability through granular redundancy in the network topology.
This granular redundancy is achieved with a Fully Meshed Topology, which effectively
provides independent backups for both NICs and switches.

Redundant (High Availability) Topologies:


 Simple Redundant Topology without Interface Bonding
 Fully Meshed Redundancy via Interface Bonding

(4-13-D-i) Simple Redundant Topology without Interface Bonding

In the case of switch or Security Gateway failure, a High Availability cluster solution
provides system redundancy.
Figure below depicts a redundant system without Link Aggregation (two synchronized
Security Gateways - cluster members) deployed in a simple redundant topology:

In this scenario:
 GW-1 and GW-2 are cluster members
 S-1 and S-2 are switches
 C-1 and C-2 are interconnecting
networks

Cluster members GW-1 and GW-2 each have


one external NIC connected to an external
switch (S-1 and S-2, respectively).

In the event of a failure of either Active cluster


member GW-1, its NIC (on C-X), or switch S-
1, cluster member GW-2 becomes the only
Active gateway, connecting to switch S-2 over
C-2.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 58
In any of the 3 cases (gateway failure, NIC failure, or switch failure), the result of the
failover is that no further redundancy exists, and a further failure of any active component
will completely stop network traffic.

Link Aggregation provides high availability of NICs. If one fails, the other can function in
its place. This functionality is in Bond High Availability mode and in Bond Load Sharing
mode.

(4-13-D-ii) Fully Meshed Redundancy via Interface Bonding

The Link Aggregation High Availability mode, when deployed with ClusterXL, enables a
higher level of reliability by providing granular redundancy in the network. This granular
redundancy is achieved by using a fully meshed topology, which provides for independent
backups for both NICs and switches.

A fully meshed topology further enhances the redundancy in the system by providing a
backup to both the interface and the switch, essentially backing up the cable. Each cluster
member has two external interfaces, one connected to each switch.

Figure below depicts this implementation, where both cluster members are connected to
both external switches:

In this scenario:
 GW-1 and GW-2 are Security Gateway
cluster members in New High
Availability
 mode
 S-1 and S-2 are switches
 C-1, C-2, C-3 and C-4 are networks

After a switch failure, switch functionality and


gateway high availability are maintained.
Similarly, after a NIC failure, switch and
gateway high availability are maintained.

(4-13-D-iii) Bond Internal Failover

Note: The bond failover operation requires network interface cards that support the
Media-Independent Interface (MII) standard.

Failover can occur because of a failure in the physical link state, or a failure in the
receiving/sending of CCP packets. Either of these failures will trigger a failover: either
within the bond interface, or between cluster members (depending on the circumstances)

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 59
 Link state initiated failover

1. The active slave interface detects a link state of down, and notifies the bond
interface.
2. The bond initiates an internal bond failover to the standby slave interface.
Note: Since this is a failover within the bond, the status of the other cluster
member is not considered.
3. If this slave interface should detect a link failure, and the initial slave interface is
still down, ClusterXL initiates a failover to the other cluster member, as long as
the state of the other cluster member is not Down.

 CCP initiated failover

1. ClusterXL detects a problem in the receiving/sending of CCP packets.


2. ClusterXL initiates an internal bond failover.
3. ClusterXL monitors CCP packet transmission/arrival. If a problem is detected, the
system initiates a failover to the other cluster member, as long as the state of the
other cluster member is not Down.

(4-13-D-iv) Configuring Bond High Availability Failover Mode

There are a number of configurable settings regarding bond failover:

 'fwha_manual_bond_failover' kernel parameter

Description:
Sets the failover mode.

Values:
o 0 = (default) automatically perform internal bond failover to the other slave
interface.
o 1 = perform ClusterXL failover to the other cluster member (as long as the state
of the other cluster member is not Down), unless the command 'cphaconf
enable_bond_failover' was run, in which case, the next failover will be
internal bond failover to the other slave interface (refer to command's description
below).

Notes:
o With both values, the next bond failover occurs in 2 minutes.
o The current value of this kernel parameter can be checked with 'fw ctl get
int fwha_manual_bond_failover' command.
o The value of this kernel parameter can be set:
 either on-the-fly with 'fw ctl set int fwha_manual_bond_failover
VALUE' command (this change does not survive reboot)
 or by adding this parameter with desired value into the
$FWDIR/boot/modules/fwkern.conf file (per sk26202)

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 60
 'cphaconf enable_bond_failover BondName' command

Description:
Sets what happens during a ClusterXL failover after a bond has already failed
over internally. This command works only if the value of the
'fwha_manual_bond_failover' kernel parameter is currently set to 1 (one).
After a failover occurs within a bond, the next time a failure is detected on a slave
interface, ClusterXL automatically fails over to the other cluster member.
An administrator can prevent this from occurring by first correcting the error on
slave interface that caused the failover, and then resetting the system to failover
internally.
The 'cphaconf enable_bond_failover BondName' command directs the
system to failover within the bond the next time a failure is detected on a slave
interface.
Notes:
o When successful, there is no immediate output from this command; however the
words 'can failover' appear in the output of the 'cphaprob -a if'
command (in the corresponding line for this bond interface).
o This command should be run each time the system is reconfigured - after
verifying that all slave interfaces are active.
o Refer to 'cphaconf' command section.

Refer to this solution:


 sk43730 (Failover in Bond interface can cause failover in ClusterXL)

(4-13-E) Link Aggregation - Load Sharing mode

Note: Refer to sk22345 (Security Gateway support for EtherChannel technology /


802.3ad Link Aggregation).
Link Aggregation in Load Sharing Mode is supported in:
 SecurePlatform OS version R70.1 and above
 Gaia OS version R75.40 and above
 VSX SecurePlatform OS version R67 and above
 IPSO OS version 3.8.1 and above

In Bond Load Sharing mode, Link Aggregation supplies load sharing, in addition to High
Bond Availability. All slave interfaces are active, and connections are balanced between the
bond's slave interfaces, similar to the way ClusterXL balances connections between cluster
members.

In Bond Load Sharing mode, each connection is assigned to a specific slave interface.
For the individual connection, only one slave interface is active. On failure of that interface,
the bond does failover of the connection to one of the other interfaces, which adds the
failed interface's connection to the connections it is already handling.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 61
Important Note: Bond in Load Sharing requires Performance Pack (SecureXL) to be
enabled (SIM Affinity should be configured to run in Static mode via 'sim affinity -s'
command).

Connections are balanced between slave interfaces according to Layer 3 and Layer 4,
and follow one of these standards:
 802.3ad - includes LACP and is the recommended mode, but some switches may
not support this mode.
 XOR.

In Bond Load Sharing mode, all the interfaces of a bond must be connected to the same
switch. The switch itself must support and be configured for Link Aggregation, by the same
standard (802.3ad or XOR) as the bond interface on Check Point Security Gateway.

A bond in Load Sharing mode is considered to be Down when less than a critical
minimal number of slave interfaces remain up.
When not explicitly defined, the critical minimum number of interfaces in a bond of n
slave interfaces is n-1.
Note: Failure of a second slave interface will cause the entire bond to be considered
down, even if the bond contains more than two slave interfaces.

If a smaller number of interfaces will be able to handle the expected traffic, you can
increase redundancy by explicitly defining the number of critical interfaces.

To explicitly define the number of critical interfaces, create and edit the
cpha_bond_ls_config.conf file:
[Expert@HostName]# cd $FWDIR/conf/
[Expert@HostName]# touch cpha_bond_ls_config.conf
[Expert@HostName]# chown admin:bin cpha_bond_ls_config.conf
[Expert@HostName]# chmod -v u=rwx,g=rwx cpha_bond_ls_config.conf
[Expert@HostName]# vi cpha_bond_ls_config.conf

Location of the cpha_bond_ls_config.conf file:


 On ClusterXL in Gateway Mode:
$FWDIR/conf/cpha_bond_ls_config.conf
 On ClusterXL in VSX Mode (R75.40VS, R76, R77 and above):
$FWDIR/CTX/CTX0000<VSID>/conf/cpha_bond_ls_config.conf

Each line of the file should contain the Bond Name and the number of critical interfaces
(separated by a space of horizontal tab) that must remain up:
BondName number_of_critical_interfaces_that_must_be_up

Example:
If bond0 has 7 interfaces, and bond1 has 6 interfaces, then file contents could be:
bond0 5
bond1 3

Explanation:
 bond0 would be considered Down when 3 of its 7 interfaces have failed
 bond1 would be considered Down when 4 of its 6 interfaces have failed

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 62
Refer to these solutions:
 sk43730 (Failover in Bond interface can cause failover in ClusterXL)
 sk95087 ('cphaconf show_bond -a' command shows incorrect number of Slave
interfaces that does not match configuration in the
$FWDIR/conf/cpha_bond_ls_config.conf file)
 sk97779 (Critical minimal number of interfaces in a Bond in VRRP cluster running on
Gaia OS)
 sk94545 Configuration in the $FWDIR/conf/cpha_bond_ls_config.conf file on VSX
cluster member does not apply to other Virtual Systems)

(4-14) Monitoring the Interface Link State (MILS)

Notes:
 MILS is enabled by default since R75.47 (fwha_monitor_if_link_state=1).
 MILS is supported only on SecurePlatform / Gaia OS.
 MILS is supported only for physical interfaces.
 MILS feature considers the link state of a Bond interface as up if it has at least one
slave with link up.

Refer to this solution:


 sk31336 (Using Monitor Interface Link State feature to improve ClusterXL interface-
failure-detection ability).

Enabling Interface Link State Monitoring significantly shortens the time it takes
ClusterXL to detect an interface failure (from milliseconds to microseconds).
By monitoring the link state (i.e., the electrical state) of an interface, ClusterXL is
immediately alerted to connectivity issues concerning a certain network interface, such as a
disconnected cable, or an electrical failure (real or simulated) on a switch.
Interface Link State Monitoring requires a NIC driver that supports link state detection.
The device driver reports the link state as either connected or disconnected.
Monitoring the interface link state is particularly useful during cluster probing when none
of the hosts answer on the connected subnet (refer to the definition of 'probing' and of
'pingable host' in Clustering Definitions and Terms section).
When MILS is enabled, ClusterXL immediately detects when an interface goes down.
When MILS is disabled, ClusterXL determines whether an interface is malfunctioning
based on expiration of internal timeouts.

(4-15) Configuring cluster addresses on different subnets

Cluster IPs are Virtual IP addresses given to ClusterXL objects, which differ from the
unique physical IP addresses of the individual cluster member. These addresses enable
the cluster to be seen as a single gateway, thus allowing it to serve as a router in a network
that is unaware of the cluster's internal structure and status.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 63
Note: ISP Redundancy is not supported in ClusterXL where physical interfaces of
cluster members and cluster VIP are defined on different subnets. Refer to sk66521 (ISP
Redundancy in ClusterXL when interfaces of cluster members and cluster VIP are defined
on different subnets per sk32073).

Refer to Configuring ISP Redundancy on a Cluster section.

Cluster IP addresses can reside on subnets other than those of the members. The
advantage of this is that it:
 Enables a multi-machine cluster to replace a single-machine gateway in a pre-
configured network, without the need to allocate new addresses to the cluster
members.
 Makes it possible to use only one routable address for the ClusterXL Gateway
Cluster.
Refer to this solution:
 sk32073 (Configuring Cluster Addresses on Different Subnets).

There are two major steps required in order for ClusterXL to function correctly with
cluster IPs on different subnets:

1. The first step is to create static routes on each cluster member, which determine
the interface connected to the cluster's network (the subnet, to which the cluster IP
belongs). Unless these entries are created, the OS cannot route packets to the
cluster's network. No additional configuration is required for the cluster members. It
is, however, important to note that the unique IP addresses given to the members
must share common subnets on each "side" of the cluster (meaning, each interface
on each machine must have an interface on every other machine using the same
subnet).

Note:
Configuring the static route is not needed in these cases:

 On SecurePlatform OS Security Gateway with enabled Advanced Dynamic


Routing (GateD daemon will add the route to cluster VIP network when the
member's interface comes up).
 On Gaia OS Security Gateway in VSX mode (this is done automatically when
configuring routes in SmartDashboard).

2. The second step relates to the configuration of the cluster topology. Here, the
cluster IP addresses are determined, and associated with the interfaces of the
cluster members (each member must have an interface responding to each cluster
IP address). Normally, cluster IP addresses are associated with an interface based
on a common subnet. In this case, these subnets are not the same. It must be
explicitly specified, which member subnet is associated with the cluster IP address.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 64
Example:

(4-16) Moving from a single gateway to a cluster

Refer to ClusterXL Administration Guide (R70, R70.1, R71, R75, R75.20, R75.40,
R75.40VS, R76, R77.X, R80.10) - Chapter 'ClusterXL Advanced Configuration' - Moving
from a Single Gateway to a ClusterXL Cluster.

(4-17) Adding another member or interface to an existing cluster

Refer to ClusterXL Administration Guide (R70, R70.1, R71, R75, R75.20, R75.40,
R75.40VS, R76, R77.X, R80.10) - Chapter 'ClusterXL Advanced Configuration' - Adding
Another Member to an Existing Cluster.

Refer to these solutions:


 sk57100 (Adding or removing an interface in ClusterXL High Availability topology
might cause fail-over)
 sk69180 (Adding a new Slave interface to existing Bond interface on
SecurePlatform)

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 65
(4-18) Proxy ARP

Refer to this solution:


 sk30197 (Configuring Proxy ARP for Manual NAT).

Let us consider the following scenario:

1. Two networks (Network_A and Network_B) are separated by a Security Gateway


(single Security Gateway or ClusterXL).

2. On each network, there is a host (Host_A on Network_A,


and Host_B on Network_B).

3. Let us assume, that Network_A represents the Internal network,


and Network_B represents the External network.

4. According to the existing standards, when Host_B needs to send data to Host_A,
an ARP Request for the MAC address of Host_A will be sent
by Host_B to Network_B.

Since Host_A is located on another network, and the Security Gateway acts as a
router, this ARP Request (sent to Broadcast address on Layer2) will not be
forwarded by the Security Gateway from Network_B to Network_A.

As a result, Host_B will not discover the MAC address of Host_A, and will not be
able to send the data to Host_A.

A standard solution, in such cases, is to configure the Security Gateway to act


as Proxy ARP.

The Security Gateway will pretend to be the Host in question. The Security Gateway
will accept the ARP Requests and the Security Gateway will send its own MAC
Address in ARP Reply. Then, when the data is received from the External
network, the Security Gateway will forward the data to the relevant host on the
Internal network.

Configuration on the Security Gateway is two-fold:

1. Layer2-to-Layer3 matching - matching IP addresses of the relevant hosts on the


Internal network to the MAC Address of the Security Gateway on the External
network (performed via special configuration file $FWDIR/conf/local.arp).
2. NAT rules

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 66
(4-19) ISP Redundancy

If you have a ClusterXL Gateway cluster, connect each cluster member to each ISP
using two physical interfaces.

Refer to ClusterXL Administration Guide (R70, R70.1, R71, R75, R75.20, R75.40,
R75.40VS, R76, R77.X, R80.10) - Chapter 'ClusterXL Advanced Configuration' -
Configuring ISP Redundancy on a Cluster.

Note: ISP Redundancy is not supported in ClusterXL where physical interfaces of


cluster members and cluster VIP are defined on different subnets. Refer to sk66521 (ISP
Redundancy in ClusterXL when interfaces of cluster members and cluster VIP are defined
on different subnets per sk32073).

Refer to FireWall Administration Guide (R70, R71, R75, R75.20, R75.40, R75.40VS) -
Chapter 'ISP Redundancy'.

Refer to Security Gateway Technical Administration Guide (R76, R77.X) - Chapter 3


'ISP Redundancy'.

Refer to these User Guides:


 How To Configure ISP Redundancy
 How To Configure ISP Redundancy in SecurePlatform

Refer to these solutions:


 sk25129 (Supported platforms for ISP Redundancy)
 sk42636 (Controlling connections configured with ISP Redundancy in Load Sharing
mode)
 sk66521 (ISP Redundancy in ClusterXL when interfaces of cluster members and
cluster VIP are defined on different subnets per sk32073)
 sk23630 (Advanced configuration options for ISP Redundancy)
 sk32225 (Configuring ISP Redundancy so that certain traffic uses specific ISP)
 sk40958 (How to verify the status of ISP Redundancy links on command line)

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 67
An example of cluster-specific configuration is depicted below:

(4-20) Dynamic Routing

ClusterXL supports Dynamic Routing (Unicast and Multicast) protocols as an integral


part of Check Point operating systems. As the network infrastructure views the clustered
gateways as a single logical entity, failure of a cluster member will be transparent to the
network infrastructure and will not result in a ripple effect.

When configuring the routing protocols on each cluster member, each member is
defined identically, and uses the cluster VIP addresses (not the members' physical IP
addresses). Meaning, that Router ID should be set to cluster Virtual IP on each member.

Note: When configuring OSPF restart, you must define the restart type as signaled or
graceful. For Cisco devices, use type signaled.

Note: If cluster running on SecurePlatform OS does not participate in Dynamic Routing


protocols, then disable Advanced Dynamic Routing on each cluster member in order to
prevent unexpected cluster failovers due to FIB pnote.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 68
Refer to these documents:
 Gaia Advanced Routing Administration Guide (R75.40, R75.40VS, R76, R77.X,
R80.10)
 SecurePlatform Advanced Routing Suite CLI Reference Guide (R60, R61, R62, R65,
R70, R71, R75.20, R75.40, R75.40VS, R76, R77.X)
 Release Notes of the given version

Refer to these User Guides:


 SecurePlatform Pro - Advanced Routing Suite - Configuring ClusterXL
 SecurePlatform Pro - Advanced Routing Suite - Configuring OSPF
 SecurePlatform Pro - Advanced Routing Suite - Configuring BGP
 SecurePlatform Pro - Advanced Routing Suite - Configuring BGP and OSPF
 SecurePlatform Pro - Advanced Routing Suite - Configuring PIM
 Guidelines for configuring PIM with VRRP
 PIM-DM and PIM-SM Failover Behaviour in a High Availability Configuration
 How To Configure Policy Based Routing

Refer to these solutions:


 sk39960 (How to allow Dynamic Routing protocols traffic (OSPF, BGP, PIM, RIP,
IGRP) through Check Point Security Gateway)
 sk95968 (OSPF on Gaia)
 sk95967 (BGP on Gaia)
 sk100499 (BGP on Gaia OS - configuring Graceful Restart)
 sk100239 (How to configure PIM on Gaia OS)
 sk100501 (How to configure Routemaps in Gaia Clish)
 sk98936 (How to configure route redistribution and inbound route filters in Gaia
Portal)
 sk32614 (Configuring SecurePlatform Pro for OSPF)
 sk36969 (How to configure OSPF on Security Gateway & UTM-1 Edge VTI
environment)
 sk36646 (OSPF graceful restart types)
 sk42974 (How to manually add route on SecurePlatform to override OSPF route)
 sk32615 (Configuring SecurePlatform Pro for BGP)
 sk86985 (How to configure iBGP to propagate routes)
 sk92836 (PIM HA Mode explained)
 sk32702 (Configuring PIM and IGMP Multicast Protocols)
 sk31934 (ClusterXL IGMP Membership)
 sk33221 (Using ClusterXL with IGMP Snooping-enabled switches)

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 69
(4-21) SNMP

Refer to Requirements for software section.

Refer to these solutions:


 sk90860 (How to configure SNMP on Gaia OS)
 sk34511 (How to enable SNMP on SecurePlatform OS)
 sk68560 (How to configure SNMP on SecurePlatform OS)
 sk65923 (How to configure the cluster to send SNMP Trap upon fail-over)
 sk93455 (Send SNMP Trap in the event of a ClusterXL failover to multiple Trap
Servers)
 sk40266 (SNMP, MIBs, and how SNMP traps work)
 sk71980 (Output of a 'snmpwalk' command with 'exec' extension or 'extend'
extension is limited)
 sk65173 (Check Point SNMP sysObjectID .1.3.6.1.2.1.1.2)
 sk40622 (SNMPv3 USM (User-based Security Model) User)
 sk42426 (Hardware Monitoring with SNMP on Power-1 / UTM-1 / Smart-1 / 2012
appliances)

(4-22) Enhanced enforcement of the TCP 3-way handshake

The standard enforcement for a 3-way handshake that initiates a TCP connection
provides adequate security by guaranteeing one-directional stickiness. This means that it
ensures that the SYN-ACK will always arrive after the SYN. However, it does not guarantee
that the ACK will always arrive after the SYNACK, or that the first data packet will arrive
after the ACK. If you wish to have stricter policy that denies all out-of-state packets, you
can configure the synchronization mechanism so that all the TCP connection initiation
packets arrive in the right sequence (SYN, SYN-ACK, ACK, followed by the data). The
price for this extra security is a considerable delay in connection establishment.
To enable enhanced enforcement, use the GuiDBedit Tool to change the value of global
attribute sync_tcp_handshake_mode from minimal_sync (default value) to
complete_sync:
1. Close all SmartConsole windows (SmartDashboard, SmartView Tracker, etc.).
2. Connect to Security Management Server with GuiDBedit Tool.
3. In the left upper pane, go to 'Table' - 'Network Objects' - 'network_objects'.
4. In the right upper pane, select the relevant Cluster object (Class Name -
gateway_cluster).
5. Press CTRL+F (or go to 'Search' menu - 'Find') - paste
sync_tcp_handshake_mode - click on 'Find Next'.
6. In the lower pane, right-click on the sync_tcp_handshake_mode - 'Edit...' -
choose "complete_sync" - click on 'OK'.
7. Save the changes: go to 'File' menu - 'Save All'.
8. Close the GuiDBedit Tool.
9. Connect to Security Management Server with SmartDashboard.
10. Install the policy onto the Cluster object.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 70
Synchronization modes for TCP 3-way handshake:

1. Minimal sync
3-way handshake is not enforced. This mode offers the best connectivity for users
who are willing to compromise on security is this case.

2. Complete sync
All 3-way handshake packets are Synced-and-ACKed, and 3-way handshake is
enforced. This mode slows down connection establishment considerably. It may be
used when there is no way to know where the next packet will go, e.g. in 3d party
clusters.

3. Smart sync
In most cases, we can assume that if SYN and SYN-ACK were encountered by the
same cluster member, then the connection is “sticky”.
ClusterXL uses one additional flag in Connections Table record that says, “If this
member encounters a 3-way handshake packet, it should sync all other cluster
members”.
When a SYN packet arrives, the member that encountered it, records the connection
and turns off its flag. All other members are synched, and by using a post-sync-
handler, their flag is turned on (in their Connections Tables).
If the same member encounters the SYN-ACK packet, the connection is sticky, thus
other cluster members are not informed.
Otherwise, the relevant member will inform all other member (since its flag is turned
on).
The original member (that encountered the SYN) will now turn on its flag, thus all
members will have their flag on.
In this case, the third packet of the 3-way handshake will also be synced.
If for some reason, our previous assumption is not true (i.e., one cluster member
encountered both SYN and SYN-ACK packets, and other members encountered the
third ACK), then the “third” ACK will be dropped by the other cluster members, and
we rely on the periodic sync and TCP retransmission scheme to complete the 3-way
handshake.
This mode is a good solution for Load Sharing users that want to enforce 3-way
handshake verification with the minimal performance cost. It is also recommended
for High Availability New mode.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 71
5. Cluster state transitions
Refer to ClusterXL definitions and terms section.

The state machine mechanism is triggered by 3 events:


 Starting or stopping of the ClusterXL product / State Synchronization
 Incoming CCP packet with the state information from other cluster members (refer to
FWHAP_MY_STATE Data)
 Timer event - every predefined timeout (refer to CPHA timer)

(5-1) Special notes for state transitions

 When all Critical Devices (Pnotes) report their states as 'ok', the machine will try to
change its state to 'Active', depending on the cluster configuration (HA mode / LS
mode) and states of the peer members.
 Among several properly functioning cluster members working in HA mode, the
machine will become an 'Active' depending on the configuration:
o In 'Active Up' configuration ('Maintain current active Cluster Member') - a first
cluster member (on a time base), which reaches the 'Ready' state, will become
'Active'.
o In 'Primary Up' configuration ('Switch to higher priority Cluster Member') -
machine with highest priority will become 'Active'.

 When on all cluster members, some Critical Device report their state as 'problem',
one of the member will become 'Active' and will get into derived state 'Active
attention', symbolizing that it has a failure. The choice regarding what machine will
become an 'Active' is a random and does not depend on the machines priorities /
numbers and type of Critical Devices that report their state as 'problem'.

(5-2) Policy installation

When the policy is installed onto a cluster member, the fwd daemon calls the
"cphastart" command in order to start the clustering mechanism.
The "cphastart" command is responsible to read the $FWDIR/conf/objects.C file
in order to get all required information from the cluster object, and cluster members'
objects.
Once done, the "cphastart" command calls the "cphaconf" command with all the
relevant parameters.
The "cphaconf" command performs 2 main actions:
 Moves the configuration parameters to the Check Point kernel (in the kernel, the
parameters are not enforced right away - instead, the new configuration parameters
are buffered, and a process called "policy negotiation" starts)
 Notifies the cphastart daemon about the new loaded policy
The "cphaconf" command sends a signal to the cphamcset daemon to reload the
information from the objects. If the cphamcset daemon is not yet started, it will be started.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 72
The cphamcset daemon is responsible for opening sockets on the NICs in order to
allow them to pass multicast traffic (CCP) to the machine (run 'ip maddr show'
command).

Check Point kernel has a mechanism, which ensures that all cluster members enforce
the same security policy and the same ClusterXL parameters at any given time.
Since the policy installation does not take place simultaneously on all cluster members
(actually the policy commit is sequential), there may be some time difference between the
installations on all the members.
In order to overcome this problem, the policy negotiation is divided into two phases:
 All members must acknowledge the new policy arrival.
 Then, all members must acknowledge moving into the new policy.

During Phase I, a machine that got new policy sends CCP packets declaring that it got a
new policy with a certain Policy ID.
The following line appears in cluster debug with 'conf' flag:
CPHA: Phase I: Looking for machines in policy update mode...

The other machines also send this CCP packet as soon as they get the new policy. All
the machines wait to receive the confirmation packet from all the other machines, signalling
that the new policy arrived to all cluster members.

Now, Phase II takes place, when the CPHA timer is stopped completely in order to
avoid sending packets with the old parameters, and the new policy parameters are
enforced.
The following line appears in cluster debug with 'conf' flag:
CPHA: Phase II: Looking for machines ready to update policy...
After having done that, each machine sends another packet indicating it completed the
policy change phase.

When all the machines completed the policy change phase, the HA timer is started and
all the machines are updated with the new configuration.

Each one of these steps is backed up with a HA timer, which reverts the process, if not
all the Active machines confirmed the new stage after a certain time (refer to
'fwha_policy_update_timeout_factor' kernel parameter).
In this case, the old parameters are restored.

Debugging:
In order to see the policy installation, debug the 'cluster' module with 'conf' flag (in
addition, these flags are recommended: 'stat', 'pnote', 'if', 'mac'):
[Expert@GW_HostName]# fw ctl debug -m cluster + conf stat pnote if mac

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 73
Example from R76 High Availability (Active Up) cluster - from Active member:

; 2Jul2013 13:51:53.832490;[cpu_0];[fw4_0];FW-1: SIM (SecureXL Implementation Module) SecureXL device detected.;


; 2Jul2013 13:51:53.958580;[cpu_2];[fw4_0];FW-1: SecureXL: Connection templates are not possible for the installed policy. Please
refer to the Performance Pack documentation for further details.;
; 2Jul2013 13:51:56.060111;[cpu_2];[fw4_1];FW-1: fwha_set_conf: entered with State=ACTIVE, Blocking State=ACTIVE;
; 2Jul2013 13:51:56.060115;[cpu_2];[fw4_1];FW-1: fwha_set_conf: need_to_set_trusted_ifs=0, need_to_delete_trusted_ifs=0;
; 2Jul2013 13:51:56.060116;[cpu_2];[fw4_1];FW-1: fwha_set_conf: confinfo->op: FWHAC_DEL_TRUSTED_IFS;
; 2Jul2013 13:51:56.060260;[cpu_2];[fw4_1];FW-1: fwha_set_conf: setting HA configuration (op = 0x80):;
; 2Jul2013 13:51:56.060273;[cpu_2];[fw4_1];FW-1: fwha_set_conf: SWITCH SUPPORT ;
; 2Jul2013 13:51:56.060275;[cpu_2];[fw4_1];FW-1: fwha_set_conf: Deleting all Trusted IFs;
; 2Jul2013 13:51:56.060277;[cpu_2];[fw4_1];FW-1: fwha_set_conf: buffering deletion of trusted interfaces;
; 2Jul2013 13:51:56.060279;[cpu_2];[fw4_1];FW-1: fwha_set_conf: setting need_to_delete_trusted_ifs=1 and returning;
; 2Jul2013 13:51:56.175754;[cpu_2];[fw4_1];FW-1: fwha_set_conf: entered with State=ACTIVE, Blocking State=ACTIVE;
; 2Jul2013 13:51:56.175754;[cpu_2];[fw4_1];FW-1: fwha_set_conf: need_to_set_trusted_ifs=0, need_to_delete_trusted_ifs=1;
; 2Jul2013 13:51:56.175754;[cpu_2];[fw4_1];FW-1: fwha_set_conf: confinfo->op: FWHAC_ADD_TRUSTED_IF;
; 2Jul2013 13:51:56.176228;[cpu_2];[fw4_1];FW-1: fwha_set_conf: setting HA configuration (op = 0x100):;
; 2Jul2013 13:51:56.176230;[cpu_2];[fw4_1];FW-1: fwha_set_conf: Trusted IF name = eth1;
; 2Jul2013 13:51:56.176232;[cpu_2];[fw4_1];FW-1: fwha_set_conf: SWITCH SUPPORT ;
; 2Jul2013 13:51:56.176234;[cpu_2];[fw4_1];FW-1: fwha_set_conf: Adding Trusted IF;
; 2Jul2013 13:51:56.176237;[cpu_2];[fw4_1];FW-1: fwha_set_conf: buffering trusted interface info (setting need_to_set_trusted_ifs=1);
; 2Jul2013 13:51:56.176240;[cpu_2];[fw4_1];FW-1: fwha_set_conf: copying confinfo->if_name=eth1 to fwha_trusted_ifs_buffered[0] and
returning;
.........................................
; 2Jul2013 13:51:56.255176;[cpu_3];[fw4_1];CPHA: the list of cluster IPs according to the interface:;
; 2Jul2013 13:51:56.255179;[cpu_3];[fw4_1];Interface: 1) eth0, cluster ip: 172.30.41.79;
; 2Jul2013 13:51:56.255180;[cpu_3];[fw4_1];Interface: 3) eth2, cluster ip: 20.20.20.79;
.........................................
; 2Jul2013 13:52:00.110480;[cpu_2];[fw4_1];CPHA: policy update packet local=NO, random=47647, status=2, policy=1862759333, first=YES,
entry=0;
; 2Jul2013 13:52:00.110489;[cpu_2];[fw4_1];Entry: 0
random_id: 47647
policy_id: 1862759333
update status: 2
time: 2013864;
; 2Jul2013 13:52:01.373818;[cpu_3];[fw4_1];FW-1: fwha_set_conf: entered with State=ACTIVE, Blocking State=ACTIVE;
; 2Jul2013 13:52:01.373820;[cpu_3];[fw4_1];FW-1: fwha_set_conf: need_to_set_trusted_ifs=1, need_to_delete_trusted_ifs=1;
; 2Jul2013 13:52:01.373822;[cpu_3];[fw4_1];FW-1: fwha_set_conf: confinfo->op: FWHAC_START;
; 2Jul2013 13:52:01.373826;[cpu_3];[fw4_1];FW-1: fwha_state_freeze: turning freeze type 0 ON (time=2013876, caller=fwha_set_conf);
; 2Jul2013 13:52:01.373828;[cpu_3];[fw4_1];FW-1: fwha_state_freeze: FREEZING state machine at ACTIVE (time=2013876,
caller=fwha_set_conf, freeze_type=0);
; 2Jul2013 13:52:01.373830;[cpu_3];[fw4_1];FW-1: fwha_set_conf: setting HA configuration (op = 0x30407e):;
; 2Jul2013 13:52:01.373831;[cpu_3];[fw4_1];FW-1: fwha_set_conf: mode = 4 (active up);
; 2Jul2013 13:52:01.373833;[cpu_3];[fw4_1];FW-1: fwha_set_conf: cluster ID = 4916;
; 2Jul2013 13:52:01.373834;[cpu_3];[fw4_1];FW-1: fwha_set_conf: cluster size = 2;

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 74
; 2Jul2013 13:52:01.373835;[cpu_3];[fw4_1];FW-1: fwha_set_conf: machine_id = 0;
; 2Jul2013 13:52:01.373837;[cpu_3];[fw4_1];FW-1: fwha_set_conf: Log mode = log(1);
; 2Jul2013 13:52:01.373838;[cpu_3];[fw4_1];FW-1: fwha_set_conf: policy_id = 2147483647;
; 2Jul2013 13:52:01.373839;[cpu_3];[fw4_1];FW-1: fwha_set_conf: SWITCH SUPPORT ;
; 2Jul2013 13:52:01.373841;[cpu_3];[fw4_1];FW-1: fwha_set_conf: Starting HA;
; 2Jul2013 13:52:01.373842;[cpu_3];[fw4_1];FW-1: fwha_set_conf: Stopping HA;
; 2Jul2013 13:52:01.373843;[cpu_3];[fw4_1];FW-1: fwha_set_conf: Vmac Mode: 0;
; 2Jul2013 13:52:01.374296;[cpu_3];[fw4_1];fwha_load_bond_configuration: Succeded getting bond ls required slaves num Data structure
from user space: ret=0;
; 2Jul2013 13:52:01.374336;[cpu_3];[fw4_1];FW-1: fwha_set_conf: bond required slaves num conf have changed since last install policy;
; 2Jul2013 13:52:01.374340;[cpu_3];[fw4_1];FW-1: fwha_set_conf: setting buffered trusted interface info;
; 2Jul2013 13:52:01.374341;[cpu_3];[fw4_1];FW-1: fwha_set_conf: setting need_to_set_trusted_ifs=0, confinfo->op |=
FWHAC_ADD_TRUSTED_IF;
; 2Jul2013 13:52:01.374342;[cpu_3];[fw4_1];FW-1: fwha_set_conf: setting buffered deletion of trusted interfaces;
; 2Jul2013 13:52:01.374344;[cpu_3];[fw4_1];FW-1: fwha_set_conf: setting need_to_delete_trusted_ifs=0, confinfo->op |=
FWHAC_DEL_TRUSTED_IFS;
; 2Jul2013 13:52:01.374346;[cpu_3];[fw4_1];FW-1: fwha_set_conf: calling check_upd_trusted_if();
; 2Jul2013 13:52:01.374348;[cpu_3];[fw4_1];FW-1: check_upd_trusted_if: interface eth1 has been found among buffered ifs;
; 2Jul2013 13:52:01.374350;[cpu_3];[fw4_1];FW-1: check_upd_trusted_if: interface eth1 has been found among trusted ifs;
; 2Jul2013 13:52:01.374352;[cpu_3];[fw4_1];FW-1: policy ID old=3778374695 new=1862759333;
; 2Jul2013 13:52:01.374354;[cpu_3];[fw4_1];FW-1: fwha_set_conf: policy ID will be changed from 3778374695 to 1862759333FW-1: Current
policy update status - 2;
; 2Jul2013 13:52:01.374356;[cpu_3];[fw4_1];CPHA: Sending Policy ID change request. Status: 2;
; 2Jul2013 13:52:01.374377;[cpu_3];[fw4_1];CPHA: policy update packet local=YES, random=56134, status=2, policy=1862759333,
first=YES, entry=1;
; 2Jul2013 13:52:01.374380;[cpu_3];[fw4_1];Entry: 0
random_id: 47647
policy_id: 1862759333
update status: 2
time: 2013864;
; 2Jul2013 13:52:01.374382;[cpu_3];[fw4_1];Entry: 1
random_id: 56134
policy_id: 1862759333
update status: 2
time: 2013876;
; 2Jul2013 13:52:01.374384;[cpu_3];[fw4_1];CPHA: Phase I: Looking for machines in policy update mode...found 2 machines.;
; 2Jul2013 13:52:01.374385;[cpu_3];[fw4_1];CPHA: Sending Policy ID change request. Status: 3;
; 2Jul2013 13:52:01.374391;[cpu_3];[fw4_1];CPHA: Phase II: Looking for machines ready to update policy...found 1 machines.;
; 2Jul2013 13:52:01.374392;[cpu_3];[fw4_1];CPHA: waiting for more machines.;
; 2Jul2013 13:52:01.375847;[cpu_2];[fw4_1];CPHA: policy update packet local=NO, random=47647, status=3, policy=1862759333, first=NO,
entry=0;
; 2Jul2013 13:52:01.375852;[cpu_2];[fw4_1];Entry: 0
random_id: 47647
policy_id: 1862759333
update status: 3
time: 2013876;

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 75
; 2Jul2013 13:52:01.375854;[cpu_2];[fw4_1];Entry: 1
random_id: 56134
policy_id: 1862759333
update status: 3
time: 2013876;
; 2Jul2013 13:52:01.375856;[cpu_2];[fw4_1];CPHA: Phase II: Looking for machines ready to update policy...found 2 machines.;
; 2Jul2013 13:52:01.375858;[cpu_2];[fw4_1];CPHA: All machines are ready to change their configuration.;
; 2Jul2013 13:52:01.375891;[cpu_2];[fw4_1];FW-1: Stopping ClusterXL.;
; 2Jul2013 13:52:01.375924;[cpu_2];[fw4_0];FW-1: stopping HA timer;
; 2Jul2013 13:52:01.375929;[cpu_2];[fw4_1];FW-1: stopping HA timer;
; 2Jul2013 13:52:01.375933;[cpu_2];[fw4_2];FW-1: stopping HA timer;
; 2Jul2013 13:52:01.376515;[cpu_2];[fw4_1];FW-1: fwha_bond_set_configuration: entering ...;
; 2Jul2013 13:52:01.376565;[cpu_2];[fw4_1];FW-1: fwha_conf_mode: fwha_installed=1, fwha_mode=4, mode=4, pivot_mode=0;
; 2Jul2013 13:52:01.376571;[cpu_2];[fw4_1];FW-1: Changing the machine ID to 0;
; 2Jul2013 13:52:01.376575;[cpu_2];[fw4_1];FW-1: set_use_sdf: Setting sdf mode to 0;
; 2Jul2013 13:52:01.376587;[cpu_2];[fw4_1];FW-1: fwha_reset_trusted_ifs: resetting required if number;
; 2Jul2013 13:52:01.376591;[cpu_2];[fw4_1];FW-1: add_trusted_if: added interface eth1 in position 0 in list;
; 2Jul2013 13:52:01.376596;[cpu_2];[fw4_1];fwha_set_vmac_state: fwha_vmac_global_param_enabled=0, ha_new_config.cluster_vmac_mode =
0, fwha_pivot_mode = 0, FWHA_USE_BACKUP_MODE() = 1, enable_vmac=0;
; 2Jul2013 13:52:01.376598;[cpu_2];[fw4_1];fwha_set_vmac_state: vmac mode should be disabled;
; 2Jul2013 13:52:01.376600;[cpu_2];[fw4_1];fwha_set_vmac_state: vmac state was not changed=0;
; 2Jul2013 13:52:01.385388;[cpu_2];[fw4_0];fwha_set_sync_tcp_handshake_mode: mode=MINIMAL. Disabling TCP handshake enforcement;
; 2Jul2013 13:52:01.385395;[cpu_2];[fw4_1];fwha_set_sync_tcp_handshake_mode: mode=MINIMAL. Disabling TCP handshake enforcement;
; 2Jul2013 13:52:01.385407;[cpu_2];[fw4_2];fwha_set_sync_tcp_handshake_mode: mode=MINIMAL. Disabling TCP handshake enforcement;
; 2Jul2013 13:52:01.385452;[cpu_2];[fw4_0];FW-1: starting HA timer;
; 2Jul2013 13:52:01.385454;[cpu_2];[fw4_1];FW-1: starting HA timer;
; 2Jul2013 13:52:01.385455;[cpu_2];[fw4_2];FW-1: starting HA timer;
; 2Jul2013 13:52:01.385464;[cpu_2];[fw4_1];fwha_df_set_force_df_ips_only_mode: is_nac_enabled = 0;
; 2Jul2013 13:52:01.385469;[cpu_2];[fw4_1];fwha_df_set_force_df_ips_only_mode: multi_portal_enabled = 0;
; 2Jul2013 13:52:01.385470;[cpu_2];[fw4_1];fwha_df_set_force_df_ips_only_mode: old force df ips only mode: 0, new force df ips only
mode: 0;
; 2Jul2013 13:52:01.385484;[cpu_2];[fw4_1];FW-1: Starting ClusterXL.;
; 2Jul2013 13:52:01.385500;[cpu_2];[fw4_1];FW-1: fwha_state_freeze: turning freeze type 0 OFF (time=2013876, caller=policy change -
finished changes (fwha_start));
; 2Jul2013 13:52:01.385503;[cpu_2];[fw4_1];FW-1: fwha_state_freeze: ENABLING state machine at ACTIVE (time=2013876,caller=policy
change - finished changes (fwha_start));

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 76
(5-3) State transitions of the cluster member

 In ClusterXL:

Initialization -
built-in Devices
report OK

LS mode, or
Initializing Ready no other active
machines
heard
Periodic check
of Devices OK
HA mode, and
Interface
other machine
Active Check
is Active
reports OK

All non-
problematic
machines
confirmed the
Active state
Down Standby Active
There are no
members that
send lower
version of CCP

Interface Other No other Active


Active Check Critical Device machines
reports problem reports problem in the cluster

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 77
 In 3rd party cluster:

Note: Only 'Down' and 'Active' states are available, since they reflect the status of the
State Synchronization (which is the only active mechanism).

Full Sync
successful

Down Active

State Sync
failure

(5-4) State transitions due to 'FWHAP_MY_STATE' packet

Each time the cluster member receives a CCP packet with OpCode 1
(FWHAP_MY_STATE), the decision mechanism is invoked and is required to re-evaluate the
state of the current machine:

High Availability Any cluster


configuration configuration

Down Active Down Active

The previously Active All active machines


machine sent "Down" stopped sending
state packet state packets

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 78
(5-5) State transitions due to a Critical Device (Pnote)

If any Critical Device (Pnote) reports its state as 'problem', the state of the current
machine is changed:

Critical Device
failure

Active/Standby Down

Critical Devices
OK

(5-6) State transitions due to the 'Interface Active Check' Critical Device
(Pnote)

State of the interface is changed based on the arrival/transmission of CCP packets:


o The transmit state of an interface (as monitored by this pnote) is refreshed once a
FWHAP_IF_PROBE_RPLY packet is received in acknowledge to
FWHAP_IF_PROBE_REQ packet.
o The receive state of an interface (as monitored by this pnote) is refreshed once any
CCP packet (UDP on port 8116) is received.

State of an interface as displayed in the output of 'cphaprob -a if' command:


 UP - CCP packets are received and sent within the predefined timeouts
 DOWN - CCP packets are not received and not sent (beyond the predefined timeouts),
and probing mechanism was able to determine the problematic interface
 Inbound: UP , Outbound: DOWN - CCP packets are not sent (beyond the
predefined timeouts)
 Inbound: DOWN , Outbound: UP - CCP packets are not received (beyond the
predefined timeouts)
 Inbound: DOWN , Outbound: DOWN - CCP packets are not received and not sent
(beyond the predefined timeouts), and probing mechanism was not yet able to
determine the problematic interface

The state of the interface in Check Point kernel (as seen in the debug of 'cluster'
module with flag 'if') is changed in the following way (assuming default values of
relevant kernel parameters):
 From UP to ASSUMED UP - after total of 0.6 sec and 1.2 sec of not receiving/sending
CCP packets
 From ASSUMED UP to UNKNOWN - after total of 2.2 sec of not receiving/sending CCP
packets

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 79
Refer to these solutions:
 sk62863 (Cluster debug shows interface flapping due to the missing CCP packets)
 sk43984 (Interface flapping when cluster interfaces are connected through several
switches)

Note about 3rd party cluster:


The FWHAP_IF_PROBE_REQ and FWHAP_IF_PROBE_RPLY packets are not sent.
The timeout of the interface is based on the FWHAP_IFCONF_RPLY packets that are
sent every 1 second. The total timeout for determining that an interface is not responding
in a certain direction is 3 seconds.

(5-7) Actions performed by a cluster member following a state transition

As a result of any state transition, the state machine performs correspondent actions
depending on the old and a new state. All actions can be divided into several groups
according to the ClusterXL components and other FW-1 features integrated with ClusterXL:

1. Re-establishing network connectivity to the cluster - during the transition to the


Active state usually sends Gratuitous ARP request packets to update ARP cache of
the hosts/routers connected to the cluster interfaces by advertising the new MAC
address for the cluster Virtual IP addresses.

2. Pivot selection refresh - when working in the Load Sharing Unicast mode, the
transition to/from the Active state will invoke the Pivot selection mechanism and, if
needed, the recalculation of the Pivot packet selection table, which will redistribute the
traffic among the cluster members according to the new state.

3. Automatic proxy ARP refresh - when automatic Proxy ARP feature enabled, the
new Active cluster member in High Availability New mode and Load Sharing Unicast
mode, will issue Gratuitous ARP Requests according to the contents of FW-1 kernel
"arp_table". For each entry in it that table, an ARP Request will be sent containing a
Proxied IP address and the local MAC address.

4. SecureXL selection refresh - when SecureXL is configured, the transitions from/to


the Active state will cause the state machine to issue updates to the SecureXL device
with a new cluster member state.

5. Synchronization buffer flush - when the machine stops processing the FW-1
gateway traffic (which usually happens when it is moving from the Active state to any
other state) the synchronization buffer, which might hold any Delta Sync data
regarding the connections processed so far, is flushed. This is done in order to update
the rest of cluster members that obviously, will handle the connections belong to the
machine changing its state.

6. Accounting information flush - in the case of Synchronization buffer flush, the


machine will flush Accounting information to the Security Management Server.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 80
6. Cluster Control Protocol (CCP)
(6-1) Introduction

The Cluster Control Protocol (CCP) is a proprietary Check Point protocol that runs on
UDP port 8116 (packets are not encrypted).

Refer to Cluster Control Protocol Reference document (this document applies to all
cluster versions since NG FP3).

CCP is located between the Check Point kernel and the network interface (therefore, only
TCPdump/Snoop utilities should be used for capturing this traffic).

Historical fact: The first release of Check Point ClusterXL supported only High Availability
Mode. The CCP at that time was called High Availability Protocol - HAP.
The ordinal numbers of letters in the alphabet are: H(8) + A(1) + P(16) = "8116".

CCP runs only on cluster and sync interfaces (refer to ClusterXL definitions and terms
section).

CCP has the following roles:

 State Synchronization - cluster members exchange Delta Sync packets about the
processed connections to keep the relevant kernel tables synchronized on all cluster
members.

Note: Each Delta Packet contains many pieces of information about different
connections. The payload of these Delta Sync packets is not encrypted, but it is not
human-readable (i.e., sniffing this traffic will not allow anyone to understand the
contents of these packets). The only way to understand what was transferred in these
packets is to run the relevant cluster debug on all cluster members (fw ctl debug
-m fw + sync).

It is up to the cluster administrator to make sure the Sync network is secured and
isolated.

 Health checks - cluster members exchange reports and query each other about their
own states and the states of their cluster interfaces:
o Health-status Reports
o Cluster-member Probing
o State-change Commands
o Querying for Cluster Membership

Notes:
o These CCP packets are not encrypted.
o This applies only to ClusterXL - Check Point cluster running on Gaia OS /
SecurePlatform OS / X-Series COS / Windows OS / Solaris OS (R6X and lower).
o In 3rd Party clusters (e.g., Check Point cluster running on X-Series XOS / IPSO
OS), the 3rd Party software is responsible for health checks.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 81
Explanations:
o Health-status Reports - These reports contain the state of the transmitting cluster
member, as well as the presumed state of the other cluster members.

o Cluster-member Probing - If a cluster member fails to receive status for another


member (does not receive CCP packets from that member) on a given segment,
cluster member will probe that segment in a best-effort attempt to illicit a response.

The purpose of such probes is to detect (best-effort) the nature of possible


interface failures, and to determine which module has the problem.

The outcome of this probe will determine what action is taken next (change the
state of an interface, or of a cluster member).
Cluster member sends a CCP packet 'FWHAP_IF_PROB_REQ'.
Cluster member sends series of ARP Requests in the loop for all IP addresses
on this subnet.
If hosts on this subnet send ARP Replies to cluster member, then cluster
member sends series of ICMP Requests (one such host is enough).
If hosts on this subnet send ICMP Replies to cluster member (one such host is
enough), then the local interface on this member is considered to work correctly,
and the missing CCP packets from peer member are considered as a failure on
peer member.
As a result, the peer member might be declared as failed ('Down'), which in
turn might cause a fail-over in the cluster.

Example:

Cluster member FW1 is not able to send/receive CCP packets to/from the
other member FW2 on the interface eth1, this member FW1 will need to
determine where the problem occurs - on the local interface eth1 or on the
other member - and perform a fail-over (if needed)

There are 2 possible reasons why this member FW1 will not able to
send/received CCP packets from the other member FW2:
o Cluster mechanism on the other member FW2 does not work anymore -
nobody can send CCP packets to this member FW1 and receive CCP
packets from this member FW1.
o Local interface eth1 on this member FW1 does not work anymore -
there is not traffic at all.

Computer administrator (human) can always determine where the problem


is - check cables, send pings, etc.
Cluster member is not that smart and has to rely on some simple tests that
are called "Probing".

When a member starts probing, cluster member starts sending ARP


Requests for the IP addresses in the subnet.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 82
For example, on FW1, the IP address on eth1 is 192.168.196.201 / 23.
It means, that the following ARP Requests will be sent:
Who has IP 192.168.196.203
Who has IP 192.168.196.204
Who has IP 192.168.196.205
Who has IP 192.168.196.206
....................................
Who has IP 192.168.196.255
Who has IP 192.168.197.0
Who has IP 192.168.197.1
....................................
Who has IP 192.168.197.255
Who has IP 192.168.196.1
Who has IP 192.168.196.2
Who has IP 192.168.196.3
....................................

If there are hosts with such IP addresses on the subnet, they will send an
ARP Reply to the cluster member (one such host is enough).
Cluster member starts sending ICMP Requests to the IP addresses that
answered the ARP Requests.
If the hosts send an ICMP Reply to the cluster member (one such host is
enough), then the cluster member FW1 will know that it can send usual traffic
through this interface eth1 and the problem with CCP packets must be
happening on the other member FW2.
If this cluster member FW1 is not able to determine where the problem is,
this interface eth1 will be declared as Failed (and by design, a fail-over will
occur).

o State-change Commands - If a cluster member needs to change its state, the


command to do so takes place on the defined secured (sync) interface.

o Querying for Cluster Membership - When a cluster member comes online, it will
send a series of CCP query/response messages, to gain knowledge of cluster
membership (which members are located on these subnets).

(6-2) CCP and security policy rule base

CCP is located between the Check Point kernel and the network interface. Therefore,
there is no need to add a rule to the Security Policy Rule Base that accepts CCP packets.

It is not possible to drop CCP packets based on Security Policy Rule Base (even if such
rule is created, the CCP packets will still be processed by Check Point kernel).
Refer to sk44177 (SmartView Tracker repeatedly shows drops with "Source and
destination addresses are equal").

It is not possible to perform NAT on CCP packets.

To capture the CCP packets, use only TCPdump/Snoop utilities (do not use Check Point
FW Monitor).

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 83
(6-3) CCP internal timers

In order for the ClusterXL to be as robust as possible, it is designed to send CCP packets
and expects to receive CCP packets on time - based on internal CCP timers.

If CCP packets are not sent/received on time (as expected based on internal CCP
timers), the internal ClusterXL algorithms will suspect that there is a problem with the state of
the involved interface(s) and/or with the state of the cluster member(s). Eventually, a
problematic interface, or the whole member might be declared as failed.

For example, if Member_B does not receive CCP packets from Member_A on interface
eth3, then Member_B might declare its interface eth3 as 'Down', or even declare itself as
'Down'. This, in turn, might lead to a fail-over between cluster members.

The operation of CCP is based on the following two separate internal timers:

 Sync timer

Purpose:
Performs sync-related actions every fixed interval. By default, the sync timer interval is
100ms. The base time unit is 100ms (or 1 tick), which is therefore the minimal value. This
time interval is controlled via global kernel parameter.

Global kernel parameter:


fwha_timer_sync_res

Formula in the code:


Sync timer interval =
= 10 x fwha_timer_base_res x fwha_timer_sync_res =
= 10 x 10 ms x fwha_timer_sync_res

Parameter values:
Integers from 1 (default) to 232-1

Notes:
o Increasing this value increases the time interval between Delta Sync actions. For
example, if the timer is doubled to 200 ms (fwha_timer_sync_res=2), then the
time interval between Delta Sync actions also doubles to 200 ms.
o Refer to sk41471 (ClusterXL - State Synchronization time interval and
'fwha_timer_sync_res' kernel parameter).

 CPHA timer

Purpose:
Performs cluster-related actions every fixed interval. By default, the CPHA timer interval
is 100ms. The base time unit is 100 ms (or 1 tick), which is also the minimum value. This
time interval is controlled via global kernel parameter.

Global kernel parameter:


fwha_timer_cpha_res

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 84
Formula in the code:
CPHA timer interval =
= 10 x fwha_timer_base_res x fwha_timer_cpha_res =
= 10 x 10 ms x fwha_timer_cpha_res

Parameter values:
Integers from 1 (default) to 232-1

Notes:
 If the cluster members are geographically separated from each other (e.g., located in
different cities), set the CPHA timer to be around 10 times the round-trip time (RTT) of
the synchronization network.
 Increasing this value increases the time it takes to detect a failover. For example, if
detecting interface failure takes 0.3 seconds, and the timer is doubled to 200 ms
(fwha_timer_cpha_res=2), then the time needed to detect an interface failure also
doubles - to 0.6 seconds.
 Refer to sk43872 (ClusterXL - CCP packets and fwha_timer_cpha_res parameter).

(6-4) CCP modes

CCP can run in these modes:


 Multicast (default since NG FP3 HF2) - the Layer 2 Destination MAC address of CCP
packets is 01:00:5E:X:X:X
 Broadcast - the Layer 2 Destination MAC address of CCP packets is
FF:FF:FF:FF:FF:FF
 Unicast - the Layer 2 Destination MAC address of CCP packets is the physical MAC
address of specific cluster member(s). This mode is used:
o On VSX cluster in VSLS configuration - when number of configured Virtual
Systems is less than the number of cluster members
o On 40000/60000 appliance (starting in R75.40VS for 61000) - refer to
'asg_sync_manager' utility (60000 Security System Administration Guide)

In VSX cluster:
 VSX NGX / VSX NGX R65 / VSX NGX R67 / VSX NGX R68:
o The only possible mode of CCP is Broadcast.
 R75.40VS / R76 and above:
o CCP mode over Sync Network is Broadcast for all Virtual Systems.
o CCP mode over non-Sync Networks is Multicast.
 In VSLS configuration, when instances of Virtual Systems are not running on all
cluster members (e.g., only 2 VSs were configured on a VSX cluster that has 4 cluster
members), the Delta Sync packets generated by a Virtual System, are sent in Unicast
only to those members that run the instance of same the Virtual System.
Refer to sk36644 (The Mode of Cluster Control Protocol (CCP) in VSX cluster).
Note: The CCP mode is not set on Virtual Switches because they do not send CCP
packets.

It is possible to change the CCP mode on-the-fly. Refer to sk20576 (How to set
ClusterXL Control Protocol (CCP) in Broadcast / Multicast mode in ClusterXL):

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 85
Notes:
 This change must be done on all members of the cluster.
 This change is applied immediately.
 This change survives reboot:
o Unix OS: refer to $FW_BOOT_DIR/ha_boot.conf file
o Windows OS: refer to Windows Registry key
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\CPHA\CCP
_mode

Procedure:

 To check the current mode, run:


[Expert@HostName]# cphaprob -a if

Notes:
o The CCP mode will appear at the end of the line.
o In VSX R68 and lower, the mode is not displayed (only Broadcast is supported).

Example from ClusterXL:

Required interfaces: 4
Required secured interfaces: 1

eth0 UP non sync(non secured), multicast


eth1 UP sync(secured), multicast
eth2 UP non sync(non secured), multicast
eth3 UP non sync(non secured), multicast

 To change the CCP mode to broadcast mode, run:


[Expert@HostName]# cphaconf set_ccp broadcast

 To change the CCP mode to multicast mode, run:


[Expert@HostName]# cphaconf set_ccp multicast

(6-5) CCP and VLAN interfaces

CCP has the following roles:


 Health checks (state of cluster members and of cluster interfaces)
 State Synchronization (Delta Sync)

In ClusterXL (including VSX), the Synchronization Network (CCP packets that carry Delta
Sync information) is supported only on the lowest VLAN tag of a VLAN interface. For
example, if three VLANs with tags 10, 20 and 30 were configured on interface eth1, then
only interface eth1.10 may be used for State Synchronization.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 86
Default health checks of cluster interfaces (monitoring of interfaces by ClusterXL) are
performed in the following way:

Interface ClusterXL VSX cluster


all cluster interfaces are all cluster interfaces
Physical interfaces
monitored are monitored
only lowest VLAN tag is HA: only lowest and highest
monitored VLAN tags are monitored
VLAN interfaces only lowest and highest
VSLS: all VLAN tags
VLAN tags are monitored
are monitored
(since R75.47/R77)

It is possible to customize the default monitoring of VLAN tags in the following way:

Monitor VLAN tag ClusterXL VSX cluster


need to disable
Only lowest VLAN tag default
the default behaviour *
Only lowest and highest default
default (HA)
VLAN tag (since R75.47 / R77)
All VLAN tags not supported default (VSLS)
Only specific VLAN tag Refer to sk92784 (Configuring VLAN Monitoring on
(since R71) ClusterXL for specific VLAN interface)

* Note: In VSX cluster, in order to disable the default monitoring behaviour, set the value of
the relevant kernel parameter to 0 (zero):
 Pre-R75.40VS versions: fwha_monitor_all_vlans
 R75.40VS / R76 and above: fwha_monitor_all_vlan

Refer to sk35462 (Abnormal behavior of cluster members during failover when 'Monitor all
VLAN' feature is enabled).

Starting in R80.10, output of the cphaprob -a -m if command also prints the VLAN
monitoring scheme.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 87
(6-6) CCP packet header

The protocol is sub-divided into several types. Packets of different types are used to send
machine status reports, query interfaces of other machines, and perform safe update of their
internal state (policy). A special type of message is used to perform State
Synchronization between cluster members, i.e., notify cluster members of connections
handled by each other.

(6-6-A) External Header

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
IP IP
Eth Type
0 Layer 2 Destination MAC address Layer 2 Source MAC address Ver Hdr
(0x0800)
(4) Len
IP
IP IP Flags + IP Layer 3 Layer 3
Total Pro
16 datagram Fragment TTL header Source Destin.
Length to
ID Offset checksum IP address IP addr
(11)
Layer 3
Layer 4 Layer 4
Destin. Total UDP
32 Source Destin. CCP Header and Data
IP addr Length checksum
Port Port
(cont.)
48 CCP Header and Data (cont.)
64 CCP Header and Data (cont.)

Important Note: It is not possible to control the CCP packets by security policy rule base
(neither by security rules, nor by NAT rules) because CCP is located between the Check
Point kernel and the network interface.

Length of external headers:


o Ethernet Header = 14 bytes
o IP Header = 20 bytes
o UDP Header = 8 bytes
o CCP offset = 42 bytes from frame start

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 88
Let us review the header fields:

o Layer 2

Refer to sk25977 (Connecting multiple clusters to the same network segment (same
VLAN, same switch).

o Destination MAC address (Bytes 0 - 5)

In ClusterXL:
Value Notes
01:00:5e:YY:ZZ:WW  when VIP address is configured on these interfaces
 when CCP is set to run in multicast mode
 when sent over non-secured (non-sync) interfaces
 YY:ZZ:WW = concatenation of 3 last octets of VIP

Algorithm for VIP address = "A"."B"."C"."D":

o If 2nd octet "B" < 127, then Final MAC =


01:00:5E:("B"hex):("C"hex):("D"hex)

Example:
If VIP = 192.50.204.20, then Final MAC =
01:00:5E:("50"hex):("204"hex):("20"hex) =
01:00:5E:32:CC:14

o If 2nd octet "B" > 127, then Final MAC =


01:00:5E:("B-128"hex):("C"hex):("D"hex)

Example:
If VIP = 192.168.204.20, then Final MAC =
01:00:5E:("168-128"hex):("204"hex):("20"hex) =
01:00:5E:28:CC:14
01:00:5e:YY:ZZ:WW  when there is no VIP configured on this interface
 when CCP is set to run in multicast mode
 when sent over non-secured (non-sync) interfaces
 YY = the 2nd octet (from the left) of the final
calculated IP address after adding 250 to the
interface's network address
 ZZ = the 3rd octet (from the left) of the final calculated
IP address after adding 250 to the interface's network
address
 WW = the 4th octet (from the left) of the final
calculated IP address after adding 250 to the
interface's network address

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 89
Algorithm:
1. Calculate the interface's network address - perform
logical AND between the interface's IP address
and subnet mask
2. Add 250 to the calculated interface's network
address
3. Convert the 2nd (YY), 3rd (ZZ) and 4th (WW)
octets of the final calculated IP address from Dec
to Hex format

Example #1
A. The interface's IP address and subnet mask are:
192.168.40.100 / 24
B. The interface's network address is:
192.168.40.100 AND 255.255.255.0 =
192.168.40.0
C. The final calculated IP address is:
192.168.40.0 + 250 =
11000000.10101000.00101000.01100000 +
00000000.00000000.00000000.11111010 =
11000000.10101000.00101000.11111010 =
192.168.40.250
D. The converted octets are:
"168" dec = "A8" hex
"40" dec = "28" hex
"250" dec = "FA" hex
E. Hence, the Final MAC:
01:00:5E:("168"hex):("40"hex):("250"hex) =
01:00:5E:A8:28:FA

Example #2
A. The interface's IP address and subnet mask are:
192.168.40.100 / 29
B. The interface's network address is:
192.168.40.100 AND 255.255.255.248 =
192.168.40.96
C. The final calculated IP address is:
192.168.40.96 + 250 =
11000000.10101000.00101000.01100000 +
00000000.00000000.00000000.11111010 =
00000000.00000000.00101001.01011010 =
192.168.41.90
D. The converted octets are:
"168" dec = "A8" hex
"41" dec = "29" hex
"90" dec = "5A" hex
E. Hence, the Final MAC:
01:00:5E:("168"hex):("41"hex):("90"hex) =
01:00:5E:A8:29:5A
FF:FF:FF:FF:FF:FF  when CCP is set to run in broadcast mode
 when sent over secured (sync) interfaces

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 90
In VSX cluster:
Value Notes
FF:FF:FF:FF:FF:FF Refer to sk36644.
o In VSX NGX / VSX R65 / VSX R67 / VSX R68:
The only possible mode of CCP is Broadcast.
o In R75.40VS / R76 and above in VSX mode:
o CCP mode over Sync Network is Broadcast for all
Virtual Systems
o CCP mode over non-Sync Networks is Multicast
o In VSLS configuration:
When instances of VSs are not running on all cluster
members (e.g., only 2 VSs were configured on a VSX
cluster that has 4 cluster members), the Delta Sync
packets generated by a VS, are sent in Unicast only to
those members that run the instance of same the VS.

o Source MAC address (Bytes 6 - 11)

Note: The same Source MAC address is used for all the VSs on the same member.

In ClusterXL (on Gaia R77.30 and above) and in VSX mode (R77.30 and above)
before installing the policy for the first time:
1st 2nd 3rd 4th 5th 6th
00 00 00 00 Value derived from 21
Cluster_Global_ID
Notes:
 Cluster_Global_ID - controls the value of 5th byte in Source MAC address of
CCP packets.
Default values are:
o 0xFE hex / 254 dec - ClusterXL Gateway mode on Gaia OS R77.30
o 0xFE hex / 254 dec - ClusterXL VSX mode on Gaia OS R77.30
o Starting in Gaia R80.10, the value is assigned automatically

In ClusterXL (R77.20 and lower) and in VSX mode (R75.40VS / R76 / R77 / R77.10 /
R77.20) before installing the policy for the first time:
1st 2nd 3rd 4th 5th 6th
00 00 00 00 fwha_mac_magic 21
Notes:
 fwha_mac_magic - name of the kernel parameter that controls the value of 5th
byte in Source MAC address of CCP packets.
Default values are:
o 0xFE hex / 254 dec - ClusterXL Gateway mode on R77.20 and lower
o 0xFE hex / 254 dec - ClusterXL VSX mode on R75.40VS / R76 and above
o 0xF6 hex / 246 dec - VSX Cluster from VSX NGX up to VSX R68

Refer to these solutions:


 sk25977 (Connecting multiple clusters to the same network segment (same VLAN,
same switch)
 sk62432 (Source MAC Address of Cluster Control Protocol (CCP) frames in
ClusterXL before installing the policy for the first time)

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 91
In ClusterXL running Gaia R80.10:
1st 2nd 3rd 4th 5th 6th
Value is assigned
00 00 00 00 ID_of_Source_Member
automatically

In ClusterXL running Gaia R77.30:


1st 2nd 3rd 4th 5th 6th
Value derived from
00 00 00 00 Cluster_Global_ID ID_of_Source_Member

In ClusterXL running Gaia R75.40-R77.20 / SecurePlatform / IPSO:


1st 2nd 3rd 4th 5th 6th
Value of
00 00 00 00 fwha_mac_magic ID_of_Source_Member

In VSX cluster running Gaia R80.10:


1st 2nd 3rd 4th 5th 6th
Value is assigned
00 00 00 XXXXXXXX ID_of_Source_Member
automatically

In VSX cluster running Gaia R77.30:


1st 2nd 3rd 4th 5th 6th
Value derived from
00 00 XXXXXXXX 00 Cluster_Global_ID ID_of_Source_Member

In VSX cluster running Gaia R75.40VS, R76, R77, R77.10, R77.20 /


SecurePlatform VSX NGX, VSX R65, VSX R67, VSX R68 / IPSO VSX R65:
1st 2nd 3rd 4th 5th 6th
Value of
00 00 00 00 fwha_mac_magic ID_of_Source_Member

Notes:
 Cluster_Global_ID - controls the value of 5th byte in Source MAC address of
CCP packets.
Default values are:
o 0xFE hex / 254 dec - ClusterXL Gateway mode on Gaia OS R77.30
o 0xFE hex / 254 dec - ClusterXL VSX mode on Gaia OS R77.30
o Starting in Gaia R80.10, the value is assigned automatically
 fwha_mac_magic - controls the value of 5th byte in Source MAC address of CCP
packets.
Default values are:
o 0xFE hex / 254 dec - ClusterXL Gateway mode on R77.20 and lower
o 0xFE hex / 254 dec - ClusterXL VSX mode on R75.40VS / R76 and above
o 0xF6 hex / 246 dec - VSX Cluster from VSX NGX up to VSX R68
o Starting in Gaia R80.10, the value is assigned automatically
 XXXXXXXX - is either 00000000, or 8 least significant (right-most) bits of VSID -
controlled by setting the kernel parameter fwha_add_vsid_to_ccp_mac=1

Refer to this solution:


 sk25977 (Connecting multiple clusters to the same network segment (same VLAN,
same switch)

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 92
 Layer 3

Address Value Notes


Source IP address 0.0.0.0 The IP address of the CCP packet on
Destination IP address broadcast the receiver side is ignored and is not
address for being checked.
this subnet

Refer to this solution:


 sk104567 (Traffic passing through the VSX cluster is lost during a cluster failure
on Standby member)

 Layer 4 (UDP)

Port Value Notes


Source port 8116 It is strongly recommended not to pass
Destination port 8116 any other traffic on UDP port 8116
through ClusterXL

(6-6-B) CCP Header

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
IP IP
Eth Type
0 Layer 2 Destination MAC address Layer 2 Source MAC address Ver Hdr
(0x0800)
(4) Len
IP
IP IP Flags + IP Layer 3 Layer 3
Total Pro
16 datagram Fragment TTL header Source Destin.
Length to
ID Offset checksum IP address IP addr
(11)
Layer 3
Layer 4 Layer 4 Magic
Destin. Total UDP CCP Cluster
32 Source Destin. Number
IP addr Length checksum Version Number
Port Port (0x1A90)
(cont.)
Total
Source Destin.
CCP Source IF Random num. of
48 Machine Machine Policy ID Filler
OpCode Number ID CoreXL
ID ID
FW inst.
CoreXL
VSX
64 instance CCP Data
VSID
ID

CCP offset = 42 bytes from frame start

Magic Number (Bytes 42 - 43) - Identifies the CCP protocol with the constant number 1A90
hex / 6800 dec.

CCP Version (Bytes 44 - 45) - An integer number that is assigned in each Check Point
version. All member of the same cluster must have identical CCP version (i.e., identical
Check Point software).

When a cluster member receives CCP packets with CCP Version lower than his, it goes
into 'Ready' state (by design). Refer to sk42096 (Cluster member is stuck in 'Ready' state).

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 93
Notes:
 The CCP version can be obtained by checking the value of kernel parameter
'fwha_version':
[Expert@GW_HostName]# fw ctl get int fwha_version

 The CCP version on 32-bit system in Gateway mode equals the value of kernel
parameter fwha_version:
CCP 32-bit GW = fwha_version

 The CCP version on 64-bit system in Gateway mode is greater by 1 than CCP version
on 32-bit system in Gateway mode:
CCP 64-bit GW = CCP 32-bit GW + 1 = fwha_version + 1

 The CCP version on system in VSX mode 32-bit is greater by 2 than CCP version on
32-bit system in Gateway mode:
CCP VSX = CCP 32-bit GW + 2 = fwha_version + 2

 The CCP version on system in VSX mode 64-bit is greater by 3 than CCP version on
32-bit system in Gateway mode:
CCP VSX = CCP 32-bit GW + 3 = fwha_version + 3

CCP Version vs. Check Point software:

Version Version Check Point


Code (Dec) Code (Hex) software version
1 0x0001 4.1
2 0x0002 NG (FP0)
3 0x0003 NG FP1
6 0x0006 NG FP2
530 0x0212 NG FP3
534 0x0216 VSX NG AI R2
537 0x0219 VSX NGX EA
538 0x0219 VSX NGX GA
540 0x021C NG AI R54 EA
541 0x021D NG AI R54 GA
550 0x0226 NG AI R55 (up to HFA_16)
551 0x0227 NG AI R55 HFA_17
552 0x0228 NG AI R55W
553 0x0229 NG AI R55 HFA_18
591 0x024F NG AI R55 LSV
593 0x0251 NGX R60 EA
601 0x0259 NGX R60 GA
NGX R60 HFA_01
602 0x025A NGX R60 HFA_02
646 0x0286 NGX R60 Multicast acceleration
650 0x028A NGX R60 with Anti-Virus
665 0x0299 NGX R61 EA2
667 0x029B NGX R61 GA
690 0x02B2 NGX R62 EA
691 0x02B3 NGX R62 GA
700 0x02BC Connectra NGX R61 EA

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 94
705 0x02C1 Connectra NGX R61 GA
710 0x02C6 Connectra NGX R66 GA
800 0x0320 NGX R65 EA
801 0x0321 NGX R65 GA
802 0x0322 NGX R65 HFA_01
803 0x0323 NGX R65 HFA_02
804 0x0324 NGX R65 HFA_02
Connectra NGX R66.1
805 0x0325 NGX R65 HFA_03
810 0x032A NGX R65 HFA_03 GA
811 0x032B NGX R65 HFA_40
813 0x032D NGX R65 HFA_50
814 0x032E NGX R65 HFA_50
815 0x032F NGX R65 HFA_60
816 0x0330 NGX R65 HFA_70
850 0x0352 VSX NGX Scalability Pack
900 0x0384 VSX NGX R65 GA
901 0x0385 VSX NGX R65 HFA_10
902 0x0386 VSX NGX R65 HFA_20
1000 0x03E8 NGX R65 with CoreXL LE
1010 0x03F2 VSX NGX R67 EA
1001 0x03E9 VSX NGX R67 GA
1100 0x044C VSX NGX R68 GA
1500 0x05DC R70 EA
1501 0x05DD R70 GA
1502 0x05DE R70.1 EA
R70.1 IPv6Pack HCC
1505 0x05E1 R70.1 GA
1506 0x05E2 R70.1 IPv6Pack
1508 0x05E4 R70.12
1516 0x05EC R70.20
1518 0x05EE R70.30
1520 0x05F0 R70.40
FireWall-1 GX 5.0 HCC
1523 0x05F3 R70.50
1505 0x05E1 R71 GA (also refer to sk50260)
1555 0x0613 R71.10
R71 VE
1557 0x0615 R71.20
1559 0x0617 R71.30
1561 0x0619 R71.40
1562 0x061A R71.45
1563 0x061B R71.50
2000 0x07D0 R75 GA
2005 0x07D5 R75.10
2010 0x07DA R75.20
2020 0x07E4 R75.30
2210 0x08A2 R75.40 32-bit
2211 0x08A3 R75.40 64-bit
2500 0x09C4 R75.40VS 32-bit

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 95
2501 0x09C5 R75.40VS 64-bit
2502 0x09C6 R75.40VS in VSX mode
2220 0x08AC R75.45 32-bit
2221 0x08AD R75.45 64-bit
2225 0x08B1 R75.46 32-bit
2226 0x08B2 R75.46 64-bit
2230 0x08B6 R75.47 32-bit
2231 0x08B7 R75.47 64-bit
2235 0x08BB R75.48 32-bit
2236 0x08BC R75.48 64-bit
2700 0x0A8C R76 32-bit
2701 0x0A8D R76 64-bit
2702 0x0A8E R76 in VSX mode
2900 0x0B54 R77 32-bit
2901 0x0B55 R77 64-bit
2902 0x0B56 R77 in VSX mode
2905 0x0B59 R77.10 32-bit
2906 0x0B5A R77.10 64-bit
2907 0x0B5B R77.10 in VSX mode
2910 0x0B5E R77.20 32-bit
2911 0x0B5F R77.20 64-bit
2912 0x0B60 R77.20 in VSX mode
2920 0x0B68 R77.30 32-bit
2921 0x0B69 R77.30 64-bit
2922 0x0B6A R77.30 in VSX mode
3120 0x0C30 R80.10 32-bit
3121 0x0C31 R80.10 64-bit
3122 0x0C32 R80.10 in VSX mode 32-bit
3123 0x0C33 R80.10 in VSX mode 64-bit

On 40000/60000 Security System:


Version Version Check Point
Code (Dec) Code (Hex) software version
2000 0x07D0 R75.050 / R75.051 / R75.052 for 61000
2500 0x09C4 R75.40VS for 61000 32-bit
2501 0x09C5 R75.40VS for 61000 64-bit
2502 0x09C6 R75.40VS for 61000 in VSX mode
62700 0xF4EC R76SP for 41000/61000 32-bit
62701 0xF4ED R76SP for 41000/61000 64-bit
62702 0xF4EE R76SP for 41000/61000 in VSX mode
62710 0xF4F6 R76SP.10 for 41000/61000 32-bit
62711 0xF4F7 R76SP.10 for 41000/61000 64-bit
62712 0xF4F8 R76SP.10 for 41000/61000 in VSX mode
62700 0xF4EC R76SP.10_VSLS for 41000/61000 32-bit
62701 0xF4ED R76SP.10_VSLS for 41000/61000 64-bit
62702 0xF4EE R76SP.10_VSLS for 41000/61000 in VSX mode
62700 0xF4EC R76SP.20 for 41000/61000 32-bit
62701 0xF4ED R76SP.20 for 41000/61000 64-bit
62702 0xF4EE R76SP.20 for 41000/61000 in VSX mode
62700 0xF4EC R76SP.30 for 41000/61000 32-bit

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 96
62701 0xF4ED R76SP.30 for 41000/61000 64-bit
62702 0xF4EE R76SP.30 for 41000/61000 in VSX mode
62700 0xF4EC R76SP.40 for 41000/61000 32-bit
62701 0xF4ED R76SP.40 for 41000/61000 64-bit
62702 0xF4EE R76SP.40 for 41000/61000 in VSX mode
62700 0xF4EC R76SP.50 for 40000/60000 32-bit
62701 0xF4ED R76SP.50 for 40000/60000 64-bit
62702 0xF4EE R76SP.50 for 40000/60000 in VSX mode 32-bit
62703 0xF4EF R76SP.50 for 40000/60000 in VSX mode 64-bit

Cluster Number (Bytes 46 - 47) - This number identifies the cluster, on which this datagram
is communicated. The cluster number is set by Security Management Server.

CCP OpCode (Bytes 48 - 49) - This code identifies the type of CCP packet. Each CCP
OpCode implies a different structure of the packet’s Data section (see below).

Refer to this document (the structure of CCP Data has not changed):
NGX R60 Advanced Technical Reference Guide (ATRG) - Chapter 11 ClusterXL -
Debugging CPHA Issues - General Analysis Matrix for CPHA Packets

OpCode Type Description


1 FWHAP_MY_STATE Report source machine's state
2 FWHAP_QUERY_STATE Query other machine's state
3 FWHAP_IF_PROBE_REQ Interface active check (probe) request
4 FWHAP_IF_PROBE_RPLY Interface active check (probe) reply
5 FWHAP_IFCONF_REQ Interface configuration request
6 FWHAP_IFCONF_RPLY Interface configuration reply
Load Balancing (Load Sharing)
7 FWHAP_LB_CONF
configuration report
Load Balancing (Load Sharing)
configuration report and a request for
8 FWHAP_LB_CONF_CONFIRM
its confirmation (a reply to
FWHAP_LB_CONF)
9 FWHAP_POLICY_CHANGE Policy ID change request/notification
10 FWHAP_SYNC Delta Sync packets ("New" version)
11 FWHAP_CLUSTER_ADDITIONAL_INFO Additional cluster info

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 97
On 40000/60000 Security System:

OpCode Type Description


1 FWHAP_MY_STATE Report source machine's state
2 FWHAP_QUERY_STATE Query other machine's state
3 FWHAP_IF_PROBE_REQ Interface active check (probe) request
4 FWHAP_IF_PROBE_RPLY Interface active check (probe) reply
5 FWHAP_IFCONF_REQ Interface configuration request
6 FWHAP_IFCONF_RPLY Interface configuration reply
Load Balancing (Load Sharing)
7 FWHAP_LB_CONF
configuration report
Load Balancing (Load Sharing)
configuration report and a request for
8 FWHAP_LB_CONF_CONFIRM
its confirmation (a reply to
FWHAP_LB_CONF)
9 FWHAP_POLICY_CHANGE Policy ID change request/notification
10 FWHAP_SYNC Delta Sync packets ("New" version)
11 FWHAP_CHASSIS_STATE Chassis protocol
Chassis freeze mechanism
12 FWHAP_CHASSIS_FREEZE
(freeze after failover)
13 FWHAP_SECURITY_GROUP Security group advertising
Chassis sync lost mechanism
14 FWHAP_CHASSIS_SYNC_LOST
(freeze when sync is lost)
Chassis link state mechanism
15 FWHAP_CHASSIS_LINK_STATE
(freeze when sync is lost)
16 FWHAP_CHASSIS_GENERAL_INFO Additional Chassis info
18 FWHAP_ARP ARP Forwarding via CCP (sk111392)
Inform which member performs Full
19 FWHAP_FULL_SYNC_INFO
Sync and from which member
20 FWHAP_UPGRADE Used during upgrade
Chassis link state mechanism
23 FWHAP_CHASSIS_LINK_STATE (freeze when sync is lost) - since
R76SP.50

Source IF Number (Bytes 50 - 51) - The ID of the network interface that originated this CCP
packet.
These IDs are assigned by Check Point kernel during attachment to the interfaces.
Refer to the output of the 'fw ctl iflist' command on each cluster member (Note:
these outputs show the local configuration on the cluster member, and therefore do not have
to be identical on all cluster members).

Random ID (Bytes 52 - 53) - Each cluster member is assigned a random ID upon boot. This
field states the random ID of the machine that originated this CCP packet.

Source Machine ID (Bytes 54 - 55) - The ID of the machine that originated the packet based
on the internal cluster numbering (starts from zero). Each cluster member is given a number,
which identifies it within the cluster - refer to the output of 'cphaprob state' command.
These numbers are assigned based on the priority of cluster members as configured in
SmartDashboard - cluster object - 'ClusterXL Members' pane (the higher the member is
located in this list, the higher its priority and the lower its ID).

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 98
Destination Machine ID (Bytes 56 - 57) - The ID of the machine, for which this CCP packet
is intended based on the internal cluster numbering (starts from zero). Each cluster member
is given a number, which identifies it within the cluster - refer to the output of 'cphaprob
state' command.
These numbers are assigned based on the priority of cluster members as configured in
SmartDashboard - cluster object - 'ClusterXL Members' pane (the higher the member is
located in this list, the higher its priority and the lower its ID).

Policy ID (Bytes 58 - 59) - Each policy installed on cluster member is identified by a unique
ID. This enables different cluster members to verify they are working under the same policy.
Policy ID can be seen only during cluster debug (fw ctl debug -m cluster + conf).
Note: To handle a situation, where one member has already enforced the new policy ID,
and sends Delta Sync packets to member, who has not yet done so, we regard packets
that contain the previous policy ID as legal, for a short period after the end of the policy
negotiations.

Filler (Bytes 60 - 61) - Originally, this field was used to align the CCP header, and it was
always set to 0.

As of NG FP3, this field is also used to indicate the status of the source machine in
Service Mode only. Possible values for this field are 1 for 'Active' and 0 for 'Down'.
Starting in NG FP4, the Filler has 2 fields in Service Mode:
 The first byte (nibble) contains the member status (as in NG FP3):
o If it contains 1, then in 'Sync only' mode, the member is ready to accept a Full
Sync from other cluster members. Otherwise, it can not act as a Full Sync server.
This can happen if Full Sync has failed, or if there is no policy yet.
 The second byte (nibble) contains the pnote status.
o If it contains 0, then all pnotes report their status as 'OK'.
o Otherwise, it will contain 1.

Note: The 'Filler' field is relevant only in cluster running on IPSO OS, in which a member
state is updated also by the statuses of pnotes. In other 3rd party solutions, the pnote
status is passed on the network, but is being disregarded by Check Point code.

Total num. of CoreXL FW inst. (Bytes 62 - 63) - Total number of loaded CoreXL FW
instances. This field exists since R70.
CoreXL instance ID (Bytes 64 - 65) - The ID of the CoreXL FW instance, to which this CCP
packet belongs (sent from/to). This field exists since R70.

VSX VSID (Bytes 66 - 67) - ID of the Virtual System, to which this CCP packet belongs (sent
from/to). In non-VSX, always contains 0.
This field exists in R75.40VS, R76, R77 and above.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 99
CCP Data (Bytes 68 - above) - Each CCP OpCode implies a different structure of the
packet’s Data section.

Let us review several types of CCP packets in detail:


 FWHAP_MY_STATE Data
 FWHAP_QUERY_STATE Data
 FWHAP_IF_PROBE_REQ Data
 FWHAP_IF_PROBE_RPLY Data
 FWHAP_IFCONF_REQ Data
 FWHAP_IFCONF_RPLY Data
 FWHAP_POLICY_CHANGE Data
 FWHAP_SYNC Data

 FWHAP_MY_STATE Data

OpCode Type Description


1 FWHAP_MY_STATE Report source machine's state

The FWHAP_MY_STATE OpCode designates a packet containing a report on the state of


the machines in the cluster, as known to the source machine, as well as a report on the state
of the source machine. Since this packet may induce state changes on the receiving cluster
member, the receiving member will accept state changes only if the packet was received on
a secured interface.

Note: These packets are not sent in 3rd party clusters.

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
IP IP
Eth Type
0 Layer 2 Destination MAC address Layer 2 Source MAC address Ver Hdr
(0x0800)
(4) Len
IP
IP IP Flags + IP Layer 3 Layer 3
Total Pro
16 datagram Fragment TTL header Source Destin.
Length to
ID Offset checksum IP address IP addr
(11)
Layer 3
Layer 4 Layer 4 Magic
Destin. Total UDP CCP Cluster
32 Source Destin. Number
IP addr Length checksum Version Number
Port Port (0x1A90)
(cont.)
Total
Source Destin.
CCP Source IF Random num. of
48 Machine Machine Policy ID Filler
OpCode Number ID CoreXL
ID ID
FW inst.
Number State State
CoreXL
VSX of Report
64 instance HA Mode Problem of of ... ...
VSID Reported Code
ID ID 0 ID 1
IDs
As. As. LPT LPT
In Out
IFR IF
In
IF
Out of of ... ...
IF IF ID 0 ID 1

The byte IFR (Interface Report) is calculated using the following formula:
IFR = 70 + <Number of Reported IDs>

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 100
Number of Reported IDs (Bytes 68 - 69) - Specifies the number of machines, for which the
state is reported.

Report Code (Bytes 70 - 71) - Flags indicating whether this packet contains a machine state
report, an interface state report, or both. The flags specified below can be combined together
using bitwise OR to form the field value:

Value Flag Name Description


0x1 FWHAP_RP_MACHINE_STATE Report source machine's state
0x2 FWHAP_RP_IF_STATE Query another machine’s state

HA Mode (Bytes 72 - 73) - Contains the mode of the machine that sent this datagram.

Value Mode Name Description


0 FWHA_UNDEF_MODE Report source machine's state
1 FWHA_NOT_ACTIVE_MODE HA is not active
2 FWHA_BALANCE_MODE More than one machine is active
3 FWHA_PRIMARY_UP_MODE Backup mode: active machine is the one
with the lowest ID alive
4 FWHA_ONE_UP_MODE Backup mode: active machine remains
active until it dies

Problem (Bytes 74 - 75) - This field contains a Boolean value:


 1 - the machine that originated this packet has a problem
 0 - otherwise
Note: On NG FP2, this field held a time stamp, which was set to the number of tenths of
seconds elapsed since the machine’s last boot.

State of ID x (Bytes 76+x) - Reports the state of the machine whose internal ID is “x”.

Possible states are:


Value State Name Description
0 FWHA_FW_DEAD Machine reports itself as dead
1 FWHA_FW_INIT Machine is up and running, but is not
ready to receive packets yet
2 FWHA_FW_STANDBY Machine is able to process packets, but
is currently set as a backup machine
3 FWHA_FW_READY Machine is ready to process packets,
but is currently waiting for other
machines to confirm their states
4 FWHA_FW_ACTIVE Machine is filtering packets
10 FWHA_FW_TOTAL_DEAD Timeout occurred waiting for this
machine to report (more than 1 sec)

In IF (Byte IFR) - Number of interfaces currently up, in the inbound direction on the source
machine.

As. In IF (Byte IFR+1) - Number of interfaces currently assumed to be up, in the inbound
direction on the source Machine. (Bytes 70+2x - 70+2x+1)

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 101
Out IF (Byte IFR+2) - Number of interfaces currently up, in the outbound direction on the
source machine.

As. Out IF (Byte IFR+3) - Number of interfaces currently assumed to be up, in the outbound
direction on the source machine.

LPT of ID x (Byte IFR+4+x) - Reports the time, in HA time units (10 HA time units ~ 1
second), elapsed since the last CCP packet was received from machine with ID “x”.
Note: HA time units are mostly used by Check Point RnD.

 FWHAP_QUERY_STATE Data

OpCode Type Description


2 FWHAP_QUERY_STATE Query other machine's state

These packets are used by a cluster member to ask another member for its status. This
is used when source member stopped receiving CCP packets from another member for
some time (0.2 seconds) and may want to inquire the other member to see if it is "alive".

This CCP packet does not have any CCP Data.

Note: These packets are not sent in 3rd party clusters.

 FWHAP_IF_PROBE_REQ Data

OpCode Type Description


3 FWHAP_IF_PROBE_REQ Interface active check request

An interface probing is a mechanism, which allows a machine to verify that its interfaces
are up and are able to receive and transmit data.
These packets are used to verify the status of each interface.
This is done to detect connectivity problems of the interfaces.
Refer to Clustering Definitions and Terms section.

Note: These packets are not sent in 3rd party clusters.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 102
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
IP IP
Eth Type
0 Layer 2 Destination MAC address Layer 2 Source MAC address Ver Hdr
(0x0800)
(4) Len
IP
IP IP Flags + IP Layer 3 Layer 3
Total Pro
16 datagram Fragment TTL header Source Destin.
Length to
ID Offset checksum IP address IP addr
(11)
Layer 3
Layer 4 Layer 4 Magic
Destin. Total UDP CCP Cluster
32 Source Destin. Number
IP addr Length checksum Version Number
Port Port (0x1A90)
(cont.)
Total
Source Destin.
CCP Source IF Random num. of
48 Machine Machine Policy ID Filler
OpCode Number ID CoreXL
ID ID
FW inst.
CoreXL
VSX Interface
64 instance
VSID Number
ID

Interface Number (Bytes 62 - 63) - FireWall-1 serial interface number of the queried
interface. Refer to the output of 'fw ctl iflist' command.

 FWHAP_IF_PROBE_RPLY Data

OpCode Type Description


4 FWHAP_IF_PROBE_RPLY Interface active check reply

An interface probing is a mechanism, which allows a machine to verify that its interfaces
are up and are able to receive and transmit data.
This packet is a reply to FWHAP_IF_PROBE_REQ packet.
These packets are used to verify the status of each interface.
This is done to detect connectivity problems of the interfaces.

Note: The transmit state of an interface (as monitored by 'Interface Active Check' pnote)
is refreshed once a FWHAP_IF_PROBE_RPLY packet is received in acknowledge to
FWHAP_IF_PROBE_REQ packet.

Note: The 'FWHAP_IF_PROBE_RPLY' packet is always sent with Layer 2 Destination


MAC address of subnet Broadcast FF:FF:FF:FF:FF:FF. Refer to sk44410 (CCP packets
are sent in Broadcast although CCP mode is set to Multicast).

Note: These packets are not sent in 3rd party clusters.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 103
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
IP IP
Eth Type
0 Layer 2 Destination MAC address Layer 2 Source MAC address Ver Hdr
(0x0800)
(4) Len
IP
IP IP Flags + IP Layer 3 Layer 3
Total Pro
16 datagram Fragment TTL header Source Destin.
Length to
ID Offset checksum IP address IP addr
(11)
Layer 3
Layer 4 Layer 4 Magic
Destin. Total UDP CCP Cluster
32 Source Destin. Number
IP addr Length checksum Version Number
Port Port (0x1A90)
(cont.)
Total
Source Destin.
CCP Source IF Random num. of
48 Machine Machine Policy ID Filler
OpCode Number ID CoreXL
ID ID
FW inst.
CoreXL
VSX Interface
64 instance
VSID Number
ID

Interface Number (Bytes 62 - 63) - FireWall-1 serial interface number of the queried
interface. Refer to the output of 'fw ctl iflist' command.

 FWHAP_IFCONF_REQ Data

OpCode Type Explanation


5 FWHAP_IFCONF_REQ Interface configuration request

These packets are used in order to learn the following information about peer cluster
members:
o Interfaces
o IP addresses
o MAC addresses

These packets are sent occasionally to verify the IP addresses are still the same.

ClusterXL uses these packets in order to discover cluster misconfiguration as follows:


o whether one machine considers an interface secured, while the other does not
o whether the IP addresses reported by the sending machine belong to a different
interface on the receiving machine (which may indicate a cable connectivity
problems).

This CCP packet does not have any CCP Data.

Note: The 'FWHAP_IFCONF_REQ' packet is always sent with Layer 2 Destination MAC
address of subnet Broadcast FF:FF:FF:FF:FF:FF. Refer to sk44410 (CCP packets are
sent in Broadcast although CCP mode is set to Multicast).

Note: These packets are sent in 3rd party clusters.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 104
 FWHAP_IFCONF_RPLY Data

OpCode Type Explanation


6 FWHAP_IFCONF_RPLY Interface configuration reply

Packets of this type, sent at fixed intervals, or as a reply to FWHAP_IFCONF_REQ


packets, contain a report on the interface originating the message.
These packets are also sent when the machine boots to "tell" all other cluster members
about the IP addresses and MAC addresses of the new machine.
These packets are sent on each interface, and contain indication whether the sending
machine considers this interface secured.

ClusterXL uses these packets in order to discover cluster misconfiguration as follows:


o whether one machine considers an interface secured, while the other does not
o whether the IP addresses reported by the sending machine belong to a different
interface on the receiving machine (which may indicate a cable connectivity
problems).

Note: These packets are sent in 3rd party clusters.

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
IP IP
Eth Type
0 Layer 2 Destination MAC address Layer 2 Source MAC address Ver Hdr
(0x0800)
(4) Len
IP
IP IP Flags + IP Layer 3 Layer 3
Total Pro
16 datagram Fragment TTL header Source Destin.
Length to
ID Offset checksum IP address IP addr
(11)
Layer 3
Layer 4 Layer 4 Magic
Destin. Total UDP CCP Cluster
32 Source Destin. Number
IP addr Length checksum Version Number
Port Port (0x1A90)
(cont.)
Total
Source Destin.
CCP Source IF Random num. of
48 Machine Machine Policy ID Filler
OpCode Number ID CoreXL
ID ID
FW inst.
CoreXL Number of Trusted
VSX
64 instance Reported Ethernet Address interface
VSID
ID IPs ?
80 IP addr 1 IP addr 2 IP addr 3 ...

Number of Reported IPs (Bytes 68 - 69) - Number of IP addresses associated with this
interface.

Ethernet Address (Bytes 70 - 75) - The real Ethernet address of the interface (as opposed
to the phony address, see “External Header”).

Trusted interface (Bytes 76 - 77) - Boolean value: 1 if this interface is trusted (secured), 0
otherwise.

IP addr X (Bytes 78+4x - 78+4x+3) - IP address number X associated with the reporting
interface (ClusterXL uses only the first configured IP address).

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 105
 FWHAP_POLICY_CHANGE Data

OpCode Type Explanation


9 FWHAP_POLICY_CHANGE Policy ID change request/notification

This message type is used to synchronize cluster members when configuration


parameters change. It ensures that all cluster members will activate the changes
simultaneously, so as to avoid configuration conflicts, and in order to verify the entire cluster
enforces the same policy at any given time.

Note: These packets are not sent in 3rd party clusters.

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
IP IP
Eth Type
0 Layer 2 Destination MAC address Layer 2 Source MAC address Ver Hdr
(0x0800)
(4) Len
IP
IP IP Flags + IP Layer 3 Layer 3
Total Pro
16 datagram Fragment TTL header Source Destin.
Length to
ID Offset checksum IP address IP addr
(11)
Layer 3
Layer 4 Layer 4 Magic
Destin. Total UDP CCP Cluster
32 Source Destin. Number
IP addr Length checksum Version Number
Port Port (0x1A90)
(cont.)
Total
Source Destin.
CCP Source IF Random num. of
48 Machine Machine Policy ID Filler
OpCode Number ID CoreXL
ID ID
FW inst.
CoreXL Policy
VSX
64 instance Update New Policy ID
VSID
ID State

Policy Update State (Bytes 68 - 71) - The members that originated this packet, notifies the
other members whether or not it needs to change its own configuration, due to the new
policy. The message is also used to notify all cluster members that the originator is ready to
apply the changes.

Possible values are:


Value Name Description
1 FWHA_POLICY_UPD_INIT This member does not need to
update its configuration
2 FWHA_POLICY_UPD_NEED This member needs to update its
configuration to conform with the new policy
3 FWHA_POLICY_UPD_READY This member is ready to apply
the configuration changes
4 FWHA_POLICY_UPD_NEW This member has just joined the cluster, and
has already applied the new policy

New Policy ID (Bytes 72 - 75) - Specifies the ID of the new policy, which the source member
is trying to enforce. All cluster members should agree on this value before the policy can be
updated.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 106
This field contains last two bytes of MD4 hash of Policy ID (Policy ID is generated by the
Security Management Server based on the contents of compiled policy files -
<PolicyName>.ft, <PolicyName>.fc, <PolicyName>.set).

 FWHAP_SYNC Data

OpCode Type Explanation


10 FWHAP_SYNC New Delta Sync packets

This packet type defines a sub-protocol of CCP, used to maintain the State
Synchronization between cluster members. This is done by sending updates about the
FireWall kernel tables wrapped in the CCP packet data.

Refer to State Synchronization in ClusterXL section.

Note: These packets are sent in 3rd party clusters.

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
IP IP
Eth Type
0 Layer 2 Destination MAC address Layer 2 Source MAC address Ver Hdr
(0x0800)
(4) Len
IP
IP IP Flags + IP Layer 3 Layer 3
Total Pro
16 datagram Fragment TTL header Source Destin.
Length to
ID Offset checksum IP address IP addr
(11)
Layer 3
Layer 4 Layer 4 Magic
Destin. Total UDP CCP Cluster
32 Source Destin. Number
IP addr Length checksum Version Number
Port Port (0x1A90)
(cont.)
Total
Source Destin.
CCP Source IF Random num. of
48 Machine Machine Policy ID Filler
OpCode Number ID CoreXL
ID ID
FW inst.
Sync OP

CoreXL
Flags

VSX Sequence
64 instance Sync OP Specific Data
VSID Number
ID

Sequence Number (Bytes 68 - 70) - Uniquely identifies the packet, in case a retransmission
is needed.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 107
Sync OP (Byte 71, Lower Nibble) - Defines a Sync protocol OpCode:

OpCode value OpCode name Explanation


0 BC_MSG Holds FireWall table data
(may be fragmented)
1 BC_RETRANS_REQ Request to retransmit
missing data fragments
2 BC_RETRANS_REQ Request an ACK message
from peer members
4 BC_RETRANS_REJECT Rejects a retransmission request

Flags (Byte 71, Upper Nibble) - A bit-wise combination of the following values:

Flag Value Flag Name Explanation


0x80 BC_ACK_FLAG Indicates an acknowledge is
required for flushed data
0x10 BC_FRAGM_FLAG Indicates this packet is a single
fragment of a larger message
0x20 BC_LAST_FRAGM_FLAG Indicates this is the last
fragment in the message

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 108
7. ClusterXL Monitoring and Troubleshooting
Refer to ClusterXL Debugging section.

Refer to these solutions:


 sk56202 (How to troubleshoot failovers in ClusterXL)
 sk62570 (How to troubleshoot failovers in ClusterXL - Advanced Guide)
 sk81740 (How to configure a cluster to send Mail Alert upon fail-over)
 sk65923 (How to configure the cluster to send SNMP Trap upon fail-over)
 sk67560 (How to export History Report from SmartView Monitor)
 sk98348 (Best Practices - Security Gateway Performance)
 sk52421 (Ports used by Check Point software)

(7-1) SmartView Tracker

The best and simplest way to start cluster troubleshooting, is to check the cluster logs
(pre-requisite for such logs is to set 'Track changes in the status of cluster
members' to 'Log' in SmartDashboard - cluster object - ClusterXL - Tracking).
Refer to Configuring cluster object in SmartDashboard section.

In SmartView Tracker:
1. Open the FireWall log that contains the data from the time of the cluster problem.
2. Go to the 'Date' column
A. Right-click on the 'Date' column header
B. Click on 'Edit Filter...'
C. Select the relevant date
D. Click on OK button
3. Go to the 'Time' column
A. Right-click on the 'Time' column header
B. Click on 'Edit Filter...'
C. Select the relevant time
D. Click on OK button
4. Go to the 'Information' column
A. Right-click on the 'Information' column header
B. Click on 'Edit Filter...'
C. Select 'Specific'
D. In 'Field' - select 'Contains'
E. In 'Text' - type cluster_info
F. Click on OK button
5. Analyze the cluster logs
6. Go to 'File' menu - click on 'Export...'

Refer to ClusterXL Administration Guide (R70, R70.1, R71, R75, R75.20, R75.40,
R75.40VS, R76, R77.X, R80.10) - Chapter 'Monitoring and Troubleshooting Gateway
Clusters' - Monitoring Cluster Status Using SmartConsole Clients - SmartView Tracker.

Refer to SmartView Tracker Administration Guide (R75.40, R75.40VS, R76, R77.X).

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 109
(7-2) SmartView Monitor

SmartView Monitor displays a snapshot of all ClusterXL cluster members, enabling real
time monitoring and alerting. For each cluster member, state change and critical device
problem notifications are displayed.

The SmartView Monitor GUI client communicates with the cluster member via the Check
Point Application Monitoring (AMON) Infrastructure.
The AMON client (SmartView Monitor GUI) sends a request for some specific OID
(SNMP Get) to the AMON server on the cluster member. The AMON server queries the
Check Point kernel (in the same way as the "cphaprob" commands) in order to retrieve the
requested information.
The information is then formatted into MIB (SNMP Response) and sent back to the
AMON client for display.

It is also possible to stop and start ClusterXL on the member:


1. On the left, go to Gateways Status view.
2. Select the relevant cluster member of a given cluster.
3. Right-click on the selected member.
4. Go to Cluster Member menu
5. Select the relevant operation - 'Stop Member' or 'Start Member'.

Notes:
 SmartView Monitor uses a separate Check Point infrastructure to control ClusterXL
(special internal command is sent from SmartView Monitor to Security Management
Server that manages this cluster, which sends another internal command to perform
the requested operation on ClusterXL).
 Complicated debug is required in order to see this communication (FWM and CPD
daemons on Management Server, and CPD daemon on cluster member).
 Cluster administrator should use command line on each cluster member to control
ClusterXL (cpstart/cpstop ; cphastart/cphastop).

Refer to ClusterXL Administration Guide (R70, R70.1, R71, R75, R75.20, R75.40,
R75.40VS, R76, R77.X, R80.10) - Chapter 'Monitoring and Troubleshooting Gateway
Clusters' - Monitoring Cluster Status Using SmartConsole Clients - SmartView Monitor.

Refer to SmartView Monitor Administration Guide (R70, R71, R75, R75.20, R75.40,
R75.40VS, R76, R77.X) - Chapter 'Monitoring Gateway Status' - Configuring Gateway Views
- Start/Stop Cluster Member.

Refer to these solutions:


 sk67560 (How to export History Report from SmartView Monitor)
 sk65923 (How to configure the cluster to send SNMP Trap upon fail-over)
 sk31961 (When viewing a ClusterXL Member via SmartView Monitor, VLAN
Interfaces not visible)
 sk88360 ('Error: 'ClusterXL' is not responding. Verify that 'ClusterXL' is installed on
the gateway' message in SmartView Monitor)
 sk53701 (ClusterXL works correctly in HA mode, but in LS mode a member is shown
as 'Disconnected' in SmartView Monitor, and policy installation intermittently fails on
that member with SIC error no. 148)

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 110
(7-3) Clock synchronization

Refer to Clock Synchronization section.

If clocks on cluster members are out of sync, then the SIC communication between the
members and the VPN will fail.

Refer to these solutions:


 sk92602 (How to troubleshoot NTP on Gaia OS)
 sk90365 (Enabling NTP causes OSPF adjacencies to disconnect)
 sk92984 (NTP client on Gaia fails to synchronize with Windows 2003)
 sk40322 (Is it recommended to use NTP with VRRP or IP Clustering?)
 sk39783 (NTP process fails after there is a VRRP state change)
 sk67740 (How to stop 'ntpdate[PID]: adjust time server' logs in /var/log/messages)
 sk32647 (Entries in /var/log/messages files have different timestamps when using
NTP Server - some entries are shown with local time, and some entries are shown
with correct UTC/GMT time)

(7-4) CCP mode

Refer to Cluster Control Protocol (CCP) section and to ClusterXL Requirements for
Hardware and Software section.

(7-5) SecureXL

Refer to Requirements for software section and to SecureXL section.

Refer to these solutions:


 sk32578 (SecureXL Mechanism)
 sk98722 (ATRG: SecureXL)
 sk71200 (SecureXL NAT Templates)
 sk67861 (Accelerated Drop Rules Feature in R75.40 and above)
 sk66402 (SecureXL Drop Templates are not supported in versions lower than R76)
 sk79620 (SecureXL 'sim affinity -s' settings do not survive reboot)
 sk61962 (SMP IRQ Affinity on Check Point Security Gateway)
 sk62441 (Problems with VPN and NAT when SecureXL is enabled)
 sk93308 (Security Gateway randomly reboots when IPS or SecureXL is enabled)
 sk82280 (Security Gateway with Route Based VPN configuration crashes when
SecureXL is enabled)
 sk90301 (SecureXL does not start on the Backup member of VRRP cluster after
reboot)
 sk79880 (Traffic is dropped 'by cphwd_offload_conn Reason: VPN and/or NAT traffic
between accelerated and non-accelerated interfaces or between non-accelerated
interfaces is not allowed')
 sk93348 (On R75.40VS in VSX mode, traffic does not pass from Virtual Router to
Virtual System when SecureXL is enabled)

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 111
 sk31404 (How to Debug SecureXL)
 sk35175 (Security Gateway does not apply 'keep_DF_flag' parameter when SecureXL
is enabled)
 sk90740 (Latency after upgrade from NGX R65 to R75.x when SecureXL is enabled)
 sk82280 (Security Gateway with Route Based VPN configuration crashes when
SecureXL is enabled)
 sk80940 (Multi-Queue hotfix for Security Gateway)

(7-6) CoreXL

Refer to Requirements for software section and to CoreXL section.

Refer to these solutions:


 sk61701 (CoreXL Known Limitations)
 sk98737 (ATRG: CoreXL)
 sk42096 (Cluster member is stuck in 'Ready' state)
 sk44488 (CoreXL is enabled, however not all available CPU cores are used)
 sk36750 ("License violation: The current machine has M CPU cores and the installed
license is valid for up to N CPU cores" error when installing license)
 sk61284 (CoreXL Affinity settings of daemons do not survive reboot)
 sk64301 (CoreXL interface affinity is not enforced, even if SecureXL is disabled)
 sk76800 (IP Pool NAT support in CoreXL)
 sk53060 (URI Resource and CoreXL)
 sk86401 (Connections with Hide NAT are dropped during policy installation due to
NAT port allocation failure when CoreXL is enabled)
 sk65463 ('Peak' number of connections - discrepancy between the output of 'fw tab -t
connections -s' command and the output of 'fw ctl pstat' command when CoreXL is
enabled)
 sk83300 (Packets are dropped on Trusted Interface MPLS when CoreXL is enabled)
 sk43443 (How to debug CoreXL)
 sk80940 (Multi-Queue hotfix for Security Gateway)

(7-7) VPN

Refer to VPN section.

Refer to these solutions:


 sk92332 (Customizing the VPN configuration for Check Point Security Gateway -
'vpn_table.def' file)
 sk108600 (VPN Site-to-Site with 3rd party)
 sk62441 (Problems with VPN and NAT when SecureXL is enabled)
 sk93204 (Troubleshooting "Clear text packet should be encrypted" error in ClusterXL)
 sk61902 (How to start VPND daemon under debug)
 skI4326 (Enabling IKE and VPN debugging)
 sk33327 (How to generate a valid ike debug, vpn debug and fw monitor)

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 112
 sk63560 (How to run complete VPN debug on Security Gateway to troubleshoot VPN
issues)
 sk40114 (What files and information are needed to troubleshoot VPN related issues)
 sk92465 (Slow Site-to-Site VPN affected by Virtual Defragmentation)
 sk34467 (Debugging Site to Site VPN)
 sk31114 (How to collect debug on Security Gateway during boot)

(7-8) NAT

Refer to NAT section.

The following command allows to work with the NAT table (fwx_alloc, ID 8187):
[Expert@Member]# fw tab -t fwx_alloc [flags]

For more information on the 'fw tab' command, refer to Command Line Interface
Reference Guide - Chapter 'Security Management Server and Firewall Commands' - fw - fw
tab.

Refer to these solutions:


 sk31832 (How to prevent ClusterXL / VRRP / IPSO IP Clustering from hiding its own
traffic behind Virtual IP address)
 sk34180 (Outgoing connections from cluster members are sent with cluster Virtual IP
address instead of member's Physical IP address)
 sk32224 (NAT Table 'fwx_alloc')
 sk36708 (NAT table reaches its maximum capacity on ClusterXL, which causes traffic
issues)
 sk35733 (NAT Table (fwx_alloc) is larger than Connections Table (connections) on
ClusterXL members with enabled State Synchronization)
 sk60343 (How To Troubleshoot NAT-related Issues)
 sk30197 (Configuring Proxy ARP for Manual NAT)
 sk62441 (Problems with VPN and NAT when SecureXL is enabled)
 sk78340 (NATed connections are dropped during ISP Link failover if CoreXL is
enabled)
 sk86401 (Connections with Hide NAT are dropped during policy installation due to
NAT port allocation failure when CoreXL is enabled)
 sk69480 ('NAT Hide failure - there are currently no available ports for hide operation'
log appears repeatedly in SmartView Tracker)

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 113
(7-9) VLAN

Refer to VLAN section and to CCP and VLAN interfaces section.

Refer to these solutions:


 sk92826 (ClusterXL VLAN monitoring)
 sk61323 (Monitoring of VLAN interfaces in ClusterXL)
 sk92784 (Configuring VLAN Monitoring on ClusterXL for specific VLAN interface)
 sk35462 (Abnormal behavior of cluster members during failover when 'Monitor all
VLAN' feature is enabled)
 sk95218 (Disconnected monitored VLAN can cause ClusterXL upgrade failure)

(7-10) Link Aggregation (Bonding)

Refer to Link Aggregation (Bonding) section.

Refer to ClusterXL Administration Guide (R65, R70, R70.1, R71, R75, R75.20, R75.40,
R75.40VS, R76, R77.X, R80.10) - Chapter 'ClusterXL Advanced Configuration' - Link
Aggregation and Clusters - Troubleshooting Bonded Interfaces.

Refer to these solutions:


 sk43730 (Failover in Bond interface can cause failover in ClusterXL)
 sk64009 (Bond interface configured on 10 Gb Intel interfaces - enslaved interfaces do
not have link after reboot)
 sk93341 (Output of 'cphaprob -a if' shows Bond interface as 'Down' in the context of
any Virtual System)
 sk71880 (After several reboots of VSX Gateway, state of Bond interface is 'Down')
 sk97779 (Critical minimal number of interfaces in a Bond in VRRP cluster running on
Gaia OS)

(7-10-A) Checking Bond status

1. Run the command 'cphaconf show_bond -a', and note the state of the bond
interfaces.

Note: On Linux-based OS, this information is taken from


/proc/net/bonding/BondName.

2. Run the command 'cphaconf show_bond BondName', and note which interfaces
are active inside the bond interface.

Note: On Linux-based OS, this information is taken from


/proc/net/bonding/BondName.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 114
Possible statuses of slave interfaces:
 Down - (only Bond in Load Sharing mode) the physical link is down.
 Active - currently is handling traffic.
 Standby - (only Bond in High Availability mode) the interface is ready and can
support internal bond failover.
 Not Available - (only Bond in High Availability mode) the physical link is broken, or
the Cluster member is in status Down. The bond cannot failover internally in this
state.

3. Check the state of slave interfaces directly with 'cat


/proc/net/bonding/BondName' command.

Pay attention to the link status on physical slave interfaces and to the bond
parameters, compare these to the configuration on the switch(es).

(7-10-B) Bond in High Availability mode

To see whether the bond can failover internally:


[Expert@Member]# cphaprob -a if

On the corresponding line for the bond interface, the words 'can failover' must
appear.

To test the internal bond failover:


[Expert@Member]# cphaconf failover_bond BondName.

The standby and active interfaces should switch.

(7-10-C) Bond in Load Sharing mode

Important Note: Bond in Load Sharing requires Performance Pack (SecureXL) to be


enabled.

Check the configuration of critical minimum number of slave interfaces :


[Expert@Member]# cat $FWDIR/conf/cpha_bond_ls_config.conf

In order to improve the performance, SIM Affinity should be configured to run in Static
mode via 'sim affinity -s' command.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 115
(7-11) Adding another member or interface to an existing cluster

Any change in the physical of software configuration of an existing cluster might cause a
failover.

Therefore, in order to avoid traffic outage and to have the ability to troubleshoot, all such
changes must be carried out during a maintenance window.

Refer to ClusterXL Administration Guide (R70, R70.1, R71, R75, R75.20, R75.40,
R75.40VS, R76, R77.X, R80.10) - Chapter 'ClusterXL Advanced Configuration' - Adding
Another Member to an Existing Cluster.

Refer to these solutions:


 sk57100 (Adding or removing an interface in ClusterXL High Availability topology
might cause fail-over)
 sk39047 (Output of 'cphaprob -a if' command shows 'Sync will not function since there
aren't any sync(secured) interfaces' after replacing one VRRP/Clustered pair with new
or upgraded hardware)

If some changes must be performed in "real time" (e.g., installation of a hotfix), then
follow these suggestions:

 In High Availability mode cluster:


o Perform the necessary changes on Standby member
o Make sure that the machine is up and running
o Make sure that the ClusterXL is running correctly on Standby member (refer to
'cphaprob' command section)
o Perform a manual failover from currently Active member to Standby member per
sk55081 (Best practice for manual fail-over in ClusterXL)
o Make sure that the new Active member processes the traffic correctly
o Perform the necessary changes on former Active member
 In Load Sharing Multicast mode cluster:
o Select one of the members
o Perform the necessary changes on the selected member
o Make sure that the machine is up and running
o Make sure that the ClusterXL is running correctly on selected member (refer to
'cphaprob' command section)
o Perform a manual failover from the "unchanged" member to the "changed"
member per sk55081 (Best practice for manual fail-over in ClusterXL)
o Make sure that the traffic passes correctly
o Perform the necessary changes on "unchanged" member
 In Load Sharing Unicast mode cluster:
o Perform the necessary changes on Non-Pivot member
o Make sure that the machine is up and running
o Make sure that the ClusterXL is running correctly on Non-Pivot member (refer to
'cphaprob' command section)
o Perform a manual failover from current Pivot member to Non-Pivot member per
sk55081 (Best practice for manual fail-over in ClusterXL)
o Make sure that the new Pivot member processes the traffic correctly
o Perform the necessary changes on former Pivot member

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 116
(7-12) ISP Redundancy

Refer to section ISP Redundancy section.

Refer to these solutions:


 sk42636 (Controlling connections configured with ISP Redundancy in Load Sharing
mode)
 sk66521 (ISP Redundancy in ClusterXL when interfaces of cluster members and
cluster VIP are defined on different subnets per sk32073)
 sk25152 (Static (Hide) NAT fails for outgoing connections through gateway with ISP
Redundancy in Load Sharing mode)
 sk60590 (ISP Redundancy is missing from the gateway or cluster object)
 sk61692 (Troubleshooting ISP Redundancy)
 sk65341 (ISP Redundancy probing is not working in ClusterXL)
 sk83900 (ISP Redundancy failover is not working in Gaia OS)
 sk31530 (ISP Redundancy Link Interface cannot be created)
 sk40958 (How to verify the status of ISP Redundancy links on command line)

(7-13) Dynamic Routing

Refer to Dynamic Routing section.

Refer to these solutions:


 sk62570 (How to troubleshoot failovers in ClusterXL - Advanced Guide)
 sk31243 (ClusterXL member is "Down" due to Critical device "FIB")
 sk43281 (FIBMGR packets dropped by fw_cluster_ttl_anti_spoofing Reason: ttl check
drop)
 sk43243 (How to debug FIBMGRD daemon)
 sk41393 (How to Troubleshoot OSPF Problems)
 sk40164 (What Information do I collect for OSPF issues?)
 sk33201 (Regarding ClusterXL and OSPF)
 sk36231 (OSPF equal multipath support in SecurePlatform Pro)
 sk82600 (Graceful restart for OSPF and BGP in Gaia does not work)
 sk32568 (How to increase OSPF adjacency membership on SecurePlatform Pro)
 sk84520 (How to debug OSPF and RouteD daemon on Gaia)
 sk60860 (How to debug OSPF and GateD daemon on SecurePlatform Pro)
 sk60861 (How to debug BGP and GateD daemon on SecurePlatform Pro)
 sk92598 (How to collect traces and debugs information for PIM and Multicast on
Gaia)
 sk85280 (Advanced Routing (OSPF, BGP, etc) configuration is not saved by 'save
configuration <file name>' command in Gaia CLISH shell)

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 117
(7-14) SNMP

Refer to SNMP section.

Refer to these solutions:


 sk59023 (Disable verbose SNMP logging - "snmpd[PID]: Received SNMP packet(s)
from UDP:")
 sk66648 (SecurePlatform does not send SNMP Traps)
 sk66581 (SecurePlatform sends SNMP Traps only to one sink server, although
several sink servers were configured; SNMP Traps are always sent with with 'public'
community name)
 sk93644 (How to bind SNMPD on SecurePlatform OS to specific interface)
 sk80820 (LinkUp/LinkDown (linkUpLinkDown) Trap is not working on Gaia)
 sk72760 ('snmpwalk' always reports speed of Bond and Bridge interfaces as 10
Mbps)
 sk77260 ('snmpwalk' always reports speed of 10 Gb interfaces as 10 Mbps)
 sk90362 (SNMPD daemon fails to start on Gaia OS)
 sk89300 (SNMPD daemon crashes after interface IP address change on Gaia OS)
 sk61425 (Machine with Check Point software responds with 'No Such Object available
on this agent at this OID' to Check Point SNMP OID, but responds correctly to generic
SNMP OID)
 sk69625 (Gaia does not provide SNMP RAID Trap)
 sk66585 (/var/log/messages shows - snmpd[PID]: /etc/snmp/snmpd.conf: line N:
Warning: Unknown token)
 sk92937 (SNMPv3 with USM 'authentication' configuration does not survive reboot on
Gaia OS)
 sk93204 (Troubleshooting "Clear text packet should be encrypted" error in ClusterXL)
 sk38936 (How to debug dropped SNMP V1 & V2 packets)
 sk56783 (How to debug SNMPD daemon on SecurePlatform and Gaia)
 sk66586 (How to debug SNMPMONITOR on SecurePlatform and Gaia)
 sk66383 (How to debug CPSNMPAGENTX on SecurePlatform and Gaia)
 sk66384 (How to debug CPSNMPD on SecurePlatform and Gaia)

(7-15) Policy Installation

Policy installation on cluster triggers re-configuration of each cluster member. Part of this
re-configuration is negotiation of the state of each member.
The policy installation process is transparent for the traffic. Policy installation, in certain
cases, may cause a cluster member to initiate a failover.
Cluster administrator can control the installation of policy on cluster with the help of
several kernel parameters (each parameter is described below):
 fwha_freeze_state_machine_timeout
 fwha_policy_update_timeout_factor
 fwha_conf_immediate
 fwha_cul_policy_freeze_timeout_millisec
 fwha_cul_policy_freeze_event_timeout_millisec
Refer to sk92723 (Cluster flapping prevention).

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 118
How to work with kernel parameters:
 The current value of a kernel parameter can be checked with 'fw ctl get int
fwha_conf_immediate' command.
 The value of a kernel parameter can be set:
o either on-the-fly with 'fw ctl set int fwha_conf_immediate VALUE'
command (this change does not survive reboot)
o or by adding a parameter with desired value into the
$FWDIR/boot/modules/fwkern.conf file (per sk26202)

Let us review each kernel parameter:

 'fwha_freeze_state_machine_timeout' kernel parameter

Explanation:
o This kernel parameter is related to what Check Point calls the "state machine",
which is responsible for determining the state of each machine in the cluster, i.e.,
whether the machine is Active/Standby/Down. When the state of the machine is
changed, failover takes place. During policy installation, there are cases, in which,
the state is changed, and consequently an unwanted failover may occur.
o This parameter sets the number of seconds, during which the state of each cluster
member will be "frozen" starting from the moment the policy installation starts on
the member, and until the count-down reaches zero.

Values:
o This parameter sets a timeout value, and its units are seconds.
o In versions prior to R75.40VS, the default value is 0 seconds ("freeze" mechanism
is disabled).
o Starting in R75.40VS, the default value is set to 30 seconds.
o Upper limit is 232-1.

Notes:
o This kernel parameter is known as "freeze" mechanism.
o When the value of this kernel parameter is set to some value, and Cluster Member
priorities were changed, then during policy installation, the cluster configuration on
members will not be updated correctly even though output of 'cphaprob state'
command shows that the Member IDs and their state have changed. Refer to
sk66064 (Change of Cluster Member priority when the kernel parameter
'fwha_freeze_state_machine_timeout' is enabled may cause network outage).
o On VSX cluster members, the "freeze" mechanism applies only to cluster member
itself (Virtual System 0). It does not apply to any other Virtual Systems.
o In R75.40VS / R76 and above in VSX mode, Virtual System 0 will monitor/perform
state change lock even when other Virtual Systems get the policy.
o When the value of this kernel parameter is set to some value, the following
messages will appear in /var/log/messages file during policy installation:
;FW-1: fwha_state_freeze: FREEZING state machine at CURRENT_STATE
(time=HTU,caller=fwha_set_conf);
;FW-1: fwha_state_freeze: ENABLING state machine at CURRENT_STATE
(time=HTU,caller=policy change - finished changes (fwha_start));
The following messages in /var/log/messages file are normal during the boot of the
machine:
;FW-1: fwha_state_freeze: FREEZING state machine at FAILURE
(time=HTU,caller=fwha_set_conf);

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 119
;FW-1: fwha_state_freeze: ENABLING state machine at FAILURE
(time=HTU,caller=policy change - finished changes (fwha_start));

Refer to these solutions:


o sk32488 (When to use 'fwha_freeze_state_machine_timeout' parameter)
o sk25971 (Failover occurs in the cluster during Security Policy installation)
o sk66064 (Change of Cluster Member priority when the kernel parameter
'fwha_freeze_state_machine_timeout' is enabled may cause network outage)
o sk66881 (On OPSec cluster, output of 'cphaprob state' does not show the local
member, only peer members)

 'fwha_policy_update_timeout_factor' kernel parameter

Explanation:
o When policy is installed on a cluster, the cluster members undertake a negotiation
process to make sure all of them have received the same policy before they
actually apply it. This negotiation process has a timeout mechanism, which makes
sure a cluster member does not wait indefinitely for responses from other cluster
members, which is useful in cases when another cluster member goes down when
policy is being installed (for example).
o In configurations, in which policy installation takes a long time (usually caused by a
policy with a large number of rules), a cluster with more than two machines, and
slow machines, this timeout mechanism may expire prematurely.
o It is possible to tune the timeout by setting the desired value for this kernel
parameter.

Formula:
Policy change timeout for members to synchronize policy installation state before
proceeding:
[20 x (number of members in cluster) x
fwha_policy_update_timeout_factor]

Values:
o This kernel parameter is a multiplier, therefore it has no units.
o The default value is 1.
o Do not set this parameter to a value larger than 3.

Notes:
o The default value of 1 should be sufficient for most configurations.
o For configurations, where the situation described above occurs, setting this
parameter to 2 should be sufficient.
o Do not set this parameter to a value larger than 3.

Refer to these solutions:


o sk63163 (Failover does not occur in ClusterXL HA Primary Up mode after
changing cluster member priorities and installing the policy)
o sk92723 (Cluster flapping prevention)

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 120
 'fwha_conf_immediate' kernel parameter

Policy can be updated without negotiation in the following cases:


o ClusterXL is not running on this machine
o ClusterXL should be stopped on this machine
o The only action to take is to add or remove unused interfaces
o The policy is or was missing (Policy ID = 0)
o Policy is installed on a single cluster member

Explanation:
o When the value is set to 0, the cluster member will not change its state to the next
required state until it negotiates with other cluster members.
o When the value is set to 1, the cluster members skip policy installation negotiation
and install new cluster configuration immediately.

Values:
o This kernel parameter is an on-off switch, therefore it has no units.
o Accepted values are 0 and 1.
o The default value is 0.

Refer to these solutions:


o sk63163 (Failover does not occur in ClusterXL HA Primary Up mode after
changing cluster member priorities and installing the policy)
o sk92723 (Cluster flapping prevention)

Refer to Installation and Upgrade Guide (R60, R61, R62, R65, R70, R71, R75,
R75.20, R75.40).

 'fwha_cul_policy_freeze_timeout_millisec' kernel parameter

Explanation:
This parameter controls the time, during which a member should wait for an event
(e.g., pnote problem, CCP of Active member are not received) to occur during policy
installation (starting from the local policy installation).

Values:
o This parameter sets a timeout value, and its units are milliseconds.
o The default value is 0.
o Recommended value is 30000.
o Upper limit is 232-1.

Refer to this solution:


o sk92723 (Cluster flapping prevention)

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 121
 'fwha_cul_policy_freeze_event_timeout_millisec' kernel parameter

Explanation:
This parameter controls the time, during which a member should freeze its state upon
event (e.g., pnote problem, CCP of Active member are not received) during policy
installation.

Values:
o This parameter sets a timeout value, and its units are milliseconds.
o The default value is 0.
o Recommended value is 15000.
o Upper limit is 232-1.

Notes:
o If the Active member fails during policy installation, a network outage might occur
of maximal duration depending on the value assigned to this kernel parameter.

Refer to this solution:


o sk92723 (Cluster flapping prevention)

(7-16) Full Sync

Refer to State Synchronization in ClusterXL section.

Refer to these solutions:


 sk37029 (Forcing Full Synchronization in ClusterXL)
 sk37030 (Debugging Full Synchronization in ClusterXL)
 sk65103 (After reboot, state of cluster member is 'Down', and state of
'Synchronization' device is 'problem')
 sk101695 (Cluster member is Down after reboot / policy installation / running 'cpstart')

Cluster member may fail to start correctly while the cluster is under severe load.

If a reboot (or 'cpstop' followed by 'cpstart') is performed on a cluster member while


the cluster is under severe load, the member may fail to start correctly.

The starting member will attempt to perform a Full Sync with the existing active
member(s) and may in the process use up all its resources and available memory.

This can lead to unexpected behaviour.

Procedure:
To overcome this problem, define the maximum amount of memory that the member may
use when starting up for synchronizing its connections with the active member. By default,
this amount is not limited. Estimate the amount of memory required as follows:

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 122
Memory required (MB) for Full Sync:
New connections per second
Number of open 100 1000 5000 10 000
Connections
1000 1.1 MB 6.9 MB
10 000 11 MB 69 MB 329 MB
20 000 21 MB 138 MB 657 MB 1305 MB
50 000 53 MB 345 MB 1642 MB 3264 MB
Note: These figures were derived for cluster members using the Windows platform,
with Pentium 4 processors running at 2.4 GHz.

Example:
If the cluster holds 10 000 connections, and the connection rate is 1000 connections per
second, then cluster administrator will need 69 MB for Full Sync.

Instructions:
Define the maximal limit for memory allocation to Full Sync by setting the value of the
global kernel parameter fw_sync_max_saved_buf_mem to the required number of
megabytes. Refer to sk26202 (Changing the kernel global parameters for Check Point
Security Gateway).

Impact:
If memory allocation reaches this limit during Full Sync, then further allocations are
forbidden, and relevant messages are printed into /var/log/messages file:
FW-1: fwlddist_save: WARNING: this member will not be fully synchronized !
FW-1: fwlddist_save: current delta sync memory during full sync has reached the
maximim of N MB
FW-1: fwlddist_save: it is possible to set a different limit by changing
fw_sync_max_saved_buf_mem value

(7-17) Delta Sync

Refer to State Synchronization in ClusterXL section.

Refer to these solutions:


 sk92909 (How to debug ClusterXL to understand why a connection is not
synchronized)
 sk41827 (Synchronization network in the cluster is flooded with Sync Retransmit
packets)

(7-17-A) Processing of Delta Sync packets during Full Sync

While performing Full Sync, the Delta Sync updates are not processed and saved.

Cluster member may fail to complete Full Sync while the cluster is under severe load.

It is possible that the rate of Delta Sync updates during the Full Sync process exceeds
the rate of the Full Sync packets. The FWD daemon on the Full Sync client member will not

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 123
be able to handle this number of Delta Sync packets because of the starvation of the user
space daemon, and Full Sync will never end.

Meanwhile, the Delta Sync packets are stored and occupy an ever-increasing amount of
memory in the kernel until memory allocation fails.

Procedure:
To overcome this problem, define the maximal limit for memory allocated to save the
Delta Sync packets during Full Sync. By default, this amount is not limited.

Instructions:
Define the maximal limit for memory allocated to save the Delta Sync packets during Full
Sync by setting the value of the global kernel parameter fw_sync_max_saved_buf_mem to
the required per cent of the memory allocated by Check Point kernel (controlled by be kernel
parameter fw_salloc_total_alloc) from the overall allowed memory (controlled by be
kernel parameter fw_salloc_total_alloc_limit).

Impact:
After a certain amount of Delta Sync packets is received, no more Delta Sync packets
are accepted, so additional sync updates received during Full Sync are discarded, and
relevant messages are printed into /var/log/messages file:
FW-1: fwlddist_save: WARNING: this member will not be fully synchronized !
FW-1: fwlddist_save: reached the memory threshold.
FW-1: fwlddist_save: Current = X MB, allowed = Y MB, threshold = N%

A consequence of this is that connections that were not transferred during full sync will
not survive failover.
After Full Sync is complete, the Delta Sync packets stored during the Full Sync phase are
applied by order of arrival.

(7-17-B) Delta Sync Buffer Threshold

Whenever an operation is performed on a kernel table, which is marked as "sync"-ed (in


$FWDIR/conf/table.def file on Security Management Server), the Delta Sync
mechanism duplicates this action into a buffer of its own.

Once this Delta Sync buffer is full, and every Sync timer interval, the Delta Sync buffer is
sent to all cluster members over the Synchronization Network. The receiving member will
duplicate those actions into its kernel tables.

State Synchronization mechanism creates Delta Sync packets for incoming connections.
These Delta Sync packets are placed in the Sync Sending Queue.

Obviously, this queue has a limited size, which might create a bottleneck - member would
not be able to place the new Delta Sync packets in the Sync Sending Queue - there might
not be enough space in this queue. Either because the number of incoming connection is too
high, or because the Delta Sync packets are not sent in timely manner (due to CPU load,
some problems on the Sync interfaces / Sync network).

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 124
In order to deal with such potential bottleneck, ClusterXL monitors the Sync Sending
Queue - if the number of Delta Sync packets in this queue reaches the threshold, then:
1. The 'FW-1: State synchronization is in risk. Please examine your
synchronization network to avoid further problems' warning is printed
into /var/log/messages file
2. Member starts blocking new incoming connections

This threshold is controlled via kernel parameter fw_sync_buffer_threshold, whose


value is the maximal percentage of the buffer that may be filled before new connections are
blocked:
 By default, this value it is set to 80, with a buffer size of 512 sync words.
 By default, if more than 410 consecutive packets are sent without getting an
Acknowledgement on any one of them, new connections are blocked.

Refer to these solutions:


 sk43896 (Blocking New Connections Under Load in ClusterXL)
 sk82080 (/var/log/messages are filled with 'kernel: FW-1:
fwldbcast_update_block_new_conns: sync in risk: did not receive ack for the last 410
packets')
 sk23695 ('FW-1: State synchronization is in risk. Please examine your
synchronization network to avoid further problems!' appears in /var/log/messages file)

(7-18) Traffic

Refer to these solutions:


 sk43896 (Blocking New Connections Under Load in ClusterXL)
 sk80520 (ClusterXL drops traffic with 'dropped by fwha_forw_run Reason: Failed to
send to another cluster member')
 sk106425 (Connections through cluster to physical IP address of ClusterXL Standby
member / VRRP Backup member are dropped by Anti-Spoofing)
 sk34668 (How to modify the assigned load between the members of ClusterXL in
Load Sharing Unicast mode)
 sk93204 (Troubleshooting "Clear text packet should be encrypted" error in ClusterXL)
 sk31832 (How to prevent ClusterXL / VRRP / IPSO IP Clustering from hiding its own
traffic behind Virtual IP address)
 sk34180 (Outgoing connections from cluster members are sent with cluster Virtual IP
address instead of member's Physical IP address)
 sk42384 (Outgoing connections from cluster members are sent with member's
Physical IP address instead of cluster Virtual IP address)
 sk37411 (Forwarding mechanism does not work properly on a machine with more
than 60 interfaces in a Nokia IP cluster)
 sk31821 (Traffic that is sent to Secondary IP addresses / Alias IP addresses that were
defined on interfaces of ClusterXL members is not processed)
 sk44084 (Kernel debug on ClusterXL Pivot member shows - FW-1:
fwha_pivot_forward_packet: can not forward since fwha_ether_addrs[dst=X][ifn=Y] is
NULL)
 sk30154 ($FWDIR/log/fwd.elg shows repeatedly - 'fwarp_initialize_myself: unable to
find mac address of interface IF_NAME')

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 125
 sk98348 (Best Practices - Security Gateway Performance)
 sk35175 (Security Gateway does not apply 'keep_DF_flag' parameter when SecureXL
is enabled)
 sk90740 (Latency after upgrade from NGX R65 to R75.x when SecureXL is enabled)
 sk44177 (SmartView Tracker repeatedly shows drops with "Source and destination
addresses are equal")
 sk25977 (Connecting multiple clusters to the same network segment (same VLAN,
same switch)
 sk26874 (Cannot simultaneously ping Virtual IP address of the cluster and IP
addresses of physical interfaces on cluster members from a remote host)
 sk43321 (Cluster Forwarding is enabled in 3rd party cluster after each policy
installation)

(7-19) Flapping

Refer to ClusterXL definitions and terms section and to Cluster Control Protocol (CCP)
section.

If CCP packets are not received/sent within the expected timeouts, then eventually either
the problematic interface(s), or the whole member will be declared as failed. This in turn (by
design) will lead to the change in state of either the problematic interface(s), or the whole
member to 'Down'.
Depending on the configuration and the nature of the issue, the state might randomly
change between 'Up'/'Active' and 'Down'. Such random change in state is called "flapping"
(of either an interface, or a member).
Flapping, in its turn might cause an interruption in the production traffic that passes
through the cluster.

Cluster Under Load (CUL) mechanism (R75.40VS, R76, R77 and above) involves a
number of kernel parameters that allow cluster members to automatically monitor the CPU
utilization and prevent flapping according to the values of these kernel parameters - as
described in sk92723 (Cluster flapping prevention):
 fwha_cul_mechanism_enable
 fwha_cul_member_cpu_load_limit
 fwha_cul_member_long_timeout
 fwha_cul_cluster_short_timeout
 fwha_cul_cluster_log_delay_millisec
 fwha_cul_policy_freeze_timeout_millisec
 fwha_cul_policy_freeze_event_timeout_millisec

Refer to these solutions:


 sk43984 (Interface flapping when cluster interfaces are connected through several
switches)
 sk93454 (Increasing ClusterXL dead timeout)
 sk97827 (How to change ClusterXL Interface Monitoring Timeouts)
 sk62570 (How to troubleshoot failovers in ClusterXL - Advanced Guide)
 sk31934 (ClusterXL IGMP Membership)
 sk33221 (Using ClusterXL with IGMP Snooping-enabled switches)

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 126
 sk55081 (Best practice for manual fail-over in ClusterXL)
 sk92787 (How to debug ClusterXL failovers caused by RouteD daemon on Gaia OS)
 sk90283 (ClusterXL fails over for no apparent reason when IPS, DLP, Application
Control or Anti-Malware Blade is enabled)
 sk65103 (After reboot, state of cluster member is 'Down', and state of
'Synchronization' device is 'problem')
 sk101695 (Cluster member is Down after reboot / policy installation / running 'cpstart')
 sk25971 (Failover occurs in the cluster during Security Policy installation)
 sk41089 (How to troubleshoot and isolate the cause of VRRP transitions)
 sk65502 (Crossbeam cluster - after each reboot, the member is Down and Sync is
Off)
 sk44101 (HA process does not start when configuring more than 63 disconnected
interfaces)

(7-20) 'fw ctl pstat' command

Description:
Prints internal Security Gateway Statistics.
Output is divided into several sections.
Cluster-related section is called "Sync".
It is always located at the bottom of the output.

Syntax:
[Expert@Member]# fw ctl pstat

Refer to ClusterXL Administration Guide (R70, R70.1, R71, R75, R75.20, R75.40,
R75.40VS, R76, R77.X, R80.10) - Chapter 'Monitoring and Troubleshooting Gateway
Clusters' - Monitoring Synchronization (fw ctl pstat).

Refer to this solution:


 sk34476 (ClusterXL Sync Statistics - output of 'fw ctl pstat' command).

Example:

Sync:
Version: new
Status: Able to Send/Receive sync packets
Sync packets sent:
total : 466729198, retransmitted : 1305, retrans reqs : 89, acks : 809
Sync packets received:
total : 77283541, were queued : 6715, dropped by net : 6079
retrans reqs : 37462, received 175 acks
retrans reqs for illegal seq : 0
dropped updates as a result of sync overload: 0
Delta Sync memory usage: currently using XX KB mem
Callback statistics: handled 138 cb, average delay : 2, max delay : 34
Number of Pending packets currently held: 1
Packets released due to timeout: 18

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 127
Explanations:
Output section Explanation
Sync: off Delta Sync is disabled: either Full Sync
failed, or Delta Sync was disabled by cluster
administrator.
Sync: 'Active Mode' tab is opened in SmartView
Live connections update: on
Tracker. Refer to sk30908.
Sync: Check Point FW-1 v4.1 and lower.
Version: old
Sync: Check Point FW-1 NG and above.
Version: new
Sync: Delta Sync works correctly.
Version: new
Status: Able to Send/Receive sync
packets
Sync: The problem is described in the output itself
Version: new
(requires cluster debugging).
Status:
Able to send sync packets
Unable to receive sync packets

Sync:
Version: new
Status:
Unable to send sync packets
Unable to receive sync packets
Sync: The problem is described in the output itself
Version: new
(requires cluster debugging).
Status:
Able to send sync packets
Saving incoming sync packets

Sync:
Version: new
Status:
Unable to send sync packets
Saving incoming sync packets
Sync: The problem is described in the output itself
Version: new
(requires cluster debugging).
Status:
Able to send sync packets
Able to receive sync packets

Sync:
Version: new
Status:
Unable to send sync packets
Able to receive sync packets
Sync packets sent:  TOTAL number of sync packets is non-
total : 466729198, retransmitted :
1305, retrans reqs : 89, acks : 809
zero and increasing.
 RETRANS REQS may increase under
load.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 128
Sync packets received:  TOTAL number of sync packets is non-
total : 77283541, were queued : 6715,
zero and increasing.
dropped by net : 6079
retrans reqs : 37462, received 175  QUEUED value never decreases - a non-
acks zero value does not indicate a problem.
retrans reqs for illegal seq : 0  DROPPED BY NET number may indicate
dropped updates as a result of sync network congestion - this counter is
overload: 0
incremented when the cluster member
receives a sync packet with a sequence
number, which is higher, than the expected
sequence number; meaning, packets with
lower sequence numbers where lost
somewhere along the way, and we should
find out where.
 RETRANS REQS growing very fast may
indicate that the load is becoming too high.
 RETRANS REQS FOR ILLEGAL SEQ may
indicate a sync problem.
 DROPPED UPDATES AS A RESULT OF
SYNC OVERLOAD - in a heavily loaded
system, the cluster member may drop
synchronization updates sent from peer
cluster members.

Delta Sync memory usage: currently This statistic only appears for a non-zero
using XX KB mem
value. It requires memory only while Full
Sync is occurring. At other times, Delta Sync
requires no memory.
Callback statistics: handled 138 cb, This statistic only appears for a non-zero
average delay : 2, max delay : 34
value.
AVERAGE DELAY should be ~1-5 packets,
otherwise indicates an overload of sync
traffic.
Number of Pending packets currently This statistic only appears for a non-zero
held: 1
value.
Packets released due to timeout: 18 This statistic only appears for a non-zero
value. If the number is large (more than 100
pending packets), and the "Number of
Pending packets currently held" is small in
the output of 'cphaprob syncstat'
command, then you should take action to
reduce the number of pending packets.
To tackle this problem, see "Reducing the
Number of Pending Packets" in ClusterXL
Administration Guide.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 129
(7-21) 'cphaprob' command

Refer to ClusterXL definitions and terms section.

Description:
Use the 'cphaprob' command to verify that the cluster and the cluster members are
working properly, and to define critical devices.

Syntax:
[Expert@Member]# cphaprob [flags]

Note: The commands below are listed in the order to their importance / relevance.

 cphaprob state

Description:
Prints the summary with the following information:
o Cluster Mode
o Member ID of each known member
o Assigned traffic load for each known member
o State of each known member

Syntax:
[Expert@Member]# cphaprob state

Example:

[Expert@FW2-Member:0]# cphaprob state

Cluster Mode: High Availability (Active Up) with IGMP Membership

Number Unique Address Assigned Load State

1 10.10.10.31 0% Standby
2 (local) 10.10.10.32 100% Active

[Expert@FW2-Member:0]#

Output in 3rd party and OPSec clusters:


o The machine state is only Check Point status and is not really a machine status.
o The command only monitors Full Sync success, and if a policy was successfully
installed. For IPSO IP Clustering, the state is accurate and also includes the status
of the IPSO Cluster.
o For IPSO VRRP, the status is accurate for a Firewall, but it does not correctly
reflect the status of the IPSO machine (e.g., it does not detect interface failure).
o For Gaia VRRP, the status is accurate for a Firewall, and it does not reflect the
status of the Gaia machine (e.g., it does not detect interface failure).

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 130
Refer to these solutions:
o sk61546 (The IP addresses of ClusterXL members in the output of 'cphaprob
state' command differ from the IP addresses of Sync interfaces)
o sk66881 (On OPSec cluster, output of 'cphaprob state' does not show the local
member, only peer members)
o sk93037 (Output of 'cphaprob state' command on Crossbeam chassis cluster
shows only local member)
o sk36247 (Cluster member detects only itself in 'cphaprob state' output, or detects
other member as "ClusterXL inactive or machine is down")
o sk30154 ($FWDIR/log/fwd.elg shows repeatedly - 'fwarp_initialize_myself: unable
to find mac address of interface IF_NAME')
o sk61331 (ClusterXL Load Sharing in Unicast (Pivot) mode - after second reboot of
Pivot member, output of 'cphaprob stat' on non-Pivot member shows wrong
assigned load as 0% for Pivot and 100% for non-Pivot)
o sk42096 (Cluster member is stuck in 'Ready' state)

 cphaprob [-l][-ia][-e] list

Description:
Prints the summary of Critical Devices (Pnotes) with the following information:
o Device name
o Device timeout (how frequently the periodic reports are expected)
o Device state
o Time since last periodic report

Syntax:
[Expert@Member]# cphaprob [-l][-ia][-e] list

Commands:
cphaprob list

In R77.30 and above In R77.20 and lower


 When there are no issues on the cluster
member:

There are no pnotes in problem


state
Prints the list of some of the "Built-in
* Issue 'cphaprob -l list' to Devices" and the "Registered Devices"
show full list of pnotes
Device Name: Interface Active
Check
 When a critical device reports a problem - Device Name: Recovery Delay
prints only the critical device that reports
its state as "problem" Device Name: Synchronization
Device Name: Filter
Example: Device Name: fwd
Device Name: cphad
Registered Devices: Device Name: cvpnd
Device Name: routed
Device Name: routed
Registration number: 2
Timeout: none
Current state: problem
Time since last report: 2.8 sec

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 131
cphaprob -l list

In R77.30 and above In R77.20 and lower


Prints the list of all the "Built-in Devices"
and the "Registered Devices" - exactly as
"cphaprob -ia list" does in R77.20 and
lower

Device Name: Problem


Notification
Device Name: Interface Active
Check
Device Name: HA Initialization Command does not exist in these versions
Device Name: Load Balancing
Configuration
Device Name: Recovery Delay

Device Name: Synchronization


Device Name: Filter
Device Name: routed
Device Name: cphad
Device Name: fwd
Device Name: cvpnd

cphaprob -i list

In R77.30 and above In R77.20 and lower


 When there are no issues on the cluster
member:

There are no pnotes in problem


state Prints the list of some of the "Built-in
Devices" and the "Registered Devices"
* Issue 'cphaprob -l list' to
show full list of pnotes
Device Name: Interface Active
Check
 When a critical device reports a problem - Device Name: Recovery Delay
prints only the critical device that reports Device Name: HA Initialization
its state as "problem"
Device Name: Synchronization
Example: Device Name: Filter
Device Name: fwd
Registered Devices: Device Name: cphad
Device Name: cvpnd
Device Name: routed Device Name: routed
Registration number: 2
Timeout: none
Current state: problem
Time since last report: 2.8 sec

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 132
cphaprob -ia list

In R77.30 and above In R77.20 and lower


 When there are no issues on the cluster
member:

There are no pnotes in problem


state

* Issue 'cphaprob -l list' to Prints the list of all the "Built-in Devices"
show full list of pnotes and the "Registered Devices"

Device Name: Problem


 When a critical device reports a problem - Notification
prints the device "Problem Device Name: Interface Active
Notification" and the critical device Check
that reports its state as "problem" Device Name: HA Initialization
Device Name: Load Balancing
Example: Configuration
Device Name: Recovery Delay
Built-in Devices:
Device Name: Synchronization
Device Name: Problem Device Name: Filter
Notification Device Name: routed
Current state: problem Device Name: cphad
Device Name: fwd
Registered Devices: Device Name: cvpnd

Device Name: routed


Registration number: 2
Timeout: none
Current state: problem
Time since last report: 6.7 sec

cphaprob -e list

In R77.30 and above In R77.20 and lower


 When there are no issues on the cluster
member:

There are no pnotes in problem


state

* Issue 'cphaprob -l list' to


show full list of pnotes
Prints the list of "Registered Devices" only

Device Name: Synchronization


 When a critical device reports a problem -
Device Name: Filter
prints only the critical device that reports Device Name: fwd
its state as "problem" Device Name: cphad
Device Name: cvpnd
Example: Device Name: routed
Registered Devices:

Device Name: routed


Registration number: 2
Timeout: none
Current state: problem
Time since last report: 2.8 sec

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 133
Example from R77.20:

[Expert@FW2-Member:0]# cphaprob -ia list

Built-in Devices:

Device Name: Problem Notification


Current state: OK

Device Name: Interface Active Check


Current state: OK

Device Name: HA Initialization


Current state: OK

Device Name: Load Balancing Configuration


Current state: OK

Device Name: Recovery Delay


Current state: OK

Registered Devices:

Device Name: Synchronization


Registration number: 0
Timeout: none
Current state: OK
Time since last report: 85011.8 sec

Device Name: Filter


Registration number: 1
Timeout: none
Current state: OK
Time since last report: 2302 sec

Device Name: fwd


Registration number: 2
Timeout: none
Current state: OK
Time since last report: 94170 sec

Device Name: cphad


Registration number: 3
Timeout: none
Current state: OK
Time since last report: 2302 sec

Device Name: routed


Registration number: 4
Timeout: none
Current state: OK
Time since last report: 875.5 sec

Device Name: ted


Registration number: 5
Timeout: 600 sec
Current state: OK
Time since last report: 0.7 sec

[Expert@FW2-Member:0]#

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 134
Refer to these solutions:
o sk36372 (Output of 'cphaprob -ia list' on ClusterXL shows a Critical Device called
'HA Initialization')
o sk36373 (Output of 'cphaprob -ia list' on ClusterXL shows a Critical Device called
'Load Balancing Configuration')
o sk92353 (Output of 'cphaprob -ia list' on ClusterXL shows a Critical Device called
'Recovery Delay')

 cphaprob [-a] [-m] if

Description:
Prints the summary of cluster interfaces with the following information:
o Number of required cluster interfaces - including the Sync interfaces (the maximal
number of good cluster interfaces seen since the last reboot)
o Number of required secured (trusted) interfaces (the maximal number of good
sync interfaces seen since the last reboot)
o Names of monitored cluster interfaces (refer to CCP and VLAN interfaces)
o State of cluster interfaces (based on arrival/transmission of CCP packets)
o CCP mode on cluster interfaces
o Number of cluster Virtual IP addresses
o Virtual IP addresses
o Virtual MAC addresses (if VMAC mode is enabled per sk50840)
o VLAN monitoring scheme

Syntax:
[Expert@Member]# cphaprob [-a][-m] if

Flag Description
-a Prints Virtual IP addresses and
their corresponding interfaces.
-m Starting in R80.10, prints the
VLAN monitoring scheme

Example:

[Expert@FW2-Member:0]# cphaprob -a if

Required interfaces: 3
Required secured interfaces: 1

eth0 UP non sync(non secured), multicast


eth1 UP sync(secured), multicast
eth2 UP non sync(non secured), multicast

Virtual cluster interfaces: 2

eth0 192.168.204.33 VMAC address: 00:1C:7F:00:00:1F


eth2 20.20.20.33 VMAC address: 00:1C:7F:00:00:1F

[Expert@FW2-Member:0]#

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 135
If "-m" flag is used, then one of the following lines will be printed:
 No VLANs are monitored on the member
 Error reading monitoring mode
 Monitoring mode is "Monitor all VLANs": All VLANs are
monitored
 Monitoring mode is "Monitor specific VLAN": Only specified
VLANs are monitored

Output in 3rd party / OPSec clusters:


o Shows only the relevant information - interface name, if it is a trusted (secured)
interface or not.
o "Multicast"/"Broadcast" refers to the CCP and is relevant only for the trusted
(secured) interface.
o Status of the interface is not printed since it is not monitored.

 cphaprob mmagic

Description:
Starting in Gaia R80.10, prints the summary information about the MAC magic:
o Configuration mode
o Configuration phase
o Value of MAC magic
o Value of MAC forward magic

Syntax:
[Expert@Member]# cphaprob mmagic

Examples from non-VSX cluster:

[Expert@FW2-Member:0]# cphaprob mmagic

Configuration mode: Automatic (multi-ver)


Configuration phase: Stable

MAC magic: 1
MAC forward magic: 254

Used MAC magic values: None.

[Expert@FW2-Member:0]#

[Expert@FW2-Member:0]# cphaprob mmagic

Configuration mode: Manual


Configuration phase: Stable

MAC magic: 100


MAC forward magic: 254

Used MAC magic values: None.

[Expert@FW2-Member:0]#

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 136
 cphaprob [-reset] syncstat

Description:
Prints internal cluster statistics about the operation of the State Synchronization.
Can be used on ClusterXL and 3rd party / OPSec clusters.

Syntax:
[Expert@Member]# cphaprob [-reset] syncstat

Flag Description
-reset Resets the statistics in kernel that was
collected since boot, or last reset.

Refer to ClusterXL Administration Guide (R70, R70.1, R71, R75, R75.20, R75.40,
R75.40VS, R76, R77.X, R80.10) - Chapter 'Monitoring and Troubleshooting Gateway
Clusters' - Troubleshooting Synchronization.

Refer to these solutions:


o sk34475 (ClusterXL Sync Statistics - output of 'cphaprob syncstat' command)
o sk82080 (/var/log/messages are filled with 'kernel: FW-1:
fwldbcast_update_block_new_conns: sync in risk: did not receive ack for the last
410 packets')

Example:

Sync Statistics (IDs of F&A Peers - 1 2 3 4 5 6 7 ):

Other Member Updates:


Sent retransmission requests................... 165
Avg missing updates per request................ 1
Old or too-new arriving updates................ 5661
Unsynced missing updates....................... 0
Lost sync connection (num of events)........... 4354
Timed out sync connection ..................... 1

Local Updates:
Total generated updates ....................... 9180670
Recv Retransmission requests................... 1073
Recv Duplicate Retrans request................. 2564

Blocking Events................................ 0
Blocked packets................................ 0
Max length of sending queue.................... 4598
Avg length of sending queue.................... 0
Hold Pkts events............................... 1
Unhold Pkt events.............................. 1
Not held due to no members..................... 16
Max held duration (sync ticks)................. 0
Avg held duration (sync ticks)................. 11

Timers:
Sync tick (ms)................................. 100
CPHA tick (ms)................................. 100

Queues:
Sending queue size............................. 512
Receiving queue size........................... 256

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 137
Output section Explanation Limits
IDs of F&A Peers The F&A (Flush and Ack) peers are
the cluster members that this
member recognizes as being part
of the cluster.
The IDs correspond to IDs and IP
addresses shown by the
'cphaprob state' command.
Other Member Updates: The statistics in this section relate
to Delta Sync updates generated
by other cluster members, or to
Delta Sync updates that were not
received from the other members.
Updates inform about changes in
the connections handled by the
cluster member, and are sent from
and to members. Updates are
identified by sequence numbers.
Sent retransmission The number of retransmission Has to be less than 30% of "Total
requests requests, which were sent by this generated updates" ON
member. Retransmission requests OTHER MEMBERS.
are sent when certain packets (with
a specified sequence number) are
missing, while the sending member
already received updates with
advanced sequences.
Avg missing updates per Each retransmission request can More than 20 can imply
request contain up to 32 missing connectivity problems.
consecutive sequences.
The value of this field is the
average number of requested
sequences per retransmission
request.
Old or too-new arriving The number of arriving Delta Sync Has to be less than 10% of "Total
updates updates where the sequence generated updates" ON THIS
number is too low, which implies it MEMBER.
belongs to an old transmission, or
too high, to the extent that it cannot
belong to a new transmission.
Unsynced missing updates The number of missing Delta Sync Should be 0 - less than 1% of
updates, for which the receiving "Total generated updates" is
member stopped waiting. It stops acceptable.
waiting when the difference in
sequence numbers between the
newly arriving updates and the
missing updates is larger than the
length of the "Receiving Queue".
Lost sync connection The number of events, in which Lost sync connection (number of
(num of events) synchronization with another events)
member was lost and regained due
to either Security Policy installation
on the other member, or a large
difference between the expected
and received sequence number.
Timed out sync The number of events, in which the Should be 0 - positive value
connection member declares another member indicates connectivity problems.
as not connected. The member is
considered as disconnected
because no CCP packets with ACK
were received from that member for
a period of time (1 second), even

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 138
though there are Flush and Ack
packets being held for that
member.
Local Updates: The statistics in this section relate
to Delta Sync updates generated
by the local cluster member.
Updates inform about changes in
the connections handled by the
cluster member, and are sent from
and to members. Updates are
identified by sequence numbers.
Total generated updates The number of Delta Sync updates Can have any value.
generated by the Sync mechanism
since the statistics were last reset
(with 'cphaprob -reset
syncstat' command). Its value is
the same as the difference between
the sequence number when
running the 'cphaprob -reset
syncstat' command, and the
current sequence number.
Recv Retransmission The number of received Should be less than 30% of
requests retransmission requests. A member "Total generated updates"
requests retransmissions when it is ON THIS MEMBER.
missing specified packets with
lower sequence numbers than the
ones already received.
Recv Duplicate Retrans The number of duplicated Should be less than 30% of
request retransmission requests received "Total generated updates"
by the member. Duplicate requests ON THIS MEMBER.
were already handled, and so are
dropped.
Blocking Events Under extremely heavy load If "Block New Connections"
conditions, the cluster member may mechanism is enabled (per
block new connections (refer sk43896), then positive value
to sk43896). This counter shows indicates heavy load.
the number of times that the cluster
member started blocking new
connections due to Sync overload.
Blocked packets The number of packets that were Higher than 5% of "Avg length
blocked because the cluster of sending queue" can imply
member was blocking all new connectivity problems.
connections (see 'Blocking
Events' above). The number of
blocked packets is usually one
packet per new connection attempt.
Max length of sending The size of the Sending Queue is If "Block New Connections"
queue fixed and by default, it is 512 sync mechanism is enabled (per
words. This size is controlled via sk43896), then should be less than
kernel parameter "Sending queue size".
fw_sync_sending_queue_size.
Avg length of sending The average value of the 'Max If "Block New Connections" is
queue length of sending queue', enabled (per sk43896), then
since last reboot or since the Sync should be less than 80% of
statistics were reset. "Sending Queue size".
Hold Pkts events The number of event, where the Should be the same as "Unhold
Delta Sync update required Flush Pkt events".
and Ack, and so was kept within
the system until an ACK arrived
from all the other functioning
members

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 139
Unhold Pkt events The number of events, when the Should be the same as "Hold
member received all the required Pkt events".
ACKs from the other functioning
members.
Not held due to no The number of packets, which Should be 0 - positive value
members should have been held within the indicates connectivity problem
system, but were released because between the members.
there were no other operating
members.
Max held duration (sync The maximal time in cluster ticks (1 Should be less than 50 - positive
ticks) tick equals 100ms), for which a value indicates connectivity
held packet was delayed in the problem between the members.
system for Flush and Ack
purposes.
Avg held duration (sync The average duration in cluster Should be about the Round-Trip
ticks) ticks (1 tick equals 100ms), for Time (RTT) of the Sync network. A
which the held packets were larger value indicates connectivity
delayed within the system for Flush problem.
and Ack purposes.
Timers: The values in this section relate to The value is controlled via kernel
internal timers that control Sync parameter
and cluster related actions. fwha_timer_sync_res
per sk41471.
Default value is 100 ms (minimal
possible value).
Sync tick (ms) Timer interval for Delta Sync The value is controlled via kernel
operations. parameter
fwha_timer_cpha_res
per sk43872.
Default value is 100 ms (minimal
possible value).
CPHA tick (ms) Timer interval for cluster operations
(excluding Delta Sync).
Queues: The values in this section relate to
the sizes of Delta Sync Queues.
Sending queue size The Sending Queue on the cluster The value is controlled via kernel
member stores locally generated parameter
Delta Sync updates. Updates in the fw_sync_sending_queue_size
Sending Queue are replaced by per sk82080.
more recent updates. In a highly Default value is 512 sync words
loaded cluster, updates are (minimal possible value).
therefore kept for less time. If a
member is asked to retransmit an
update, it can only do so if the
update is still in its Sending Queue.
Each member has one The value is controlled via kernel
sending queue. parameter
fw_sync_recv_queue_size per
sk82080.
Default value is 256 sync words
(minimal possible value).

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 140
 cphaprob -d <device> -t <timeout> -s <ok|init|problem> [-p] [-g] register

Description:
Registers a Critical Device (Pnote) with specified parameters.

Syntax:
[Expert@Member]# cphaprob -d <device> -t <timeout_in_sec> -s
<ok|init|problem> [-p] [-g] register

Flags Description
-d device Specifies the name of the Pnote (refer to
ClusterXL definitions and terms section).
-t timeout_in_sec Specifies how frequently the periodic reports are
expected.
If no periodic reports should be expected, then
enter 0 (zero).
-s <ok|init|problem> Specifies the initial state with which
the Pnote will be registered.
-p (Optional) Specifies that this Pnote
must be registered permanently (this
configuration will be saved in the
$FWDIR/conf/cphaprob.conf file).
-g (Optional) Specifies that this Pnote
must be registered globally (applies to
R75.40VS and above in VSX mode).

Refer to these solutions:


o sk43172 (Cluster performs fail-overs - detected a problem (cphad) / (fwd))
o sk32712 (FWD response time increase (cphaprob -d fwd -t seconds register -p)
does not survive reboot)
o sk92878 (User Space process monitoring mechanism in R76 ClusterXL)

 cphaprob -d <device> [-p] [-g] unregister

Description:
Registers a Critical Device (Pnote) with specified parameters.

Syntax:
[Expert@Member]# cphaprob -d <device> [-p] [-g] unregister

Flags Description
-d device Specifies the name of the Pnote (refer to
ClusterXL definitions and terms section).
-p (Optional) Specifies that this Pnote
must be unregistered permanently (this
configuration will be removed from the
$FWDIR/conf/cphaprob.conf file).
-g (Optional) Specifies that this Pnote
must be unregistered globally (applies to
R75.40VS and above in VSX mode).

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 141
Refer to these solutions:
o sk43172 (Cluster performs fail-overs - detected a problem (cphad) / (fwd))
o sk32712 (FWD response time increase (cphaprob -d fwd -t seconds register -p)
does not survive reboot)
o sk92878 (User Space process monitoring mechanism in R76 ClusterXL)

 cphaprob -d <device> -s <ok|init|problem> [-g] report

Description:
Reports a specified state for Critical Device (Pnote).

Syntax:
[Expert@Member]# cphaprob -d <device> -s <ok|init|problem> [-g] report

Flags Description
-d device Specifies the name of the Pnote (refer to
ClusterXL definitions and terms section).
-s <ok|init|problem> Specifies the state, which
will be reported for the Pnote .
-g (Optional) Specifies that this Pnote
state must be reported globally
(applies to R75.40VS and above in VSX mode).

Refer to these solutions:


o sk92868 (Cannot change the state of the pnote 'cphad' to 'problem' with 'cphaprob
-d cphad -s problem report' command)
o sk92878 (User Space process monitoring mechanism in R76 ClusterXL)
o sk43172 (Cluster performs fail-overs - detected a problem (cphad) / (fwd))

 cphaprob -f <file> [-g] register

Description:
Registers Critical Devices (Pnotes) with specified parameters from a file.

Syntax:
[Expert@Member]# cphaprob -f <file> [-g] register

Flags Description
-f file Specifies the file that contains the list of Pnotes
and their parameters.
For file syntax, refer to the
$FWDIR/conf/cphaprob.conf file.
-g (Optional) Specifies that this Pnote
must be registered globally (applies to
R75.40VS and above in VSX mode).

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 142
Refer to these solutions:
o sk43172 (Cluster performs fail-overs - detected a problem (cphad) / (fwd))
o sk32712 (FWD response time increase (cphaprob -d fwd -t seconds register -p)
does not survive reboot)
o sk92878 (User Space process monitoring mechanism in R76 ClusterXL)

 cphaprob -a [-g] unregister

Description:
Unregisters all Critical Devices (Pnotes).

Syntax:
[Expert@Member]# cphaprob -a [-g] unregister

Flags Description
-a Specifies that all Pnotes must be unregistered.
-g (Optional) Specifies that all Pnotes must be
unregistered globally (applies to
R75.40VS and above in VSX mode).

Refer to these solutions:


o sk43172 (Cluster performs fail-overs - detected a problem (cphad) / (fwd))
o sk32712 (FWD response time increase (cphaprob -d fwd -t seconds register -p)
does not survive reboot)
o sk92878 (User Space process monitoring mechanism in R76 ClusterXL)

 cphaprob igmp

Description:
Prints IGMP membership status.

Syntax:
[Expert@Member]# cphaprob igmp

Example:

[Expert@FW2-Member:0]# cphaprob igmp

IGMP Membership: Enabled


Supported Version: 2
Report Interval [sec]: 60

IGMP queries are replied only by Operating System

Interface Host Group Multicast Address Last ver. Last Query[sec]


---------------------------------------------------------------------------
eth0 224.168.204.33 01:00:5e:28:cc:21 N/A N/A
eth1 224.10.10.250 01:00:5e:0a:0a:fa N/A N/A
eth2 224.20.20.33 01:00:5e:14:14:21 N/A N/A

[Expert@FW2-Member:0]#

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 143
Refer to these solutions:
 sk31934 (ClusterXL IGMP Membership)
 sk33221 (Using ClusterXL with IGMP Snooping-enabled switches)
 sk22495 (Interface flapping (down/up) in a ClusterXL environment)

 cphaprob [-reset] ldstat

Description:
Prints the serialization statistics about the operations performed in kernel tables
based on Delta Sync - creating a new connection, updating an existing connection,
deleting an existing connection, etc.
Can be used on ClusterXL and 3rd party / OPSec clusters.

Syntax:
[Expert@Member]# cphaprob [-reset] ldstat

Flag Description
-reset Resets the statistics in kernel that was
collected since boot, or last reset.

Example:

[Expert@FW2-Member:0]# cphaprob ldstat

Operand Calls Bytes Average Ratio %


-------------------------------------------------------
ERROR 0 0 0 0
SET 5287 1359896 257 27
RENAME 0 0 0 0
REFRESH 41105 2137460 52 42
DELETE 5276 189792 35 3
SLINK 10496 671744 64 13
UNLINK 0 0 0 0
MODIFYFIELDS 8032 610432 76 12
RECORD DATA CONN 0 0 0 0
COMPLETE DATA CONN 0 52026 0 1

Total bytes sent: 4893244 (4 MB) in 52026 packets. Average 94

[Expert@FW2-Member:0]#

Refer to this solution:


o sk92909 (How to debug ClusterXL to understand why a connection is not
synchronize)

 cphaprob fcustat

Description:
Prints the Full Connectivity Upgrade (FCU) statistics on the member that is being
upgraded in Full Connectivity mode.
Note: FCU is not supported since R75 (refer to sk107042).

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 144
Syntax:
[Expert@Member]# cphaprob fcustat

Example:

[Expert@FW2-Member:0]# cphaprob fcustat

During FCU....................... yes


Number of connection modules..... 23
Connection module map (remote -->local)
0 --> 0 (Accounting)
1 --> 1 (Authentication)
2 --> 3 (NAT)
3 --> 4 (SeqVerifier)
4 --> 5 (SynDefender)
5 --> 6 (Tcpstreaming)
6 --> 7 (VPN)
Table id map (remote->local)..... (none or a specific list,
depending on configuration)
Table handlers ..................
78 --> 0xF98EFFD0 (sip_state)
8158 --> 0xF9872070 (connections)
Global handlers ................. none

[Expert@FW2-Member:0]#

Output section Explanation


During FCU This should be "yes" only after
running the 'fw fcu' command and
before running 'cphastop' on the
final old member.
In all other cases it should be "no".
Number of connection modules Safe to ignore.
Connection module map The output reveals a translation
map from the old member to the
new member.
For additional information, refer to
'Full Connectivity Upgrade
Limitations' in the Installation and
Upgrade Guide.
Table id map This shows the mapping between
the gateway's kernel table indices
on the old member and on the NM.
Having a translation is not
mandatory.
Table handlers This should include a sip_state
and connection table handlers.
Security Gateway configuration (in
VSX, applies to R75.40VS and
above ), a VPN handler should also
be included.
Global handlers Reserved for future use.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 145
 cphaprob tablestat

Description:
Prints the Cluster tables.

Syntax:
[Expert@Member]# cphaprob tablestat

Example:
[Expert@FW2-Member:0]# cphaprob tablestat

---- Unique IP's Table ----

Member Interface IP-Address


------------------------------------------
0 1 192.168.204.31
0 2 10.10.10.31
0 3 20.20.20.31

(Local)
1 1 192.168.204.32
1 2 10.10.10.32
1 3 20.20.20.32

------------------------------------------

(7-22) 'cphastart' and 'cphastop' commands

Refer to ClusterXL Administration Guide (R70, R70.1, R71, R75, R75.20, R75.40,
R75.40VS, R76, R77.X, R80.10) - Chapter 'Monitoring and Troubleshooting Gateway
Clusters' - ClusterXL Configuration Commands - The cphastart and cphastop Commands.

 cphastart

o Running cphastart on a cluster member activates ClusterXL on the member.


o It does not initiate Full Sync.
o The cpstart command is the recommended way to start a cluster member.
o To collect more information, run this command under debug (by default, output is
printed on the screen; output can be redirected into a file):
[Expert@Member]# cphastart -d > /var/log/cphastart.txt
Refer to the following lines in the output file:
prepare_command_args: -D ... start
/opt/CPsuite-RXX/fw1/bin/cphaconf clear-secured
/opt/CPsuite-RXX/fw1/bin/cphaconf -D ... start
Refer to cphaconf <relevant flags> start section.

Note: Starting in R77.20, refer to $FWDIR/log/cphastart.elg

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 146
 cphastop

o Running cphastop on a cluster member stops the cluster member from passing
traffic.
o State Synchronization also stops.
o It is still possible to open connections directly to the cluster member.
o In High Availability Legacy mode, running cphastop may cause the entire cluster
to stop functioning.

(7-23) 'cphaconf' command

Important Note: This command should NOT normally be used, since configuration is
controlled by the Management Server. Use it only if specifically instructed to by Check Point
Support. Exception: when working with Bond interfaces.

Refer to ClusterXL definitions and terms section.

Refer to ClusterXL Administration Guide (R70, R70.1, R71, R75, R75.20, R75.40,
R75.40VS, R76, R77.X, R80.10) - Chapter 'Monitoring and Troubleshooting Gateway
Clusters' - ClusterXL Configuration Commands - The cphaconf command.

Note: Starting in R77.20, refer to $FWDIR/log/cphaconf.elg

Note: The commands below are listed in the order to their importance / relevance.

 cphaconf <relevant flags> start

Important Note:
Use the following table only to analyze the output of 'cphastart -d' command.

Refer to the following lines in the output of 'cphastart -d' command:

prepare_command_args: -D ... start


/opt/CPsuite-RXX/fw1/bin/cphaconf clear-secured
/opt/CPsuite-RXX/fw1/bin/cphaconf -D ... start

Description:
Loads cluster configuration with relevant options into kernel.

Flags Description
-D Prints debug information about
the execution of 'cphaconf' command
-c <size> Sets cluster size
(number of members in the cluster)
-i <ID> Sets member ID of the local machine
(count is starts from 1
-n <ID> Sets cluster ID
-p <policy_id> Sets Policy ID explicitly

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 147
-m <1|service> Sets cluster mode (use either the ordinal
-m <2|balance> number, or the explicit name):
-m <3|primary-up>  balance = Load Sharing (refer to '-M' flag)
-m <4|active-up>  primary-up = High Availability Primary Up
 active-up = High Availability Active Up
 service = Service Mode (HA is not active -
though sync, for example may be on)
-R a Sets the number of required interfaces:
-R <required_IF_num>  a = the number should be detected
automatically
 required_IF_num = sets the number
explicitly (the number of required trusted
(secured, a.k.a. Sync) interfaces will be set
to zero, because in manual mode we do not
check trusted interfaces)
-t <secured_IF_1> Sets/adds the trusted (secured, a.k.a. Sync)
<secured_IF_2> interfaces explicitly
...
-d <disconnected_IF_1> Sets/adds the disconnected interfaces explicitly
<disconnected_IF_2>
...
-A Enables auto mode in order to add all unused
interfaces (automatically)
-M <0|multicast> Sets Load Sharing mode (use either the ordinal
-M <1|pivot> number, or the explicit name)
Note: <0|multicast> is the default - it can be
omitted from the command
-l 0 Sets tracking mode for changes in the status of
-l 1 cluster members (failover):
-l 2  0 = None
-l 3  1 = Log
-l 4
 2 = Popup Alert
-l 5
-l 6  3 = Mail Alert
-l 7  4 = SNMP Trap Alert
 5 = User Defined Alert no. 1
 6 = User Defined Alert no. 2
 7 = User Defined Alert no. 3
-S 0 Enables/disables Sticky Decision Function:
-S 1  0 = do not use Sticky Decision Function
 1 = use Sticky Decision Function
(must be set with '-f' flag)
-f 0 Sets Sticky Decision Function method:
-f 1  0 = IPs, Ports, and SPIs (default)
-f 2  1 = IPs and Ports
 2 = IPs only
-o Sets High Availability in Legacy mode
-x Sets High Availability in New mode (multicast)
-z 0 0 = Disables VMAC mode
-z 1 1 = Enables VMAC mode (per sk50840)

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 148
-v Enables VSLS mode
(refer to VSX Administration Guide)
-V Prior to R75.40VS:
MD5 checksum for VSLS parameters
-T 0 Prior to R75.40VS:
-T 1 3rd party mode
-T 2 0 = 3rd party cluster mode is undefined
1 = more than 1 machine is active in 3rd party
2 = 1 machine is active in 3rd party
-r Read registry (only on Windows OS)
-s Disables Switch Support - the sending of
Ethernet broadcast to overcome switches.
Related to probing of local network via ARP.
Notes:
 If it is guaranteed that the cluster will not be
connected to Layer 2 switches, then Switch
Support can be disabled (no impact).
 The following lines appear in cluster debug
with 'conf' flag:
FW-1: fwha_set_conf: SWITCH SUPPORT
FW-1: fwha_set_conf: NO SWITCH SUPPORT

 cphaconf stop

Description:
Removes the cluster configuration from kernel.

Background:
The 'cphastop' command is actually a shell script wrapper that runs this command.

 cphaconf set_ccp <broadcast/multicast>

Description:
Sets the CCP mode - broadcast / multicast (default mode).

Notes:
o Refer to CCP modes section.
o Explicit configuration will be added into:
 Unix OS: $FWDIR/boot/ha_boot.conf file
 Windows OS: Windows Registry -
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\CPHA\
CCP_mode

Refer to these solutions:


o sk20576 (How to set ClusterXL Control Protocol (CCP) in Broadcast / Multicast
mode in ClusterXL)
o sk36644 (The Mode of Cluster Control Protocol (CCP) in VSX cluster).

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 149
 cphaconf show_bond <-a | bond_name>

Description:
Shows the current bond configuration
o -a = displays a summary table for all bond interfaces
o bond_name = displays a summary table for specific bond and its slaves

 cphaconf failover_bond <bond_name>

Description:
Starts internal failover between slave interfaces of given bond interface (Bond in High
Availability mode only).

 cphaconf enable_bond_failover <bond_name>

Refer to Configuring Bond Failover Mode section.

Description:
Sets what happens during a failover after a bond has already failed over internally.

Note:
It works only if the value of kernel parameter 'fwha_manual_bond_failover' is
currently set to 1 (one).

The fwha_manual_bond_failover kernel parameter is used to set the failover


mode - either within the bond, or to the next cluster member:
o 0 - (default) sets the system to fail over to another bonded slave interface when a
failure is detected on a slave interface. This is the default setting.
o 1 - sets the system to fail over to another cluster member when failure is detected
on a slave interface.
o In both modes, the next bond failover occurs in three minutes.

 cphaconf debug_data

Description:
Prints the current cluster configuration as loaded in the kernel on this machine.

Note:
Works only during the following cluster debug:

In 1st shell:
[Expert@Member_HostName]# fw ctl debug 0
[Expert@Member_HostName]# fw ctl debug -buf 32000
[Expert@Member_HostName]# fw ctl debug -m cluster + conf
[Expert@Member_HostName]# fw ctl kdebug -T -f > /var/log/debug.txt

In 2nd shell:
[Expert@Member_HostName]# cphaconf debug_data

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 150
In 1st shell:
[Expert@Member_HostName]# fw ctl debug 0

Review /var/log/debug.txt

Example:

 Configuration:
[Expert@FW1-Member:0]# cphaprob state

Cluster Mode: High Availability (Active Up) with IGMP Membership

Number Unique Address Assigned Load State

1 (local) 10.10.10.31 100% Active


2 10.10.10.32 0% Standby

[Expert@FW1-Member:0]#

[Expert@FW1-Member:0]# cphaprob -a if

Required interfaces: 3
Required secured interfaces: 1

eth0 UP non sync(non secured), multicast


eth1 UP sync(secured), multicast
eth2 UP non sync(non secured), multicast

Virtual cluster interfaces: 2

eth0 192.168.204.33 VMAC address: 00:1C:7F:00:00:1F


eth2 20.20.20.33 VMAC address: 00:1C:7F:00:00:1F

[Expert@FW1-Member:0]#

 Debug output:

;[cpu_1];[fw4_0];================================================;
;[cpu_1];[fw4_0];===== ClusterXL debug information ===;
;[cpu_1];[fw4_0];================================================
;
;[cpu_1];[fw4_0];---- Sync ----
;
;[cpu_1];[fw4_0];fwlddist_state is (1a): Receiving, Not Saving, Sending;
;[cpu_1];[fw4_0];fwlddist_dobcast is: 1;
;[cpu_1];[fw4_0];fw_has_nondefault_filter is: 1;
;[cpu_1];[fw4_0];fw_syncn_is_configured is: 1;
;[cpu_1];[fw4_0];fwlddist_policy_in_ready_state is: 1;
;[cpu_1];[fw4_0];---- VMAC mode: ----
;
;[cpu_1];[fw4_0];VMAC: vmac mode is enabled;
;[cpu_1];[fw4_0];VMAC: the vmac of each interface:;
;[cpu_1];[fw4_0];Interface: 1) eth0, vmac: 00:1C:7F:00:00:FE;
;[cpu_1];[fw4_0];Interface: 3) eth2, vmac: 00:1C:7F:00:00:FE;
;[cpu_1];[fw4_0];VMAC: priomisc mode interfaces (by the VMAC mechanism) are:;
;[cpu_1];[fw4_0];Interface: 1) eth0, vmac_index=0x0;
;[cpu_1];[fw4_0];Interface: 3) eth2, vmac_index=0x0;
;[cpu_1];[fw4_0];------------------------

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 151
;
;[cpu_1];[fw4_0];---- Interfaces info: ----
;
;[cpu_1];[fw4_0];0) if: lo, flags: 0x800;
;[cpu_1];[fw4_0];1) if: eth0, flags: 0x10000800;
;[cpu_1];[fw4_0];2) if: eth1, flags: 0x10000808;
;[cpu_1];[fw4_0];3) if: eth2, flags: 0x10000800;
;[cpu_1];[fw4_0];------------------------

;
;[cpu_1];[fw4_0];================================================;
;[cpu_1];[fw4_0];===== ClusterXL debug end ===;
;[cpu_1];[fw4_0];================================================
;
;[cpu_1];[fw4_1];================================================;
;[cpu_1];[fw4_1];===== ClusterXL debug information ===;
;[cpu_1];[fw4_1];================================================
;
;[cpu_1];[fw4_1];------------------------
;[cpu_1];[fw4_1];===== Cluster instance information ===;
;[cpu_1];[fw4_1];------------------------

;
;[cpu_1];[fw4_1];---- Selection table ----
;
;[cpu_1];[fw4_1];Effective selection table size: 2
;
;[cpu_1];[fw4_1];0: 0;
;[cpu_1];[fw4_1];1: 0;
;[cpu_1];[fw4_1];------------------------

;
;[cpu_1];[fw4_1];---- Multicast table ----
;
;[cpu_1];[fw4_1];lo: Address: 1.0.0.127;
;[cpu_1];[fw4_1];Cluster/default multicast IP: 0.0.0.0, MAC address:
00:00:00:00:00:00;
;[cpu_1];[fw4_1];eth0: Address: 31.204.168.192;
;[cpu_1];[fw4_1];Cluster/default multicast IP: 33.204.168.192, MAC address:
01:00:5E:28:CC:21;
;[cpu_1];[fw4_1];eth1: Address: 31.10.10.10;
;[cpu_1];[fw4_1];Cluster/default multicast IP: 250.10.10.10, MAC address:
01:00:5E:0A:0A:FA;
;[cpu_1];[fw4_1];eth2: Address: 31.20.20.20;
;[cpu_1];[fw4_1];Cluster/default multicast IP: 33.20.20.20, MAC address:
01:00:5E:14:14:21;
;[cpu_1];[fw4_1];------------------------

;
;[cpu_1];[fw4_1];---- Status subscribers ----
;
;[cpu_1];[fw4_1];Subscriber: 0 pid 23079 sig 12 desc pepd;
;[cpu_1];[fw4_1];Subscriber: 1 pid 23078 sig 12 desc pdpd;
;[cpu_1];[fw4_1];Subscriber: 2 pid 25236 sig 3 desc routed instance 0;
;[cpu_1];[fw4_1];Subscriber: 3 pid 25270 sig 12 desc ted;
;[cpu_1];[fw4_1];Subscriber: 4 pid 4533 sig 12 desc cvpnd;
;[cpu_1];[fw4_1];------------------------

;
;[cpu_1];[fw4_1];===== Cluster instance information end ===;
;[cpu_1];[fw4_1];------------------------

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 152
;[cpu_1];[fw4_1];---- Sync ----
;
;[cpu_1];[fw4_1];fwlddist_state is (1a): Receiving, Not Saving, Sending;
;[cpu_1];[fw4_1];fwlddist_dobcast is: 1;
;[cpu_1];[fw4_1];fw_has_nondefault_filter is: 1;
;[cpu_1];[fw4_1];fw_syncn_is_configured is: 1;
;[cpu_1];[fw4_1];fwlddist_policy_in_ready_state is: 1;
;[cpu_1];[fw4_1];---- VMAC mode: ----
;
;[cpu_1];[fw4_1];VMAC: vmac mode is enabled;
;[cpu_1];[fw4_1];VMAC: the vmac of each interface:;
;[cpu_1];[fw4_1];Interface: 1) eth0, vmac: 00:1C:7F:00:00:FE;
;[cpu_1];[fw4_1];Interface: 3) eth2, vmac: 00:1C:7F:00:00:FE;
;[cpu_1];[fw4_1];VMAC: priomisc mode interfaces (by the VMAC mechanism) are:;
;[cpu_1];[fw4_1];Interface: 1) eth0, vmac_index=0x0;
;[cpu_1];[fw4_1];Interface: 3) eth2, vmac_index=0x0;
;[cpu_1];[fw4_1];------------------------

;
;[cpu_1];[fw4_1];---- Interfaces info: ----
;
;[cpu_1];[fw4_1];0) if: lo, flags: 0x800;
;[cpu_1];[fw4_1];1) if: eth0, flags: 0x10000800;
;[cpu_1];[fw4_1];2) if: eth1, flags: 0x10000808;
;[cpu_1];[fw4_1];3) if: eth2, flags: 0x10000800;
;[cpu_1];[fw4_1];------------------------

;
;[cpu_1];[fw4_1];================================================;
;[cpu_1];[fw4_1];===== ClusterXL debug end ===;
;[cpu_1];[fw4_1];================================================

 cphaconf -t <secured_IF_1> <secured_IF_2> ... add

Description:
Adds the specified trusted (secured) interfaces explicitly into the current cluster
configuration in kernel.

 cphaconf sync

Description:
Sets sync configuration in kernel (in HA New mode).

 cphaconf stop_all_vs

Description:
Stops clustering on each Virtual System (relevant only for VSX systems).

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 153
 cphaconf forward <on|off>

Description:
Enables (on; default setting) / Disables (off) the Forwarding Layer (controls the
forwarding of traffic between cluster members).

Refer to Forwarding section.

 cphaconf clear-secured

Description:
Clears the list of secured (trusted) interfaces in kernel.

 cphaconf clear-disconnected

Description:
Clears the list of disconnected interfaces in kernel.

Refer to Defining 'Disconnected' interfaces section.

 cphaconf clear_subs

Description:
Clears the list of subscribers.

Note:
List of such subscribers can be obtained by running the cphaconf debug_data
command.

 cphaconf mc_reload

Description:
Updates the multicast configuration by reloading the 'cphamcset' daemon (if this is
HA New mode and CCP is set to run in Multicast mode). The current configuration is
kept.

 cphaconf uninstall_macs

Description:
Calls the $FWDIR/bin/cpha_restore_macs script to remove the cluster MAC
address configuration (and restore a previous MAC configuration if it was saved on
Linux-based OS to the ifcfg-ethX file).

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 154
 cphaconf macs

Description:
Only on IPSO OS: Sets Multicast MAC addresses on relevant interfaces.

 cphaconf init

Description:
Initializes cluster configuration.

 cphaconf fini

Description:
Finalizes cluster configuration.

 cphaconf debug <on|off>

Description:
Enables (on) / Disables (off) ClusterXL kernel debug module (fw ctl debug -m
cluster).

(7-24) 'cpstat' command

Description:
Produces relevant information for the installed products.

Syntax:
[Expert@HostName]# cpstat [-d] [-s SIC_Name] [-p port] [-o
polling_interval [-c count] [-e period]] [-f flavour]
application_flag

Flags:
'cpstat' flags Description
-d Prints some debug information about the
execution of 'cpstat' command
-s <SIC_Name> Sets the SIC name of the AMON server
-p <port> Sets the port number of the AMON server
(default port is 18192)
-o <polling_interval> Sets polling interval (in seconds) - how
frequently to produce the output (default is 0,
i.e., the results are shown only once)
-c <count> Sets how many times in total to produce the
output (default is 0, i.e., the results are shown
repeatedly)
-e <period> Sets the interval, over which "statistical" OIDs
are computed (ignored for regular OIDSs)

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 155
-f <flavour> Specifies "flavor" of the output - which
information to display for desired product
Note: to see the supported flavours, just type
'cpstat' and press Enter
application_flag Specifies the desired product
Note: to see the supported products, just type
'cpstat' and press Enter

In our case, we are interested in the information only about the ClusterXL product:
[Expert@HostName]# cpstat -f default ha
[Expert@HostName]# cpstat -f all ha

Refer to sk93201 (Output of 'cpstat -f all ha' command on Gaia OS does not populate the
'Cluster IPs table' and the 'Sync table').

The 'cpstat -f all ha' command on Gaia OS and on 3rd party / OPSec clusters
works in the following way:
1. The 'cpstat -f all ha' command calls the
$FWDIR/bin/cxl_create_partner_topology_file shell script.
2. The $FWDIR/bin/cxl_create_partner_topology_file shell script collects the
relevant information and saves in the
$FWDIR/tmp/cxl_partner_topology_config.txt file.
3. 'cpstat -f all ha' uses the information in
$FWDIR/tmp/cxl_partner_topology_config.txt file and populates the
'Cluster IPs table' and the 'Sync table'.
Examples:

[Expert@Member]# cpstat -f default ha

Product name: High Availability


Version: N/A
Status: OK
HA installed: 1
Working mode: High Availability (Active Up)
HA started: yes

[Expert@Member]# cpstat -f all ha

Product name: High Availability


Major version: 6
Minor version: 0
Service pack: 3
Version string: N/A
Status code: 0
Status short: OK
Status long: Refer to the Notification and Interfaces tables for
information about the problem
HA installed: 1
Working mode: High Availability (Active Up)
HA protocol version: 2
HA started: yes
HA state: standby
HA identifier: 1

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 156
Interface table
----------------------------------------------------------------
|Name|IP |Status |Verified|Trusted|Shared|Netmask|
----------------------------------------------------------------
|eth0|172.30.41.78|Up | 0| 0| 2|0.0.0.0|
|eth1| 10.10.10.78|Up | 300| 1| 2|0.0.0.0|
|eth2| 20.20.20.78|Up | 300| 0| 2|0.0.0.0|
|eth3| 30.30.30.78|Disconnected|21318100| 0| 2|0.0.0.0|
|eth4| 40.40.40.78|Disconnected|21318100| 0| 2|0.0.0.0|
----------------------------------------------------------------

Problem Notification table


------------------------------------------------
|Name |Status|Priority|Verified|Descr|
------------------------------------------------
|Synchronization|OK | 0| 168880| |
|Filter |OK | 0| 21318| |
|cphad |OK | 0| 21318| |
|fwd |OK | 0| 168949| |
|routed |OK | 0| 21307| |
|cvpnd |OK | 0| 1| |
|ted |OK | 0| 1| |
------------------------------------------------

Cluster IPs table


---------------------------------------------------------------
|Name|IP |Netmask |Member Network|Member Netmask|
---------------------------------------------------------------
|eth0|172.30.41.79| 255.255.0.0| 172.30.0.0| 255.255.0.0|
|eth2| 20.20.20.79|255.255.255.0| 20.20.20.0| 255.255.255.0|
---------------------------------------------------------------

Sync table
--------------------------------
|Name|IP |Netmask |
--------------------------------
|eth1|10.10.10.78|255.255.255.0|
--------------------------------

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 157
(7-25) $FWDIR/bin/clusterXL_admin script

This shell script registers a pnote (called 'admin_down') and gracefully changes the state
of the given cluster member to 'Down' (by reporting the state of that pnote as 'problem'), or
gracefully reverts the state of the given cluster member to 'Up' (by reporting the state of that
pnote as 'ok').

Refer to sk55081 (Best practice for manual fail-over in ClusterXL).

(7-26) $FWDIR/bin/clusterXL_monitor_ips script

This shell script pings a list of predefined IP addresses and changes the state of the
given cluster member to 'Down' or 'Up' based on the replies to these pings.
Note: Cluster member will go down even if one ping is not answered.

Refer to sk35780 (How to configure $FWDIR/bin/clusterXL_monitor_ips script to run


automatically on Gaia / SecurePlatform OS).

(7-27) $FWDIR/bin/clusterXL_monitor_process script

This shell script monitors a list of predefined processes and changes the state of the
given cluster member to 'Down' or 'Up' based on whether these processes are running or not.

Refer to sk92904 (How to configure $FWDIR/bin/clusterXL_monitor_process script to run


automatically on Gaia / SecurePlatform OS).

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 158
8. ClusterXL Debugging
(8-1) Debugging Check Point Security Gateway

In order to see how the Security Gateway processes the traffic, and how the internal
components are working, a debug of Check Point kernel should be run on this Security
Gateway (depending on the issue, it might also be required to run a debug of the relevant
user space daemon - e.g., in case of VPN - vpnd, in case of Full Sync - fwd).

Some debugs print so much information, that the load on CPU might increase to 100%
and render the Security Gateway unresponsive.

Note: It is always recommended to run the kernel debug during a scheduled maintenance
window in order to minimize the impact on production traffic and on users.

(8-1-A) Syntax

[Expert@GW_HostName]# fw ctl debug -h


fw ctl debug [-d <strings>] [-s "<string>"] [-v ("<VSIDs>"|all)] [-k] [-x] [-m
<module>] [-e expr |-i <filter-file|-> | -u] [+|-] <options | all | 0>
Or: fw ctl debug [-t (NONE|ERR|WRN|NOTICE|INFO)] [-f (RARE|COMMON)]
Or: fw ctl debug -buf [buffer size][-v ("<VSIDs>"|all)][-k]
-h - for help
-e - Set debug filter to expr (inspect script)
-i - Set debug filter from filter-file (- is the standard input)
-u - Unset debug filtering

To display all kernel debugging modules and all their flags that this machine supports:
[Expert@GW_HostName]# fw ctl debug -m

To display all kernel debugging modules and their flags that were turned on:
[Expert@GW_HostName]# fw ctl debug

To display all debugging flags that were turned on for this kernel debugging module:
[Expert@GW_HostName]# fw ctl debug -m MODULE

To set default kernel debug options:


[Expert@GW_HostName]# fw ctl debug 0

Notes:
 Some debug flags are enabled by default (error, warning) in various kernel
debugging modules, so that some generic messages are printed into Operating
System log (Linux-based OS: /var/log/messages; Windows OS: Event
Viewer).
 This command should be issued before starting any kernel debug.
 This command must be issued to stop the kernel debug.

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 159
To unset all kernel debug options:
[Expert@GW_HostName]# fw ctl debug -x

Note:
 This unsets all debug flags, which means that none of the relevant messages will be
printed. Default debug flags should be enabled.

To set kernel debugging buffer:


[Expert@GW_HostName]# fw ctl debug -buf 32000

Notes:
 Default size of the debugging buffer is 50 KB
 Maximal size of the debugging buffer is 32768 KB
 Unless the size of the debugging buffer is increased from default 50 KB, the
debug will not be redirected to a file (debug messages will be printed into
Operating System log)
 Debug messages are collected in this buffer, and a user space process
($FWDIR/bin/fw) collects them and prints into the output file.

To print debug messages into the output file (start the kernel debug):
[Expert@GW_HostName]# fw ctl kdebug -T -f > /var/log/debug.txt

Note:
 If you need to use this command in shell scripts, then add an ampersand at the end to
run the command in the background (fw ctl kdebug -T -f > /var/log/debug.txt
&).

To stop the kernel debug:


Press CTRL+C and set the default kernel debug options
[Expert@GW_HostName]# fw ctl debug 0

Note:
 If you started the kernel debug via shell script, then you should just set the default
kernel debug options.

(8-1-B) Important Notes about 'cpstop' and 'cpstart':


 When running the 'cpstop' command, all Check Point services are stopped - and the
kernel debug will stop printing debug messages.
 When running the 'cpstart' command (after the 'cpstop'), the kernel debug will
continue printing debug messages.

(8-1-C) Important Notes about Security Gateway in VSX mode:


 In VSX NGX / VSX R6x, the kernel debug commands can be run from context of any
Virtual Device.
 In VSX R6x, if you wish to filter the debug for messages only from specific Virtual
Devices, then use specify the relevant VSID in the syntax when setting flags:
[Expert@VSX_HostName:0]# fw ctl debug -v VSID1, VSID2 -m MODULE + flags
Note: Refer to VSX NGX R65 Administration Guide - 'Per Virtual System Debugging'.
 In R75.40VS and above in VSX mode, you have to switch to the context of the
specific Virtual Device, and then run the usual debugging commands:

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 160
[Expert@VSX_HostName:0]# vsenv VSID
[Expert@VSX_HostName:VSID]# fw ctl debug ...

(8-1-D) Debug action plan

1. Prepare the kernel debug options:

A. Set default kernel debug options:


[Expert@GW_HostName]# fw ctl debug 0

Should get this message:


Defaulting all kernel debugging options

B. Set kernel debug buffer:


[Expert@GW_HostName]# fw ctl debug -buf 32000

Should get this message:


Initialized kernel debugging buffer to size 32000K

Note: Any other message means that there was a problem allocating the buffer,
and you should not continue until that issue is resolved (e.g., "Failed to allocate
kernel debugging buffer").

C. Set relevant kernel debug flags in relevant kernel debugging modules:


[Expert@GW_HostName]# fw ctl debug -m MODULE + FLAG1 FLAG2 ... FLAGn
or
[Expert@GW_HostName]# fw ctl debug -m MODULE all

Should get this message:


Updated kernel's debug variable for module MODULE

Note: Pay close attention to the name of the kernel debug module.

2. Verify the kernel debug options:


[Expert@GW_HostName]# fw ctl debug -m MODULE

Should get this output:


Kernel debugging buffer size: 32000KB
Module: MODULE
Enabled Kernel debugging options: LIST OF FLAGS

Notes:
 Pay close attention to the size of the kernel debugging buffer.
 Pay close attention to the name of the kernel debugging module.
 The order of the flags in this output does not matter - just all the flags you set
have to be here.

3. Start the kernel debug:


[Expert@GW_HostName]# fw ctl kdebug -T -f > /var/log/debug.txt

Should see the blinking cursor - the debug has started.


You can open a new shell and verify that the information is written into the output file:
[Expert@GW_HostName]# tail -f /var/log/debug.txt

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 161
4. If needed, start capturing the relevant traffic:
A. Start Check Point FW Monitor (refer to sk30583)
B. Start TCPdump on relevant interfaces

Note: It is strongly recommended to filter only the relevant traffic.

5. Replicate the issue:


A. Initiate the problematic traffic (write down exact times, IP addresses, ports, etc)
B. Repeat the steps that lead to unwanted behaviour
C. Make sure the issue was replicated

6. Stop the kernel debug and set default kernel debug options:
Press CTRL+C
[Expert@GW_HostName]# fw ctl debug 0

7. Stop the traffic captures:


Press CTRL+C

8. Collect the debug output files (from kernel debug and traffic captures) and all other
related files (OS logs, CPinfo files, daemons' logs, SmartView Tracker logs, etc).

(8-2) Debugging modules and flags

To debug Check Point ClusterXL software, the following kernel debugging settings are
used:

(8-2-A) Global Kernel parameters

Before starting the kernel debug itself, pay attention to the following global kernel
parameters relevant to relevant to cluster issues (after debug, set the default values):

 Disable this kernel parameter to disable the limit on the debug messages time window
(default - 60 ; zero - disables the limit):
[Expert@Member_HostName]# fw ctl set int fw_kdprintf_limit_time 0

 Disable this kernel parameter to disable the limit on the amount of debug messages
(default - 30 ; zero - disables the limit) that are printed within specified time
(fw_kdprintf_limit_time):
[Expert@Member_HostName]# fw ctl set int fw_kdprintf_limit 0

 Set this kernel parameter to print additional IO information and the contents of the
packets in HEX format when 'select' flag is enabled in 'cluster' module:
[Expert@Member_HostName]# fw ctl set int fwha_dprint_io 1

 Set this kernel parameter to print additional information about cluster interfaces when
'if' flag is enabled in 'cluster' module (very helpful for Check Point RnD):
[Expert@Member_HostName]# fw ctl set int fwha_dprint_all_net_check 1

 Set this kernel parameter to print the dump of each packet when 'packet' flag is
enabled in 'fw' module (very helpful for Check Point RnD):

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 162
[Expert@Member_HostName]# fw ctl set int fw_debug_dump_packet 1
Notes:
o This parameter is available in R75.40VS, R76 and above
o Enabling the debug with flag 'packet' creates high load on CPU
o Enabling the parameter 'fw_debug_dump_packet' creates high load on CPU

(8-2-B) Kernel debugging modules and debug flags

Refer to Kernel Debug flags (R77, R77.10, R77.20, R77.30).


Kernel debugging modules and debug flags relevant to cluster issues are listed below.

Firewall module: fw ctl debug -m fw + flag1 flag2 ... flagN

Flag Explanation When should be used


* complicated traffic issues and
chain forwarding - processing forwarding between members
chainfwd
of packet by various layers * recommended: 'df' , 'forward' flags
from 'cluster' module
* changes in the configuration
highavail cluster configuration * information about interfaces during
traffic processing
IOCTL control messages -
* policy installation (in specific cases only)
ioctl sending configuration from
* manual changes in the configuration
user space to kernel
mrtsync synchronization (in kernel)
between cluster members of
* PIM Routing is enabled on cluster
Multicast Routes that are
* Multicast traffic passes through
added when working with
* Refer to sk95156
Dynamic Routing Multicast
protocols
nat * traffic is not hidden/folded, or incorrectly
NAT issues - basic information
hidden/folded behind cluster Virtual IP
(Hiding and Folding behind
address
VIP)
* NAT is not working as expected
synchronization operations in see how and which connections are
sync
ClusterXL synchronized
* traffic is not hidden/folded, or incorrectly
NAT issues - basic information
hidden/folded behind cluster Virtual IP
xlate (Hiding and Folding behind
address
VIP)
* NAT is not working as expected
* traffic is not hidden/folded, or incorrectly
NAT issues - additional
hidden/folded behind cluster Virtual IP
xltrc information - going through
address
NAT rulebase
* NAT is not working as expected

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 163
Cluster module: fw ctl debug -m cluster + flag1 flag2 ... flagN

Flag Explanation When should be used


* see how cluster works with SecureXL
related to status and support of * use with: conf
accel
SecureXL * recommended: when debugging
SecureXL
ARP Forwarding in ClusterXL * traffic issues
arp
R80.10 and above (sk111956) * recommended: if, mac
* issues related to CCP (e.g., member is
stuck in 'Ready' state)
arrival/transmission of Cluster * recommended: during policy installation
ccp
Control Protocol (CCP) packets (use with: conf)
* use with: pnote, stat
* recommended: if, mac
* anything related to cluster configuration
and policy installation
configuration and policy * use with: pnote, stat, subs
conf
installation * recommended: if, mac
* recommended: 'ioctl' flag from 'fw'
module
* translation of kernel tables
* connection synchronization
Connectivity Upgrade
cu * connection rematch
(only since R77.20)
* etc.
* use with: conf, pnote, stat, subs
Decision Function - decides,
* traffic issues in LS cluster
df which member will handle each
* recommended: select
packet in a Load Sharing mode
connections dropped by the
CXL Decision Function (DF) * traffic issues
drop
module (only in R60 and above) * recommended: df, select, forward
- excluding CCP packets
Forwarding Layer messages - * traffic issues
forward sending and receiving a * use with: df, if, mac
forwarded packet * recommended: select
* changes in member's state
* configuration and policy installation
interface tracking and validation * use with: conf, pnote, stat, mac,
if - all the operations and checks subs
on interfaces
* recommended: 'highavail' flag from
'fw' module
creating and sending of logs by
cluster (should be used in * logging issues
log
parallel with 'log' flag in 'fw' * use with 'log' flag from 'fw' module
module)
related to current configuration * changes in member's state
of and detection of cluster * configuration and policy installation
mac
interfaces (should be used in * use with: conf, pnote, stat, if, subs
parallel with 'conf' flag and 'if' * recommended: 'highavail' flag from

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 164
flag) 'fw' module
* anything related to cluster configuration
operations on MAC magic and policy installation
mmagic (getting, setting, updating,
* use with: conf, pnote, stat
initializing, dropping, etc.)
* recommended: if, mac
* function calls between Check Point and
IPSO OS and reports about state of
related to cluster running on
nokia cluster
IPSO OS
* use only on IPSO OS
related to ClusterXL Load all decisions made in LS Unicast (Pivot
pivot
Sharing Unicast (Pivot) mode mode)
related to registering and * state of critical devices (pnotes)
pnote monitoring of critical devices * use with: conf, stat, subs
(pnotes) * recommended: if, mac
* traffic issues
packet selection - including
select * use with: df, forward
Decision Function (DF)
* recommended: if, mac, arp
related to state of cluster * changes in member's state
stat
members (state machine) * use with: pnote, subs, mac
Subscriber module - set of APIs,
which enable user space
* cluster configuration and policy
processes (by using a DLL) to
installation
subs be aware of the current state of
* changes in member's state
the ClusterXL state machine
* use with: conf, pnote, stat, mac
and other clustering
configuration parameters.
* issues with cluster operations that are
based on internal cluster timers
timer reports of cluster internal timers
* use with: ccp
* recommended: pnote, stat, if, mac
* cluster configuration and policy
sending trap messages from
installation
trap cluster kernel to RouteD
* changes in member's state
daemon about Master change
* use with: conf, pnote, stat, mac

Additional flags in 'cluster' module only on 40000/60000 appliance:

Flag Explanation When should be used


correction Correction Layer traffic issues
anything related to policy installation and
bstat Blade State
configuration
ch_ccp Chassis CCP issues related to CCP
anything related to configuration and
ch_conf Chassis configuration
interfaces
ch_stat Chassis State anything related to state
* traffic issues
iterator Iterator * anything related to policy installation and
configuration
osp Open Security Platform traffic balancing issues
smo Single Management Object * anything related to policy installation and

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 165
configuration
* anything related to state
unisync Unicast Sync anything related to Unicast Sync
vpn VPN traffic VPN traffic issues

Refer to these solutions:


 sk83220 (How to collect ClusterXL debug during boot)
 sk31114 (How to collect debug on Security Gateway during boot)
 sk92987 (How to debug Check Point 'cpstart' and 'cpstop' commands)

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 166
9. Working with kernel parameters
Kernel parameters allow the administrator to control various aspects of Check Point
software operation.
The desired behaviour is achieved by setting the relevant values for the corresponding
kernel parameters.

 It is strongly recommended to consult Check Point Support before changing any


values of kernel parameters.

 Some kernel parameters can be set on-the-fly with 'fw ctl set int PARAMETER
VALUE' command (e.g., fwha_mac_magic).
Note: This change does not survive reboot.

 Some kernel parameters can be set only during boot of the machine (any parameter
that controls memory allocation, sizes of memory buffers).

 To check the current value of any integer parameter:


[Expert@Member_HostName]# fw ctl get int PARAMETER

 To set the desired value on-the-fly (if parameter supports it):


[Expert@Member_HostName]# fw ctl set int PARAMETER VALUE
Note: This change does not survive reboot.

 To set the desired value permanently (all parameters support it):


Refer to sk26202 (Changing the kernel global parameters for Check Point Security
Gateway).

Refer to the solutions that contain most relevant cluster-related kernel parameters:
 sk92723 (Cluster flapping prevention)
 sk25977 (Connecting multiple clusters to the same network segment (same VLAN,
same switch)
 sk23695 ('FW-1: State synchronization is in risk. Please examine your
synchronization network to avoid further problems!' appears in /var/log/messages file)
 sk43984 (Interface flapping when cluster interfaces are connected through several
switches)
 sk31655 (State of Standby cluster member in High Availability cluster is constantly
changing between 'Standby' and 'Down')
 sk31336 (Using Monitor Interface Link State feature to improve ClusterXL interface-
failure-detection ability)
 sk62863 (ClusterXL - cluster debug shows interface flapping due to the missing CCP
packets)
 sk63163 (Failover does not occur in ClusterXL HA Primary Up mode after changing
cluster member priorities and installing the policy)
 sk41827 (Synchronization network in the cluster is flooded with Sync Retransmit
packets)
 sk43896 (Blocking New Connections Under Load in ClusterXL)
 sk82080 (/var/log/messages are filled with 'kernel: FW-1:
fwldbcast_update_block_new_conns: sync in risk: did not receive ack for the last 410
packets')

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 167
 sk43872 (ClusterXL - CCP packets and fwha_timer_cpha_res parameter)
 sk41471 (ClusterXL - State Synchronization time interval and 'fwha_timer_sync_res'
kernel parameter)
 sk31934 (ClusterXL IGMP Membership)
 sk95156 (How to control the synchronization of multicast routes in Check Point
cluster)
 sk104567 (Traffic passing through the VSX cluster is lost during a cluster failure on
Standby member)

10. ClusterXL Error Messages


Refer to ClusterXL Administration Guide (R70, R70.1, R71, R75, R75.20, R75.40,
R75.40VS, R76, R77.X, R80.10) - Chapter 'Monitoring and Troubleshooting Gateway
Clusters':
 Monitoring Cluster Status Using SmartConsole Clients - SmartView Tracker -
ClusterXL Log Messages
 ClusterXL Error Messages

11. Additional related solutions


Refer to these solutions:
 sk57120 (Various warnings in SmartDashboard about synchronization networks in
cluster)
 sk98348 (Best Practices - Security Gateway Performance)
 sk98722 (ATRG: SecureXL)
 sk98737 (ATRG: CoreXL)

©2017 Check Point Software Technologies Ltd. All rights reserved.


Classification: [Protected] — All rights reserved | P. 168

You might also like