You are on page 1of 51

EIS Installation Checklist for the

ORACLE® Big Data Appliance (BDA) X7-2

Customer:
Task Number:
Technician:
Version EIS-DVD:
Date:

• It is recommend that the EIS web pages are checked for the latest version of this checklist prior
to commencing the installation.
• The idea behind this checklist is to help the installer achieve a "good" installation.
• It is assumed that the installer has attended the appropriate training classes.
• Use of a laptop (preferably with Solaris or Linux available) is recommended during the
installation.
• It is not intended that this checklist be handed over to the customer.
• Feedback on issues with EIS content or product quality is welcome – refer to the last page of this
checklist.

The final configuration steps to configure the BDA system to use ASR are expected to be
performed by an Oracle Advanced Customer Services (ACS) software engineer during actions that
are subsequent to the EIS activities and hence is not part of this installation checklist. An exception
to this is the configuration of the InfiniBand switches for ASR (page 49), since the standard ASR
scripts do not include the InfiniBand switches.

The following additional EIS Installation Checklist will be required:


• Sun Data Center InfiniBand Switch 36 as a Spine Switch in an Engineered System Rack.

Some or all of the following additional EIS Installation Checklists may be required:
• Oracle Advanced Support Gateway (OASG) Server
• Multi-Rack Cabling of Engineered Systems (via InfiniBand).

System Type Rack Master Serial Number


BDA X7-2

Oracle Internal and Approved Partners Only Page 1 of 51 Vn 1.1b Created: 15 Feb 2018
ENGINEERED SYSTEMS ENTERPRISE SUPPORT TEAM EEST
If a Field Support Engineer (FSE) requires assistance while installing an Engineered System, a
streamlined method of engaging an Engineered Systems Enterprise Support Team (EEST)
engineer when the SR owner is not available is described below.
The SR referred to is the installation SR.
An FSE would use this option if they are onsite and require immediate assistance from EEST.
This process is specifically for FSE Callbacks for hardware, installation and Field Change Order
(FCOs) support and should NOT be used by ACS or partners.
The complete process is described in MOS Document ID 1803744.1 EEST: Silent Menu Option :
GCSEXA (internal-only). Ensure that you understand the contents before going onsite.

Task Check

INSTALLATION OVERVIEW– EIS DELIVERABLES


For an overview of the activities carried out during the installation of this product via the
EIS Methodology, refer to the EIS-Deliverables page within the EIS website for the Oracle
Big Data Appliance BDA.

MEMORY DIMM FAULT BEST PRACTICES


TSC are seeing many DIMMs being replaced in the field due to initial boot up and Memory
DIMM fault either due to Memory DIMM movement during shipping or improper seating during
memory upgrades.
If Memory DIMM faults are encountered during the installation or upgrade of any engineered
systems, please follow the "BEST PRACTICE" as supplied by the hardware team.
Best Practice for resolving DIMM Faults during Installation or upgrade:
1. We (TSC) are recommending the re-seating of Memory DIMMs when seeing faults during
installation or memory upgrades, as a first step.
2. If we see these failures outside of upgrades or rack installs, we don't want to take the chance
and risk of a second visit if they're truly failed Memory DIMMs. So we recommend, in these
instances, to simply replace the Memory DIMM. If the replaced Memory DIMM experiences
Training errors, then it might be prudent to re-seat the Memory DIMM.

Oracle Internal and Approved Partners Only Page 2 of 51 Vn 1.1b Created: 15 Feb 2018
INFORMATION: EIS Checklist Steps and Estimated Timing
Item First Page Est. Duration
ASR Preparation 4 1 hour
BDA Rack Preparation 5 1 hour
Unpacking 9 1 hour
Initial Power-On Actions 11 30 mins
Configuring the CISCO Ethernet Switch 13 45 mins
Verifying / Configuring Rack PDUs 20 30 mins
Configuring the InfiniBand NM2-GW Switches 22 15 mins / switch
InfiniBand Spine Switch Installation 28 15 mins
IB Switches Instance Check 32 5 mins / switch
BDA Server Nodes 33 15 mins / server
Verification of the InfiniBand Network 37 5 to 30 mins
Customer Network Preparation 39 1 hour
Connecting to Customer Network 44 1 hour
Configure the InfiniBand switches to use ASR 49 1 hour
BDA Mammoth Utility 51
Handover 51 30 mins

Oracle Internal and Approved Partners Only Page 3 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Check

AUTOMATIC SERVICE REQUEST (ASR) PREPARATION


If the ASR Manager has not been previously installed AND the Customer requests that
Oracle install it, go to the documentation page for Oracle ASR and select the URL for the
ASR Manager installation and operations guide, quick install guide, security white paper.
There you will find the current Installation and Operations Guide. Follow the instructions
to install the ASRM based on the system you have been given to install it in.
Register the ASR manager with Oracle – follow the installation process through the point
where the ASRM is registered to the backend.
These activities must be completed prior to running the ASR activation step within the
BDAMammoth utility.
The BDA rack does not support the Oracle Advanced Support Gateway (OASG / Platinum
Gateway).
CHECKING A CUSTOMER'S EXISTING ASR MANAGER SERVER
If the Customer has a configured and operational ASR Manager system available, the
components within the BDA rack will be configured to use ASR. Ensure that the IP
address & root password of the ASR Manager host are available.
Request that the customer determines the ASR Manager version – 5.5 is the minimum
required version for BDA X7 racks. If the response is that the package cannot be found,
refer to the next task (on the following page).
On a Linux system:
# rpm -qa | grep SUNWswasr
SUNWswasr-2.7-1 <<<<<<< Version 2.7 – requires update to ≥5.5!
On a Solaris System:
# pkginfo -l SUNWswasr
PKGINST: SUNWswasr
NAME: SASM ASR Plugin
CATEGORY: application
ARCH: all
VERSION: 2.6 <<<<<<< Version 2.6 – requires update to ≥5.5!
BASEDIR: /
VENDOR: Sun Microsystems, Inc.
<SNIP>

When the ASR manager version went from 4.x to 5.x the package name and subsequently
the path names changed. Thus if the search for package SUNWswasr failed, try again for
asrmanager :
On a Linux system:
# rpm -qa | grep asrmanager
asrmanager-5.5.1 <<<<<<< Version 5.5 is OK!
On a Solaris System:
# pkginfo -l asrmanager
PKGINST: asrmanager
NAME: ASR Manager
CATEGORY: application
ARCH: all
VERSION: 5.3.0 <<<<<<< Version 5.3.0 requires an update!
BASEDIR: /
VENDOR: Oracle Corporation
<SNIP>

Oracle Internal and Approved Partners Only Page 4 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Check
If the installed ASR Manager host does not meet the minimum requirements, request the
customer to update prior to installation of the BDA rack(s).

Task Comment Check

BDA RACK PREPARATION


Ensure you have the Configuration files as The Configuration Worksheets are available under:
completed by the customer, signed off by ACS http://docs.oracle.com/cd/E40622_01/
and provided by the Install Coordinator. This
will likely be a .zip file containing the html
installation preview (bda-install-preview.html,
for example) and a customer specific
network.json file (bda1-network.json, for
example) in a separate directory created by
unzipping the .zip file.
FAB/EIS-ALERT info reviewed?
Prepare for delivery. The delivery company will contact the customer
contact with delivery date and time details. The
Oracle FSE should contact the customer to obtain
that information when available.
Customer action: If name service (DNS, NIS) in use, ensure that new hostnames, IP
addresses etc. have been correctly entered on the name server.
Laptop available during installation? Preferably with Solaris or Linux available.

OTHER DOWNLOADS / ACTIONS


If the rack to be installed was delivered with a Base Image version of 4.10.1 or 4.10.0-x
where x is less than 3, then you must download the most recent BDA v4.10.0 rpm package,
which is available as patch 27441706 from MOS (URL is for the internal MOS site).The
patch is 231.5 MB in size.

Oracle Internal and Approved Partners Only Page 5 of 51 Vn 1.1b Created: 15 Feb 2018
ASSUMPTIONS
In an EIS installation the following assumptions are made:
• Systems come pre-configured from the factory with a default name and IP address. This is
utilised in this check list for actions before connecting to the customer network. The root
password on all systems OS and ILOM has been set to welcome1. The default name and IP
address scheme used is on page 7.
• We understand that v4.10.1 shipped on all BDA X7 racks starting from RR in October 2017.
• EIS recommends that connections from the rack components to the Customer network should
be made AFTER the initial configuration (as described in this checklist). Connecting before
power on could cause undesirable interactions due to possible presence of a duplicate IP
address in the Customer's environment.
• For BDA X7-2 racks, it is required to use a laptop (preferably with Solaris or Linux) plugged
into the Cisco switch using SSH to the default IP addresses defined on page 7. A cable is
provided plugged into Cisco port 48 with settings appropriate for the customer network uplink
to another switch. Do NOT use port 48 on the Cisco for a laptop – use any other free port or
temporarily one of the PDU ports.
• Component numbering starts with 1 at the rack bottom working upwards based on the server
type. BDA Server Node 1 is the lower-most server node in the rack location U2, BDA Server
Node 9 is the upper-most server node in the bottom half rack location U18; BDA Server Node
10 starts the top half progressing up to 18.
• BDA Full Rack configuration has 18 server nodes. A BDA Starter Rack has 6 server nodes
and BDA Elastic Upgrades add additional BDA Server Nodes one at a time up to 12, for a total
of 18 server nodes in the rack.

Oracle Internal and Approved Partners Only Page 6 of 51 Vn 1.1b Created: 15 Feb 2018
Table Showing Default Settings for Full, Expansion & Starter Racks

Hostname net0 IP ILOM IP IB bonded IP RU BDA


(hostname-c) (hostname-priv) Starter In-Rack Full
Exp.

bda18 192.168.1.18 192.168.1.118 192.168.10.18 39

bda17 192.168.1.17 192.168.1.117 192.168.10.17 37

bda16 192.168.1.16 192.168.1.116 192.168.10.16 35

bda15 192.168.1.15 192.168.1.115 192.168.10.15 33

bda14 192.168.1.14 192.168.1.114 192.168.10.14 31

bda13 192.168.1.13 192.168.1.113 192.168.10.13 29

bda12 192.168.1.12 192.168.1.112 192.168.10.12 27 27

bda11 192.168.1.11 192.168.1.111 192.168.10.11 25 25

bda10 192.168.1.10 192.168.1.110 192.168.10.10 23 23


bdasw-ib3 192.168.1.203 22 22
Cisco 192.168.1.200 21 21
bdasw-ib2 192.168.1.202 20 20
bda09 192.168.1.9 192.168.1.109 192.168.10.9 18 18

bda08 192.168.1.8 192.168.1.108 192.168.10.8 16 16

bda07 192.168.1.7 192.168.1.107 192.168.10.7 14 14

bda06 192.168.1.6 192.168.1.106 192.168.10.6 12 12


bda05 192.168.1.5 192.168.1.105 192.168.10.5 10 10
bda04 192.168.1.4 192.168.1.104 192.168.10.4 8 8
bda03 192.168.1.3 192.168.1.103 192.168.10.3 6 6
bda02 192.168.1.2 192.168.1.102 192.168.10.2 4 4
bda01 192.168.1.1 192.168.1.101 192.168.10.1 2 2
bdasw-ib1 192.168.1.201 1 1

PDU-A 192.168.1.210

PDU-B 192.168.1.211

Oracle Internal and Approved Partners Only Page 7 of 51 Vn 1.1b Created: 15 Feb 2018
INFORMATION: CABLE LABELS WITHIN RACK
The cables between the various units within the rack are labelled by manufacturing. The cables
are also colour-coded as follows:
• Black – InfiniBand Data
• Black – InfiniBand Switch & BDA Server Node Ethernet management cables
• Red – ILOM Ethernet management cables
• Black – AC power jumper cables
Some examples are given here:
At an InfiniBand switch (connection to second switch) (black cable):
R1 U20 P8A (local): Rack Unit 20 Port 8A on the switch.
R1 U24 P8A (remote): Rack Unit 24 Port 8A on the switch.
At an InfiniBand switch to a PCI card on a server (black cable):
R1 U20 P15A (local): Rack Unit 20 Port 15A on the switch.
R1 U12 PCIE3-1 (remote): Rack Unit 12 PCIE card in slot 3, port #1.
At a server's ILOM to Ethernet switch (red cable):
R1 U8 ILOM (local): Rack Unit 8 ILOM NET MGT port.
R1 U23 P38 (remote): Rack Unit 23 Port 38 on the Ethernet switch.
At a server's power cable / PDU1 (black cable):
U19 PS0 (local): Rack Unit 19 Power Supply 0.
PDU A (remote): Group 2 Output 3 on PDU A (left side, viewed from rear).
G2-3
For data cables the label at the opposite end of the cable is labelled with local/remote exchanged;
for power cables the labels at each end are identical.

1 PDU = Power Distribution Unit.

Oracle Internal and Approved Partners Only Page 8 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Check

UNPACKING
For reference refer to the Oracle® Big Data Appliance Site Checklists. Use the latest
release available here: https://docs.oracle.com/en/bigdata/
Oversee the delivery of the rack(s). It is the responsibility of the delivery company to
unpack and roll the rack into place into the data
center. The Oracle FSE should be present for
oversight to ensure that the delivery company
follows proper procedure (e.g., doesn't roll it over
rough floor).
Determine the Rack Master Serial Number The Rack's Master Serial Number is located on the
and contact your regional Installation top left side wall (viewed from rear) inside the rack
on the rear of the chassis.
Coordinator either by phone or email and
provide this serial number. This is so that
your Installation Coordinator can begin the
process to verify Install Base information is
correct for future entitlement purposes.
Delivery complete?
Collect the white Customer Information The installer should inform the customer about
Sheets (CIS). location of spares (& documentation) kit and
request the customer to safely store the kit outside
Any documentation delivered with the system of the data center room.
should be stored with the spares.
Allow the system to acclimatise (power off) at Refer to EIS standard “Acclimatisation of Oracle
the customer site if required. Hardware Products”.
Collect packing material together for disposal.
Unpack outside data center to ensure no
contamination/dust is released inside
customer's controlled environment.
Verify all packing material has been removed, Fans & air vents must be free to operate.
i.e. nothing is blocked. Metal brace plates screwed on the rear of the rack
where the PDU power cords are tied are brackets
for shipping only. They should be removed if they
were not already removed by the unpacking
engineers, so as to not block airflow out of the rack.
Ensure that the Rack levelling feet have been This will prevent it rolling forward or back while
lowered to stabilize the rack to the floor. working on the rack.
Refer to the Oracle® Rack Cabinet 1242 User’s
Guide Section Stabilize the Rack (Leveling Feet)

Intentionally left blank

Oracle Internal and Approved Partners Only Page 9 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Check
There are a number of spare parts that should be handed to the Customer for safe-keeping
(must be able to locate them when needed!):
• Located inside the rack in a bundle:
• 1 QSFP InfiniBand copper cables (3M) for FRU replacement
• 1 Black Cat5e Ethernet cable (10')
• 1 Red Cat5e Ethernet cable (10')
• 1 Blue Cat5e Ethernet cable (7')
• Located in the ride-along boxes:
• 1 10TB Disk Drive
• Cisco Switch Accessory Kit
• InfiniBand cables for multi-racking (6x 3m & 10x 5m)
• X7-2L Documentation Kit
• Oracle Rack Cabinet 1242 (Foxconn) Accessory Kit
There is also a set of 2 keys to open the rack doors and side panels.

Oracle Internal and Approved Partners Only Page 10 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Check

INITIAL POWER-ON ACTIONS


In an EIS installation do NOT connect the data cables or the Cisco switch to the Customer's
network at this stage. These will be connected later after the factory network settings are
changed to the Customer's network settings.
Connecting Rack PDUs to Power & Confirming Redundant Distribution
Verify that all the breaker switches (1 per group; 6 per PDU) are off before
connecting the power cables.
Connect power to PDU B only. PDU B is on the right-side of rack when viewed
from rear.
When breaker is in the On (|) position, the breakers
are flush with the side of the PDU. When in the
OFF (0) position, the circuit breakers extend
beyond the side of the PDU.
Then switch them all ON on PDU B only one at a
time.
Go through all units within the rack and verify • Oracle Server X7-2L PS1 – the right LED.
that the expected power LEDs (and only these) • CISCO switch: LED on left (viewed from
front) will turn GREEN and middle LEDs for
are ON. the other PDU will be RED.
If one of the expected power LEDs is not ON • IB switch:
then verify that the power cables for PDU B • LED on right (viewed from front) labelled
are properly pressed into the PDU. PS1.

If some other LED is on then something has


been wrongly-cabled and must be fixed NOW.
Additionally connect power to PDU A. PDU A is on the left-side of rack when viewed from
Ensure that for the single phase systems (the rear.
ones with 6 power cords) that: When breaker is in the On (|) position, the breakers
are flush with the side of the PDU. When in the
• PDU_A Input_2 and PDU_B Input_0 must OFF (0) position, the circuit breakers extend
be on the same phase. beyond the side of the PDU.
• PDU_A Input_1 and PDU_B Input_1 must Then switch them all ON on PDU A one at a time.
be on the same phase.
• PDU_A Input_0 and PDU_B Input_2 must
be on the same phase.
These are marked where they come out of the
PDU. Connecting the cables in this manner,
ensures that in the case of a failover the phases
are balanced on both the A and B sides.
Go through all units within the rack and verify that ALL expected power LEDs are ON. If
one of the expected power LEDs is not ON then verify that the power cables for PDU A are
properly pressed into the PDU.
Perform visual check of all cable connections Do NOT press every connector “just in case”.
within the rack.

Oracle Internal and Approved Partners Only Page 11 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Check
Verify that for all systems the OK LED is The OK LED blinks on for 0.1 seconds once every
blinking “standby”. This means that the 3 seconds when in “standby”.
ILOM is up and that the host is off. The system OK LED does NOT flash while ILOM
is booting as it did on past systems. The LED will
If the system does not go into Standby, stay dark until it goes into Standby blink mode after
connect to that unit's SP SER MGT port with 2 to 3 minutes.
baud settings 9600,8,N,1. If it is at the pre-
boot> menu, then check the locate button on
front and rear is not stuck depressed, then type
boot.
Verify that the server’s LEDs have been correctly manufactured (Bug 27416683).
On all X7-2 and X7-2L servers within the rack, press the chassis locate LED for 10 seconds
until all the LED's come on. Verify the LED colours on the front left indicator module are
as follows:
Left:
• Locate LED is White
• Service LED is Amber
• System OK LED is Green
• Do NOT Service LED is White
Right:
• Top Fan LED is Amber
• Rear PS LED is Amber
• Temperature LED is Amber
• SP OK LED is Green.
In particular verify that the Service LED is not Green and the SP OK LED is not Amber. If
the system does not have the correct colours, replace the front left indicator module (part
7322171) and open a CPAS citing Bug 27416683.

Oracle Internal and Approved Partners Only Page 12 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Check

CONFIGURING THE CISCO 93108-1G ETHERNET SWITCH


Starting with the X7 Engineered Systems racks the Cisco 93108-1G model replaces the 4948E-F
as the Ethernet switch for the management network.
We configure the Cisco switch into one big VLAN. More complex switch configuration,
including multiple VLANs, is outside the scope of this service.
We configure the hostname, IP setup and DNS & NTP configurations. We also set all ports to
portfast, ports 1-47 as host ports, and port 48 as a switch port.
• It is NOT recommended to connect the Cisco switch to the customer's network until AFTER
ORACLE Advanced Customer Services (ACS) has re-configured IP addresses on all the
systems within, to prevent any duplicate IP conflicts possible with the default IP addresses the
systems have shipped with.
• The switch comes with Cisco Enterprise Support if under an Oracle Premier Support contract.
Support during installation can be provided by calling Oracle Support on behalf of the
customer, opening a new SR and requesting that the EEST owner open a collaboration SR to
TSC Networks team who have the ability to gain further assistance from Cisco if required.
• Some customers may wish to configure the Cisco Ethernet switch themselves. This may
include having a particular version of the NX-OS software or particular configuration settings
necessary to communicate properly with the rest of the customer's network infrastructure.
This is supported but is outside the scope of this installation service.
• Customers may choose to do provide an alternate switch for the engineered system that
conforms to their internal data center network standards. The customer will be responsible for
the supply, installation, configuration and support with a compatible switch. The alternate
switch should be with an equivalent 1U 48-port Gigabit Ethernet switch. The alternate switch
should only be installed following the successful installation and testing of the engineered
system with the factory fitted switch.
Oracle may be able to offer alternative switch installation and configuration with an additional
time and materials (T&M) based service.
Connect a serial cable between the CISCO An Oracle supplied rollover cable is pre-installed
console and a laptop or similar device. The on the Cisco serial console port. Obtain the
appropriate adapter2 (Cisco P/N 74-0495-01) and
RJ45 console port is now on the front of the connect it at the end of the rollover cable.
switch (i.e. you access it from the front of the
An Oracle P/N 530-3100 RJ45-DB9 adapter as
Engineered System – on the 4948 it was in the used on ILOM ports will also work, connected at
rear). the end of the network cable.
Ensure the following terminal session is Conversion from DB9 to USB on a laptop will
logged on the laptop/client device by scripting require a USB-DB9 serial adapter similar to that
the output. The data can then be used as a used during the IB switch configuration.
reference to ensure the switch has been
correctly configured.
The default serial port speed is 9600 baud, 8 data
bits, 1 stop bit & no parity.

Intentionally left blank

2 The BDA rack's ride-along accessory kit may or may not (depending on Cisco) include this Cisco adapter.

Oracle Internal and Approved Partners Only Page 13 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Check
GETTING STARTED
We understand that manufacturing will wipe the switch configuration before delivery. Thus
when the rack's PDUs are powered up the switch will boot and enter the Basic System
Configuration Dialog. Since the connection to the Cisco serial port has probably been
made after the power was applied the initial output from the switch will probably have
been missed.
Enter “?” then press the RETURN key – you should find one of the following:
1. If you see a prompt similar to <switch_name> login: the switch has been
previously configured – go to Erasing the Configuration below.
2. If you see the prompt:
Abort Auto Provisioning and continue with normal setup ?(yes/no)[n]:
Enter "yes" & go to System Admin Account Setup & Basic Configuration on page 15.
If an additional return was entered and the default response of ‘n’ was selected, power
down the switch and turn it back on.
ERASING THE CONFIGURATION
As mentioned above this step should only be needed if (somehow) the switch was shipped
pre-configured... Firstly log in (hoping that the default password has been used):
orcltsw-adm01 login: admin
Password: welcome1

Erase the configuration:


orcltsw-adm0# write erase

Warning: This command will erase the startup-configuration.

Do you wish to proceed anyway? (y/n) [n] y

orcltsw-adm0#

Then reload the system. This will cause the switch to reboot & there will be a long
dialogue which has been clipped here:
orcltsw-adm0# reload
This command will reboot the system. (y/n)? [n] y
2017 Aug 31 01:09:00 exadatax7-adm0 %$ VDC-1 %$ %PLATFORM-2-PFM_SYSTEM_RESET: Manual
system restart from Command Line Interface
CISCO SWITCH Ver7.59
Device detected on 0:1:2 after 0 msecs
Device detected on 0:1:1 after 0 msecs
Device detected on 0:1:0 after 0 msecs
MCFrequency 1333Mhz
Relocated to memory
Time: 8/31/2017 1:9:22
<SNIP>
INIT: version 2.88 booting
<SNIP>
INIT: Entering runlevel: 3
Running S93thirdparty-script...
Populating conf files for hybrid sysmgr ...
Starting hybrid sysmgr ...
inserting /isan/lib/modules/klm_cisco_nb.o ... done

Oracle Internal and Approved Partners Only Page 14 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Check
The following is the first prompt after the reboot of the switch:
Abort Auto Provisioning and continue with normal setup ?(yes/no)[n]: yes

---- System Admin Account Setup ----

Wait 30 seconds for the next prompt, do not hit enter otherwise it will accept the next
prompt default of 'yes'. If you enter yes by accident, you will have to go through the whole
configuration before you can erase and start again to clear it.
SYSTEM ADMIN ACCOUNT SETUP & BASIC CONFIGURATION
We now set the password for user admin – recommended is welcome1:
Do you want to enforce secure password standard (yes/no) [y]: no

Enter the password for "admin": welcome1


Confirm the password for "admin": welcome1

The Basic System Configuration Dialogue will start. Ignore the request to register the
Cisco Nexus9000 device:
---- Basic System Configuration Dialog VDC: 1 ----

<SNIP>

Would you like to enter the basic configuration dialog (yes/no): yes

Answer “no” to the following questions except when entering the name for the switch:
Create another login account (yes/no) [n]: no

Configure read-only SNMP community string (yes/no) [n]: no

Configure read-write SNMP community string (yes/no) [n]: no

Enter the switch name : orcltsw-adm01 <<< Example name!

Continue with Out-of-band (mgmt0) management configuration? (yes/no) [y]: no

Do NOT configure mgmt 0 (out of band mgmt) – we use In-band-mgmt:


Configure advanced IP options? (yes/no) [n]: yes

Configure static route? (yes/no) [n]: no

Configure the DNS IPv4 address? (yes/no) [n]: no

Configure the default domain name? (yes/no) [n]: no

Enable the telnet service? (yes/no) [n]: no

Enable the SSH service:


Enable the ssh service? (yes/no) [y]: yes

Type of ssh key you would like to generate (dsa/rsa) [rsa]: rsa

Number of rsa key bits <1024-2048> [1024]: 1024

Configure the NTP server:


Configure the ntp server? (yes/no) [n]: yes

NTP server IPv4 address : 10.100.100.2 <<< Example address!

Oracle Internal and Approved Partners Only Page 15 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Check
Final configuration:
Configure default interface layer (L3/L2) [L2]: L2

Configure default switchport interface state (shut/noshut) [noshut]: noshut

Configure CoPP system profile (strict/moderate/lenient/dense) [strict]: lenient

Review of configuration & saving:


The following configuration will be applied:
no password strength-check
switchname orcltsw-adm01
no feature telnet
ssh key rsa 1024 force
feature ssh
ntp server 10.100.100.2
system default switchport
no system default switchport shutdown
copp profile lenient

Would you like to edit the configuration? (yes/no) [n]: no

Use this configuration and save it? (yes/no) [y]: yes


[########################################] 100%
Copy complete.

User Access Verification


orcltsw-adm01 login:

FURTHER CONFIGURATION ACTIONS


Log in and enter config mode:
orcltsw-adm01 login: admin
password: welcome1
orcltsw-adm01# configure terminal
orcltsw-adm01(config)#

Adding the VLAN 1 IP:


orcltsw-adm01(config)# feature interface-vlan
orcltsw-adm01(config)# interface vlan 1
orcltsw-adm01(config-if)# ip address 10.100.100.110/24
orcltsw-adm01(config-if)# no shutdown
orcltsw-adm01(config-if)# exit

Set the default route:


orcltsw-adm01(config)# ip route 0.0.0.0/0 10.100.100.1

where 0.0.0.0/0 really belongs as shown and 10.100.100.1 is to be replaced by the IP


address of the Customer’s default gateway.
Ports 1-47 are designated as expected to be connected to systems and never to switches.
Configure Spanning Tree Protocol (STP) to type "edge" (formerly called "portfast") on
ports 1-47 effectively disabling STP on those ports by skipping the STP algorithm on boot
up:
orcltsw-adm01(config-if)# interface Ethernet 1/1-47
orcltsw-adm01(config-if-range)# spanning-tree port type edge
Edge port type (portfast) should only be enabled on ports connected to a single
host. Connecting hubs, concentrators, switches, bridges, etc... to this
interface when edge port type (portfast) is enabled, can cause temporary bridging loops.
Use with CAUTION
orcltsw-adm01(config-if-range)# exit

Intentionally left blank

Oracle Internal and Approved Partners Only Page 16 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Check
Configure STP to be "network" for port 48, which is designated as the uplink port, and
expected to be connected as uplink to another switch:
orcltsw-adm01(config)# interface Ethernet 1/48
orcltsw-adm01(config-if)# spanning-tree port type network
orcltsw-adm01(config-if)# exit
If you will be connecting the switch to 2 different uplink switches for redundancy on the
same VLAN – OR – using an uplink port other than 48, then change this and the previous
command accordingly. Use "network" for uplink ports so the switch is able to detect and
protect against routing loops and "edge" for non-uplink ports.
Before plugging the uplink port(s) into the customers network consult with their network
administrator to see if the default spanning-tree configuration is acceptable.
Set low STP priority to avoid this switch becoming the "root bridge":
orcltsw-adm01(config)# spanning-tree vlan <1> priority 61440

The value for vlan may be variable depending on Customer uplink settings.
Normally Customers have already configured their core switches to have a better priority
so this problem would not happen at a real Customer.
By default the Cisco treats all ports as Layer3 interfaces. Change the default using
"switchport" by itself to change all ports to act as Layer2 interfaces:
orcltsw-adm01(config)# interface Ethernet 1/1-48
orcltsw-adm01(config-if-range)# switchport
orcltsw-adm01(config-if-range)# exit

Configure the DNS client (this does not work using the install script):
orcltsw-adm01(config)# ip domain-name x7toi.com <<< Example domain name!
orcltsw-adm01(config)# ip name-server 10.100.100.2 <<< Example address!

Setting clock (example to use Pacific Standard Time):


orcltsw-adm01(config)# show clock
20:44:52.986 UTC Thu Aug 31 2017
Time source is NTP
orcltsw-adm01(config)# clock timezone PST -8 0
orcltsw-adm01(config)# show clock
12:46:22.692 PST Thu Aug 31 2017
Time source is NTP

Use the help feature to display possible values for timezone:


orcltsw-adm01(config)# clock timezone ?
WORD Name of time zone, such as PST, MST, CST, EST (Max Size 8)

Exit from config mode:


orcltsw-adm01(config)# exit

VERIFICATION
Verify that all changed settings are correct using the "show running-config" command.
Below is an edited example output. More than likely there will be additional default
settings displayed that we did not set which may be different to the settings required by the
customer's network. If a setting is incorrect and needs to be changed refer to the next task
(on page 18 below).
orcltsw-adm01# show running-config
!Command: show running-config
!Time: Thu Aug 31 13:06:40 2017

version 7.0(3)I5(2)
power redundancy-mode combined force

Oracle Internal and Approved Partners Only Page 17 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Check

switchname orcltsw-adm01
vdc orcltsw-adm01 id 1
limit-resource vlan minimum 16 maximum 4094
limit-resource vrf minimum 2 maximum 4096
limit-resource port-channel minimum 0 maximum 511
limit-resource u4route-mem minimum 248 maximum 248
limit-resource u6route-mem minimum 96 maximum 96
limit-resource m4route-mem minimum 58 maximum 58
limit-resource m6route-mem minimum 8 maximum 8

feature interface-vlan
clock timezone PST -8 0

no password strength-check
username admin password 5 $5$CrlKuGiG$A2skAGr3jmZvBn1fgDYGFoHA9xWrdez9MpuSTHSwo96
role network-admin
ip domain-lookup
ip domain-name x7toi.com
ip name-server 10.100.100.2
system default switchport
copp profile lenient
snmp-server user admin network-admin auth md5 0x962ed71b70064edc73d53a89eb14ea8a
priv 0x962ed71b70064edc73d53a89eb14ea8a localizedkey
rmon event 1 description FATAL(1) owner PMON@FATAL
rmon event 2 description CRITICAL(2) owner PMON@CRITICAL
rmon event 3 description ERROR(3) owner PMON@ERROR
rmon event 4 description WARNING(4) owner PMON@WARNING
rmon event 5 description INFORMATION(5) owner PMON@INFO
ntp server 10.100.100.2 use-vrf default

vlan 1

vrf context management

interface Vlan1
no shutdown
ip address 10.100.100.110/24

interface Ethernet1/1
spanning-tree port type edge
<SNIP>

interface Ethernet1/47
spanning-tree port type edge

interface Ethernet1/48
spanning-tree port type network

interface Ethernet1/49
<SNIP>

interface Ethernet1/54

interface mgmt0
vrf member management
line console
line vty
boot nxos bootflash:/nxos.7.0.3.I5.2.bin
ip route 0.0.0.0/0 10.100.100.1
no system default switchport shutdown
orcltsw-adm01#

If anything in the above list is incorrect, go back and repeat the appropriate section. To
erase a setting whilst in config mode, insert "no" in front of the same command. Any other
settings that the customer requires should be checked and corrected by the customer.

Oracle Internal and Approved Partners Only Page 18 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Check
The installer should send the above output from running-config (minus any password
lines) to the customer network administrator to verify so they may suggest any changes
necessary for attaching to the customer network.
Make the current configuration permanent:
orcltsw-adm01# copy running-config startup-config
[########################################] 100%
Copy complete.
orcltsw-adm01# exit

FINALLY
Disconnect the cable from the CISCO console.
The Cisco switch must NOT be connected to the Customers management network at this
stage. This will be done later after Oracle have configured the systems with the customer's
IP addresses and the customer has verified the running-config on the switch and worked
with the FSE to make any additional changes necessary for attaching to the customer
network.
If you wish to check the Cisco switch attach your laptop to a port on the Cisco and ping the
IP addresses of the BDA internal management network. Do NOT use port 48 on the Cisco
for a laptop – use any other free port or temporarily one of the PDU ports.

Oracle Internal and Approved Partners Only Page 19 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Check

VERIFYING / CONFIGURING RACK PDUS


BDA X7 racks are delivered with the “Enhanced” PDU metering units connected via the Cisco
switch using static IP addresses. The task here is to configure the PDU metering units to the
Customer's management network using static IP addresses. This method is based on the Oracle
Rack Cabinet 1242 Power Distribution Units User’s Guide E87281, page 55 onwards (with
modifications). It is assumed that all BDA X7 racks are delivered with “Enhanced” PDUs with
firmware version 2.07 or later.
The various configuration items that will be entered here into the PDU metering units are to be
taken from the installation template.
THE FOLLOWING STEPS ARE REQUIRED FOR BOTH PDUS A B

Connect an RS-232 cable between the SER MGT port and the host (e.g. Laptop or other
suitable system).
Configure the host’s terminal or terminal Terminal Configuration Settings:
emulator (it is assumed that the installer • 9600 baud
knows how to do this – otherwise see page 58 • 8 bit
of the PDU User’s Guide). • 1 stop bit
• no parity bit
• no flow control
At the terminal device, log in to the PDU User = admin, pwd = adm1n
metering unit: Password may be: welcome1
After successful login, enter the Customer’s network configuration:
pducli->set net_ipv4_dhcp=Off
pducli->set net_ipv4_ipaddr=xxx.xxx.xxx.xxx
pducli->set net_ipv4_subnet=xxx.xxx.xxx.xxx
pducli->set net_ipv4_gateway=xxx.xxx.xxx.xxx

Successful commands will be acknowledged with:


pducli->set OK => PDU-Reset required to apply changes ('reset=yes')!!!

It is sufficient to perform the reset just once (as shown in the task below).
The PDU can be optionally configured for DNS with:
pducli->set net_ipv4_dns1=xxx.xxx.xxx.xxx
pducli->set net_ipv4_dns2=xxx.xxx.xxx.xxx

Reset the PDU metering unit which will cause pducli->reset=yes


PDU reset! New login required!
it to reboot.
Remove the RS-232 cable from the SER MGT
port.
Connect to the PDU using a browser on a https://<NEW IP Address of PDU>
laptop attached to the Cisco switch with an IP You will probably need to accept the security note
address on the same subnet as the Customer's in the web browser when attaching to the PDU via
network. Alternatively connect via a https.
crossover cable (or standard Ethernet cable to If the network configuration was successful the web
a laptop running a 1G connection). browser will display the Metering Overview page.

Click on the Net Configuration link found in You will need to login:
the upper left side of the page to view the IP User = admin, pwd = adm1n
settings. Password may be: welcome1

Oracle Internal and Approved Partners Only Page 20 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Check
SETTING PDU SYSTEM TIME, NTP & INFORMATION A B

Select the System Time tab.


Configure the Manual Settings: Enter the current date and time and click Submit.

Configure the NTP-Server Settings:


• Check the Enable box
• Enter the NTP Server (only one allowed) from the Installation Template
• Select Time Zone from pull-down
• Click Submit
If NTP is not enabled, rebooting the PDU resets the date to 1970 January 1.
Select the PDU Information tab – refer to the Oracle Rack Cabinet 1242 Power
Distribution Units User’s Guide, Section Set the PDU Information (Enhanced PDU).
• Enter the hostname of PDU from Installation Template into the Name field (format
similar to EDX72sw-pdua0)
• Enter BDA X7-2 into the Product Identifier field (case-sensitive).
• Enter the Rack Serial Number (format similar to AK12345678)
• Optionally set the Location field. This should refer to where the PDU is located (e.g.
Lab2002-Rack04.
• Click Submit
Log out from the PDU: Click on Logout.

Repeat the above steps for PDU B.

Oracle Internal and Approved Partners Only Page 21 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Switch
1 2

CONFIGURING THE SUN DATACENTER 36-PORT MANAGED QDR


INFINIBAND GATEWAY SWITCHES (NM2-GW)
The Spine switch is configured later (page 28) separately from the gateway switches configured
here.
Connect a serial cable between the IB A USB to DB9 serial adapter is pre-wired to each IB
switch's USB serial adapter and a laptop or switch's USB port that provides the serial console
access.
similar device.
A DB9-DB9 null-modem cable is included in the
The default serial port speed is 115200 baud, 8 bits, ship kit.
no parity, 1 stop bit, no handshake & flow control
none. Since most laptops do not have a DB9 serial port you
will need a second serial to USB converter cable.
The terminal settings may need to be changed Either supply your own USB-to-DB9 adapter, or re-
depending on the terminal type on the laptop end of use one of the IB switch adapters (OS drivers may be
the serial cable e.g. required) or order F350-1519 FRU which includes
TERM=vt100; export TERM one with an OS drivers CD.
Login as user ilom-admin: localhost: ilom-admin
password: welcome1
The switch OS is Linux-based but has an ILOM supports up and down arrows for command-
ILOM interface that will be used to make the line history, and left and right for command-line
necessary configuration changes. editing which should be used to make these steps
easier to complete. Tab can also be used for
command-line completion where possible.
CONFIGURING THE SWITCHES
Set the switch hostname without using the domain name – the Installation Template
contains the hostnames to be used (example):
-> set /SP hostname=bda1sw-ib2 (or -ib3)
-> show /SP hostname
/SP
Properties:
hostname = bda1sw-ib2

Note the Gateway switches are referred to as Leaf 1 and Leaf 2 however the default
hostname scheme usually refers to them as ib2 and ib3, where ib1 is the non-gateway spine
switch.
Set the DNS server and domain name:
-> set /SP/clients/dns auto_dns=enabled
-> set /SP/clients/dns nameserver=<IP address>
-> set /SP/clients/dns searchpath=<domain name>

where <IP address> is up to three comma separated name server IP addresses in preferred
search order. e.g. “nameserver=10.196.23.245,138.2.202.15”
where <domain name> is the customer's full DNS domain name (everything after the host
name and first dot).e.g. “searchpath=us.oracle.com”.

Intentionally left blank

Oracle Internal and Approved Partners Only Page 22 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Switch
1 2
Verify the DNS settings:
-> show /SP/clients/dns
/SP/clients/dns
Targets:

Properties:
auto_dns = enabled
nameserver = 10.196.23.245, 138.2.202.15
retries = 1
searchpath = us.oracle.com
timeout = 5

<SNIP>

Configure the Switch management network settings:


-> cd /SP/network
-> set pendingipaddress=10.196.16.152
-> set pendingipgateway=10.196.23.254
-> set pendingipnetmask=255.255.248.0
-> set pendingipdiscovery=static
-> set commitpending=true
-> show
/SP/network
Targets:
test
Properties:
commitpending = (Cannot show property)
dhcp_server_ip = none
ipaddress = 10.196.16.152
ipdiscovery = static
ipgateway = 10.196.23.254
ipnetmask = 255.255.248.0
macaddress = 00:E0:4B:38:77:7E
pendingipaddress = 10.196.16.152
pendingipdiscovery = static
pendingipgateway = 10.196.23.254
pendingipnetmask = 255.255.248.0
state = enabled
<SNIP>
->

If any of the “ip<parameter>” values are wrong, correct them repeating the above
“pendingip<parameter>” settings followed by commitpending=true.

Intentionally left blank

Oracle Internal and Approved Partners Only Page 23 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Switch
1 2
Set the Timezone. Setting the timezone first ensures the offset from UTC is maintained
correctly as normally the hardware clock is kept in UTC. If the date is not close to current
time, then set it after setting the timezone:
-> show /SP/clock (Verify the current setting)
-> set /SP/clock timezone=<zone identifier>
-> show /SP/clock (Verify the new setting displays correctly)
where <zone identifiter> is the identifier of the proper timezone file, such as “US/Eastern”
or “America/New_York”. This should be provided by the Customer on the configuration
worksheet.
Time zone data provided with the Oracle Big Data Appliance and Oracle Enterprise Linux
comes from the zoneinfo database. For a reference list of latest time zone values, refer to
the zoneinfo database available in file /usr/share/zoneinfo on one of the Linux-based server
nodes, or in the public domain available via http://www.iana.org/time-zones
The timezone files available supplied on the IB switch may not have the latest that is on the
above site.
Set the SP clock manually to something near current time, if not already.
-> show /SP/clock (Verify the current setting)
-> set /SP/clock datetime=MMddHHmmCCyy
-> show /SP/clock (Verify the new setting displays correctly)

using the format MMddHHmmCCyy Month, Day, Hour, Minute, Century, Year.
Configure the NTP settings. NTP is critical to the operation of the BDA applications:
-> set /SP/clients/ntp/server/number address=IP_address
Where number can be 1 or 2 depending on how many and which NTP server you are
configuring and IP_address is the address of that server. Use “1” for the primary NTP
server and repeat the command using “2” for the secondary.
-> set /SP/clock usentpserver=enabled
If the customer does not use NTP on their network, then the first two BDA server nodes
should be configured as NTP servers, prior to the deployment scripts being run. Failure to
have a proper clock synchronized will cause the deployment scripts for Hadoop to fail. For
specific instructions on how to configure NTP server on Linux, refer to MOS Document ID
1554253.1.
VERIFY THE SETTINGS
-> show /SP/clients/ntp/server/1
/SP/clients/ntp/server/1
Targets:

Properties:
address = 10.204.74.2

Commands:
cd
set
show

Intentionally left blank

Oracle Internal and Approved Partners Only Page 24 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Switch
1 2
-> show /SP/clients/ntp/server/2
/SP/clients/ntp/server/2
Targets:

Properties:
address = 10.196.16.1

Commands:
cd
set
show
-> show /SP/clock
/SP/clock
Targets:

Properties:
datetime = Mon Jan 30 11:53:19 2012
timezone = EST (America/New_York)
usentpserver = enabled

Commands:
cd
set
show

Verify that the Rack Master Serial Number is populated on the InfiniBand Gateway Switch
ILOM:
-> show /SP system_identifier

/SP
Properties:
system_identifier = Oracle Big Data Appliance X7-2 AK012345678

should display the rack master serial number.


Correct this if it is not set properly. If you exceed the 64 character limit, you will receive a
"set: invalid property value" error.
-> set /SP system_identifier="Oracle Big Data Appliance x7-2 <Rack Master S/N>"

DETERMINE SWITCH HEALTH STATUS


Examine the Firmware version: -> version
SP firmware 2.2.7-1
The version shown here is 2.2.7-1. SP firmware build number: 118629
SP firmware date: Fri Jun 9 14:23:48
The original X7 systems were released with CEST 2017
SP filesystem version: 0.0.3
firmware version 2.2.7-1 and as of November
2017 were still being shipped with that
version.
The InfiniBand switches within the BDA X7 racks should be installed with firmware 2.2.7-
1 by manufacturing. If you receive a BDA X7 rack that does NOT have firmware 2.2.7-1
you should contact the Solution Center for advice.

Intentionally left blank

Oracle Internal and Approved Partners Only Page 25 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Switch
1 2
Enter the Fabric Management shell.:
-> start /SYS/Fabric_Mgmt
Are you sure you want to start /SYS/Fabric_Mgmt (y/n)? y

NOTE: start /SYS/Fabric_Mgmt will launch a restricted Linux shell.


User can execute switch diagnosis, SM Configuration and IB
monitoring commands in the shell. To view the list of commands,
use "help" at rsh prompt.

Use exit command at rsh prompt to revert back to


ILOM shell.

FabMan@bda1sw-ib2->

Check the overall health of the switch: If there are issues discovered at this point,
FabMan@bda1sw-ib2-> showunhealthy they must be corrected.
OK - No unhealthy sensors

General environment test: FabMan@bda1sw-ib2->env_test


Environment test started:
Starting Environment Daemon test:
Ensure that all tests return “OK”. Environment daemon running
Environment Daemon test returned OK
Fans 1, 2 & 3 are expected; Fans 0 & 4 not Starting Voltage test:
Voltage ECB OK
present is also expected. This switch shares Measured 3.3V Main = 3.27 V
the same chassis as the 72-port switch which Measured 3.3V Standby = 3.37 V
Measured 12V = 11.97 V
requires those extra fans, the 36-port does Measured 5V = 5.04 V
Measured VBAT = 3.10 V
not. All OK and PASSED results should be Measured 2.5V = 2.50 V
positive indication that everything is normal. Measured 1.8V = 1.78 V
Measured I4 1.2V = 1.22 V
The example output shown here is from a Voltage test returned OK
Starting PSU test:
switch with FW 2.2.7-1. Output from other PSU 0 present OK
PSU 1 present OK
versions may vary. PSU test returned OK
Starting Temperature test:
Back temperature 32
Front temperature 32
SP temperature 51
Switch temperature 42, maxtemperature 43
Temperature test returned OK
Starting FAN test:
Fan 0 not present
Fan 1 running at rpm 12317
Fan 2 running at rpm 12317
Fan 3 running at rpm 12426
Fan 4 not present
FAN test returned OK
Starting Connector test:
Connector test returned OK
Starting Onboard ibdevice test:
Switch OK
All Internal ibdevices OK
Onboard ibdevice test returned OK
Starting SSD test:
SSD test returned OK
Starting Auto-link-disable test:
Auto-link-disable test returned OK
Environment test PASSED

Intentionally left blank

Oracle Internal and Approved Partners Only Page 26 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Switch
1 2
Use the setsmpriority list command to determine the current priority setting:
FabMan@bda1sw-ib2->setsmpriority list
Current SM settings:
smpriority 5
controlled_handover TRUE
subnet_prefix 0xfe80000000000000
M_Key None
Routing engine FatTree
FabMan@bda1sw-ib2->

Leaf1 (ib2) and Leaf2 (ib3) switches should be set to 5.


If it is not correct, then disable subnet manager and use the setsmpriority command to set it
correctly:
FabMan@bda1sw-ib2->disablesm
Stopping partitiond daemon. [ OK ]
Stopping IB Subnet Manager.. [ OK ]
FabMan@bda1sw-ib2->

Setting the priority to 5 (leaf switches) ib2 & ib3:


FabMan@bda1sw-ib2->setsmpriority 5
Current SM settings:
smpriority
5 controlled_handover TRUE
subnet_prefix 0xfe80000000000000
FabMan@bda1sw-ib2->

Restart the IB Subnet manager:


FabMan@bda1sw-ib2->enablesm
Starting IB Subnet Manager. [ OK ]
Starting partitiond daemon. [ OK ]
FabMan@bda1sw-ib2->

Logout from the IB switch. Exit the Fabric Manager shell:


FabMan@bda1sw-ib2-> exit

Reboot the switch to ensure all the changes take effect.:


-> reset /SP
Are you sure you want to reset /SP (y/n)? y
Performing reset on /SP
Broadcast message from root (Wed Sep 7 09:27:29 2016):
The system is going down for reboot NOW!
-> Connection to switch_name closed by remote host.
Connection to switch_name closed.

The switch LEDs will flash while rebooting, it will take ~5 minutes with no serial output
until it is completely booted which will give the login prompt again.
Disconnect the laptop's serial cable from the IB switch's USB to DB9 adapter port, leaving
the USB to DB9 adapter wired in the rack.

Oracle Internal and Approved Partners Only Page 27 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Check

SPINE SWITCH INSTALLATION


The spine switch is always pre-installed by manufacturing within each BDA rack.
Follow the instructions in the EIS checklist Sun Data Center InfiniBand Switch 36 as a
Spine Switch in an Engineered System Rack, starting at Configuring the InfiniBand Spine
Switch (page #6).
CONFIGURING SUBNET MANAGERS
If the BDA rack is intended to be multi-racked Refer to the following documents:
to other racks via InfiniBand you should • Oracle Exadata Database Machine
follow the advice in the EIS Checklist for Maintenance Guide, Section Understanding the
Network Subnet Manager Master.
Multi-Rack Cabling of Engineered Systems
• Oracle Exadata Database Machine Extending
(via InfiniBand). and Multi-Rack Cabling Guide, Section
Preparing to Cable Racks Together.
• Big Data Appliance Owner's Guide, Section
Understanding the Network Subnet Manager
Master.
• In mixed cablings with Exalogic or BDA racks,
refer to MOS Document ID 1682501.1.
If the BDA rack will be standalone and not connected to other racks with InfiniBand, then
setup the IB switches so that the Spine switch is the subnet manager master as follows.
Connect to the InfiniBand Spine switch (RU1)
using a serial cable or Ethernet cable to the
Cisco Ethernet switch (preferred).
Login as user ilom-admin: localhost: ilom-admin
password: welcome1
The switch OS is Linux-based but has an
The above password will (should) only be true for
ILOM interface that will be used to make the newly-installed racks (for production racks the
necessary configuration changes. customer will have changed the password).
ILOM supports up and down arrows for command-
line history, and left and right for command-line
editing which should be used to make these steps
easier to complete. Tab can also be used for
command-line completion where possible.
Enter the Fabric Management shell:
-> start /SYS/Fabric_Mgmt
Are you sure you want to start /SYS/Fabric_Mgmt (y/n)? y

NOTE: start /SYS/Fabric_Mgmt will launch a restricted Linux shell.


User can execute switch diagnosis, SM Configuration and IB
monitoring commands in the shell. To view the list of commands,
use "help" at rsh prompt.

Use exit command at rsh prompt to revert back to


ILOM shell.

FabMan@bda1sw-ib1->

Oracle Internal and Approved Partners Only Page 28 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Check
Use the setsmpriority list command to determine the current settings for priority and
controlled handover. For the Spine switch they should be as follows:
FabMan@bda1sw-ib1-> setsmpriority list
Current SM settings:
smpriority 8
controlled_handover TRUE
subnet_prefix 0xfe80000000000000
M_Key None

If one of the settings need to be changed, perform the procedure in the following steps:
• disablesm
• setsmpriority and/or setcontrolledhandover
• enablesm
The detailed procedure is shown in the following tasks / rows.
Use the disablesm command to stop the Subnet Manager:
FabMan@bda1sw-ib1-> disablesm
Stopping IB Subnet Manager.. [ OK ]

Use the setsmpriority command to set the priority to 8:


FabMan@bda1sw-ib1-> setsmpriority 8
Current SM settings:
smpriority 8
controlled_handover TRUE
subnet_prefix 0xfe80000000000000
M_Key None
FabMan@bda1sw-ib1->

Use the setcontrolledhandover command to set the controlled_handover to TRUE:


FabMan@bda1sw-ib1-> setcontrolledhandover TRUE

Use the enablesm command to restart the Subnet Manager:


FabMan@bda1sw-ib1-> enablesm
Starting IB Subnet Manager. [ OK ]
Starting partitiond daemon. [ OK ]

The smnodes list needs to contain the IP addresses of all switches which have Subnet
Manager enabled so that partition configuration can be synchronized across all these
switches.
Check if the smnodes list exists, and if it does not have the IP's of all the switches listed,
then add or delete them as needed:
FabMan@bda1sw-ib1-> smnodes list
FabMan@bda1sw-ib1-> smnodes add IP_address IP_address ...

or
FabMan@bda1sw-ib1-> smnodes delete IP_address IP_address ...
FabMan@bda1sw-ib1-> smnodes list
Logout from the InfiniBand switch ILOM Exit the Fabric Manager shell:
shell. FabMan@bda1sw-ib1->exit

Exit the ILOM shell:


-> exit

Login to the InfiniBand switch Linux as root localhost: root


password: welcome1
and reboot the switch to ensure all the changes [root@localhost ~]# reboot
take effect.

Oracle Internal and Approved Partners Only Page 29 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Check
Disconnect the laptop’s serial cable from the
InfiniBand switch’s USB-to-DB9 adapter port
or the laptop’s Ethernet cable from the Cisco
Ethernet switch.
REPEATING ABOVE ACTIONS ON GW LEAF SWITCH WITH PRIORITY 5 2 3

Log into each Gateway Leaf switch in turn.


Enter the Fabric Management shell as above: -> start /SYS/Fabric_Mgmt

Use the setsmpriority command to set the priority to 5:


FabMan@bda1sw-ib1-> setsmpriority 5
Current SM settings:
smpriority 5
controlled_handover TRUE
subnet_prefix 0xfe80000000000000
M_Key None
FabMan@bda1sw-ib2->

Use the disablesm command to stop the Subnet Manager:


FabMan@bda1sw-ib2-> disablesm
Stopping IB Subnet Manager.. [ OK ]

Use the setcontrolledhandover command to set the controlled_handover to TRUE:


FabMan@bda1sw-ib2-> setcontrolledhandover TRUE

Use the enablesm command to restart the Subnet Manager:


FabMan@bda1sw-ib2-> enablesm
Starting IB Subnet Manager. [ OK ]
Starting partitiond daemon. [ OK ]

Check if the smnodes list exists, and if it does not have the correct IP's of all the switches
listed, then add or delete them as needed:
FabMan@bda1sw-ib1-> smnodes list
FabMan@bda1sw-ib1-> smnodes add IP_address IP_address ...

or
FabMan@bda1sw-ib1-> smnodes delete IP_address IP_address ...
FabMan@bda1sw-ib1-> smnodes list
Logout from the InfiniBand switch ILOM Exit the Fabric Manager shell:
shell. FabMan@bda1sw-ib1->exit

Exit the ILOM shell:


-> exit

Login to the InfiniBand switch Linux as root localhost: root


password: welcome1
and reboot the switch to ensure all the changes [root@localhost ~]# reboot
take effect.
Disconnect the laptop’s serial cable from the
InfiniBand switch’s USB-to-DB9 adapter port
or the laptop’s Ethernet cable from the Cisco
Ethernet switch.
LOCATING THE MASTER SUBNET MANAGER
We must ensure that the Master Subnet Manager is running on the Spine switch. First we
check on which switch the Master Subnet Manager is currently running. If needed, we then
relocate the Master Subnet Manager to another switch.

Oracle Internal and Approved Partners Only Page 30 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Check
Connect to any InfiniBand switch using a
serial cable or Ethernet cable to the Cisco
Ethernet switch (preferred).
Login as ilom-admin: localhost: ilom-admin
password: welcome1
The switch OS is Linux-based but has an ILOM supports up and down arrows for command-
ILOM interface that will be used to make the line history and left and right for command-line
necessary configuration changes. editing which should be used to make these steps
easier to complete. Tab can also be used for
command-line completion where possible.
Enter the Fabric Management shell as above: -> start /SYS/Fabric_Mgmt

Use the getmaster command to check the location of the Master Subnet Manager. The
following example shows that the Master Subnet Manager is currently running on the first
leaf switch:
FabMan@bda1sw-ib1-> getmaster
Local SM enabled and running, state STAND BY
20110207 11:34:04 OpenSM Master on Switch : 0x0021286cccb6a0a0 ports 36 Sun DCS
36 QDR switch bda1sw-ib2 enhanced port 0 lid 1 lmc 0

If the Master Subnet Manager is running on the Spine switch, no further action is required;
go to the next Section of this checklist..
If the Master Subnet Manager is not running on the Spine switch, you need to relocate the
Master Subnet Manager as described in the following steps.
Log in as user ilom-admin on the switch that
is the current Master Subnet Manager and
enter the fabric manager shell.
Use the disablesm command to stop the Subnet Manager. The Master Subnet Manager
will then failover to another switch.
Wait ten seconds (may be longer for larger multi-rack cablings) for the InfiniBand network
to update and then use the getmaster command to identify the current location of the
Master Subnet Manager. Based on the previous configuration of the priority setting the
Master Subnet Manager should now be running on the Spine switch.
The following example shows that the Master Subnet Manager has been relocated from the
first leaf switch:
FabMan@bda1sw-ib2-> disablesm
Stopping IB Subnet Manager.. [ OK ]
FabMan@bda1sw-ib2-> getmaster
20110207 11:34:04 OpenSM Master on Switch : 0x0021286cccb6a0a0 ports 36 Sun DCS 36
QDR switch bda1sw-ib1 enhanced port 0 lid 1 lmc 0

Use the enablesm command to re-enable the Subnet Manager on any switches where the
Subnet Manager has been disabled during this procedure:
FabMan@bda1sw-ib2-> enablesm
Starting IB Subnet Manager. [ OK ]
Starting partitiond daemon. [ OK ]

Oracle Internal and Approved Partners Only Page 31 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Switch
1 2

IB SWITCHES INSTANCE CHECK


When installing an BDA system, the installer must check all the gateway switch instance numbers
(NM2-GW) in the fabric to make sure the automatic setup assigned different instance numbers for
each switch. The instance numbers must be unique within each rack.
This also applies across all racks within a multi-rack environment – for details refer to the EIS
Installation Checklist for InfiniBand Multi-Rack Cabling of Engineered Systems (a separate
service)
The switches must not use consecutive numbers. The recommendation is to use even numbers.
From the factory – the rack will use the numbers 10 & 20. Use this schema if you have to make
changes. Examine the IB Gateway Switch FW 2.0 Product Notes:
http://docs.oracle.com/cd/E26699_01/pdf/E26705.pdf for current issues and workarounds.
Login as user ilom-admin on GW1: localhost: ilom-admin
password: welcome1
The switch OS is Linux-based but has an ILOM supports up and down arrows for command-
ILOM interface that will be used to make line history and left and right for command-line
the necessary configuration changes. If editing which should be used to make these steps
logged in as root, then start ILOM interface easier to complete. Tab can also be used for
command-line completion where possible.
with spsh.
Login over SSH may take some time due to
looking for DNS that is not yet accessible.
Enter the Fabric Management shell.:
-> start /SYS/Fabric_Mgmt
Are you sure you want to start /SYS/Fabric_Mgmt (y/n)? y

NOTE: start /SYS/Fabric_Mgmt will launch a restricted Linux shell.


User can execute switch diagnosis, SM Configuration and IB
monitoring commands in the shell. To view the list of commands,
use "help" at rsh prompt.

Use exit command at rsh prompt to revert back to


ILOM shell.

FabMan@bda1sw-ib2->

On GW1 display which number is in use. Example:


FabMan@bda1sw-ib2->setgwinstance --list
No BXM system name set, using 6 last bits of the ip-address, value: 24

If it reports No BXM system name set... (as shown above) or the system name is set to
0, then it MUST be set manually to a value between 0 and 63. Even numbered values are
preferred.
If GW1 is not 10, then set it to 10 following the factory scheme:
FabMan@bda1sw-ib2->setgwinstance 10
Stopping Bridge Manager..-. [ OK ]
Starting Bridge Manager. [ OK ]
FabMan@bda1sw-ib2->setgwinstance --list
BXM system name set to 10

Now repeat the above for GW2 using 20 as the value (in place of 10).

Oracle Internal and Approved Partners Only Page 32 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Check

BDA SERVER NODES


Server Node numbering starts with 1 at the bottom of the rack working upwards.
If for any reason you need to connect to the ILOM Serial Management port for debug, the baud
rate setting on BDA systems is 9600,8,N,1.
Power on all compute nodes: For each node: press power button on front panel.

Each server will boot itself up through BIOS and boot the OS with the default factory IP
configuration.
The servers may take 5 – 10 minutes to boot through the normal BIOS POST tests.
From a laptop connected to the Cisco It is recommended NOT to use port 48 on the Cisco for
switch on 192.168.1.x, SSH to the BDA a laptop – use any other free port or temporarily one of
the PDU ports.
ILOM at 192.168.1.101.
Example:
Then connect to the console & login to the -> start /SP/console
host. <press RETURN key>
bda01 login:
User: root
Password: welcome1
You may have to press Enter to wake up the system
first.
The system checks are done using the dcli command to run across all the nodes in the rack
at the same time, using SSH authorization keys. The dcli command defaults to using the
known BDA configuration JSON files hence there is no need to specify the -g option to
dcli.
For Elastic configurations you may need to use the appropriate number between 7 and 18
with the '-j' option for each dcli command as follows:
[root@bda01 bda]# dcli -j "eth0_ips[1:9]" "hostname ; date"

where the eth0_ips[1:n] is the total number of "n" nodes in the rack. If the -j option is
omitted, then there will be a delay in the command while the non-existent hosts wait for
SSH to timeout.
SSH keys should have already been distributed across the rack during the factory
configuration. Verify SSH keys:
[root@bda01 ~]# dcli "hostname ; date"

If this asks you for a password then enter Control-C (several times) and continue,
otherwise go to the next step.
To generate the root SSH keys and push them across the rack with the default 'welcome1'
password, use the provided script:
[root@bda01 ~]# setup-root-ssh -p welcome1

Add the following option to the above setup-root-ssh command according to the
configuration:
Starter rack: -j "eth0_ips[1:6]"
Full rack: -j "eth0_ips[1:18]"
Elastic config: -j "eth0_ips[1:x]" where x is between 7 and 18.
Re-verify by re-running:
[root@bda01 ~]# dcli "hostname ; date"

Oracle Internal and Approved Partners Only Page 33 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Check
Verify the System Serial Numbers (check against front of systems S/N sticker) are correct
for each node assignment, where 1 is the lowest system, and 18 is the highest system.
[root@bda01 ~]# dcli "dmidecode -s chassis-serial-number"
192.168.10.1: # SMBIOS implementations newer than version 2.8 are not
192.168.10.1: # fully supported by this version of dmidecode.
192.168.10.1: 1733XC2033
...
192.168.10.6: # SMBIOS implementations newer than version 2.8 are not
192.168.10.6: # fully supported by this version of dmidecode.
192.168.10.6: 1733XC2034

Verify the Model is set correctly:


[root@bda01 ~]# dcli "ipmitool sunoem cli 'show /System'" | grep model
192.168.1.1: model = BDA X7-2 starter
192.168.1.1: component_model = ORACLE SERVER X7-2L
...
192.168.1.6: model = BDA X7-2 starter
192.168.1.6: component_model = ORACLE SERVER X7-2L

Verify the Rack Master Serial Number is set correctly, check against the rack front S/N
sticker:
[root@bda01 ~]# dcli "ipmitool sunoem cli 'show /SP system_identifier'" | grep =
192.168.10.1: system_identifier = Oracle Big Data Appliance AK00024695
...
192.168.10.18: system_identifier = Oracle Big Data Appliance AK00024695

If the Rack Master Serial Number is incorrect, insert it into the ILOM on every system
(refer to the IP addresses on page 7):
Enter the following command (on one line, no break):
[root@db01 ~]# dcli -l root \
"ipmitool sunoem cli 'set /SP system_identifier=\
"\"Oracle Big Data Appliance AK00024695\""'" \
> /tmp/set-rack-csn.out

Where <R-MSN> is the Rack Master Serial Number (e.g. AK00024695).


Note that the system_identifier has a 40 character limit so if too much text is entered
between Appliance and the end of the serial number for the R-MSN value, you will
receive a "set: invalid property value" error.
Verify that the 2 IB ports per server are both LinkUp:
[root@bda01 ~]# dcli ibstatus | grep phys
192.168.10.1: phys state: 5: LinkUp
192.168.10.1: phys state: 5: LinkUp
..
192.168.10.18: phys state: 5: LinkUp
192.168.10.18: phys state: 5: LinkUp

Verify that the IB ports are running QDR:


[root@bda01 ~]# dcli ibstatus | grep rate
192.168.10.1: rate: 40 Gb/sec (4X QDR)
192.168.10.1: rate: 40 Gb/sec (4X QDR)
..
192.168.10.18: rate: 40 Gb/sec (4X QDR)
192.168.10.18: rate: 40 Gb/sec (4X QDR)

2 per server – both should be at QDR 40 Gb/sec.


Verify that there are no faults seen by the ILOMs:
[root@bda01 ~]# dcli 'ipmitool sunoem cli "show faulty"'

There should be none.

Oracle Internal and Approved Partners Only Page 34 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Check
After powering up each server you should The first one should already be there from the factory,
see the following two files in the /root but the second one is generated after you power up the
system at the customer site.
directory:
If instead you find the file BDA_IMAGING_FAILED,
BDA_IMAGING_SUCCEEDED or BDA_REBOOT_FAILED one or more of our
hardware or software checks failed. The files
BDA_REBOOT_SUCCEEDED /root/bda_imaging_status and
[root@bda01 ~]# dcli ls -l /root/BDA* /root/bda_reboot_status will give you more detailed
information on what checks succeeded or failed.
The following checks should also help identify any
issues that should be rectified before continuing to the
ACS portion.
Gather the hardware profile output from each system into a file for review:
[root@bda01 ~]# dcli bdacheckhw > ~/all-bdahwcheck.out

Verify there are no failures in the hardware profile output:


[root@bda01 ~]# grep -vi success ~/all-bdahwcheck.out

If there are no issues the above command should return nothing. If there is an issue, it will
return this as the output. Optionally, use “less” or “more” to page through the output file
which verifies the hardware configuration is supported, in the correct slots. Any failures
and warnings need to be investigated and rectified before continuing.
If there are any INFO lines, these can be ignored. Action any failed checks.
Specific lines that are worthy of grep'ing out for individual component verification are:
[root@bda01 ~]# grep cores ~/all-bdahwcheck.out should be 96.
[root@bda01 ~]# grep memory ~/all-bdahwcheck.out should be ~252 (256GB).
[root@bda01 ~]# grep fans ~/all-bdahwcheck.out should be 4.
[root@bda01 ~]# grep supply ~/all-bdahwcheck.out should be all OK.

An easy way to verify that the power supplies are all present is to use wc -l to count the
number of output lines, there should be 12 for a Starter, an appropriate multiple of 2 for
elastic configurations and 36 for Full Rack.
Verify that the disks and volumes are all present:
[root@bda01 ~]# grep disk ~/all-bdahwcheck.out | grep
"model\|status" | more should be LSI with disks 0 to 11 the same model and all
“Online, Spun Up No alert”
If using “wc -l” to count lines replace “more” with “|grep Online |wc -l”, which will
count the disks online – should be 72 for a Starter, an appropriate multiple of 12 for elastic
configurations and 216 for a Full Rack.
Verify that the IB HCA is being seen properly:
[root@bda01 ~]# grep Host ~/all-bdahwcheck.out | grep model
should be:
SUCCESS: Correct Host Channel Adapter model: Mellanox Technologies MT27500 [ConnectX-3]

If Memory DIMM faults are encountered during the installation or upgrade of any
engineered systems, please follow the "BEST PRACTICE" as supplied by the hardware
team – refer to page 2 of this checklist.

Oracle Internal and Approved Partners Only Page 35 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Check
Verify that the RAID volumes are all present and in optimal state:
[root@bda01 ~]# dcli MegaCli64 -ldinfo -lall -a0 | grep
"Virtual Drive\|State" > ~/all-ldstate.out
[root@bda01 ~]# less ~/all-ldstate.out

should be Optimal 12 virtual drives numbered 0 to 11 for each server node. If using “wc
-l” to count lines, change “less” to “grep Optimal” - there should be 72 for a Starter,
an appropriate multiple of 12 for elastic configurations and 216 for a Full Rack.
Gather the software profile output from each system into a file for review:
[root@bda01 ~]# dcli bdachecksw > ~/all-bdaswcheck.out

The software profile output checks the partition setup and software versions. Check there
are no failures in the software profile output:
[root@bda01 ~]# grep -vi success ~/all-bdaswcheck.out

If there are no issues the above command should return nothing. If there is an issue, it will
return this as the output. Optionally, use less or more to page through the output file
which verifies the software configuration is supported. If there are any INFO lines, these
can be ignored. If there are any failures then the system OS may need to be re-imaged for
re-partitioning.
Verify the boot order is correct on all of the nodes:
[root@bda01 ~]# dcli efibootmgr
192.168.10.1: BootCurrent: 0000
192.168.10.1: Timeout: 1 seconds
192.168.10.1: BootOrder: 0000,0004,0005,0006,0007
192.168.10.1: Boot0000* Oracle Big Data Appliance 4
192.168.10.1: Boot0004* NET0:PXE IP4 Intel(R) I210 Gigabit Network Connection
192.168.10.1: Boot0005* Oracle Linux
192.168.10.1: Boot0006* Oracle Linux
192.168.10.1: Boot0007* Oracle Linux
[...]
192.168.10.4: BootCurrent: 0000
192.168.10.4: Timeout: 1 seconds
192.168.10.4: BootOrder: 0000,0001,0002,0003,0004
192.168.10.4: Boot0000* Oracle Big Data Appliance 4
192.168.10.4: Boot0001* NET0:PXE IP4 Intel(R) I210 Gigabit Network Connection
192.168.10.4: Boot0002* Oracle Linux
192.168.10.4: Boot0003* Oracle Linux
192.168.10.4: Boot0004* Oracle Linux

Should be BootOrder: 0002,0000,0001


• 1st is Boot0002* Oracle Linux
• 2nd is Boot0000* Oracle Big Data Appliance 4
• 3rd and last is Boot0001* NET0:PXE IP4 Intel(R) I210 Gigabit Network Connection.
If any are in incorrect order, log into the affected node and use the efibootmgr -- bootorder
command:
[root@bda01 ~]# efibootmgr --bootorder 0002,0000,0001
BootCurrent: 0002
Timeout: 1 seconds
BootOrder: 0002,0000,0001
Boot0000* Oracle Big Data Appliance 4
Boot0001* NET0:PXE IP4 Intel(R) I210 Gigabit Network Connection
Boot0002* Oracle Linux

NOTE: If upgrading to a new BDA server base image is required for any reason, then it is
recommended that this be done now, prior to any additional setup of customer network or
connections, although re-imaging should not matter if it is network configured or not.
ACS, EEST, or TSC x86 can provide instructions and the image if necessary.

Oracle Internal and Approved Partners Only Page 36 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Check

VERIFICATION OF THE INFINIBAND NETWORK


Perform visual check of all IB cable Visually check the IB cabling – that the lights on the
connections within the rack. ports are ON. Look through the cable management
arms to ensure that the expected LEDs are ON.
Do NOT press every connector “just in case”.
Verify InfiniBand topology (example of fully-operational system) before running the
configuration steps, as the IB connections are used extensively during the install for inter-
node communication. The bdacheckib utility is in the default root user's path, normally
stored in /opt/oracle/bda/bin in the base OS image.
For single-racks, run the following command:
[root@bda01 ~]# bdacheckib -s for the shipped configuration.
LINK bdasw-ib1.0B ... bdasw-ib3.8B UP
LINK bdasw-ib1.1B ... bdasw-ib2.8B UP
LINK bdasw-ib3.15A ... bda02.HCA-1.2 UP
LINK bdasw-ib3.15B ... bda01.HCA-1.2 UP
LINK bdasw-ib3.14A ... bda04.HCA-1.2 UP
LINK bdasw-ib3.14B ... bda03.HCA-1.2 UP
LINK bdasw-ib3.13A ... bda06.HCA-1.2 UP
LINK bdasw-ib3.13B ... bda05.HCA-1.2 UP
LINK bdasw-ib3.12B ... bda07.HCA-1.2 UP
LINK bdasw-ib3.11A ... bdasw-ib2.11B UP
LINK bdasw-ib3.9B ... bdasw-ib2.9A UP
LINK bdasw-ib3.9A ... bdasw-ib2.9B UP
LINK bdasw-ib3.10B ... bdasw-ib2.10A UP
LINK bdasw-ib3.10A ... bdasw-ib2.10B UP
LINK bdasw-ib3.11B ... bdasw-ib2.11A UP
LINK bdasw-ib3.12A ... bda08.HCA-1.2 UP
LINK bdasw-ib3.0B ... bda17.HCA-1.1 UP
LINK bdasw-ib3.0A ... bda18.HCA-1.1 UP
LINK bdasw-ib3.1B ... bda15.HCA-1.1 UP
LINK bdasw-ib3.1A ... bda16.HCA-1.1 UP
LINK bdasw-ib3.2B ... bda13.HCA-1.1 UP
LINK bdasw-ib3.2A ... bda14.HCA-1.1 UP
LINK bdasw-ib3.3B ... bda11.HCA-1.1 UP
LINK bdasw-ib3.3A ... bda12.HCA-1.1 UP
LINK bdasw-ib3.4B ... bda09.HCA-1.2 UP
LINK bdasw-ib3.4A ... bda10.HCA-1.1 UP
LINK bdasw-ib3.8A ... bdasw-ib2.8A UP
LINK bdasw-ib3.8B ... bdasw-ib1.0B UP
LINK bdasw-ib2.15A ... bda02.HCA-1.1 UP
LINK bdasw-ib2.15B ... bda01.HCA-1.1 UP
LINK bdasw-ib2.14A ... bda04.HCA-1.1 UP
LINK bdasw-ib2.14B ... bda03.HCA-1.1 UP
LINK bdasw-ib2.13A ... bda06.HCA-1.1 UP
LINK bdasw-ib2.13B ... bda05.HCA-1.1 UP
LINK bdasw-ib2.12B ... bda07.HCA-1.1 UP
LINK bdasw-ib2.11A ... bdasw-ib3.11B UP
LINK bdasw-ib2.9B ... bdasw-ib3.9A UP
LINK bdasw-ib2.9A ... bdasw-ib3.9B UP
LINK bdasw-ib2.10B ... bdasw-ib3.10A UP
LINK bdasw-ib2.10A ... bdasw-ib3.10B UP
LINK bdasw-ib2.11B ... bdasw-ib3.11A UP
LINK bdasw-ib2.12A ... bda08.HCA-1.1 UP
LINK bdasw-ib2.0B ... bda17.HCA-1.2 UP
LINK bdasw-ib2.0A ... bda18.HCA-1.2 UP
LINK bdasw-ib2.1B ... bda15.HCA-1.2 UP
LINK bdasw-ib2.1A ... bda16.HCA-1.2 UP
LINK bdasw-ib2.2B ... bda13.HCA-1.2 UP
LINK bdasw-ib2.2A ... bda14.HCA-1.2 UP
LINK bdasw-ib2.3B ... bda11.HCA-1.2 UP
LINK bdasw-ib2.3A ... bda12.HCA-1.2 UP
LINK bdasw-ib2.4B ... bda09.HCA-1.1 UP

Oracle Internal and Approved Partners Only Page 37 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Check
LINK bdasw-ib2.4A ... bda10.HCA-1.2 UP
LINK bdasw-ib2.8A ... bdasw-ib3.8A UP
LINK bdasw-ib2.8B ... bdasw-ib1.1B UP
[root@bda1 ~]#

For more extensive checking of the fabric, review the output from "iblinkinfo".

Oracle Internal and Approved Partners Only Page 38 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Check

CUSTOMER NETWORK PREPARATION


ACTIONS IF BASE IMAGE PREVIOUS TO V4.10.0-3
Special network configuration instructions for configuring the network on already shipped
BDA racks shipped with a Base Image Base Image version of 4.10.1 or 4.10.0-x where x is
less than 3.
The instructions here are based on the contents of Network Configuration Instructions for
Shipped BDA Racks with a BDA Base Image Less Than V4.5.0 (MOS Document ID
2135358.1).
Use imageinfo to determine which version of the Base Image is installed.
Example showing v4.10.0:
[root@bda01 ~]# imageinfo
Big Data Appliance Image Info

IMAGE_VERSION : 4.10.0
IMAGE_CREATION_DATE : Fri Sep 29 04:58:55 UTC 2017
IMAGE_LABEL : BDA_MAIN_LINUX.X64_170927
LINUX_VERSION : Oracle Linux Server release 6.9
KERNEL_VERSION : 4.1.12-94.5.9.el6uek.x86_64
BDA_RPM_VERSION : bda-4.10.0-1.el6.x86_64
JDK_VERSION : jdk1.8.0_141-1.8.0_141-fcs.x86_64

Example showing v4.10.1:


[root@bda01 ~]# imageinfo
Big Data Appliance Image Info

IMAGE_VERSION : 4.10.1
IMAGE_CREATION_DATE : Tue Oct 24 06:34:17 UTC 2017
IMAGE_LABEL : BDA_4.10.0_LINUX.X64_RELEASE
LINUX_VERSION : Oracle Linux Server release 6.9
KERNEL_VERSION : 4.1.12-94.5.9.el6uek.x86_64
BDA_RPM_VERSION : bda-4.10.1-1.el6.x86_64
JDK_VERSION : jdk1.8.0_141-1.8.0_141-fcs.x86_64

If 4.10.0-3 is shown, go straight to Section Initial Configuration of the Network (below).


Copy the BDA v4.10.0-3 rpm (downloaded during the Preparation activities on page 5 of
this checklist).to the first node in the rack. For example, use scp or equivalent from your
laptop to transfer into /tmp.
Copy this new BDA rpm to all nodes of the rack (using the factory default admin IP
addresses):
[root@bda01 ~]# dcli -d /tmp -f bda-4.10.0-3.el6.x86_64.rpm

Update the BDA rpm on all nodes & remove the package afterwards::
[root@bda01 ~]# dcli "rpm -Uvh /tmp/bda-4.10.0-3.el6.x86_64.rpm"
[root@bda01 ~]# dcli "rm -f /tmp/bda-4.10.0-3.el6.x86_64.rpm"

After applying the patch, imageinfo will $ rpm -q bda


bda-4.10.0-3.el6.x86_64
NOT be changed, but the rpm version will
be updated. To see the rpm version update
run the following command.

Intentionally left blank

Oracle Internal and Approved Partners Only Page 39 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Check
INITIAL CONFIGURATION OF THE NETWORK
The HW installation engineer will now perform the initial configuration of the network
based on the customer's configuration provided using the network setup JSON files
provided by the ACS engineer using the installation tracking website. There are two
scripts used to setup the network, one script to be run before connecting the Cisco and
Client 10GbE to the customer's networks and one script to be run after.
Next the customer-specific json files obtained from ACS will be copied to the BDA Server
1 via scp or USB thumb drive. Detailed instructions for using the USB drive follow; if
using scp go to the Section COPY CUSTOMER_SPECIFIC NETWORK.JSON FILE on the
next page.
COPYING FILE TO BDA SERVER 1 FROM USB MEDIA
Insert USB drive3 into the first server (db01) and locate the drive:
# for x in `blkid | cut -d: -f1 | grep -i sd` ; do udevadm info -q property -n $x
| grep -iq "id_bus=usb" ; if [ $? -eq 0 ] ; then echo $x ; fi ; done
/dev/sdb1
#

From the above output, one can see that the USB primary partition to mount is /dev/sdb1.
Alternative method for locating the USB partition to mount:
1. Locate the drive from the kernel messages log (some lines below have wrapped):
[root@node8 oracle.SupportTools]# tail -20 /var/log/messages
Apr 15 16:15:48 node8 lsidiagd-monitor: lsidiagd is alive
Apr 15 20:26:06 node8 kernel: usb 2-1.4: new high speed USB device number 3 using ehci_hcd
Apr 15 20:26:06 node8 kernel: usb 2-1.4: New USB device found, idVendor=0781, idProduct=5530
Apr 15 20:26:06 node8 kernel: usb 2-1.4: New USB device strings: Mfr=1, Product=2, SerialNumber=3

Apr 15 20:26:06 node8 kernel: usb 2-1.4: Product: Cruzer

Apr 15 20:26:06 node8 kernel: usb 2-1.4: SerialNumber: 20060775110297303815


Apr 15 20:26:06 node8 kernel: scsi7 : usb-storage 2-1.4:1.0
Apr 15 20:26:07 node8 kernel: scsi 7:0:0:0: Direct-Access SanDisk Cruzer 1.01 PQ:
0 ANSI: 2
Apr 15 20:26:07 node8 kernel: sd 7:0:0:0: Attached scsi generic sg1 type 0
Apr 15 20:26:07 node8 kernel: sd 7:0:0:0: [sdb] 15695871 512-byte logical blocks: (8.03 GB/7.48
GiB)
Apr 15 20:26:07 node8 kernel: sd 7:0:0:0: [sdb] Write Protect is off
Apr 15 20:26:07 node8 kernel: sd 7:0:0:0: [sdb] No Caching mode page present
Apr 15 20:26:07 node8 kernel: sd 7:0:0:0: [sdb] Assuming drive cache: write through
Apr 15 20:26:07 node8 kernel: sd 7:0:0:0: [sdb] No Caching mode page present
Apr 15 20:26:07 node8 kernel: sd 7:0:0:0: [sdb] Assuming drive cache: write through
Apr 15 20:26:07 node8 kernel: sdb: sdb1
Apr 15 20:26:07 node8 kernel: sd 7:0:0:0: [sdb] No Caching mode page present
Apr 15 20:26:07 node8 kernel: sd 7:0:0:0: [sdb] Assuming drive cache: write through
Apr 15 20:26:07 node8 kernel: sd 7:0:0:0: [sdb] Attached SCSI removable disk

From the above one can see that the new device is sdb.
2. Output from fdisk for a FAT32-formatted USB stick:
[root@node8 oracle.SupportTools]# fdisk -l /dev/sdb
Disk /dev/sdb: 8036 MB, 8036285952 bytes
255 heads, 63 sectors/track, 977 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Device Boot Start End Blocks Id System


/dev/sdb1 1 977 7847721 b W95 FAT32

From the above output one can see that the USB primary partition to mount is sdb1.

3 Confirm that the file system type on the USB stick is FAT32 in order to prevent Linux mounting problems during
the installation.

Oracle Internal and Approved Partners Only Page 40 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Check
Create a directory for mounting the USB # mkdir /mnt/usb
drive on the database server using the
following command:
Mount the device. Use the device name Example:
given in the first part of this subsection: # mount -t vfat /dev/sdb1 /mnt/usb

COPY CUSTOMER_SPECIFIC NETWORK.JSON FILE


Copy and rename (remove the prefix) the customer-specific json files obtained from ACS
to the “/opt/oracle/bda” directory on BDA Server 1 via scp or USB thumb drive:
[root@bda01 ~]# cp xxxx-rack-network.json /opt/oracle/bda/rack-network.json
[root@bda01 ~]# cp xxxx-cluster-network.json /opt/oracle/bda/cluster-network.json

If using USB drive: Unmount the drive and # umount /mnt/usb


remove it from the system:
CONNECTING TO CUSTOMER'S MANAGEMENT NETWORK
If logged into the host via SSH, then logout Example:
and reconnect to the host’s ILOM to connect #Password:
ssh root@192.168.1.101
welcome1
to its console: -> start -f /HOST/console
• Login via SSH to the server's default IP for bda01 login:
ILOM on page 7. User: root
Password: welcome1
• Login to the host “root” account with default
password “welcome1” You may have to press Enter to wake up the system
first.
This can also be done by a serial connection Since the network setup script restarts the host
directly to the ILOM using the “SER MGT” port. network services, if SSH is connected directly to the
host, then an error and option will be given to pass to
the script to ignore this. In this case the system will
need to be reconnected to after the script is complete
and not all output may be seen.
The first script rack-networksetup pushes the configuration in the rack-network.json
file (as copied from the configuration output <rack-name>-rack-network.json file
above) with the customer specific network settings for the servers and ILOMs.
[root@bda01 ~]# ./rack-networksetup
rack-networksetup: do basic sanity checks on /opt/oracle/bda/rack-network.json
and /opt/oracle/bda/cluster-network.jso n
rack-networksetup: passed
rack-networksetup: Found 6 node(s) accessible
rack-networksetup: If this number is not the total number of nodes,
rack-networksetup: please make sure that all nodes are ON and connected correctly
rack-networksetup: Type 'r' if you want to retry (If you are waiting for nodes to
boot and become accessible)
rack-networksetup: Type 'n' to abort network setup
rack-networksetup: Type 'y' to continue configuring 6 node(s)
Continue using 6 nodes? [r/n/y]:
y
rack-networksetup: checking for rack-expansion.json
rack-networksetup: ping servers on ship admin network by ip
rack-networksetup: passed
...

Intentionally left blank

Oracle Internal and Approved Partners Only Page 41 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Check
If the above fails due to ssh keys, then follow the instruction given to run
“/opt/oracle/bda/bin/remove-root-ssh” first, and if necessary also “rm
/root/.ssh/authorized_keys ” then re-run “rack-networksetup”.
Since the network setup script also changes the ILOM network address, at the end of the
script the connection may appear to be hung. In fact it has simply changed networks and is
no longer accessible.
After completing the above step, the systems will have IP addresses suitable for the
Customer's network. The network on the node connected to will also be restarted. The
installer must now take steps so that their laptop is on the customer’s network range:
• Disconnect PDU A’s Ethernet cable.
• Change the installer laptop to PDU A’s IP and netmask (NO GATEWAY)
Now the “Reconnect” step in the next task should work.
Reconnect to the host’s ILOM to connect to Example:
# ssh root@10.100.50.101
its console:
Password: welcome1
• Login via SSH to the server's new IP for
ILOM. -> start -f /HOST/console

• Login to the host “root” account with default bda01 login:


password “welcome1” User: root
The "rack-networksetup" script should be Password: welcome1
complete, and the host console sitting at the root You may have to press Enter to wake up the system
prompt. first.
The script output will verify the other nodes are
If you are on ILOM “SER MGT” port, reconnecting changed in its output. Verify the network changes
may not be necessary. If it is, press “Esc (“ to return were done on the first node as well. This should
to the ILOM prompt before reconnecting to the show the customer’s IP address:
console. [root@bda1c01 network]# pwd
/opt/oracle/bda/network
[root@bda1c01 network]# ifconfig eth0

Intentionally left blank

Oracle Internal and Approved Partners Only Page 42 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Check
Connect the Cisco switch port 48 to the Customer's management network – the Customer's
network administrator may wish to perform this step. There is already a blue network
cable attached to this port, coiled and tied off within the cabinet. It may have been used for
earlier configuration work. It may be used for this connection, or replaced with a customer
supplied cable if its not sufficient length or incorrect colour.
The Cisco switch should not be connected until the running configuration has been verified
and any necessary changes have been made by the customer's network administrator.
After connection ensure that you can get to the management addresses from outside the
switch.
If the customer wishes to use the SFP+ ports for a fibre uplink in port 48, then the interface
setting for port 48 needs to be changed as follows:
bda1sw-ip#configure terminal

Enter configuration commands, one per line. End with CNTL/Z.


bda1sw-ip(config)#interface gigabitEthernet 1/48
bda1sw-ip(config-if)#media-type sfp
bda1sw-ip(config-if)#end
bda1sw-ip#
*Sep 15 14:12:06.309: %SYS-5-CONFIG_I: Configured from console by console
bda1sw-ip#write memory
bda1sw-ip#copy running-config startup-config

Run the bdachecknet-rack command:


[root@bdalc01 bin]# bdachecknet-rack | tee -a /tmp/bdachecknet_rack.out
bdachecknet-rack: do basic sanity checks on /opt/oracle/bda/rack-network.json and
/opt/oracle/bda/cluster-network.json
bdachecknet-rack: passed
bdachecknet-rack: checking for rack-expansion.json
bdachecknet-rack: ping test private infiniband ips (bondib0 40gbs)
bdachecknet-rack: passed
bdachecknet-rack: ping test admin ips (eth0 1gbs)
bdachecknet-rack: passed
bdachecknet-rack: test client network (eoib) resolve and reverse resolve
bdachecknet-rack: passed
bdachecknet-rack: test client name array matches ip array
bdachecknet-rack: passed
bdachecknet-rack: ping servers on client network by ip
bdachecknet-rack: passed
bdachecknet-rack: test ntp servers
bdachecknet-rack: passed
bdachecknet-rack: ping client gateway
bdachecknet-rack: passed
bdachecknet-rack: test arp -a
bdachecknet-rack: passed
bdachecknet-rack: all checks succeeded
[root@bdax72bur09node01 bin]#

Intentionally left blank

Oracle Internal and Approved Partners Only Page 43 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Check
CONNECTING TO CUSTOMER'S CLIENT DATA NETWORK
Oracle recommends the following guidelines for connecting 10GbE connections to the IB
Gateway switches on the BDA:
• The same number of 10GbE connections should be made to both IB Gateway switches.
• If connecting between 1-4 10GbE connections to each switch, use a single QSFP
splitter cable to the 0A-ETH port of both switches.
• If connecting between 5-8 10GbE connections to each switch, use 2 QSFP splitter
cables to both 0A-ETH and 1A-ETH ports of both switches. In this case divide
connections as evenly as possible between the 2 splitter cables.
• When connecting multiple 10GbE connections to a single QSFP splitter cable, make
connections starting with the lowest numbered port and counting upwards e.g. for 2
connections use 0A-ETH1 and 0A-ETH2.
• 10GbE connections should be made to exactly the same ports on both IB Gateway
switches. If connections are made to 0A-ETH1 and 0A-ETH2 on one switch,
connections should only be made to 0A-ETH1 and 0A-ETH2 on the other switch.
Connect 10Gb Ethernet cables between the In simplest configuration there will be 1 cable each
BDA InfiniBand gateway switch ports and between the Customer's network and each InfiniBand
gateway switch. If there are additional network
the customer's network. connections used the number of cables will multiply
Once the cables are routed, the Customer's accordingly.
network administrator may need to perform
some network switch end configuration to
bring the links up.
Verify the 10GbE gateway switch links are up on both IB Gateway leaf switches. The
minimum supported configuration is 1 10GbE link on each IB Gateway leaf switch.
Connect to the IB gateway switch via SSH, and login as ilom-admin user. Then enter the
Fabric Management shell:
-> start /SYS/Fabric_Mgmt
Are you sure you want to start /SYS/Fabric_Mgmt (y/n)? y

NOTE: start /SYS/Fabric_Mgmt will launch a restricted Linux shell.


User can execute switch diagnosis, SM Configuration and IB
monitoring commands in the shell. To view the list of commands,
use "help" at rsh prompt.

Use exit command at rsh prompt to revert back to


ILOM shell.

FabMan@bda1sw-ib2->

Intentionally left blank

Oracle Internal and Approved Partners Only Page 44 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Check
Following Task to be Performed on both IB Gateway Switches 1 2
(Leaf 1 & 2 with hostnames ib2 & ib3 respectively)
Run the following, and check the Bridge entries have active links up for all ports that are
connected from the IB Gateway leaf switches to the customer's network switch. In this
example, 4 ports are connected on this switch:
FabMan@bda1sw-ib2-> listlinkup

Connector 0A-ETH Present
Bridge-0 Port 0A-ETH-1 (Bridge-0-2) up (Enabled)
Bridge-0 Port 0A-ETH-2 (Bridge-0-2) up (Enabled)
Bridge-0 Port 0A-ETH-3 (Bridge-0-1) up (Enabled)
Bridge-0 Port 0A-ETH-4 (Bridge-0-1) down (Enabled)
Connector 1A-ETH Present
Bridge-1 Port 1A-ETH-1 (Bridge-1-2) up (Enabled)
Bridge-1 Port 1A-ETH-2 (Bridge-1-2) up (Enabled)
Bridge-1 Port 1A-ETH-3 (Bridge-1-1) up (Enabled)
Bridge-1 Port 1A-ETH-4 (Bridge-1-1) down (Enabled)

Note: If the customer's using an Oracle Sun Network 10GbE Switch 72p, then the links
may not come up until the appropriate ports are enabled. Ensure the EIS checklist for that
switch has been completed as well.
Now verify that the DNS and NTP servers can be pinged from an IBGW switch. Log into
IBGW1 (ib2) as root and ping each of the customer specified 1G router, DNS and NTP
servers. If DNS can’t be contacted, cluster-networksetup (below) will likely fail.
If the router or DNS/NTP servers cannot be pinged, this should be resolved before moving
forward to cluster-networksetup.
Use the 2nd script cluster-networksetup to push out and create the VNICs and
10GbE bond interfaces on the servers:
[root@bda1c01 network]# ./cluster-networksetup

Make sure you capture the output to a file. The output file should be provided to your
Install Co-ordinator to upload to the install tracker so the ACS engineer can review this
prior to coming on-site to do the SW install.
Now reboot all the nodes and the InfiniBand Gateway switches.
Verify the VNICs are created properly on each switch as follows. Connect to the IB
gateway switch via ssh and login as ilom-admin user.
Then enter the Fabric Management shell:
-> start /SYS/Fabric_Mgmt
Are you sure you want to start /SYS/Fabric_Mgmt (y/n)? y

NOTE: start /SYS/Fabric_Mgmt will launch a restricted Linux shell.


User can execute switch diagnosis, SM Configuration and IB
monitoring commands in the shell. To view the list of commands,
use "help" at rsh prompt.

Use exit command at rsh prompt to revert back to


ILOM shell.

FabMan@bda1sw-ib2->

Oracle Internal and Approved Partners Only Page 45 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Check
Each active port should be assigned the default VLAN (vlan id=0): 1 2
FabMan@hostname-> showvlan
Connector/LAG VLN PKEY
------------- --- ----
0A-ETH-1 0 ffff
0A-ETH-3 0 ffff
1A-ETH-1 0 ffff
1A-ETH-3 0 ffff

If the interfaces are configured with Link Aggregation (LAG), you will see this instead of
the connector port/link name:
FabMan@bda1sw-ib2->showvlan
Connector/LAG VLN PKEY
------------- --- ------
LAG-01 0 0xffff

Finally you should see VNICs created round-robin on each server and 10GbE interface
(output cut for brevity):
FabMan@hostname-> showvnics
ID STATE FLG IOA_GUID NODE IID
MAC VLN PKEY GW
--- -------- --- ----------------------- -------------------------------- ----
----------------- --- ---- --------
561 UP N 0021280001CF4C23 bda1node12 BDA 192.168.41.31 0000
CE:4C:23:85:2B:0A NO ffff 0A-ETH-1
...<SNIP>

Also from the BDA nodes, make sure you can ping out the 10Gb interfaces and verify the
customer can ping into the BDA nodes 10Gb interfaces from the customer's data network.
NTP VERIFICATION ON THE INFINIBAND SWITCHES
Login as root to the IB switches and verify that the NTP service is running properly.
Note: The first example is of the internal Cisco NTP server. The second example is a
failure because NTP was not enabled on the NTP server. Note the transmit and receive. If
it is not responsive you will only see transmits with no receives.
First example:
[root@edx1sw-iba ~]# ntpdate -ud 129.148.9.196
6 Dec 17:49:40 ntpdate[30753]: ntpdate 4.2.6p2@1.2194-o Fri Jun 9 11:44:57 UTC 2017 (1)
Looking for host 129.148.9.196 and service ntp
host found : 129.148.9.196transmit(129.148.9.196)
receive(129.148.9.196)
transmit(129.148.9.196)
receive(129.148.9.196)
transmit(129.148.9.196)
receive(129.148.9.196)
transmit(129.148.9.196)
receive(129.148.9.196)
transmit(129.148.9.196)
server 129.148.9.196, port 123
stratum 8, precision -19, leap 00, trust 000
refid [129.148.9.196], delay 0.02647, dispersion 0.00000
transmitted 4, in filter 4
reference time: ddd2ac39.95a4fe78 Wed, Dec 6 2017 17:49:45.584
originate timestamp: ddd2ac44.444d1403 Wed, Dec 6 2017 17:49:56.266
transmit timestamp: ddd2ac44.442bb16f Wed, Dec 6 2017 17:49:56.266
filter delay: 0.02657 0.02655 0.02647 0.02654
0.00000 0.00000 0.00000 0.00000
filter offset: 0.000021 0.000036 0.000021 0.000026
0.000000 0.000000 0.000000 0.000000
delay 0.02647, dispersion 0.00000
offset 0.000021

6 Dec 17:49:58 ntpdate[30753]: adjust time server 129.148.9.196 offset 0.000021 sec

Oracle Internal and Approved Partners Only Page 46 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Check
Second Example:
[root@hyw1r002sw-iba01 ~]# ntpdate -ud 129.148.9.196
6 Dec 17:54:45 ntpdate[32489]: ntpdate 4.2.6p2@1.2194-o Fri Jun 9 11:44:57 UTC 2017 (1)
Looking for host 129.148.9.196 and service ntp
host found : 129.148.9.196transmit(129.148.9.196)
transmit(129.148.9.196)
transmit(129.148.9.196)
transmit(129.148.9.196)
transmit(129.148.9.196)129.148.9.196: Server dropped: no data
server 129.148.9.196, port 123
stratum 0, precision 0, leap 00, trust 000
refid [129.148.9.196], delay 0.00000, dispersion 64.00000
transmitted 4, in filter 4
reference time: 00000000.00000000 Thu, Feb 7 2036 6:28:16.000
originate timestamp: 00000000.00000000 Thu, Feb 7 2036 6:28:16.000
transmit timestamp: ddd2ad75.b4f468c3 Wed, Dec 6 2017 17:55:01.706
filter delay: 0.00000 0.00000 0.00000 0.00000
0.00000 0.00000 0.00000 0.00000
filter offset: 0.000000 0.000000 0.000000 0.000000
0.000000 0.000000 0.000000 0.000000
delay 0.00000, dispersion 64.00000
offset 0.000000

6 Dec 17:55:03 ntpdate[32489]: no server suitable for synchronization found

VERIFYING NETWORK SERVICES ARE ACCESSIBLE


SSH keys may have changed with the change of network settings. Verify SSH keys and
dcli is still available:
[root@bda1c01 ~]# cd /opt/oracle/bda
[root@bda1c01 bda]# dcli "hostname ; date"

If this asks you for a password then enter Control-C (several times) and continue,
otherwise go to the next step.
To generate the root SSH keys and push them across the rack with the default 'welcome1'
password, use the provided script:
[root@bda1c01 bda]# setup-root-ssh -p welcome1

Re-verify by re-running:
[root@bda1c01 bda]# dcli "hostname ; date"

Note that there is some uncertainty as to whether the above is entirely correct. Feedback
via the EIS Support alias would be appreciated.
Using the date output from above, check all If systems are not within a difference of a few
systems and switches are clock seconds, then the installation scripts will fail.
Manually correct any times that are too large for NTP
synchronized and also seeing the NTP to correct automatically to something close that NTP
server correctly. can correct.
If not, reboot each box or switch and monitor the
console boot messages, or check the routing through
the Cisco switch to the NTP server.
Also ntpq -p 127.0.0.1 can be used to verify
the NTP client is running and configuration is correct.

Oracle Internal and Approved Partners Only Page 47 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Check
Use the bdachecknet-cluster script to verify all network connectivity and expected
network services are working properly.
Pipe the output to tee in order to log the output to a file. The output file should be
provided to your Install Co-ordinator to upload to the install tracker so the ACS engineer
can review this prior to coming on-site to do the SW install.
[root@bda1c01 network]# bdachecknet-cluster | tee -a /tmp/bdachecknet_cluster.out
bdachecknet-cluster: do basic sanity checks on /opt/oracle/bda/rack-network.json
and /opt/oracle/bda/cluster-network.json
bdachecknet-cluster: passed
bdachecknet-cluster: checking for rack-expansion.json
bdachecknet-cluster: ping test private infiniband ips (bondib0 40gbs)
bdachecknet-cluster: passed
bdachecknet-cluster: ping test admin ips (eth0 1gbs)
bdachecknet-cluster: passed
bdachecknet-cluster: test client network (eoib) resolve and reverse resolve
bdachecknet-cluster: passed
bdachecknet-cluster: test client name array matches ip array
bdachecknet-cluster: passed
bdachecknet-cluster: ping servers on client network by ip
bdachecknet-cluster: passed
bdachecknet-cluster: test ntp servers
bdachecknet-cluster: passed
bdachecknet-cluster: ping client gateway
bdachecknet-cluster: passed
bdachecknet-cluster: test arp -a
bdachecknet-cluster: passed
bdachecknet-cluster: test vnics for this node
host if status actv primary switch gw
port ping gw vlan
======================= === ====== ==== ===== ==================== ========= ======= ====
bdax72bur09node01 eth8 up no no bdax72bur09sw-ib2 0A-ETH-1 no N/A
bdax72bur09node01 eth9 up yes yes bdax72bur09sw-ib3 0A-ETH-1 yes N/A
1Ping gtw error on host bdax72bur09node01, interface eth8, switch bdax72bur09sw-ib2, port
0A-ETH-1
bdachecknet-cluster: network checks failed
[root@bdacl01 bin]#

If systems are not able to resolve and use DNS names, then the installation scripts will fail.
Manually correct any /etc/resolv.conf files and verify the reverse lookup as well.
Verify the PDU metering units are Enter the metering unit’s static IP address or hostname
accessible from a laptop or system into the browser’s address line. If the network
configuration was successful, the browser displays the
connected to the Customer's management Current Measurement page.
network.
If the PDUs are not accessible then the installation
Use a web browser to log on to the PDU scripts will fail.
metering unit.

Oracle Internal and Approved Partners Only Page 48 of 51 Vn 1.1b Created: 15 Feb 2018
Task Check

AUTOMATIC SERVICE REQUEST (ASR) MANAGER SETUP


When a server is being installed according to the EIS Methodology and ASR is to be
configured it is assumed that the customer already has a configured and operational ASR
Manager system available. ASR setup on the BDA rack itself is done by ACS during the
steps of the BDAMammoth utility (with exception of InfiniBand switches which are
performed by FSE below). Those steps will fail if the ASR Manager is not already
installed, setup and accessible from the BDA.
If the customer's ASR Manager host is not yet available for installation now and will not be
available by the time ACS is on-site, then alert the ACS engineer so they can change the
configuration setup files to skip the ASR steps of the BDAMammoth utility.

Task Comment Check

CONFIGURE THE INFINIBAND SWITCHES TO USE ASR


If the rack is not to be configured for ASR, go to page 51 in this checklist.
The ASR activation step within the BDAMammoth utility does NOT include the InfiniBand
switches – hence this task needs to be performed here. Further details are available in MOS
Document ID 1902710.1.
For the InfiniBand switches there are three steps:
1. Enabling the ILOM telemetry.
2. Activating the ASR Asset on the ASR Manager system.
3. Completion of activation in MOS (will be done by ACS engineer).
In this checklist the Destination Port (the SNMP listener port, Default value = 0) should be
changed to 162, unless the customer needs to use a different port, in which case the port needs to
be changed on the ASR Manager system as well – see Section 4.12.2 of the ASR Installation and
Operations Guide).
ON EACH SWITCH: ENABLING ILOM TELEMETRY 1 2 3

Connect a serial cable between the IB switch's localhost: ilom-admin


password: welcome1
USB serial adapter and a laptop or similar device.
Login as ilom-admin:
Select an Alert ID that is not used – there is no -> cd /SP/alertmgmt/rules
-> show 1
way to display all Alert IDs so you have to step ......
through until an unused ID is found. -> show 15

The following examples assume ID=1. An unused ID is indicated by the line:


level = disable

Change directory to the selected unused Alert -> cd /SP/alertmgmt/rules/<ID>


ID (1 – 15).

Intentionally left blank

Oracle Internal and Approved Partners Only Page 49 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Check
Set various values:
-> set destination=<ASR_mgr_IP> <<< IP address of ASR Manager.
-> set destination_port=162
-> set level=minor
-> set snmp_version=2c
-> set community_or_username=public
-> set type=snmptrap

The Destination Port should be set to 162 (see note above).


Verify the settings for the selected Alert ID:
> show <ID>

/SP/alertmgmt/rules/1
Targets:

Properties:
community_or_username = public
destination = <ASR_mgr_IP> <<< IP address of ASR Manager.
destination_port = 162
email_custom_sender = (none)
email_message_prefix = (none)
event_class_filter = (none)
event_type_filter = (none)
level = minor
snmp_version = 2c
testrule = (Cannot show property)
type = snmptrap

Log out from switch’s SP. -> exit

Disconnect the laptop's serial cable from the IB switch's USB to DB9 adapter port, leaving
the USB to DB9 adapter wired in the rack & move to next IB switch.
ACTIVATE THE ASR ASSET
Log into the ASR Manager system as user Via ssh or rlogin.
root.
# asr activate_asset -i <IP-of-SW>
Activate the ASR Asset for the switch’s SP.
During the ASR registration the technical contact for the system(s) will receive an email
with report results.
Confirm that the switch has been activated via asr list_asset. If you omit the -i option you
will see all assets – this may be a very long list.
# asr list_asset -i <IP-of-SW>
IP_ADDRESS HOST_NAME SERIAL_NUMBER ASR PROTOCOL SOURCE PRODUCT_NAME
------------- --------- ------------- ------- -------- ------ -------------
10.172.144.76 MySwitch 1013AK208D Enabled SNMP ILOM Sun Network QDR InfiniBand GW
Switch

Disconnect from the ASR manager. # exit

Oracle Internal and Approved Partners Only Page 50 of 51 Vn 1.1b Created: 15 Feb 2018
Task Comment Check

BDA MAMMOTH UTILITY


After completing the above steps, the systems will be on the Customer's network and ready
for the BDA Customer software bundle install which is installed and configured using the
“BDAMammoth” utility which ACS will complete.

Task Comment Check

HANDOVER
The installer should ensure that their laptop’s Ethernet cable is disconnected and that the
PDU A Ethernet cable is plugged back in.
Hand over to ORACLE ACS Software Engineer (ASE).
• Ensure the file outputs /tmp/bdachecknet_*.out from the bdachecknet_rack,
bdachecknet_cluster and cluster-networksetup scripts are given to the Install Co-
Ordinator to upload to the install tracker for use by ACS.
• If you are installing a Multi-rack at the same time as an BDA rack continue now with
the EIS installation checklist for each additional rack.
• Please note that multi-rack cabling can only be performed after the network
configuration of all racks have been completed – among other prerequisites, refer to the
Multi-Rack Cabling EIS Checklist for details.

Copies of the checklists are available on the EIS web pages or on the EIS-DVD. We recommend that you always check
the web pages for the latest version.

Comments & RFEs welcome. Oracle staff should mail to EIS-SUPPORT_WW_GRP@oracle.com .


Partners should mail to: SUPPORT-PARTNER-QUESTIONS_WW_GRP@oracle.com .

Oracle Internal and Approved Partners Only Page 51 of 51 Vn 1.1b Created: 15 Feb 2018

You might also like