Professional Documents
Culture Documents
Administration
ES-400
Student Guide
Please
Recycle
Contents
About This Course......................................................................................xv
Course Map........................................................................................ xvi
Module-by-Module Overview ...................................................... xvii
Appendices ................................................................................xix
Course Objectives............................................................................... xx
Topics Not Covered......................................................................... xxii
Introductions .................................................................................. xxiv
How to Use the Course Materials.................................................. xxv
How to Use the Icons.................................................................... xxvii
Typographical Conventions and Symbols ............................... xxviii
Ultra Enterprise 10000 Capabilities and Features................................1-1
Course Map........................................................................................ 1-1
Relevance............................................................................................ 1-2
Objectives ........................................................................................... 1-3
Ultra Enterprise 10000 ...................................................................... 1-4
Ultra Enterprise 10000 Features ...................................................... 1-6
Ultra Enterprise 10000 System Cabinet ......................................... 1-8
Installing AP ............................................................................1-13
Limitations for AP 2.0 (Solaris 2.5.1) ....................................1-13
Dynamic Reconfiguration.............................................................. 1-14
Operating System Support ............................................................ 1-17
Solaris Binary Compatibility .................................................1-18
SSP Operating System Levels................................................1-18
Operating System Enhancements................................................. 1-19
The SSP ............................................................................................. 1-21
SSP Logical Connectivity ............................................................... 1-24
The SSP User Environment............................................................ 1-25
System Accounts .....................................................................1-25
SSP Window ............................................................................1-25
Network Console Window....................................................1-26
Hostview ..................................................................................1-27
iii
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
Hardware Configuration Control................................................. 1-29
Blacklist.....................................................................................1-29
Figure of Merit.........................................................................1-30
Diagnostic and Monitoring Tools................................................. 1-31
Bringup .....................................................................................1-31
Power On Self Test..................................................................1-32
OpenBoot PROM.....................................................................1-32
Status Monitoring and Display ..................................................... 1-33
SSP.............................................................................................1-33
SunVTS .....................................................................................1-33
redx...........................................................................................1-33
Resiliency Features ......................................................................... 1-34
DC Power .................................................................................1-34
System Boards .........................................................................1-35
Processors.................................................................................1-35
Memory ....................................................................................1-35
I/O Interface Subsystem ........................................................1-35
Redundant Components ................................................................ 1-36
Concurrent Serviceability .............................................................. 1-37
Error Logging .................................................................................. 1-39
Check Your Progress ...................................................................... 1-40
Think Beyond .................................................................................. 1-41
Architecture Overview..............................................................................2-1
Course Map........................................................................................ 2-1
Relevance............................................................................................ 2-2
Objectives ........................................................................................... 2-3
Enterprise 10000 Packaging............................................................. 2-4
Enterprise 10000 Component List................................................... 2-6
Data Interconnects .......................................................................... 2-10
Data Paths ................................................................................2-10
Address Paths..........................................................................2-11
Centerplane Configurability.......................................................... 2-13
The System Board ........................................................................... 2-15
Logical View ............................................................................2-16
Physical View (SBus I/O) ......................................................2-17
Mezzanine (Daughter) Board Packaging..................................... 2-18
SBus Mezzanine Packaging ........................................................... 2-20
PCI Mezzanine Packaging ............................................................. 2-21
Memory Subsystem ........................................................................ 2-22
I/O Subsystem ................................................................................ 2-24
Ultra Port Architecture...........................................................2-24
Logical View ............................................................................2-25
JTAG.................................................................................................. 2-26
Support Boards................................................................................ 2-27
Control Board .................................................................................. 2-29
v
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
Objectives ........................................................................................... 4-3
Security Considerations ................................................................... 4-4
Introduction ...............................................................................4-4
General Comments on Security ..............................................4-4
Physical Security .......................................................................4-5
System Security .........................................................................4-6
Network Security ......................................................................4-8
SSP and Control Board Software Block Diagram......................... 4-9
Instances of Client Programs and Daemons ............................... 4-10
Platform Clients.......................................................................4-11
Domain Clients........................................................................4-11
SSP Platform Client Reference ...................................................... 4-12
SSP Domain Client Reference........................................................ 4-13
SSP Daemon Summary .................................................................. 4-14
SSP Daemons ................................................................................... 4-15
Control Board Server (cbs)....................................................4-15
The cb_reset Command ......................................................4-16
The cb_prom Command.........................................................4-16
Event Detector Daemon (edd)...............................................4-17
Event Detector Daemon (edd) Event Handling..................4-19
Event Detector Daemon (edd) Control ................................4-20
File Access Daemon (fad)......................................................4-21
Network Time Protocol Daemon (xntpd) ...........................4-22
The SNMP Daemon (snmpd) .................................................4-23
SNMP Trap Sink Server (straps) ........................................4-25
machine_server.....................................................................4-26
Domain Support Executables ........................................................ 4-27
System Operation............................................................................ 4-28
The hostinfo Command .............................................................. 4-29
hostview.......................................................................................... 4-30
hostview Performance Considerations....................................... 4-32
hostview Main Window ............................................................... 4-33
Main Window Processor Symbols................................................ 4-35
Selecting Items in the Main Window ........................................... 4-37
Help Window .................................................................................. 4-38
Main Window Buttons ................................................................... 4-39
The Failure Window....................................................................... 4-40
SSP Log Files.................................................................................... 4-41
Viewing a Messages File With hostview.................................... 4-42
Administering Power ..................................................................... 4-43
The power Command ..................................................................... 4-44
Examples ..................................................................................4-44
Automatic Recovery From a Power Outage .......................4-45
Power Control From Hostview..................................................... 4-46
Monitoring Power Levels in Hostview........................................ 4-47
Monitoring Temperature in Hostview......................................... 4-49
vii
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
Blacklisting Boards and Buses With hostview .......................... 5-49
Blacklisting Processors With hostview....................................... 5-50
Clearing the Blacklist File .............................................................. 5-51
Processor Sets .................................................................................. 5-52
Lab..................................................................................................... 5-54
Check Your Progress ...................................................................... 5-55
Think Beyond .................................................................................. 5-56
Installing Solaris in a Host Domain.......................................................6-1
Course Map........................................................................................ 6-1
Relevance............................................................................................ 6-2
Objectives ........................................................................................... 6-3
The Enterprise 10000 Solaris Environment ................................... 6-4
Configuring the SSP as a Boot Server ............................................ 6-6
Preparing the Domain ...................................................................... 6-8
Installing Solaris.............................................................................. 6-11
Booting the Domain for the First Time ........................................ 6-14
Installing Packages From the 2.6
SMCC Server Supplement CD-ROM ........................................ 6-16
Installing Packages From the 2.5.1
SMCC Hardware Updates CD-ROM ........................................ 6-18
Finishing the Installation - Solaris 2.6 .......................................... 6-20
Finishing the Installation - Solaris 2.5.1 ....................................... 6-21
Preinstalled Domain Software ...................................................... 6-22
Lab..................................................................................................... 6-23
Check Your Progress ...................................................................... 6-24
Think Beyond .................................................................................. 6-25
System Boot Process ..................................................................................7-1
Relevance............................................................................................ 7-2
Objectives ........................................................................................... 7-3
The SSP Boot Process........................................................................ 7-4
Prepare the SSP..........................................................................7-4
SSP Boot Process........................................................................7-4
Daemon Start Up.......................................................................7-5
The ssp_startup Script .................................................................. 7-6
Restartable Daemons ................................................................7-7
Domain Bringup Flow...................................................................... 7-8
The bringup Command................................................................... 7-9
Syntax .......................................................................................7-10
Execution ..................................................................................7-11
The hpost Command ..................................................................... 7-12
Syntax .......................................................................................7-12
Functions ..................................................................................7-14
hpost Control Files......................................................................... 7-15
.postrc ....................................................................................7-15
ix
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
Basic Alternate Pathing Concepts................................................. 8-14
Physical Paths ..........................................................................8-14
Meta-Disk .................................................................................8-15
Disk Pathgroup ............................................................................... 8-16
Meta-Network ................................................................................. 8-18
Network Pathgroup........................................................................ 8-19
Sample AP Configurations ............................................................ 8-20
AP With Mirroring.......................................................................... 8-21
Device Paths.............................................................................8-22
The AP State Database ................................................................... 8-23
AP Database Configuration Considerations............................... 8-24
Creating the AP Database.............................................................. 8-26
The apdb Command ...............................................................8-26
Refreshing the Databases.......................................................8-27
AP Databases on Alternate Pathed Disks.................................... 8-28
Viewing AP Database Status ......................................................... 8-29
The apconfig Command ......................................................8-29
Deleting a Copy of the AP Database ............................................ 8-31
Viewing Pathgroup Information .................................................. 8-32
Viewing Network Entries .............................................................. 8-33
Uncommitted Network Entries.............................................8-33
Committed Network Entries .................................................8-34
Planning Network Pathgroups and Meta-devices..................... 8-35
Meta-Network Interfaces .......................................................8-36
FDDI Devices ...........................................................................8-37
Creating a Network Pathgroup .................................................... 8-38
Activating the Meta-Device...................................................8-39
FDDI Setup Considerations........................................................... 8-41
Contacting the IEEE................................................................8-42
Switching a Network Pathgroup .................................................. 8-43
Deleting a Network Pathgroup..................................................... 8-45
Reversing an Uncommitted Delete.......................................8-46
Alternately Pathing the Primary Network Interface ................. 8-47
Boot Time Interface Failure ...................................................8-50
Viewing Disk Entries...................................................................... 8-51
Uncommitted Disk Entries ....................................................8-51
Committed Disk Entries.........................................................8-52
Disk Path Components................................................................... 8-53
Planning a Disk Pathgroup and Meta-disks ............................... 8-54
Meta-disk Configuration Example ............................................... 8-55
Creating a Disk Pathgroup and Meta-disks................................ 8-60
Using the Meta-devices .................................................................. 8-63
Disk Managers and AP .................................................................. 8-65
Using Volume Manager With AP.........................................8-65
Disabling DMP ........................................................................8-66
Using SDS With AP ................................................................8-66
xi
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
complete_detach ..................................................................9-42
reconfig..................................................................................9-42
Finishing the Complete Detach Operation.................................. 9-43
Configuring for DR Detach ........................................................... 9-45
Enabling DR Detach ...............................................................9-45
I/O Devices..............................................................................9-46
Detaching Network Devices.......................................................... 9-48
FDDI..........................................................................................9-48
Causes for DR Failure.............................................................9-49
Detaching Non-Network Devices................................................. 9-50
DR Detach-Safe Devices................................................................. 9-52
Declaring a Driver Detach-Safe.............................................9-53
Unloading a Loaded Detach-Unsafe Driver ............................... 9-54
Using modunload ....................................................................9-55
Correctable Errors ...................................................................9-60
Detaching a Board With dr............................................................ 9-61
Aborting the Detach Operation ............................................9-63
Detaching a Board With hostview .............................................. 9-64
Beginning the Detach .............................................................9-64
hostview Detach Buttons......................................................9-67
Pageable and Permanent Memory ............................................... 9-69
Operation: Permanent Memory on the Target Board........9-71
Operating System Quiesce............................................................. 9-73
Operating System Quiesce Failures ............................................. 9-75
Suspend-Safe and Suspend-Unsafe Devices ............................... 9-77
Tape Devices ............................................................................9-78
Adding New Suspend-Safe Drivers ............................................. 9-79
Adding New Suspend-Bypass Drivers........................................ 9-81
Quiesce Operation .......................................................................... 9-83
DR and AP Interaction ................................................................... 9-86
DR Attach .................................................................................9-86
DR Detach ................................................................................9-87
Lab..................................................................................................... 9-88
Part 1: Using hostview..........................................................9-88
Part 2: Using the Command Line .........................................9-93
Check Your Progress ...................................................................... 9-94
Think Beyond .................................................................................. 9-95
Diagnostic Information...........................................................................10-1
Course Map...................................................................................... 10-1
Relevance.......................................................................................... 10-2
Objectives ......................................................................................... 10-3
Standard Domain Message Logs .................................................. 10-4
Bus Configurations and the Figure of Merit ............................... 10-5
Sample FOM Calculation............................................................... 10-9
Redlist and Blacklist Files ............................................................ 10-11
xiii
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
xiv Ultra Enterprise 10000 Administration
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
About This Course
Course Goal
This course is designed introduce students to the Ultra™ Enterprise™
10000 system. It will explain the capabilities and configuration of the
system; show how to load the software, discuss the operation and
management of the system, the configuration, and use of its special
capabilities; and how to troubleshoot failures.
xv
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
Course Map
Each module begins with a course map that enables you to see what
you have accomplished and where you are going in reference to the
course goal. A complete map of this course is shown below.
This module teaches you how to plan for and install the System
Service Processor software, and perform basic SSP administration
and setup tasks.
● Module 5 – Domains
This module describes how the SSP and the Enterprise 10000 and
its domains boot. It discusses all the Enterprise 10000-specific
commands, daemons, and configuration files used in the boot
process.
Appendices
● Appendix A – Configuring NTP
● Control the Enterprise 10000 system from the command line and
from Hostview.
This course does not cover the topics shown on the above overhead.
Many of these topics are covered in other courses offered by Sun
Educational Services (SES). Refer to the Sun Educational Services
catalog for specific information and registration
Now that you have been introduced to the course, introduce yourself
to each other and the instructor, addressing the items shown on the
above overhead.
Typeface or
Meaning Example
Symbol
Course Map
This module discusses the capabilities and features of the Sun Ultra
Enterprise 10000 system. It discusses the system hardware and
software components and describes the packaging of the system,
system configuration, and some of the Enterprise 10000 system’s
special features such as Alternate Pathing and Dynamic
Reconfiguration.
1-1
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
1
Relevance
Objectives
References
1-3
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
1
1-5
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
1
1-7
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
1
System service
processor (SSP)
Access panel
Styling panel
The Enterprise 10000 cabinet is 1.8 m (70") high, 1 m (39") wide, and
1.3m (50") deep. Fully configured it weighs 638 kg (1400 pounds) and
draws 13.6 kVA of power.
● System boards
● Centerplane
● Control boards
● Cooling subsystem
The system boards house the processors, I/O interface modules, SBus
cards, and system memory.
1-9
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
1
System Domains
System Domains
1-11
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
1
Alternate Pathing
Alternate Pathing (AP) provides a domain with the ability to have two
paths to the same I/O device, providing redundancy at the level of the
I/O controller and cabling.
Alternate Pathing
Installing AP
In Solaris 2.6, AP Version 2.1 is installed from the SMCC Supplements
CD-ROM. Its documentation can be found in the Hardware
AnswerBook (SUNWabhdw) on the same CD-ROM..
For Solaris 2.5.1, AP 2.0 is installed from its own CD-ROM and comes
with its own AnswerBook.
AP software components are installed on both the SSP and the host
domain.
You will also need to manually switch both disk and network active
alternate paths during DR operations.
1-13
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
1
Dynamic Reconfiguration
Dynamic Reconfiguration
For Solaris 2.6, DR is installed with the operating system. For Solaris
2.5.1, DR is installed from the Solaris SMCC Updates CD-ROM, and
comes with its own AnswerBook.
1-15
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
1
Dynamic Reconfiguration
Only some of the architecture specific binaries are different, and all
appropriate standard Sun Solaris patches will install.
1-17
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
1
The SSP software will not run with Solaris 2.6 on the SSP, but will still
support Solaris 2.6 running in a domain.
● Resource management
● Parallel processing
1-19
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
1
● Memory management
These include:
● Dynamic Reconfiguration
● Alternate Pathing
The SSP
The System Service Processor (SSP) enables you to control and monitor
the Enterprise 10000 system. The SSP is built from a SPARCstation™ 5
with 64 Mbytes of random access memory (RAM) and a 1-Gbyte disk.
A CD-ROM is included for loading software onto the Enterprise 10000
system.
The SSP runs a number of daemons that control and monitor the
Enterprise 10000 and its domains.
● Hostview
1-21
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
1
The SSP
● Boot domains.
● Create domains.
The SSP
● Keeps logs of the interactions between the SSP and the domains.
1-23
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
1
The following diagram shows the logical external connectivity for the
Enterprise 10000 host and SSP. This diagram does not show the actual
physical network connections, which are discussed in Module 3.
Cu
sto
me
rE
the
rne
t
t
rne rd SP
Ethe oard l boa To
S
us l b t r o
SB ontro l con
To c a
To option
To
or t SP
upp al S
s ion
ote opt
Re
m To
Transceiver
(optional)
Telephone cable
DTE
Optional second SSP
Modem
DTE
System Accounts
There are two accounts on the SSP system, root and ssp. These are
used to manage the SSP itself and the Enterprise 10000, respectively.
SSP Window
An SSP window is a normal OpenWindows window into the Solaris
and SSP environments of the SSP itself.
To bring up an SSP window, you must log in as user ssp. You are then
prompted for the name of a domain that you want to manage. You can
switch the domain that you are managing at any time.
You can run the display software remotely, by properly setting your
DISPLAY and xhost environment.
1-25
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
1
Remember that the Enterprise 10000 does not have a directly attached
console; it has no keyboard or serial ports. It can only be
communicated with over Ethernet.
Hostview
The Hostview program provides a graphical user interface (GUI)
which provides the same functionality as many of the SSP commands.
1-27
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
1
Hostview
Hostview enables you to perform the following actions:
● Create domains.
Blacklist
The blacklist file lists system components, such as central
processing units (CPUs), address buses, I/O slots, or lower-level
subcomponents that are not to be included in the domain the next time
that the domain is configured. It can be modified by the user as
necessary.
Parts can be put into the blacklist file even if they are functional
for diagnostic, benchmarking, testing and so on.
The system will never automatically modify the blacklist file under
any circumstances.
1-29
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
1
Figure of Merit
During domain configuration, the SSP determines the best possible
domain hardware configuration by assigning a figure of merit (FOM)
to each possible hardware configuration for the domain (a total of 45).
It then chooses the configuration with the highest FOM.
Bringup
Bring-up diagnostics provide static, repeatable testing that catch most
hard errors. Bring-up diagnostics log all failures to the system or
domain log file on the SSP. They can be run at varying levels of depth,
depending on the situation.
1-31
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
1
OpenBoot PROM
The primary task of the OpenBoot firmware is to record the domain
configuration and boot the operating system from either a disk or a
network interface. The firmware also provides extensive features for
testing the hardware and software interactively. It is very similar to the
OBP software found in Sun’s other processors.
SSP
The SSP is the primary provider of services related to monitoring and
reporting on the status of the machine.
SunVTS
SunVTS, the on-line validation test suite, tests and validates hardware
functionality by running multiple diagnostic hardware tests on
configured controllers and devices.
redx
redx is an internal-use-only interactive hardware debugger for the
Enterprise 10000, like a hardware version of adb.
1-33
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
1
Resiliency Features
DC Power
The Enterprise 10000 system logic direct current (DC) power system is
managed at the system board itself. The 48-volt DC power is supplied
through a circuit protector to each system board. The 48-volt power is
converted through several small DC-to-DC converters on the board to
the specific lower voltages needed directly on the system board.
Failure of a DC-to-DC converter will affect only that particular system
board.
Resiliency Features
System Boards
System boards can be removed from and inserted into a powered on
and operating Enterprise 10000 system (hot swap) for servicing the on-
board components.
Processors
A failed UltraSPARC™ processor can be isolated from the remainder
of the system by the POST process. As long as there is at least one
functioning processor in the configuration, the domain may be used.
Memory
There is one memory controller on each system board. If it fails, only
the memory on that system board is unavailable. As long as there is
sufficient memory left for the domain to run, it can be used.
Alternate Pathing support for network interfaces and disk arrays can
provide the ability to transparently recover from most of these kinds of
failure.
1-35
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
1
Redundant Components
● Control boards
● Disk storage
Note that input power itself cannot be made redundant by the system.
Concurrent Serviceability
Failing components are identified in the SSP failure logs in such a way
that the field-replaceable unit is clearly identified. Repair can be made
quickly and with only minor disruption, if any, usually at a convenient
time.
1-37
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
1
Concurrent Serviceability
Error Logging
The SSP detects these errors by polling the control boards on a regular
basis. If the error is fatal, the affected domain is stopped, error log
information is collected by the SSP, and the domain is automatically
rebooted.
1-39
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
1
Before continuing on to the next module, check that you are able to
accomplish or answer the following:
Think Beyond
How does the Enterprise 10000 load the operating system if the
console communicates over the Ethernet?
1-41
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
Architecture Overview 2
Course Map
This module describes the architecture, construction, layout, and basic
hardware operation of the Enterprise 10000 system.
2-1
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
2
Relevance
1. How is the Enterprise 10000 different from the other Sun servers?
Objectives
References
● Sixteen fan trays, containing two muffin fans per tray (32 total
fans). An entire tray is hot swappable. Fan speed is controlled by
the SSP.
Quantity per
Component Function
System
Centerplane Contains address and data interconnect to all system 1 (2 logical
boards halves)
Centerplane Provides the centerplane JTAG, clock, and control 2
support board functions
System board Contains processors, memory, I/O subsystem, SBus Up to 16
and PCI boards, and power converters
Processor Mezzanine boards that contain the UltraSPARC Up to 64
modules processor and support chips
Memory Removable DIMMs (Dual In-line Memory Module) Up to 16
I/O Removable SBus boards Up to 64
Control board Controls the system JTAG, clock, fan, power and Up to 2
Ethernet interface functions
48-volt power system
AC input Receives 220-volt AC, monitors it, and passes it to Up to 4
module the power supplies
48-volt power Converts AC power to 48-volt DC 5 or 8
supply
Circuit breaker Interrupts power to various components within the 1
panel system
AC power Receives 220-volt AC, monitors it, and passes it to 1 or more
sequencer the peripherals
Peripheral Converts AC power to DC for peripherals (In peripheral
power supply cabinet)
Remote power Connects the remote control line between two 1
control module control boards and passes it to a master AC
sequencer
Fan centerplane Provides power to the pluggable fan trays 2
Fan trays Each fan tray contains two fans for system cooling 5 to 8 pairs
Component Locations
● CB – Control board
● PS – AC power supply
● SB – System board
Component Locations
PS4
PS5
AC2
PS6
PS7
AC3
FT8
FT9
FT10
FT11
CSB1
SB8
SB9
SB10
SB11
SB12
SB13
SB14
SB15
CB1
FT12
FT13
FT14
FT15
Component Locations
AC0
PS0
PS1
AC1
PS2
PS3
PDU
RPC0
RPC1
RPC2
RPC3
RPC4
FT0
FT1
FT2
FT3
CSB0
CB0
SB0
SB1
SB2
SB3
SB4
SB5
SB6
SB7
FT4
FT5
FT6
FT7
Data Interconnects
Data Paths
The data bus consists of two pairs of unidirectional, two-level, 16 x 16
crossbar switches that transfer data packets between the 16 system
boards. This means that each system board is connected directly to
every other system board through the centerplane.
System data paths are separate 144-bit-wide data paths to and from
each system board slot. If all system boards request different
destinations, the system could do 16 simultaneous 64-byte transfers.
However, if two boards request same destination, one must wait.
The data bus has a theoretical bandwidth of 21.3 Gbytes per second,
but in normal operation, 10.7 Gbytes per second is the limit based on a
combination of the data and address path bandwidths.
Data Interconnects
Address Paths
The Enterprise 10000 system provides four hardware address buses.
Each bus can be used to make data transfer requests to another
location in the domain, either on the same or on a different system
board. These buses are 48 bits wide including error correcting code
bits. Each bus is independent, meaning that there can be four distinct
address transfers occurring simultaneously.
Data Interconnects
System board 12
1
3
rd 1
rd 1
64-byte
oa
boa
block
mb
10
14
tem
d
ar
te
d
ar
bo
Sys
Sys
bo
em
em
st
64-byte
st
Sy
block
Sy
rd 9 5
mb
oa
oa rd 1
te
Sys te mb
Sys
System board 8
System board 0
Sys
tem Sys
boa tem
rd 7 boa
rd 1
Sy
Sy
te s
st
m
em
Sys
Sys
b
System board 4
oa
bo
te
tem
r
ar
d
mb
d
6
2
boa
rd 5oa
rd 3
144-bit wide, 16 x 16
data bus
(full centerplane)
64-byte
Global data router block
(on centerplane)
4 bus cycles to
transfer 64-byte block
Memory
64-byte block
64-byte block
Processor module 64-byte block
64-byte Block
Ecache
Centerplane Configurability
● The system will operate with one, two, or three address buses.
Performance degradation when operating with less than four
address or two data buses will be application dependent. At
configuration time the system will determine the optimum
combination and use it.
● The system can operate with one 72-bit data bus. Note that the
data bus bandwidth is two times the available address bus
bandwidth in a fully operational system. Therefore, with only one
72-bit data bus, the system is balanced for address and data
bandwidth.
Centerplane Configurability
● The system will operate with one or two address buses and one
72-bit data bus with a half-centerplane failure.
● Two I/O buses per system board, each with either 2 SBus slots or
1 PCI slot, giving a total of 32 SBuses with 64 SBus slots per
system or 32 PCI slots per system. PCI and SBus slots may not be
mixed on a system board, but may be mixed in a system and
domain.
Logical View
Global address arbiter (GAARB)
Global address router (GAMUX)
Global address arbiter
Global address router
Global address arbiter
Global address router
Global address arbiter
Global address router
UltraSPARC
UltraSPARC
UltraSPARC
Memory
I/O bridge
I/O bridge
Pack/
unpack
arbiter (LDARB)
U P A d a t a b u s e s
Local data
SBus card
Four banks of Pack/
eight DIMMs unpack
each
Pack/
unpack
SBus card
Memory
Module
SBus card
~21.1”
Pack/
unpack
Pack/
unpack
SBus card
~16.0”
The processor modules connect directly to the system board. They can
easily be replaced if necessary, as they are individually mounted on
the system board.
Memory DIMMs and SBus and PCI cards can also be easily replaced.
Memory
mezzanine
I/O
mezzanine
SBus
cards
Processors
Front
Personality plate
Architecture Overview
PCI Mezzanine Packaging
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
2
2-21
2
Memory Subsystem
A system board can have zero to four banks of eight DIMMS each
installed. A bank must be fully populated. While all DIMMs on a
given system board must be the same size, DIMM sizes can be
different on the different system boards.
Memory Subsystem
I/O Subsystem
A UPA module logically plugs into a UPA port. The UPA module can
contain a processor, an I/O controller with interfaces to I/O buses, and
so forth. A UPA port uses separate packet-switched address and data
buses, and the address and data paths operate independently,
providing significantly better performance.
I/O Subsystem
Logical View
I/O Module
SYSIO
Port
Controller
(PC)
Enterprise 10000 data buffer
SBus SBus
card card
SYSIO
SBus SBus
card card
PCI attaches through the SYSIO chip as well, using a PCI version
instead of an SBus verion. With 2 PCI slots per system board, you can
have 32 PCI slots in a fully configured system.
JTAG
cbs sends the JTAG commands over TCP/IP to the Control Board
Executive (cbe) running on the control board. cbe monitors and
controls the Enterprise 10000 hardware under the direction of various
SSP applications.
Support Boards
The control board (CB) generates clock signals, JTAG scan signals and
control, and provides an Ethernet interface to SSP from the system.
Only one Control Board is required. A second Control Board may be
installed for redundancy, although only one may be active at a time.
Switching Control Boards requires a reset of the entire platform.
Support Boards
System board
Centerplane
support board
Control board
The other Centerplane Support and Control boards are directly behind
those shown (on the other side of the centerplane).
Control Board
Control Board
The Ethernet controller provides the link between the SSP and the
control board. The JTAG controller scans and controls the power to all
of the Enterprise 10000 components. The reset and control logic
performs various functions, such as monitoring the inlet airstream of
the ambient air and maintaining the system heartbeat mechanism.
Before continuing on to the next module, check that you are able to
accomplish or answer the following:
Think Beyond
Why was some much attention paid to reliability and on-line repair?
Course Map
This module describes how to install and configure the software
required on the SSP. It covers both Solaris and the SSP software for the
Enterprise 10000. It also describes how to boot and operate the SSP.
3-1
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
3
Relevance
4. How does the installation of a main SSP differ from the spare SSP?
Objectives
References
Before you can install your system, you will need to know what your
network should look like. You have a large number of Enterprise 10000
system components that require network addresses:
● Up to eight domains
Some of these will require multiple addresses. The SSP can have as
many as four subnets, and the domains at least two.
There are several different configurations that will work, but all have a
common factor: the control boards must be isolated from other
network traffic. The control boards are very sensitive to delay, and too
much interference could cause the Enterprise 10000 system to fail.
The basic choices that you will have to make for the network
configuration are:
Once these issues are decided, you can begin to assign hostnames and
Internet Protocol (IP) addresses and configure the network for the
system
When making a decision, use the general principle that, the more
isolation there is, the better.
The control boards can be configured two different ways. They can
share a subnet, or each can have its own subnet.
On a shared subnet, one host interface from the SSP is required. Two
are required for dual subnets.
Each control board and SSP network interface must have a unique host
name and IP address.
In the public network configuration, one host interface from the SSP
and each domain is required. In the private network interface, two are
required from each. Each domain and SSP network interface must
have a unique host name and IP address. This means that the SSP may
have a different hostname from the domains than it does from the
external network.
SSP Privacy
You may or may not want to have the SSPs directly accessable from an
external network. If not, you must use the private network
configuration, and do not connect the interfaces shown with the
dashed lines.
CB0_subnet CB1_subnet
Subnet 1
Subnet 1
CB0
IP address: IP address:
le0
Hub 0
CB0_subnet netmask: CB0_subnet netmask:
Hostname: Hostname:
Subnet 3
Subnet 2
hme0
CB1
IP address: Hub 1 IP address:
dom_subnet netmask: CB1_subnet netmask:
QFE
Hostname: Domain 1 name:
Subnet 2
hme1
IP address: Hostname:
dom_subnet
Enet Port
Subnet 3
CB1_subnet netmask: IP address:
dom_subnet netmask:
SSP 1 Domain 2 name:
Hostname:
Hostname:
dom_subnet
Enet Port
Subnet 3
Hostname:
CB1_subnet dom_subnet CB0_subnet
Subnet 1
IP address:
IP address:
le0
dom_subnet netmask:
CB0_subnet netmask:
Domain 3 name:
Hostname:
Subnet 3
Hostname:
dom_subnet
hme0
Enet Port
Subnet 3
IP address:
IP address:
dom_subnet netmask:
QFE
dom_subnet netmask:
Hostname:
Subnet 2
Domain 4 name:
hme1
IP address:
Hostname:
dom_subnet
Enet Port
Subnet 3
CB1_subnet netmask:
IP address:
dom_subnet netmask:
Customer
NIS/NIS+ domain name: Net Domain 5 name:
(dom_subne
DNS domain: t) Hostname:
dom_subnet
Enet Port
Subnet 3
CB0_subnet netmask: IP address:
dom_subnet netmask: dom_subnet netmask:
CB1_subnet netmask: Domain 6 name:
Hostname:
dom_subnet
Enet Port
dom_subnet netmask:
• Netmasks must be the same within a subnet.
Domain 7 name:
• Each hostname must be unique.
Hostname:
• Each IP address must be unique but within the respective
dom_subnet
Enet Port
Subnet 3
subnet.
• Each control board must be on a separate subnet. IP address:
• To avoid confusion, for each domain, the domain name and dom_subnet netmask:
hostname should be the same.
Domain 8 name:
Hostname:
dom_subnet
Enet Port
Subnet 3
IP address:
dom_subnet netmask:
This worksheet, taken from the Sun Enterprise 10000 System Hardware
Installation and De-Installation Guide, will help you plan your Enterprise
10000 control networks.
The SSP software normally comes preinstalled from the factory on the
SSP workstation. All that is required to prepare it for use with the
Enterprise 10000 system it accompanies is to reply to the configuration
questions the first time that it boots up.
However, should you lose the SSP system, you may need to reinstall
the system.
Remember that the sole function of the SSP system is to monitor and
control the Enterprise 10000 host system. It should be used for no
other function. The SSP is constantly monitoring the Enterprise 10000
host through the control boards and information from the active
domains. It must be available at any time to handle conditions that
could arise on the host. Never run any other applications on the SSP.
SSP Accounts
There are two accounts on the SSP, root and ssp.
The root account is used to manage the SSP itself and is created when
Solaris is installed.
The ssp account is created when the SSP software is installed and is
used to control the Enterprise 10000 host and its domains. The default
password created for the ssp account by the SSP install process is ssp.
The install process also installs .cshrc and .login files. The account
assumes that it is running the C shell; do not modify this default.
To prepare the SSP, you must install all 12 of the following packages
on the SSP. These packages must be installed in a specific order. They
are given here in alphabetical order only for reference.
The SSP packages are provided on the System Service Processor (SSP)
3.1 for the Ultra Enterprise 10000 CD-ROM. Make sure that you apply
the appropriate current patches before attempting to communicate
with the Enterprise 10000 platform.
A number of environment variables are set for the ssp account on the
SSP. These specify locations for the SSP files.
● $SUNW_HOSTNAME
● $SSPETC
● $SSPVAR
● Log files
● $SSPLOGGER
● $SSPOPT
● SSP executables
The SSP contains files that are difficult to rebuild if they are damaged
or lost. Back up these files on a regular basis. Remember that much of
the Enterprise 10000’s configuration information is loaded from the
SSP.
4. If $SSPVAR/etc/platform_name/domain_name/.postrc
exists, save it as postrc.domain_name.
● Archive Libraries
● Basic Networking
● Graphics Headers
● Point-to-Point Protocol
9. When the Disks dialog is displayed, choose the disk on which the
software is to be installed, click on Add, then click on Continue.
12. When the File System and Disk Layout dialog is displayed, choose
Customize.
13. In the Customize Disk screen, set up the disk partitions for the
root disk, and click on OK when you are done.
14. When the File System and Disk Layout dialog is displayed again,
choose Continue if the layout is correct; otherwise, choose
Customize and go back to Step 13.
15. When the Mount Remote File Systems? dialog is displayed, choose
Continue.
16. When the Profile dialog is displayed, confirm your selections and
choose Begin Installation.
The public domain NTP software has been adapted to work on Solaris
to allow synchronization of the clocks betwen the SSP and the
domains, which is necessary for DR.
The version of NTP shipped with Solaris 2.5.1 will only work in the
Enterprise 10000 environment. The 2.6 version will work in a general
configuration. You can interconnect the 2.5.1 and 2.6 versions.
2. Configure the SSP to act as a time server for the host domains.
Note – These are not the host names and IP addresses of your
domains.
Caution – Use the order of the SUNWssp packages shown above. You
! will cause problems with the SSP configuration process if you do not
follow this order.
Caution – Whenever you reboot the SSP, wait 3 minutes before you
! perform any SSP commands. In the current release, this delay is
needed to enable the SSP software initialization process to complete.
With the factory-installed SSP software, this and the following sections
are the steps that must be performed to prepare the SSP.
When the SSP system boots for the first time with the SSP software
installed, during the boot process you will be asked configuration
information about your Enterprise 10000 host.
The system will request this information during the SSP boot process..
Please enter the name of the platform this ssp will service: presidents
Do you have a control board 0? (y/n): y
Please enter the host name of the control board 0 [presidentscb0]:
jefferson
Do you have a control board 1? (y/n): y
Please enter the host name of the control board 1 [presidentscb1]:
madison
● For the main SSP, reply y to this prompt and you will see:
MAIN SSP configuration completed.
● For a spare SSP, reply n to this prompt, and you will see:
SPARE SSP configuration completed.
If you are configuring a spare SSP, you have finished its configuration
at this point.
1. Log in as user ssp. The SSP account has been created by the SSP
software install process. The default ssp account password is ssp.
3. Ensure that the Enterprise 10000 system is powered on. Use the
power command.
ssp% power -on -all
ssp_config is also used to initially configure the SSP, and asks you for
the same information that it did just after you installed the SSP
packages. Reboot the system after running it.
ssp_config does not, however, make any changes to the SSP’s Solaris
system identity as sys-unconfig does. You can run sys-unconfig to
change the SSP’s identity without needing to rerun ssp_config.
You can also use ssp_config if you need to change the characteristics
of the control boards. This is discussed in Module 5, "Domains."
Switching to Spare
To switch the main SSP to spare status, enter
# ssp_config
with no operands. This will remove the SSP daemon inittab line and
kill any active SSP daemons. You should then immediately configure a
spare SSP to become the new main SSP.
Warning – Do not start two SSPs with both active as main. This may
confuse the control coard, requiring you to reset it and thus resetting
any active domains.
Switching to Main
To change an SSP from a spare to the main SSP, ensure the main SSP is
shut down, then on the spare SSP enter
# ssp_config spare
For more information on switching SSP systems, see the Sun Enterprise
10000 System Hardware Installation and De-Installation Guide.
/etc/inittab
The spare SSP contains a file named /etc/inittab.main that contains
the line to activate the SSP daemons. If you change the default
/etc/inittab, remember to change this file as well.
The line added to the end of /etc/inittab to start the SSP daemons
is:
sp:234:respawn:su - ssp -fc /etc/opt/SUNWssp/ssp_startup.sh 15 \
>/dev/null 2>&1 </dev/null # SUNWsspr
where:
For example:
presidents:Ultra-Enterprise-10000:jefferson:P:madison:
This example shows that there are two control boards installed in the
presidents platform. They have host names jefferson (which is the
primary) and madison.
If you have dual control boards, you can switch the primary control
board. It will require you to delete all your domains or power off the
entire platform, reconfigure the SSP, and re-create the domains or
system power.
2. Update your name service for the new control board addresses if
necessary. This may be a new MAC address if you have replaced a
control board.
5. Using hostview, make sure that the J and C symbols show in the
hostview display control board squares, which signifies that the
control boards are active.
1. You can use Hostview. The active contol board is the one
containing the J and C characters.
2. You can use snoop to watch the network traffic on the control
board subnet(s). The active control board is the only one that is
sending regular messages.
The SSP is the boot server for the control board. Two files are
downloaded by the control board boot PROM using tftp during boot
time: the image of cbe and the port number specification file. These
files are located in /tftpboot in the SSP. Their naming conventions
are:
/tftpboot/XXXXXXXX – For the cbe image.
/tftpboot/XXXXXXXX.cb_port – For the port number.
Warning – For these changes to take effect, you must reset both
control boards. Remember that this will reset all active domains.
You can add a new control board or change the host name and IP
address of existing control boards. This is done with the ssp_config
cb command. The appropriate files in /tftpboot will be updated.
a. Update your name service(s) for the new control board names
and addresses
3. Run cb_reset to reload the control boards from the primary SSP
Lab
1. Using the sample hosts file shown earlier in the module, diagram
that Enterprise 10000 network and fill out the network planning
worksheet using the provided information.
2. On your workstation:
c. Configure the SSP for the lab host environment. Update the
hosts file from the handout information or a provided file. See
page 3-18 and following for more information.
3. Use telnet and log into the lab’s main SSP as the ssp account. On
that SSP:
Before continuing on to the next module, check that you are able to
accomplish or answer the following:
Think Beyond
Why is the SSP Solaris software profile edited the way it is?
What would happen if you used the SSP for other than monitoring the
Enterprise 10000?
Course Map
This module describes the commands and procedures used to operate
an Enterprise 10000 system. It discusses the interaction between the
SSP, the Enterprise 10000, and the domains, and the control of the
domains and the hardware. It also touches on error reporting, the
location of the various system logs, and security issues.
4-1
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
4
Relevance
2. How does the SSP interact with the Enterprise 10000 platform?
Objectives
● Domain control
● Power control
● Fan control
● Log inspection
References
Security Considerations
Introduction
From a security perspective the Ultra Enterprise 10000 provides a
variety of unique and interesting challenges.
Security Considerations
With its unique and flexible architecture, some new security concerns
come to light. These concerns are the focus of this section and are not
intended to completely address the issue of securing corporate
environments in general.
For futher information on security issues, check the CERT web page at
www.cert.org and attend the SC-380 course.
Physical Security
Physical security of the Enterprise 10000 is extremely important. Since
this system has the capabilities of Dynamic Reconfiguration and
Alternate Pathing, providing the ability to remove components while
the system remains operational, it is possible that unauthorized
removal of components may occur.
Basic considerations:
● Secure the SSP within the same environment, and ensure that all
networking connections are inside the environment as well.
Security Considerations
System Security
The system security of the Enterprise 10000 is compounded by its
domain architecture. Essentially, each configured domain of the
Enterprise 10000 is another "host" that must be secured.
Also, since the SSP platform controls the entire Enterprise 10000
system, its security is also extremely important.
Make sure that you change the default ssp account password
immediately, and then continue to change it on a regular basis.
Make sure that you strictly limit the number of people who have
access to the SSP root and ssp accounts. Anyone with access to these
accounts can control all of the Enterprise 10000 domains.
Also, limit access to any accounts on the SSP to those who need to
control the Enterprise 10000. Using the SSP for other purposes can
significantly slow platform and domain support operations.
Security Considerations
System Security
File Systems
● Back up your file systems regularly, both on the SSP and for the
domains.
Security Considerations
Network Security
The Enterprise 10000 has extensive general networking capabilities by
virtue of its multiple add-on SBus slots. Each domain could act as a
host or router independent of each other domain.
SSP
netcon server
Relays messages between
netcon sessions and cbs JTAG scan database:
cvcd or OBP. $SSPVAR/data/Ultra-\
Controls all JTAG Enterprise-10000
operations. Passes
client requests to cb_config
cbe. cb_port
Monitors cbe. domain_config
ssp_resource
straps
Listens for SNMP traps.
Forwards messages to
all connected SNMP
clients.
Other clients
snmpd edd
Uploads monitor scripts. edd.emc
Monitors Enterprise platform edd.erc
SNMP proxy agent: 10000 events. per domain edd.erc
Manages Enterprise Executes response action ssp_resource
.scripts.
10000 database for SNMP
clients. Allows
SNMP clients to
monitor and control
the database. hostview
fad
file locking fad_files
MIB configuration services
and data:
$SSPETC/snmp/
ssp_resource
RPC
CBMP
SNMP
Platform Clients
For certain clients and daemons, exactly one instance is created on the
SSP, without regard to the platform or the number of domains that
exist on the platform. For these clients and daemons, the setting of the
environment variable SUNW_HOSTNAME is irrelevant. One, and only
one, instance will ever be created.
For other clients and daemons, one instance is started for the entire
platform. Currently, because the SSP can control only one platform,
this looks the same as the previous category.
Domain Clients
For the other clients and daemons, one instance is created on the SSP
for each active domain on the platform. Before you execute a domain
client application, you must set SUNW_HOSTNAME to the proper
domain name. (hpost and bringup are examples of this type of client.)
The commands and daemons listed in the above overhead run on the
SSP and are responsible for platform-wide operations on the
Enterprise 10000.
The commands listed in the above overhead run on the SSP and are
responsible for domain-specific operations on the Enterprise 10000.
Some of these commands are dicussed later in this course.
Note – Remember that these commands do not take the domain name
as a command-line argument. Instead they use the setting of
SUNW_HOSTNAME. Make sure that it is correct before running these
commands.
The SSP daemons play a central role in the UE10000’s operation. Each
daemon will be discussed more fully later, and is described in its
corresponding man page. The SSP daemons are:
SSP Daemons
SSP Daemons
SSP Daemons
SSP Daemons
$SSPVAR/etc/platform_name/domain_name/edd.erc provides
configuration information for a particular domain.
SSP Daemons
SSP Daemons
edd_cmd
You can use the edd_cmd command to turn on and off edd processing.
edd_cmd -x stop will stop edd processing, and edd_cmd -x start
will restart it.
Warning – Be careful turning off edd processing. The SSP will not be
able to respond to most requests for service from the Enterprise 10000,
such as power or over-temperature events, which could cause physical
damage to system components.
SSP Daemons
SSP Daemons
SSP Daemons
snmpd sends its traps to the SNMP trap sink server daemon (straps)
on the SSP, and to possibly other hosts and applications listening for
Enterprise 10000 SNMP events.
SSP Daemons
● Configuration file
$SSPETC/snmp/agt/Ultra-Enterprise-10000.snmpd.cnf
SSP Daemons
● hostview
● edd
SSP Daemons
machine_server
The machine_server daemon performs several network support
functions for the Enterprise 10000 SSP daemons:
System Operation
hostview
● Bring up domains
● Access the SSP log messages file for each platform or domain
hostview
You only need to run one instance of hostview for a given platform,
although you can run more than one instance at a time to work with
the same platform. You can run hostview from any SSP window
where you have logged in as user ssp, regardless of the setting of
SUNW_HOSTNAME.
Note – If you overload the SSP, you may prevent it from processing
requests from the control board.
Power
Temperature
Fans
Failure
Support Board
Control Board
System Board
Selected board
Busses
Domain 1
(colored border)
Domain 2
(colored border)
The system boards at the top of the display are in the order they
appear on the front of the physical platform. The system boards at the
bottom of the display are arranged in the order they appear on the
back of the physical platform.
◆ Operating system
● hpost
■ download_helper
▲ OBP
? Unknown
Green Running
Maroon Exiting
Blue Unknown
Black Blacklisted
Red Redlisted
You can select one or more system boards in the main window. You
can also select one entire domain in the main window. You must select
a domain or a set of boards prior to performing some operations, such
as creating a domain.
Note – If you click the right mouse button on a system board, you will
receive a small selection menu allowing you to chose power or
temperature displays for that system board.
Help Window
When you select a topic from the Help menu, the Help Window is
displayed:
You can select the desired topic in the upper pane. The corresponding
man page or help information is displayed in the lower pane.
● The Fan button displays the Fan Detail window, which enables
you to view the status of the fans within the platform.
● When certain error conditions occur, the Failure button turns red.
All of the domain messages, both normal and error messages for the
domain, are logged in the file:
$SSPLOGGER/domain_name/messages
where domain_name is the name of the domain for which the message
was issued. This is a copy of the domain’s /var/adm/messages file. It
is constantly updated.
Error messages for the Enterprise 10000 platform that are not specific
to a domain are logged in:
$SSPLOGGER/messages
Administering Power
The SSP gives you the ability to control the power status of individual
components of the Enterprise 10000 system.
Using either the command line or hostview, you can control power to:
Examples
● The power command with no options displays the status of each
power supply and external I/O power connections.
ssp% power
If both the SSP and the Enterprise 10000 platform suffer a power
outage, after the SSP has returned to its proper state, it checks whether
the following conditions are true:
If all of these statements are true, the SSP assumes that the power
outage affected both the SSP and the Enterprise 10000 platform, and
attempts to automatically power on the Enterprise platform. It will not
automatically try to bringup the domains.
In this window, the bulk power supplies are named PS0 through PS7.
The individual system board power supplies are numbered 0 through
15. The individual support board power supplies are named CSB0 and
CSB1, and the individual control board power supplies are named CB0
and CB1.
The Power Detail window shows the voltage for all of the power
supplies or measurement points on the component. The power levels
are given in volts.
Power levels can also be monitored from the command line with the
power command and the hostinfo -p command.
To see the thermal detail for a component, click on it with the left
mouse button. The power detail window for a system board is shown.
The left panel of the system board detail shows the temperatures for
the five ASIC chips, named A0 through A4. The middle panel shows
the temperatures for the three power supplies, and the right panel
shows the temperatures for the four processors, named P0 through P3.
The detail windows for control boards, support boards, and the
centerplane are similar.
You can get temperature information from the command line with the
hostinfo -t command.
The fan command is used to control the speed and activity of the 16
fan trays in the Enterprise 10000. Normally the fans are controlled by
the SSP, but you can override this control if necessary.
Syntax
fan [-l {front | rear}] [-t tray_list] [-p {on | off}]
Usage
● Display power and speed status of the fans at the front or rear of
the system.
-l {front | rear}
● To set the speed of all fans, use the -s operand. All fans always
run at the same speed. nominal is the default.
fan -s {nominal | fast}
You can control fan power and speed from within hostview.
2. Add the desired set of options to the fan command and click
execute (or type Return).
You can also enter the fan command from the command line. For
more information, see the fan man page.
You can use hostview to monitor fan speeds and fan failures for the
32 fans located throughout the Enterprise 10000 platform.
The fan trays are named FT0 through FT7 on the front, and FT8
through FT16 on the back. Each fan tray contains two fans. The color
of the fan tray symbol is green if both fans in the tray are functioning
at normal speed, amber if both fans are functioning at high speed, and
red if either fan within the fan tray has failed.
The same information can be obtained from the command line using
hostinfo -F for fan settings, and hostinfo -p for fan power status.
The top circle indicates the inner (back) fan when you open the fan
tray, and the lower circle indicates the outer (front) fan. The color
surrounding each circle in the fan detail indicates the status of that fan.
The same information can be obtained from the command line using
hostinfo -F for fan state and speed, and hostinfo -p for fan power
status. Fan status can also be monitored with the fan command.
Lab
● help
3. Experiment with controlling the fans (on, off and speed) using the
fan command.
Lab
4. Examine the SSP environment variables and locate the log files.
● /etc/inittab
Before continuing on to the next module, check that you are able to
accomplish or answer the following:
● Domain control
● Power control
● Fan control
● Log inspection
Think Beyond
What does the extensive use of SNMP by the SSP daemons imply for
network monitoring?
Course Map
This module discusses the concept of a domain in detail and describes
how to create, delete, and manage Enterprise 10000 domains, both
through the command line and using hostview.
5-1
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
5
Relevance
Objectives
● Describe a domain.
References
Domains 5-3
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
5
Introduction
The Enterprise 10000 has the ability to run Solaris, as do all of the
other Sun SPARC systems, it also has the unique ability to divide itself
into as many as eight separate systems.
Domain Configurations
The SSP enables you to logically group system boards into Dynamic
System Domains, or simply domains, which are able to run their own
operating system and handle their own workload. They appear as
separate, standalone processors to each other and to the network.
You can use domains for many purposes. For example, you can test a
new operating system version or set up a development and testing
environment in a domain. In this way, if problems occur, the rest of
your system is not affected.
You may create as many domains as you want, but only eight may be
active at one time.
Domains 5-5
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
5
Domain Configurations
Inter-Domain Networking
You may have noticed references to Inter-Domain Networking (IDN)
in the documentation or in system messages. Please ignore these
references.
You can create a domain out of any arbitrary group of system boards.
You can have from one to eight domains, with 1 to 16 entire system
boards per domain. A system board can be in only one domain at a
time. You may not split system board components across domains.
● The name given the new domain is unique and matches the host
name of the domain to be booted (this is a netcon requirement).
Domains 5-7
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
5
Each domain should have its own disk interface and local disk from
which it can be booted. If a domain does not have its own disk, you
must always boot it from the network.
Domain Planning
Domains 5-9
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
5
Enterprise 10000 systems ship with eeprom.image files (on disk) for
the number of domains requested on the sales order, and a serial
number and key for each domain on paper. These image files are
located in:
$SSPVAR/.ssp_private/eeprom_save/eeprom.image.0
$SSPVAR/.ssp_private/eeprom_save/eeprom.image.1
$SSPVAR/.ssp_private/eeprom_save/eeprom.image.2
...
If you must re-create your eeprom.image files, you must have the
serial number and the EEPROM (Electrically Erasable Programmable
Read-Only Memory) key that was used to create your first domain
files. This information was shipped with your system.
If you are creating an eeprom.image file for a new domain, you must
obtain a new EEPROM key and hostid from your service provider for
that domain.
If you cannot find this information and do not have backups of the
eeprom.image files, you must contact your service provider for this
information.
Domains 5-11
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
5
4. Create the files. The key and either a serial number or hostid must
be entered for each domain to be created.
● The first domain uses the serial number. Use the following
form of the sys_id command:
ssp% sys_id -f eeprom.image.domain_name -k key -s serial_number
● Other domains use the hostid. Use the following form of the
sys_id command:
ssp% sys_id -f eeprom.image.domain_name -k key -h hostid
The key and serial number are related; you cannot mix them
indiscriminately. An incorrect key will not allow you to create the
eeprom.image file.
Caution – You must use the -f flag to prevent the default (template)
! eeprom.image file from being overwritten.
Domains 5-13
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
5
IDPROM in eeprom.image.jackson
Format = 0x01
Machine Type = 0x80
Ethernet Address = 0:0:be:a6:6e:5
Manufacturing Date = Wed Dec 31 16:00:00 1969
Serial number (machine ID) = 0xa66e05
Checksum = 0x3f
Back up the SSP eeprom.image files to tape or disk where they can be
accessed in the event of an SSP boot disk failure. These files are located
in the $SSPVAR/.ssp_private/eeprom_save directory.
● You must delete or rename any existing eeprom.image file for the
domain for which you are making the new image file. If the file
exists and you try to recreate it, you will be given an ‘invalid key’
message.
● You will receive a checksum error message the first time a domain
starts with a new eeprom.image file. This is normal and may be
ignored.
● The creation date will always read as your time zone offset from
the GMT time of January 1, 1970.
Domains 5-15
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
5
hostid Information
The table in the above overhead shows how the domain’s hostid, serial
number, and Ethernet MAC address are determined.
domain_status
The domain_status command displays the contents of the
domain_config file. It shows which domains may be activated (but
not which ones are active) and which system boards compose them.
Domains 5-17
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
5
domain_history
The domain_history command displays the contents of the
domain_history file. It shows which domains have been removed but
may be re-created, and which system boards compose them.
The file format and the fields are the same as for the domain_status
command.
1. In the main hostview window, select a board from the domain for
which you want to obtain status information.
If the boards from the desired domain are not displayed, use the
View menu to display the desired domain (or all domains).
Domains 5-19
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
5
Switching Domains
You will have to "switch domains" if you have more than one domain
on your Enterprise 10000. Many of the SSP commands are domain-
specific, and take the identity of the domain they execute against from
the current setting of the SUNW_HOSTNAME environment variable.
domain_switch
The domain_switch command is a C shell alias installed by the SSP
packages that changes the value of the SUNW_HOSTNAME environment
variable and the prompt.
franklin:presidents% domain_switch new26
Switch to domain new26
franklin:new26%
Switching Domains
If you leave out the domain name, you will get the following error:
franklin:presidents% domain_switch
Bad ! arg selector
franklin:presidents%
Domains 5-21
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
5
If the boards that you want to include already belong to a domain, you
must remove the boards from the owning domain using DR or remove
the domain before you can use them.
Note – In either case, the proper eeprom.image file must exist in the
$SSPVAR/.ssp_private/eeprom_save directory.
Examples
Domains 5-23
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
5
1. In the main hostview window, select the board(s) that the domain
will contain.
4. If all other fields are acceptable, click on Execute. You will see the
results of the command displayed in the window, just as if you
had run domain_create from the command line.
Note that the System Boards field indicates the boards that you
selected in the main Hostview window. The default OS version
and the default platform type are also shown. Note that the
platform name defaults as well.
Domains 5-25
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
5
Example
Domains 5-27
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
5
1. In the main hostview window, select any board from the domain
that you want to remove. You only need to select one board to
identify the domain.
Note – Remember that the domain name and Solaris host name must
match. You may need to do a sys-unconfig in the domain before
shutting it down.
Domains 5-29
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
5
Example
franklin:presidents% domain_status
DOMAIN TYPE PLATFORM OS SYSBDS
jackson Ultra-Enterprise-10000 presidents 2.5.1 2 3
new26 Ultra-Enterprise-10000 presidents 2.6 0 1
bozo Ultra-Enterprise-10000 presidents 2.5.1 5
franklin:presidents% domain_rename -d bozo -n hayes
Domain : bozo is renamed to hayes !,
NOTE: The domain boot disk name may also need to be changed
franklin:presidents% domain_status
DOMAIN TYPE PLATFORM OS SYSBDS
jackson Ultra-Enterprise-10000 presidents 2.5.1 2 3
new26 Ultra-Enterprise-10000 presidents 2.6 0 1
hayes Ultra-Enterprise-10000 presidents 2.5.1 5
franklin:presidents%
Domains 5-31
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
5
● Domain 1 – white
● Domain 2 – orange
● Domain 3 – yellow
● Domain 4 – pink
● Domain 5 – brown
● Domain 6 – red
● Domain 7 – green
● Domain 8 – violet
5. To see all the messages issued for the domain, use the following
command:
Some host domain messages are sent to the SSP console window
(but not all). This command enables you to see every message
from the domain in the domain’s netcon window.
Domains 5-33
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
5
Before bringing up a domain, you must ensure that all of its system
boards are powered up.
4. Create the new console window for the domain and start netcon.
bringup checks to see if the domain is already running and will not
execute if it is (unless you use the -f option).
Domains 5-35
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
5
1. Select the domain you want to bring up. Use the mouse to select
any system board belonging to the domain you want to bring up.
Domains 5-37
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
5
Overview of netcon
Typically, you log in to the SSP machine as user ssp and enter the
netcon command. This changes the window into a netcon window
for the domain specified by the SUNW_HOSTNAME environment
variable set in the SSP window.
Overview of netcon
If you have write permission, you can enter Solaris commands. You
can also enter special tip commands prefixed by tilde (~) to perform
the functions offered by the netcontool window.
To reconnect to netcon after exiting with ~., you must reenter the
netcon command.
Domains 5-39
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
5
Using netcontool
● From the hostview window, select a board from the domain for
which you want to bring up a netcon and then select Terminal ➤
netcontool.
Using netcontool
Domains 5-41
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
5
There are four types of netcon sessions that you can request, either
when starting netcon or later. The ~ command given is how you
change to this mode from within a netcon session. Use the buttons on
the netcontool tool bar to change netcontool window states.
Domains 5-43
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
5
Select the type of session in the left panel, the type of window in the
right panel, and then choose Done.
netcontool Buttons
The Lock Write, Unlock Write, and Excl. Write buttons request the
corresponding mode for the console window.
The Rel. Write button in the netcontool window releases any write
access and places the console window in read only mode.
The Status button displays information about all open netcon session
that are connected to the same domain as the current session. This can
be useful in determining which system currently has write permission
to the domain.
Domains 5-45
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
5
Blacklisting Components
You can use the SSP blacklist feature to configure out of use any of the
following Enterprise 10000 components:
● System boards
● Processors
● Address buses
● Data buses
● I/O controllers
Blacklisting Components
Domains 5-47
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
5
Domains 5-49
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
5
From the command line, you can edit the file with a text editor and
delete its contents, or just delete the file.
From hostview:
Domains 5-51
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
5
Processor Sets
Processor sets are a feature provided by Solaris 2.6 that add the ability
to "fence" or isolate groups of CPUs for use by specially designated
processes. This allows these processes to have guaranteed access to
CPUs that other processes, including the system itself, cannot use. A
processor may belong to only one processor set at a time.
System CPUs can be grouped into one or more processor sets by the
psrset -c command. These processors will remain idle until processes
(technically, LWPs) are assigned to them by the psrset -a command.
Processors can be added and removed from a processor set at any time
with the psrset -a and psrset -r commands, respectively. Processor
set definitions can be viewed with the psrset -p command.
Processor Sets
Only the root user may create, manage and assign processes to these
processor sets.
Domains 5-53
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
5
Lab
1. Use telnet to connect to the lab’s main SSP as the ssp account.
On that SSP:
5. Start netcon and inspect the domain from the OBP. Look at the:
b. devalias list
Optional:
8. Add the domain back again and run bringup -A off again.
Before continuing on to the next module, check that you are able to
accomplish or answer the following:
❑ Describe a domain.
Domains 5-55
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
5
Think Beyond
Course Map
This module describes how to install and configure Solaris for an
Enterprise 10000 domain. These instructions assume that you will have
open both an SSP window and a netcon window. It also assumes that
there is not a local CD-ROM drive attached to the domain being
installed.
6-1
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
6
Relevance
Note – This module covers both the Solaris 2.5.1 and 2.6 releases.
Where no distinction is made, the material applies to both releases.
Objectives
References
● Solaris release notes for the level of Solaris you are installing
● The domain name and host name for a domain must be the same,
and cannot duplicate any other host name in the name service
domain.
1. Log in as root.
Note – Remember that the SSP packages configure the SSP to have the
system automatically share every CD-ROM.
If the host name for the new domain is a subset of an existing host
name (such as starfire and starfire-ssp), in the
/etc/inet/hosts file, you must ensure that the new domain entry
precedes all other host and SSP entries.
#
# Internet host table
#
140.55.22.87 starfire
127.0.0.1 localhost
140.55.22.88 starfire-ssp
140.55.22.89 otherhost
Caution – If this type of new domain name entry follows the other
host or SSP entry, the following step will not work correctly.
6. Double check that the CD-ROM has been shared with NFS share
options ro and anon=0. Correct these if it has not been, or your
boot will hang.
1. After the SSP is set up to be a boot server, log in to the SSP as user
ssp.
2. When prompted, specify the name of the domain that you want to
install.
4. Start a netcon session for your domain. After a few minutes, the
OBP ok prompt will be displayed.
ssp% netcon
During the period while the OBP is initializing, you will see no
activity in netcon. The delay could take anywhere from 30
seconds to several minutes. This is normal. At the end of OBP
initialization, you will see the OBP banner and ok prompt in your
netcon window.
The extra alias will not disappear from the devalias listing until
the next OBP reset is performed, but the alias has been deleted.
6. Create an alias with nvalias for the network interface that you
will boot from. Use show-nets to help determine the proper alias.
ok nvalias net ...
7. From the netcon window, boot the domain from the SSP by
typing:
ok boot net
Note – If the domain hangs after you see the Solaris release and
copyright boot messages, this means that you forgot to share the CD-
ROM with NFS option anon=0.
Installing Solaris
4. When the Disks dialog is displayed, choose the disk on which the
software is to be installed, then press F2 to continue.
Note – If you choose a drive other than the one designated in the OBP
boot-device parameter, a warning message will appear later in the
installation process. Make sure that you update the OBP boot-device
parameter before booting the domain.
Installing Solaris
6. When the File System and Disk Layout dialog is displayed, press
F4 to customize the layout.
7. In the Customize Disk screen, set up the disk partitions for the
root disk. Two disks are necessary if you are installing on disks
smaller than 2 Gbytes. If two disks are used, at least / and /usr
must be on the device specified in the OBP boot-device alias.
Installing Solaris
WARNING: The boot disk is not selected or does not have a “/” mount
point (c0t3d0)
Ignore this warning, but remember that you will have to update
the OBP boot-device parameter. Press F2 to continue.
where bootdisk is the OBP devalias name for your boot disk.
1. To remove the Solaris CD-ROM from the SSP CD-ROM drive, log
in to the SSP as root and unshare and eject it.
2. As user ssp, in an SSP window, bring up the new domain with the
bringup command.
ssp% bringup -A on
4. Start a netcon session after the bringup completes. The next steps
are completed from the host domain console.
● Time zone
● Time
7. Enter the host name of the SSP and its IP address. Make sure that
you specify the SSP host name that corresponds to the domain
connection subnet. The name will be saved in /etc/ssphostname.
Press Enter if the system has properly located the SSP, otherwise,
enter the SSP’s host name.
Please enter hostname of SSP for Enterprise 10000_host [name-ssp]: sspname
The only times that you will be prompted for this information are
the first time a domain boots after installation or after the
ssp_unconfig command has been run in the domain.
The domain now finishes the boot sequence and provides the root
login prompt.
There are several packages that must be installed from the SMCC
Server Supplement CD-ROM to finish the domain software
installation.
Package Description
# cd /cdrom/SMCC
# pkgadd -d . SUNWehea SUNWabhdw SUNWeman
9. To remove the Solaris CD-ROM from the SSP CD-ROM drive, log
in to the SSP as root and unshare and eject it.
There are several packages that must be installed from the SMCC
Hardware Updates CD-ROM to finish the domain software
installation.
1. At the SSP, insert the SMCC Hardware Updates CD-ROM. Wait for
the Volume Manager to mount it.
Package Description
# cd /cdrom/SMCC
# pkgadd -d . SUNWabdr SUNWehea SUNWprtnu SUNWxntp SUNWabhdw SUNWeman
9. To remove the Solaris CD-ROM from the SSP CD-ROM drive, log
in to the SSP as root and unshare and eject it.
1. After the domain has rebooted, configure NTP for your local
network. To use the default configuration, create a file called
ntp.conf in /etc/inet. It should contain the following:
server ssp_domain_hostname prefer
server 127.127.1.0
fudge 127.127.1.0 stratum 9
#
driftfile /etc/inet/ntp.drift
#
disable auth
controlkey 1
requestkey 1
authdelay 0.000793
#
precision -18
3. Reboot.
1. Adjust the xntp configuration for your local network. To use the
default configuration, update /etc/opt/SUNWxntp/ntp.conf
by:
server 127.127.1.7
to
server 127.127.1.9
3. Reboot.
Solaris may be preinstalled in one of your domains for you when your
Enterprise 10000 is shipped from the factory. If this is the case, you can
boot the domain immediately, without going through the Solaris
installation process. The SMCC Supplement or Updates packages will
have been installed.
You will still need to respond to the normal suninstall prompts and
then the ssp_config prompts. After configuring NTP, the domain will
be ready for use.
Make sure that your domain’s name service is properly updated for
the Enterprise 10000 control boards, domains and SSP.
Once again, make sure that you have installed any appropriate
patches.
Lab
● Create any necessary OBP boot disk and net devaliases (See
Appendix B for help)
Before continuing on to the next module, check that you are able to
accomplish or answer the following:
Think Beyond
Why would you not want to have a CD-ROM drive attached to your
domain?
Should you use the SSP as your boot server if you load the OS often?
Are there any special issues involved with installing patches on the
Solaris domain? On the SSP?
Course Map
This module provides an explanation of the boot process. It discusses
the environment variables, executables, and order involved in the boot
processes of both the SSP and an Enterprise 10000 domain.
7-1
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
7
Relevance
Objectives
References
On the main SSP, inittab includes a new line at the end to start the
ssp_startup script as a respawn process. The line is:
sp:234:respawn:su - ssp -fc /etc/opt/SUNWssp/ssp_startup.sh 15 \
>/dev/null 2>&1 </dev/null # SUNWsspr
Daemon Start Up
The ssp_startup script starts up the two platform daemons: edd and
snmpd. It then starts up the non-domain (platform) daemons in the
proper order (although the order of startup is not specified here): cbs,
machine_server, fad, and straps.
edd uploads the event detection scripts to the Enterprise 10000 control
board(s), waits for an event to be generated by the scripts running on
the active control board, and then responds to the event by executing
the proper response action script on the SSP.
Each time the main SSP boots, init runs the SSP startup script
$SSPETC/ssp_startup.sh. This startup script checks the SSP
environment for the availability of certain files and the state of the
Enterprise 10000 system itself, sets various environment variables,
and then starts the various SSP daemons.
The SSP daemons to be started at SSP boot time are specified in the
ssp_startup.main file. Each of these daemons is discussed in more
detail elsewhere. This list is provided here for reference.
● machine_server
● fad
● cb_reset
● cbs
● straps
● snmpd
● edd
● obp_helper
● netcon_server
Restartable Daemons
Many of the SSP daemons are monitored and restarted if they die,
because they are essential to the operation of the Enterprise 10000
system. These are specified in the ssp_startup.restart_main file
and are checked every 30 seconds. These daemons are:
● machine_server
● fad
● cbs
● straps
● snmpd
● edd
SSP Domain
bringup
POST tests
*.elf
Start obp_helper
download_helper
OBP
eeprom.image
Start netcon_server TOD value
netcon
ok prompt
Communication path
The bringup command is run from the SSP to configure and boot the
current domain as defined by the SUNW_HOSTNAME environment
variable. It starts the same process that the reset command from the
ok prompt does for other Sun SPARC systems.
Syntax
bringup [-f] [-F] [-p proc] [-Q boot_proc] [-gvCL] [-A {on | off}]
[-l level] [-X blacklist_file_pathname] [boot_args]
Options -g, -l, -p, -C and -X are passed directly to hpost and are not
used by bringup.
Examples:
ssp:domain% bringup -A off
ssp:domain% bringup -A on net -s
Execution
When bringup is run, it:
● $SSPVAR
● $SUNW_HOSTNAME
● $SSPETC
8. Runs hpost.
hpost stands for host POST; it coordinates the POST tests that run in
the host (domain).
Syntax
hpost [-?] [-?postrc | -?blacklist | -?level | -?verbose]
The hpost command runs on the SSP and directs the activities of the
domain configuration and initialization process through
communication with the Enterprise 10000 control board.
The POST program identifies and tests the physical components of the
uninitialized Enterprise 10000 hardware assigned to the domain,
configures what is operational and not blacklisted into an initialized
system, and prepares for the OBP.
Functions
When hpost is run, it:
.postrc
There is no default .postrc file, and you are not required to have one.
hpost looks for .postrc in these places (and in this order). It uses the
first one it finds.
2. $SSPVAR/etc/platform_name/domain_name
There are many other directives, but they should not be used without
the guidance of Sun support or engineering personnel.
The distinction between the blacklist and the redlist files is fairly
simple. The blacklist says don’t use the component, while the
redlist says don’t even see it.
blacklist
The blacklist file tells hpost which hardware components in the
domain are not to be used. They will not be tested and are unavailable
to the domain for the life of the boot. The blacklist was discussed in
Module 5, "Domains."
redlist
The redlist file is for internal and development use only. It tells
hpost which system components are to be considered as not installed,
even if they are physically present. While redlisted components are
effectively blacklisted, redlisting components carries a price in
capability and performance. If any component on a board is redlisted,
POST cannot reset that board. Because some failures require a board
reset to clear them, this forces the entire board to become unusable
and, in some cases, the entire system can become unusable.
Warning – You can make your domain or the entire UE10000 platform
unusable by using the redlist incorrectly. Do not use the redlist
without specific directions from Sun support personnel.
Syntax
obp_helper [-eivqr] [- o filename] [ - d filename] [ - m boot_proc]
[-A {on|off}] [-D {on|off}] [boot-arguments]
Function
● Loads download_helper into the domain.
● Loads the eeprom.image file and the Time of Day (TOD) into the
domain.
Restarting obp_helper
If the obp_helper daemon for a domain terminates, you can restart it
by running obp_helper -r with the proper domain SUN_HOSTNAME
value set. Other than this case, do not try to run obp_helper from the
command line.
Function
● Responsible for preparing the domain for OBP execution.
Remember that the OBP is just a program, and needs memory and
CPU resources properly set up for it to run.
netcon_server
netcon_server is an SSP daemon started by bringup. There is one
netcon_server per active domain. It manages communications
between a domain’s various netcon sessions and the domain specified
by the SUNW_HOSTNAME environment variable.
It also has the responsibility for updating the Enterprise 10000 SNMP
Management Information Block (MIB) with the final domain
configuration information.
Solaris
Modifications have been made to support the Enterprise 10000,
Dynamic Reconfiguration, and Alternate Pathing. All the normal
features of Solaris are provided.
obp
Just as on a regular server, obp builds the domain device tree, and
interprets and executes the FCode resident in the SBus cards.
Note – The OBP is a critical file for domain operation. You should
always have a backup copy of it.
eeprom.image
The eeprom.image file is a binary SSP file that takes the place of the
normal SPARC hardware ID PROM. It is loaded into the domain
during initialization along with the OBP and essentially "customizes"
the OBP for this domain. It contains the:
Each domain must have its own unique eeprom.image file because
each domain has a unique host ID and may have different OBP
environment variable settings.
eeprom.image
If you want to see the changes that have been made to the ID PROM
default settings, and additional devalias entries, you can run the
strings command on the eeprom.image file for the domain.
Remember that you can use sys_id to display the ID PROM area of
the eeprom.image for a domain.
ssp:domain% sys_id -d -f eeprom.image.domain_name
IDPROM in eeprom.image.domain_name
Format = 0x01
Machine Type = 0x80
Ethernet Address = 0:0:be:a6:6e:5
Manufacturing Date = Wed Dec 31 16:00:00 1969
Serial number (machine ID) = 0xa66e05
Checksum = 0x3f
dr-max-mem
● For Solaris 2.6 and above, any nonzero value enables DR.
Reset Handling
More detail on these conditions is in Module 10, "Diagnostic
Information."
sir-sync?
● If set to TRUE, the OBP will try to perform an OBP sync operation
when a SIR (system initiated reset) occurs, caused by a request
from the OS.
xir-sync?
● If set to TRUE, the OBP will try to perform an OBP sync operation
when a XIR (externally initiated reset) occurs, caused by the
hostreset commands.
redmode-sync?
redmode-reboot?
● If set to TRUE, OBP will try to reboot the default boot disk when a
REDMODE condition occurs.
watchdog-sync?
watchdog-reboot?
● If set to TRUE, OBP will try to reboot from the default boot disk
when a watchdog reset condition occurs.
These parameters are used in support of IDN, and are not discussed
further here. Always leave them set to zero.
For example:
/sbus@41,0/qec@1,20000/qe@3,0
SYSIO 0 is the upper pair of SBus slots in the system board. Each
system board SBus slot is labelled with its SBus and slot number.
0 /sbus@40 /sbus@41
1 /sbus@44 /sbus@45
2 /sbus@48 /sbus@49
3 /sbus@4c /sbus@4d
4 /sbus@50 /sbus@51
5 /sbus@54 /sbus@55
6 /sbus@58 /sbus@59
7 /sbus@5c /sbus@5d
8 /sbus@60 /sbus@61
9 /sbus@64 /sbus@65
10 /sbus@68 /sbus@69
11 /sbus@6c /sbus@6d
12 /sbus@70 /sbus@71
13 /sbus@74 /sbus@75
14 /sbus@78 /sbus@79
15 /sbus@7c /sbus@7d
For example, this means that /pci@68 would decode to board 10, slot
or position 0 (upper slot).
/SUNW,UltraSPARC@1f,0
System
Processor 0 Processor 1 Processor 2 Processor 3
Board
Lab
1. Log in to the SSP as user ssp. Enter your domain name, when
prompted, for SUNW_HOSTNAME.
2. Using hostview, verify that the I/O power distribution unit(s) and
your domain system board(s) are on. (Note: I/O components will
not be on until enabled by the SSP power command.)
● From the banner, note the domain memory size, serial number,
ethernet address and host ID.
● Use printenv and devalias to verify the boot disk.
Lab
domain# uname -a
SunOS domain 5.5.1 Generic sun4u sparc SUNW,Ultra-Enterprise-10000
domain# shutdown -y -i0 -g0
Lab
15. Using the Enterprise 10000 CD-ROM in the SSP’s CD-ROM drive,
boot the domain from the CD-ROM to single-user mode. This can
be used for maintenance on the system disk (forgotten root
password, and so on). Remember, the SSP must be correctly set up
as a boot server for this to work.
ssp:domain% bringup -A on net -sw
16. Use ps and df to look at what processes are running and what file
systems are mounted.
domain# ps -ef
domain# df -k
Reboot to the disk.
domain# reboot
Before continuing on to the next module, check that you are able to
accomplish or answer the following:
Think Beyond
Why are the SSP daemons only started on the main SSP?
Why is the boot PROM loaded from the SSP? How else could it be
done?
Course Map
This module describes the concepts, configuration, restrictions,
operation, and control of Alternate Pathing (AP). AP gives you the
ability to have multiple paths to the same device from one domain,
providing an extra degree of fault resilience.
8-1
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
8
Relevance
Objectives
References
AP Concepts
Alternate Pathing (AP) enables you to have two physical paths to the
same A5000 or SSA storage array or network interface, transparent to
the operating system.
Only one path may be active at a time. If a path fails, the alternate path
can be configured active in place of the failed path. Path switching
does not always occur automatically; it may need to be be performed
manually.
The system uses the meta-device, a name representing the end object
(such as the disk partition or network interface), but does not use the
physical path names to access the device.
Alternate Pathing
Read Write
proc- proc-
Two alternates essing essing
Disk driver
(e.g., ssd for SSA)
AP network Meta Driver
Read Write (mxx)
proc- proc-
Nexus driver essing essing
(e.g., pln/SOC for SSA)
Stream end
Device Read Write (xx driver)
(e.g., SSA disk array)
Driver routines
Once
per
interface
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
Physical network
interface
8
8-5
8
AP Implementation
AP Implementation
The active path can be manually switched to the alternate, at any time,
with no interruption to active traffic using the metadevice. Note that
there is no automatic switch-over to the alternate path if the active
path fails. In the case of Dynamic Reconfiguration, however, disk and
network paths will be automatically switched.
AP Implementation
AP Requirements
● For disk alternate paths, both ports of the SSA to be used must be
connected to the domain.
● For network alternate paths, you must have two interfaces of the
same device type (such as qfe and qfe) on the same subnet.
Supported Devices
Disk Devices
AP supports the StorEdge A5000 (Solaris 2.6 only) and SPARCstorage
Arrays.
SCSI devices are not supported. The StorEdge A3000 is not supported,
but has its own internal AP capability.
After you set up Alternate Pathing for disks, you can use Solstice
DiskSuite Version 3.0 and Sun Volume Manager Versions 2.3, 2.4, and
2.5 normally. (However, on installation DMP will automatically disable
itself in Volume Manger 2.5 if AP is already installed.)
Caution – You must make sure that any AP devices used by these
! products are used by their meta-device names only.
Supported Devices
You can place your boot disk and primary network interface under AP
control. This makes it possible for the system to boot unattended, even
if the primary network or boot disk controller is not accessible, as long
as a usable alternate path for these devices is defined and available.
Network Devices
The network devices supported by AP are:
● LE Ethernet (le)
● QE Ethernet (qe)
Installing AP
Solaris 2.6
The AP 2.1 packages for Solaris 2.6 are provided on the SMCC Server
Supplements CD-ROM shipped with the Solaris 2.6 server media kit.
Installing AP
Solaris 2.5.1
The AP 2.0 packages for Solaris 2.5.1 are provided on the Alternate
Pathing 2.0 for the Ultra Enterprise 10000 CD-ROM shipped with each
system.
On the SSP:
Documentation:
● SUNWabap – AP AnswerBook
To install AP in the domain, share the AP CD-ROM from the SSP and
mount it to the domain using NFS. Apply any appropriate patches.
Physical Paths
For the purposes of AP, an I/O device is either a disk or network device.
The only types of disk device currently supported by AP are the
StorEdge A5000 (Solaris 2.6 only) and the SPARCstorage Array (SSA).
In this module, the term disk always refers to one of these devicse.
The term physical path refers to the electrical path from the host to a
disk or network.
Meta-Disk
A meta-disk is a logical name that enables you to access a disk device
generically—you do not need to specify the particular path to the
device. You reference a meta-disk just as if it were a real device, using
an AP-specific device node such as /dev/ap/dsk/mc0t1d1s0. The
AP software determines which path is active and uses that path to
access the device.
Disk Pathgroup
Make sure that you understand the use of the term alternate. It means
either possible path, not just the spare path. The path in use is the
active alternate.
Only one alternate path at a time is allowed to handle disk I/O. The
alternate path that is currently handling I/O is called the active
alternate.
One of the alternate paths is designated the primary path. The primary
path is initially made the active alternate. Although you can change
which path is the active alternate, the primary path is always the same.
Disk Pathgroup
Some considerations:
● If you are using hubs in your configuration, use a separate hub for
each interface
Meta-Network
Network Pathgroup
● Consider using a separate hub for each path for even more
redundancy
Sample AP Configurations
The above diagram shows how you can use AP to provide fault
tolerance for an Ethernet network and an A5000 storage array.
AP With Mirroring
AP is similar to, but not the same as, disk mirroring. Disk mirroring
replicates data to separate devices and thus achieves data redundancy.
AP, on the other hand, achieves pathing redundancy. Disk mirroring
and AP are complementary; you can use them together to achieve both
data redundancy and pathing redundancy.
In the above example, the mirroring occurs on top of AP, which enables
switching of the underlying adapters used to implement the SSA
mirror from one board to another without disruption of the disk
mirroring or any active I/O.
AP With Mirroring
Device Paths
The above diagram shows the path of an I/O operation in a Volume
Manager or Solstice DiskSuite mirrored environment using AP.
You must dedicate an entire raw disk slice, of at least 300 Kbytes, to
each AP database copy. You can use larger slices, but doing so wastes
disk space since AP won’t need it. It doesn’t matter which slice you
use.
● The copies can be on any slice of any type of disk device. They do
not need to be on devices that AP supports, and do not need to
have alternate paths.
Before you can begin configuring AP, you must create at least one AP
database. The AP database is created with the apdb command. You can
use apdb to create the original database or a copy.
The -c (create) option is followed by the raw disk slice that will
contain the new AP database copy. Each copy requires its own
dedicated slice, which must be at least 300 Kbytes in size.
If you have installed the AP software but have not created at least one
database copy, you will see the following messages on the console
early in the boot process:
WARNING: ap: no database locations
/sbin/apconfig: apd_pathgroup_reset: ioctl() failed.
/sbin/apconfig: ... errno 48
/sbin/apconfig: Error 48
Note – In Solaris 2.6, the last three message lines are not seen.
You must create this database copy twice, specifying each of the
physical paths to the AP meta-disk. For example, if c1 and c9 are
connected to the same AP pathgroup, to create a copy of the AP
database residing on target 3, slice 4, use the following two
commands:
# apdb -c /dev/rdsk/c1t3d0s4 -f
# apdb -c /dev/rdsk/c9t3d0s4
The whole process works outside of AP. AP is not aware that these are
two separate copies of the database.
path: /dev/rdsk/c0t1d0s4
major: 32
minor: 12
timestamp: Thu Jul 27 16:24:27 1995
checksum: 687681819
corrupt: No
inaccessible: No
#
In this example, only one AP database had been created. If there had
been more than one copy, information about each copy would have
been listed.
● The major and minor number of the device that it resides on.
● A contents checksum.
You also use the apdb command to delete a copy of the AP database.
# apdb -d /dev/rdsk/c0t1d0s4 -f
The -d (delete) option specifies the raw disk slice continuing the copy
of the AP database that you want to delete.
The -f (force) option is required only when you are deleting either the
last or the next-to-last copy of the AP database.
metanetwork: mle0 U
physical devices:
le2
le0 P A
metanetwork: mle3
physical devices:
le4
le3 P A
Meta-Network Interfaces
A meta-network interface name is derived from the name of the
primary alternate for that meta-network. A meta-network interface
name has the form mxxx where xxx is the primary interface name such
as le0.
For example, assume that the network adapters le0 and le1 connect
to the same Ethernet network. Meta-network device mle0 could
include these two adapters (if the primary adapter is specified as le0).
Similarly, QE Ethernet meta-network names have the form mqen. Note
that you cannot mix le and qe devices in the same pathgroup.
FDDI Devices
FDDI 5.0 meta-network names have the form mnfn. The nf networks
can be either SAS (Single-Attached Station) or DAS (Dual-Attached
Station). AP 2.0 (only) also supports FDDI 3.0 SAS bf devices. You
cannot mix bf and nf devices in a pathgroup.
metanetwork: mle0 U
physical devices:
le2
le0 P A
metanetwork: mle0
physical devices:
le2
le0 P A
For example:
# ifconfig mle0 plumb
# ifconfig mle0 inet 192.9.201.150 netmask + broadcast + up
Setting netmask of mle0 to 255.255.255.224
# ifconfig -a
lo0: flags=849<UP,LOOPBACK,RUNNING,MULTICAST> mtu 8232
inet 127.0.0.1 netmask ff000000
mle0: flags=863<UP,BROADCAST,NOTRAILERS,RUNNING,MULTICAST> mtu 1500
inet 192.9.201.150 netmask ffffffe0 broadcast 192.9.201.159
ether 0:0:be:a6:51:84
A FDDI meta-device must have a unique MACID as well, and one that
does not duplicate the MACID of any other FDDI adapter on the
network. You will need to find an unused MACID for each FDDI
meta-device.
To switch the active interface, use the apconfig command. The change
will occur immediately. There is no commit process for pathgroup
switching.
# apconfig -P mle0 -a le2
You can see that the switch has occurred by using the apconfig -N
command.
# apconfig -N
metanetwork: mle0
physical devices:
le2 A
le0 P
Warning – When you switch interfaces, AP does not check that the
interface you are going to is the correct path. AP does not know if the
new interface is connected to the wrong subnet, disconnected, or
inoperative.
metanetwork: mle0 D
physical devices:
le2 A
le0 P
Once you have committed the deletion, you must re-create the
pathgroup to restore it.
The primary network interface between your Sun server and the other
machines on the network is the Ethernet interface on the same subnet
as the SSP. You can alternately path this interface. The primary
network interface is the only interface that can be auto-switched to its
alternate at boot time.
During the boot process, if the active interface for the primary network
fails, the OS attempts to find an alternate interface. Note that the AP
database in your domain is used to do this. While a subset of the host’s
AP database resides on the SSP, this is only used to switch the boot
drive if necessary. When the host is ready to configure the primary
network interface, the domain’s AP database is available.
2. Create the new network pathgroup, commit the change, and verify
the result.
# apnet -c -p qe0 -a qe4
# apdb -C
# apconfig -N
metanetwork: mqe0
physical devices:
qe4
qe0 P A
Bring down the physical network interfaces and bring up the meta-
network interface in any of the following ways:
You can also execute these commands all on one line, separated
with semi-colons. Ensure that you do not have any syntax errors.
Just like the network pathgroups, use the apconfig command to view
disk pathgroup entries, but with the -S option.
c1 pln0 P A
c3 pln1
metadiskname(s):
mc1t5d0 U
mc1t4d0 U
mc1t3d0 U
mc1t2d0 U
mc1t1d0 U
mc1t0d0 U
c1 pln0 P A
c3 pln1
metadiskname(s):
mc1t5d0 R
mc1t4d0
mc1t3d0
mc1t2d0
mc1t1d0
mc1t0d0
The P next to pln0 indicates that pln0 is the primary path, and the A
indicates that pln0 is currently the active path. The R next to mc1t5d0
indicates that this is the root (boot) device.
In the case of the A5000, AP uses the sf port, or the Fibre Channel
connection to the host. The socal driver represents the SOC+
card, the sf represents the GBIC on the SOC+ card, and the ssd
driver is the physical disk driver. The ses driver is not seen in the
I/O path. It represents a monitor connection to an interface board.
The naming conventions are the same for the SOC and SOC+
adapters built-in to the Enterprise server I/O boards.
Note – AP 2.1 for Solaris 2.6 supports both A5000 and SSA devices. AP
2.0 for Solaris 2.0 only supports SSAs. Unless otherwise mentioned, all
commands apply to both releases and both disk arrays. Examples will
use SSAs to be able to apply to both releases; A5000 output is very
similar.
pln0
/dev/dsk/c1t0d0
/dev/dsk/c1t1d0
/dev/dsk/c1t2d0
/dev/dsk/c1t3d0
/dev/dsk/c1t4d0
/dev/dsk/c1t5d0
pln1
/dev/dsk/c3t0d0
/dev/dsk/c3t1d0
/dev/dsk/c3t2d0
/dev/dsk/c3t3d0
/dev/dsk/c3t4d0
/dev/dsk/c3t5d0
#
Note – This apinst output has been edited to remove the non-Fibre
Channel devices such as SCSI controllers and peripherals. These
usually appear as ispx.
0 1
1 3
2 5
3 7
Using the ssaadm disp command, you can get the World Wide Name
(WWN) for each controller. The WWN is a unique identifier that
identifies every FibreChannel device, exactly like an Ethernet MAC
address.
# ssaadm disp c1
CONTROLLER STATUS
Vendor: SUN
Product ID: SSA110
Product Rev: 1.0
Firmware Rev: 3.12
Serial Num: 00000083BE1D
Accumulate Performance Statistics: Enabled
0 1 00000083BE1D
1 3 00000083BE1D
2 5 00000083BC49
3 7 00000083BC49
You now can confirm that the same SSA is accessible through c1 (pln0)
and c3 (pln1), and the other through c5 (pln2) and c7 (pln3).
The two pathgroups therefore must consist of pln0 and pln1, and
pln2 and pln3.
c1 pln0 P A
c3 pln1
metadiskname(s):
mc1t5d0 U
mc1t4d0 U
mc1t3d0 U
mc1t2d0 U
mc1t1d0 U
mc1t0d0 U
c1 pln0 P A
c3 pln1
metadiskname(s):
mc1t5d0
mc1t4d0
mc1t3d0
mc1t2d0
mc1t1d0
mc1t0d0
6. Use the ls command to confirm that the device nodes have been
created.
# ls /devices/pseudo/ap_dmd*
/devices/pseudo/ap_dmd@0:128,blk
/devices/pseudo/ap_dmd@0:128,raw
/devices/pseudo/ap_dmd@0:129,blk
/devices/pseudo/ap_dmd@0:129,raw
/devices/pseudo/ap_dmd@0:130,blk
/devices/pseudo/ap_dmd@0:130,raw
...
8. Use the ls command to confirm that the /dev links to the device
nodes have been created.
# ls -l /dev/ap/dsk
total 8
lrwxrwxrwx 1 root 40 Jul 27 16:47 mc1t0d0s0 ->
../../../devices/pseudo/ap_dmd@0:128,blk
lrwxrwxrwx 1 root 40 Jul 27 16:47 mc1t0d0s1 ->
../../../devices/pseudo/ap_dmd@0:129,blk
lrwxrwxrwx 1 root 40 Jul 27 16:47 mc1t0d0s2 ->
../../../devices/pseudo/ap_dmd@0:130,blk
Warning – Remember that you can still access the device through
both physical paths when the meta-device is active if you specify the
physical path name. This may not be safe, as the OS environment as a
whole is not aware that the physical paths are related and can cause
data loss or corruption. To be safe, never access a meta-device through
the physical path unless you are very sure of what you are doing.
Note that if you are placing the boot disk under AP control, you will
also need to modify the vfstab file by using the apboot command.
See Appendix B for further information.
AP 2.1 supports mirrored boot drives. This means that you could have
four physical paths to your boot volume, two to each copy.
Disabling DMP
SEVM 2.5 DMP is incompatible with both AP 2.0 and 2.1 (and with
Sun Cluster 2.0 and 2.1). It must be disabled to allow these products to
function correctly.
Note – You can perform a switch at any time, even while I/O is
occurring on the device. You might want to experiment with the
switching process to verify that you understand it and that your
system is set up properly, rather than wait until a critical situation
occurs.
Warning – When you switch paths, AP does not check that the path is
correct. That is, it does not check to see if the same device is accessed
by each path. It does determine whether or not that path is detached
or off line. You may want to verify the status of the path before
switching to it by using a command such as prtvtoc. AP does not
produce any error or warning messages if you switch to a path that is
not functioning properly. If you switch to a non-functioning path for
your boot disk, your system may crash if the path is not switched back
immediately.
c1 pln0 P A
c3 pln1
metadiskname(s):
mc1t5d0
mc1t4d0
mc1t3d0
mc1t2d0
mc1t1d0
The syntax is confusing because the primary path and the pathgroup
name are the same. Be careful.
3. Verify the results with the apconfig -S command. You can see
that the active alternate has been switched to pln1.
# apconfig -S
c1 pln0 P
c3 pln1 A
metadiskname(s):
mc1t5d0
mc1t4d0
mc1t3d0
mc1t2d0
mc1t1d0
Use the apconfig command, specifying that the primary path is now
to be the active path.
# apconfig -P pln0 -a pln0
# apconfig -S
c1 pln0 P A
c3 pln1
metadiskname(s):
mc1t5d0
mc1t4d0
mc1t3d0
mc1t2d0
mc1t1d0
Do not confuse the pathgroup name with the physical path name. It is
easy to do.
c1 pln0 P A
c3 pln1 T
metadiskname(s):
mc1t5d0
mc1t4d0
mc1t3d0
mc1t2d0
mc1t1d0
● Resetting the flag manually with apdisk -w. Specify the tried path,
not the pathgroup name.
# apdisk -w pln1
#
Note – Resetting the flag manually should only be done after the cause
of the failure has been repaired.
You can still manually switch to a path marked tried with the
apdisk -P command.
c1 pln0 P A
c3 pln1
metadiskname(s):
mc1t5d0 D
mc1t4d0 D
mc1t3d0 D
mc1t2d0 D
mc1t1d0 D
mc1t0d0 D
To allow for unattended system boot even if the I/O adapter for the
boot disk fails, you can place your domain boot disk under AP control.
Because AP only works with A5000s (Solaris 2.6 only) and SSAs, the
boot device must reside on a drive of one of these types.
If you have encapsulated and mirrored your boot disk with Volume
Manager, AP will attempt to recover from boot device problems before
Volume Manager attempts to use a mirror drive. AP 2.1 will retry
using the mirror device if it has alternate paths.
This discussion applies to both AP 2.0 (for Solaris 2.5.1) and AP 2.1
(for Solaris 2.6) unless otherwise noted.
# apboot mc2t0d0
5. At this point, just reboot the system to begin using the AP boot
device.
Caution – If you place the boot disk under AP control, you must
! manually edit /etc/vfstab to also place other file systems that are
mounted during the boot process under AP control.
In the /etc/vfstab file, you must change the device to mount and
device to fsck paths for all of the other mount points that you want
to place under AP control.
To remove AP support from your boot disk, use the apboot command
to specify a physical device node. For example, to use a non-AP device,
use
# apboot c2t0d0
apboot will also edit the /etc/system file to remove the force
loading of the AP kernel driver modules, because they are no longer
immediately needed when the boot disk is not an AP device.
Warning – If you place the boot disk under AP control and later
decide to remove the AP packages, you must first use apboot to
remove the boot disk from AP control. If you do not, the system on
that disk becomes unbootable.
The apboot command allows you to access a single boot drive through
a physical alternate path. If you want to mirror your boot drive, that is,
if you have to separate drives containing copies of the boot drive, you
can still use AP with them. The mirror must be built using the meta-
disk names for the devices.
Make sure that you have the AP 2.1 support patch on the SSP.
3. Tell the system about the mirrored drive using apboot -m.
# apboot -m mc5t3d0
4. Create the mirror using the meta-disks if you have not already
done so with your disk management software.
# apboot -u mc5t3d0
2. OBP sends the specified boot-device path to the boot disk on the
SSP.
Caution – One or two minutes may pass before visible action is taken,
! so do not immediately intervene if you notice that the boot process has
failed.
5. The AP SSP daemon looks up the alternate path for the boot disk
in the AP SSP database, then retries the boot process with the
other alternate path.
● An active alternate for a disk, other than the boot disk, turns out to
be inaccessible and that disk is required during the boot process.
Only the boot drive will automatically have an alternate path tried
if the active alternate fails. All other drives required at boot time
(for example, they are listed in /etc/vfstab) must be switched
manually if their active alternate is not available.
These situations will occur only with disks, not network interfaces. In
either case, however, you may be able to use the AP commands in
/sbin to resolve the problem.
Lab
____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________
3. Verify the requirements to set up alternate paths for disks and the
network interface. Make sure that your domain is configured
properly to create the network and disk pathgroups.
____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________
10. Switch the active alternate and verify access to the drive.
____________________________________________________________
____________________________________________________________
11. If you are using AP 2.1, disconnect the active device path, if
possible. Watch what happens. Restore the path and reset the T
flag.
____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________
Before continuing on to the next module, check that you are able to
accomplish or answer the following:
Think Beyond
Course Map
This module covers the operation, configuration, and management of
Dynamic Reconfiguration. It discusses the system requirements and
procedures for both DR Attach and DR Detach, interaction with AP,
and the restrictions and problems you may encounter during the DR
process.
9-1
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
9
Relevance
Objectives
● Discuss the restrictions and problems that can occur with DR.
References
Once you have removed the board from the domain, you may power it
off and physically remove it from the system. When you add a new
board, after installing it and powering it on, you can add it to a
domain.
You must add or remove one entire board to or from the domain at a
time. There is no DR mechanism for removing board components from
a domain, nor for DR operations involving multiple boards.
DynamicReconfiguration Capabilities
When to Use DR
You might reconfigure the system for several different reasons.
● New Domain Creation - You are splitting system boards from one
or more existing domains or deleting domains entirely, to add the
system boards to a new or existing domain. Remember that it may
be easier to halt the old domain, then delete and re-create it to free
up system boards, rather than do multiple DR operations.
Dynamic Reconfiguration
All DR operations are performed from the SSP, which interacts with
the control board and the DR daemon in the host domain.
You can set dr-max-mem from the ok prompt with setenv or from the
OS with the eeprom command. Both methods require a bringup of the
domain before they take effect. You can also set dr-max-mem from the
ok prompt with setenv-dr, which only requires a reset (reboot) to
take effect since it makes the change in memory as well as on the SSP.
You can use DR attach to dynamically add a board and its physical
memory after the domain is booted. However, the extra memory
cannot be used unless enough memory data structures were allocated
at boot time to support it. In Solaris 2.5.1, they cannot be extended
dynamically after boot time.
If you add system boards to the domain totaling more memory than
dr-max-mem specifies, only the boards’ processors and I/O devices can
be attached (without a reboot). Memory on these system boards sits
idle.
Considerations
Do not set dr-max-mem too high. At least 8 Mbytes of system memory
is consumed for each 1 Gbyte of memory that the domain is to
support. If you never attach the memory, the reserved memory used
for the page structures is wasted.
Considerations
If the value of dr-max-mem is set smaller than the amount of physical
memory present when the domain is booted, the operating system
instead uses a value equal to the current memory size. This means that
you cannot attach more memory to the domain, but you can detach
and then reattach up to the current amount of memory. The maximum
amount of memory you can reattach is the amount that was present
when the domain was booted. Additional memory attached is ignored.
Actual Effective
dr-max-mem dr-max-mem Effect on DR
Setting Setting
0 0 No DR
Memory size < 0 No DR
512 Mbytes
< memory size Memory size Cannot increase
memory size Memory size Cannot increase
> memory size Value set Increase to limit
set
To set the dr-max-mem, from the OBP prompt for the domain type:
ok setenv-dr dr-max-mem NNNNN
At boot time, if the dr-max-mem is nonzero, you will see the following
messages:
DR: current memory size is XXXXX MBytes
DR: capacity to allow an additional YYYYY MBytes of memory
DR Attach
Requirements
To be able to attach a system board to a domain:
If all of these requirements are not met, you will not be able to do the
DR attach.
DR Attach
Requirements
DR will block the attach if its requirements are not met. You can ask
DR if the requirements are met before starting the attach.
The domain dr_daemon tracks the state of the attach operation. For
example, once the Init Attach operation is completed successfully, the
daemon remembers this state. You can return to an unfinished DR
operation later and complete or abort the attach at that time.
DR Attach
Operation
DR attach is a two- and sometimes a three-step process. The first two
steps are always required. The process is the same from hostview or
from the dr shell.
● OBP probes the board devices and builds the device tree.
DR Attach
Operation
2. complete_attach – Gives the board to the OS.
Note that the entire DR attach operation is run from the SSP, even
though it performs work in the domain.
Checking environment...
Establishing Control Board Server connection...
Initializing SSP SNMP MIB...
Establishing communication with DR daemon...
System Failures
If the domain fails during the DR operation, it is frozen in its current
state. You may need to run bringup to clear the operation.
With hostview, you perform the same steps to attach a system board
that you do with dr. hostview will fill in some defaults and command
fields for you, and give you the ability to track the progress of the
operation graphically.
● dismiss – Aborts the currently active step, and leaves the board in
its current state (Present, Init Attach, or In Use).
3. Click on the top Select button, and the Board and Source Domain
fields will be filled in for you.
The source domain is the domain that the board currently belongs
to. If the board is not a member of any domain, the source domain
name will always be no_domain. This is filled in for you.
4. Fill in the target domain name or, in the main hostview window,
select the domain to which you want to attach the board.
5. Click Execute.
Clicking on init attach begins the first phase of the board attach
process. When this phase is complete, the caption on the button
changes to complete.
8. At this point, the system board is ready to be used by the OS. You
release it to the OS by clicking on complete.
The system board resources (processors, memory, and I/O devices) are
now available to the operating system.
9. Click on dismiss.
Disk Devices
Syntax
Where
When you are using hostview to do a DR operation, you can view the
system information by using the buttons shown in the image.
If you click on All, all of the currently available items are displayed.
For each processor on the selected board, the window shows the
numeric ID, processor status (Online or Offline), and any bound
thread information.
● The highest and lowest physical pages that reside in this board’s
memory.
The memory drain display will show one of the following states:
As an aid in tracking drain progress, the drain operation start time and
current time are also displayed.
The controllers or devices installed in each slot are listed. Devices are
listed by the instance number (for example, sd31). You can use
/etc/path_to_inst to decode these if necessary.
This display shows the devices that are logically present to the OS; it
does not always show all the devices that are physically present on the
board. For example, controllers whose drivers are unattached will not
appear in the list.
The physical device display that is available using the obp button
shows all of the cards on the board that were configured in the system.
The window includes an open count (if available) and the name by
which the device is known to the OS. This might be a disk partition,
meta-device, or an interface or instance name. Additional information
may be provided including the partition mount points, network
interface configuration, swap space usage, and meta-device usage.
There are some forms of device usage which may not be reported.
Examples are the raw disk partitions used for Solstice DiskSuite,
Alternate Pathing databases, and Sun Volume Manager.
For example, in the Init Attach state, only the I/O adapters known
are—not the devices attached to them or the memory interleave
configuration. The OBP window is usually most useful when a board
is in the Init Attach state.
This display shows the suspend-unsafe devices that are currently open
(in use). This information is useful for determining the cause of
operating system quiesce errors from unsafe devices. In this example,
no unsafe devices are open.
DR Detach
Requirements
To be able to detach a system board from a domain:
Warning – DR does not check to see if you will have enough swap
space before starting the detach. You can use DR to determine how
much memory needs to be drained from board, then use swap -l to
determine if the current amount system swap space is sufficient. The
system will hang is there is insufficient space.
DR will disallow the detach request (at some point) if its requirements
are not met.
drain
The primary function of the drain operation is to empty all of the
physical memory on the board being detached. If sufficient remaining
memory or swap space is not available when the drain operation is
requested, the request fails.
hostview and dr are available to monitor the drain operation. You can
view the current status of the drain operation including the number of
memory pages remaining to be emptied. The drain operation is
complete when all memory on the detaching board is free.
If you decide not to proceed with the detach operation, you can abort
the operation, and the board's memory is returned to regular usage.
complete_detach
Before the detach operation can be completed, you must terminate all
use of board resources (processors, memory, and I/O devices). DR
terminates the use of memory, processors, and network devices
automatically, while you must manually terminate the use of all non-
network I/O devices.
reconfig
The reconfig step runs the drvconfig command and then the disks,
tapes, ports, and devlinks commands, deleting the removed
devices from the system’s configuration directories. Do not run
reconfig if you will later be returning the devices to the domain.
When all board memory usage is terminated, you can try to complete
the detach process with the complete_detach operation.
● Abort – You can abort the detach operation at any time before it
completes.
If you abort the detach, the board's memory is returned to the OS and
all detached board devices are reattached.
The board is now available for any desired use. You may:
Enabling DR Detach
DR detach requires that the OBP parameter dr-max-mem be set to a
nonzero value. This setting is required at the time the domain is
booted. If the value is zero, you will not be able to perform any DR
operations on the domain.
I/O Devices
To be able to remove system boards containing critical system
resources, the system must be properly configured.
You will need to plan ahead for this. If you can not end usage of even
a single device attached through the board, you cannot remove the
board.
The same applies to network controllers. The board that hosts the
interface that connects the SSP to the domain cannot be detached
unless an alternate path exists on another board.
I/O Devices
A board hosting non-vital or replaceable system resources can be
detached whether or not there are alternate paths to the resources.
There are still a series of requirements that must be met:
● All of the board's devices still must be closed before the board can
be detached.
● You must have the system discontinue using all of its raw
partitions on the drives being removed.
These actions are not done by the system; they must be done manually
before the board can be detached. You may have to kill processes that
have open files or devices using the board’s resources.
FDDI
If FDDI interfaces are detached, DR kills the FDDI network monitoring
daemon before performing the detach operation, and then restarts it
after the detach is complete.
● On the same subnet as the SSP host for the system. Because DR
operations are initiated on the SSP, control of the detach process
would be lost.
● For Solaris 2.5.1, the active alternate for an Alternate Pathing (AP)
meta-network device when the AP meta-device is plumbed.
(Manually switch the active path to one that is not on the board
being detached.)
In the hostview device display and in the drshow I/O listing, there is
an open count field that indicates how many processes are using a
particular device. To see which processes have these devices open, use
the fuser command.
● Either kill any process that directly opens a device or raw partition
from the board, or direct it to close the open device.
DR Detach-Safe Devices
Not all device drivers support DR. This means that devices managed
by those drivers will require special attention during DR detach.
You can detach a system board that hosts a device only if the driver for
that device supports the DDI_DETACH interface or if the device driver
is not currently loaded into memory. A driver that supports
DDI_DETACH is called detach-safe; a driver that does not support
DDI_DETACH is called detach-unsafe.
DR Detach-Safe Devices
Caution – If you are not sure whether a device can be safely detached,
! ask your service provider or vendor. Do not add it to the list first.
1. Stop all usage of the controller for the detach-unsafe device, and
stop all other controllers of the same type on all boards in the
domain.
4. You can now resume use of the remaining devices. The driver will
be reloaded by Solaris.
If you cannot execute the above steps, you can reboot your domain
with the board or interface card blacklisted, or you can remove the
board from the domain while the domain is down.
Using modunload
Once you have identified the driver that you need to remove from
memory, you must run the modunload command to delete it from the
kernel. First you must use the modinfo command to get the device
driver’s ID number. It is not always the same, and is not the driver
major number from /etc/name_to_major.
1. Run the modinfo command to get the driver ID. The driver ID is
the first number in the modinfo output, in this case 107.
# modinfo | grep tape
107 f66a0000 dfe9 33 1 st (SCSI tape Driver 1.173)
Swap Space
Swap Space
The amount of additional swap space that you will need is equal to the
amount of mainstore on two domain system boards. To be able to
handle every case, you need to plan using the largest memory amount
on any domain system board. The full amount will double if you are
using eight-way interleave, when it is supported by DR, because you
would need to empty twice as many boards.
Depending on how short of swap space you are, the DR operation may
fill all available swap space and take down the system, as mentioned
earlier.
Also, remember that you need enough available swap space to run
your production workload with adequate performance.
Caution – Make sure that you have enough space in the new primary
! swap partition (and in /var) to contain a full domain panic dump,
approximately 500 to 800 Mbytes.
Memory Interleaving
By default, the system does not set up system boards with interleaved
memory. To allow interleaving, the following line must be in .postrc:
mem_board_interleave_ok
Memory Usage
Memory Usage
Correctable Errors
Correctable memory error reporting can interfere with DR.
recordstop dumps are taken by the SSP when one occurs. Multiple
dumps can prevent DR from completing its drain processing.
To prevent this, you may need to temporarily disable edd using the
edd_cmd command. Remember to restart edd processing after the DR
operation completes.
Note that the entire DR detach operation is run from the SSP. The only
host operations required are those to prepare the I/O devices for the
detach.
Checking environment...
Establishing Control Board Server connection...
Initializing SSP SNMP MIB...
Establishing communication with DR daemon...
2. Drain the board with the drain command. You can drain only one
board at a time.
The drain command will return immediately, but the drain may
not be finished. If you want the drain command to complete only
when the board has been completely drained, use the wait option
(drain 6 wait).
dr> drain 6
Removing board 6 from domain_config file.
Start draining board 6
Board drain started. Retrieving System Info...
dr>
3. You can monitor the progress of the drain operation with drshow.
dr> drshow board_number drain
With hostview, you perform the same steps to detach a system board
that you do with dr. hostview will fill in some defaults and command
fields for you and give you the ability to track the progress of the
operation graphically.
3. Click on Select. The Board and Source domain fields will be filled
in for you.
4. Click on Execute.
If the target domain is not active, the attach operation simply changes
the domain configuration file on the SSP and completes.
Continue with the next steps without waiting; they do not depend on
completion of the drain operation.
8. You can configure the update time interval for the Hostview DR
windows by clicking on the properties button.
● dismiss – Aborts the currently active step and leaves the board in
its current state (Present, Init Detach, or In Use).
● All of the on-line processors in the domain are on the board being
detached.
● All usage of the detaching I/O devices has not been stopped.
● Quiesce failed.
When the failure is resolved, you can select either complete or force to
complete the detach.
To do this, DR must quiesce the system. Quiesce implies that all system
operations will be suspended, including I/O operations, for the period
of time that it takes to move the data from the detaching system board
to a remaining board. The quiesce process can take a minute or more.
It is always better to configure your system so that any board that you
will be detaching is not the lowest numbered board in the domain.
1. drain
2. complete_detach
● The OS is resumed.
Warning – Be very careful when using the force option. You could
do serious damage to critical system data by panicing the domain.
If you cannot make a device suspend its activity, you should not force
the operating system to quiesce. Doing so could cause the domain to
crash or hang. Instead, delay the DR operation until the suspend-
unsafe device is no longer open.
Asuspend-safe device is one that does not use the domain centerplane
while the operating system is quiesced. This means that the device
must not transfer any data, reference memory, or generate any
interrupts during the quiesce operation.
Tape Devices
The sequential nature of tape devices prevents them from being
reliably suspended in the middle of an operation and then resumed.
You can not stop a read and restart it; the tape is moving. Therefore all
tape drivers are considered suspend-unsafe and cannot be quiesced.
Before executing a DR operation that requires a quiesce, make sure all
tape devices are closed or inactive and the driver unloaded.
To add new devices that support quiesce to the /etc/system file, use
the following format, where drivern represents the device driver
module name:
set hswp:suspend_safe_list1=”driver1 driver2 ... drivern”
set hswp:suspend_safe_list2=”driver1 driver2 ... drivern”
set hswp:suspend_safe_list3=”driver1 driver2 ... drivern”
set hswp:suspend_safe_list4=”driver1 driver2 ... drivern”
set hswp:suspend_safe_list5=”driver1 driver2 ... drivern”
Solaris has a preset list of devices that it ignores during the quiesce
process, making no attempt to quiesce them. These devices, which
include the OS pseudo devices, do not perform any actual I/O
operations and so do not need to be suspended during the quiesce.
Quiesce Operation
The following transcript from a Solaris 2.6 systen shows the domain
message traffic for a DR detach of the lowest-numbered system board,
Board 0, one that contains permanent memory. Comments are made in
italics.
Other than the apconfig commands, nothing was entered from the
domain console.
The drain operation has completed.
DR op: DRAIN BOARD (board 0)...
# apconfig -N
metanetwork: mqe0
physical devices:
qe4
qe0 P A DR
# apconfig -S
c1 pln3
c0 pln0 P A DR
metadiskname(s):
mc0t5d0 R
mc0t4d0
mc0t3d0
mc0t2d0
mc0t1d0
mc0t0d0
Note that the interfaces on board zero are
marked DR for drain. The active interfaces are
still on board 0.
The SSP now signals the domain to perform the
complete_detach.
# DR op: MOVE CPU0 (move CPU0 from 0 to 4)
CPU 4 has been chosen to run the quiesce since 0
is being detached.
DR op: DETACH BOARD (board 0)...
Quiesce Operation
Quiesce Operation
metanetwork: mqe0
physical devices:
qe4 A
qe0 P DE
# apconfig -S
c1 pln3 A
c0 pln0 P DE
metadiskname(s):
mc0t5d0 R
mc0t4d0
mc0t3d0
mc0t2d0
mc0t1d0
mc0t0d0
The AP interfaces on board 0 are now marked DE
for detached. Notice that the AP switch occurred
automatically (and quietly).
DR and AP Interaction
DR also asks AP about the pathgroups and alternates that are in the
AP database and what their status is (active or inactive).
DR Attach
● You must run apconfig -F to clear the detached flag on disk
pathgroup alternate paths that have been reattached.
DR and AP Interaction
DR Detach
● If the board has a path to an AP database copy, the copy will be
disconnected and marked inaccessible in the other databases.
● The AP database state flags are not always correctly updated. Run
apconfig -F to refresh the state of all AP pathgroups.
Lab
c. Click the top select button to fill in the board for the attach
operation.
f. Click on execute.
Lab
i. Select on complete.
Open Hostview and DR detach the board that you just added.
Lab
a. Turn off power to this board. Select power and edit the
command line: power -off -sb X.
b. Select execute.
Lab
6. Boot the new domain from the network as if you were going to
install the software from the CD-ROM in the SSP.
<#8> ok boot new-net-alias
7. While the domain is booting, examine the SSP directories for the
new domain:
$SSPVAR/adm/new_domain
$SSPVAR/etc/platform/new_domain
Lab
$SSPVAR/adm/domain2
$SSPVAR/etc/platform/domain2
Lab
7. Configure and boot the new domain to the OS from its disk. Use
only the command line.
8. Use hostint to force a panic in the new domain. You might want
to enter sync in the netcon window first.
ssp% domain_switch new_domain
ssp% hostint
Before continuing on to the next module, check that you are able to
accomplish or answer the following:
❑ Discuss the restrictions and problems that can occur with DR.
Think Beyond
Why does the OS need to perform the quiesce operation when there is
permanent memory on the board being detached?
What would be the best way to combine two 8-board domains? Why?
Course Map
This module discusses the various failures that might occur in an
Enterprise 10000 system and how to obtain and save diagnostic
information about them.
10-1
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
10
Relevance
Objectives
References
The Enterprise 10000 system has four address buses and two data
buses. It can continue to run, in many cases, when one or more of
these buses has failed. There may be multiple configurations that will
work in these cases, but usually there is only one optimum
configuration.
Similarly, the 144-bit data path is actually two 72-bit paths that
normally operate together. But remember that the Enterprise 10000 can
operate with either one or two data buses. One 2-bus and two 1-bus
data bus configurations are possible.
The bus configurations are denoted by the bit mask of the buses they
use. The address bus configurations are numbered 1–F, where F (or 15
decimal) is the normal four-address bus mode. The data bus
configurations are numbered 1–3, where 3 is the normal two-data bus
mode. In some cases a compound bus configuration is used in POST
displays as a two-digit hex value with the databus configuration first,
so the normal all-buses active configuration is 3F.
● Boards 6, 8, and 14
● Boards 6, 12 and 14
Usually the system chooses configuration 3F, meaning that all buses
are available.
The following is the output from a bringup run in a domain with six
system boards. The interface between a system board and an address
bus was disabled on two different boards. As you can see, the system
decided to configure with only two address buses, since it could best
access all of the system boards and their components that way.
...
phase final_config: Final configuration...
34 43776.00 24 3072 19 2 1
35 87552.00 24 3072 19 2 2
36 51200.00 20 2560 16 2 2
37 76800.00 20 2560 16 2 3
38 25600.00 20 2560 16 2 1
39 51200.00 20 2560 16 2 2
3A 26624.00 16 2048 13 2 2
3B 39936.00 16 2048 13 2 3
3C 51200.00 20 2560 16 2 2
3D 76800.00 20 2560 16 2 3
3E 39936.00 16 2048 13 2 3
3F 53248.00 16 2048 13 2 4
When bringup is run, hpost will read two syntactically identical text
files called the redlist and blacklist files.
Changes specified in these files take effect the next time a domain runs
bringup. They will not be seen by an affected domain until bringup is
run.
You do not need to run autoconfig if all of the system boards are at
same revision level. If you are not sure, run it against the new
board(s).
● Chip ID files
See the autoconfig man page for the command syntax. It takes
several minutes to run.
Diagnostic Tools
hpost
Although normally run by the bringup command, hpost can be run
from the SSP command line for diagnostic purposes.
Warning – hpost will crash a running domain. It does not check the
status of the domain before starting.
hpost can be run on one domain while other domains are running.
Warning – Make sure you have the right domain name specified in the
SUNW_HOSTNAME variable.
Diagnostic Tools
hpost
Warning – If you do not understand what the option does, do not use
it. You could cause significant damage to the system.
● -llevel, where level can be from 7 to 127. The default is 16. Levels
above 64 take considerable time and may not add a lot of value for
field-level troubleshooting.
● -vlevel, where level can be from 0 to 255. The default is 20. Running
at level 255 is useful (once) to understand how hpost works;
however overly verbose output can mask error reports. The
default verbose level will report all failures.
SunVTS
SunVTS, the on-line validation test suite, is a system exerciser that
tests and validates hardware functionality by running multiple
diagnostic hardware tests on most configured controllers and devices.
The SunVTS can also be used to stress test hardware, either in or out of
the Solaris operating environment. By running multiple and
multithreaded diagnostic hardware tests, SunVTS verifies the system
configuration and the functionality of most hardware controllers and
devices.
Diagnostic Tools
prtdiag
prtdiag is a Solaris command that is run in the domain, and provides
a detailed view of the domain’s hardware configuration. To see the
entire platform’s hardware configuration, either all of the system
boards must be in one domain, or separate prtdiag output must be
combined from all of the domains.
Diagnostic Tools
prtdiag
This will display information similar to the following:
# /usr/platform/sun4u1/sbin/prtdiag -v
System Configuration: Sun Microsystems sun4u SUNW,Ultra-Enterprise-
10000
System clock frequency: 83 MHz
Memory size: 1024 Megabytes
Diagnostic Tools
prtdiag
========================= IO Cards =========================
Bus Freq
Brd Type MHz Slot Name Model
--- ---- ---- ---- -------------------------------- ----------------
------
0 SBus 25 0 qec/qe (network) SUNW,595-3198
0 SBus 25 0 SUNW,soc/SUNW,pln 501-2069
0 SBus 25 1 QLGC,isp/sd (block) QLGC,ISP1000
1 SBus 25 0 qec/qe (network) SUNW,595-3198
1 SBus 25 0 SUNW,soc 501-2069
1 SBus 25 1 QLGC,isp/sd (block) QLGC,ISP1000
Solaris will display the system board number, memory bank number,
and the location of the DIMM on the board. The error message from
the OS would look like:
Softerror: Intermittent ECC Memory Error SIMM
Board# 3 Bank# 0 P# P13 MM 0_3
ECC Data Bit 63 was corrected
where P13 and MM 0_3 are the DIMM silk screen labelled locations on
the system board for the memory getting the correctable memory
errors.
Enabling Reporting
To enable OS reporting of correctable memory errors, place the
following commands in the domain’s /etc/system file:
set report_ce_log=1
set report_ce_console=1
System Failures
There are several conditions that can occur on the Enterprise 10000
that require specific handling or provide system state information that
should be saved. These conditions are:
● Reboot request
● Panic
● Watchdog/Redmode/XIR
● Heartbeat failure
● arbstop
Reboot Request
The reboot request can come from the reboot, shutdown, or init
commands, for example.
Panic
A panic can also be forced from the SSP by using the hostint or
sigbcmd commands.
● edd detects the panic (notice was sent by the control board)
Panic
The panic dumps can be analyzed with crash and kadb. You might
also want to use the SunSolve™ script ISCDA.
Considerations
● savecore must be enabled in /etc/init.d/sysetup to save the
crash dump. By default, panic dumps are not saved. Make sure
that you enable savecore. You don’t want to explain why you
don’t know what failed.
● The primary swap partition must be large enough to hold the raw
dump, which could be 500 Mbytes to 800 Mbytes in size. A partial
dump is useless, and only the primary swap partition will be used.
● edd logs the event in the SSP and locates the proper rule.
resetinfo files are ASCII and do not require redx to view them. They
contain the domain processors’ registers at the time of the failure.
Hostview will notify you that one of these has occurred in the Failure
window, and the resetinfo dump may be viewed from that window.
● edd detects and logs the hang (a notice is sent from the control
board)
Arbstop
Arbstop
● edd logs the arbstop event and locates the proper rule.
arbstop dumps are binary files that require redx to interpret them.
You can think of them as hardware equivalents of Solaris panic
dumps.
When the system fails, you usually just want to reboot and get on with
your work, but there may be useful information about bad hardware
that needs to be saved. The -D option of hpost will create a binary
dump of all the internal hardware state information that might be of
interest. This file can later be read by redx.
If the arbstop occurs while the domain is running, the dump file will
be:
$SSPLOGGER/domain_name/Edd_Arbstop_Dump-mm.dd.hh.mm:ss
If the arbstop occurs while hpost is running, the dump file will be:
$SSPLOGGER/domain_name/xfstate.mmdd.hhmm.ss
Note that these are both in the same directory that the SSP’s copy of
the domain /var/adm/mesages file resides in.
In all cases, the fully qualified name of the file created and the mask of
boards in the created file are printed by hpost. Each system board or
half-centerplane included in the dump requires 4–5 Kbytes, making a
dump from a fully configured system approximately 90 Kbytes.
If hpost -D is run from the command line, it will ask for a 60 character
comment. Add something helpful, or just press Enter. You do not have
to add the date, time, domain name, platform name, SSP name, or the
mask of boards in the dump; these are all placed in the dump file
automatically.
redx
Generally, you should never have a need to use redx. It is intended for
use by trained Sun support personnel only.
redx
● On-line – redx can read and write directly to the Enterprise 10000
hardware.
Starting redx
Start redx from the SSP. Make sure that the SUNW_HOSTNAME variable is
set for the proper domain.
ssp:domain% redx -l
Output window is open to child PID 13683 through fifo /tmp/redx_pipe13681
Environment DISPLAY = 129.153.40.26:0
redxl>
● prtdiag output
● sysdef output
● /var/adm/messages
● /var/opt/SUNWssp/adm/domain_name/messages
● /var/opt/SUNWssp/adm/messages
● /var/opt/SUNWssp/adm/domain_name/post/POST
● For panics, review and have accessible the core files from the
Enterprise 10000 host domain in /var/crash/domain_name.
Lab
1. Create a .postrc file for your domain, if you do not already have
one, and add:
display_fom_calc on
2. Run prtdiag from Solaris in your domain and look at the output.
5. Run bringup with hpost at the default test level (16) and
verbosity level 255. Observe the output.
6. Run bringup with hpost at test level 32 and the default verbose
level (20).
Think Beyond
Why must you exercise caution when using the hardware support
commands?
For more detail, please refer to the Network Time Protocol User’s Guide
and the appropriate man pages.
A-1
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
A
The Ultra Enterprise 10000 domains and the SSP keep time
independently but are kept in sync by the Network Time Protocol
(NTP) daemon. During the domain boot process, through NTP the
domain’s kernel asks the SSP for the time and then sets its time to
match.
If the date on the SSP is changed, the daemon changes the time in the
domain.
If you use the date command to change the time in a domain and it
differs from that of the SSP, the daemon immediately begins to
gradually adjust the domain’s time toward that of the SSP. This can
prove confusing to users and programs.
The only time you should use the date command to set the time in a
domain is if a problem prevents the domain from getting the proper
time from the SSP. Note that the domain’s clock device has no battery
backup. If this error occurs and is detected, the following message
appears during the domain boot process:
“WARNING: TOD clock not initialized -- CHECK AND RESET THE DATE!”
If you see this message, as the domain superuser use the date
command to set the time as closely as possible to that shown on the
SSP, and the domain time should quickly sync up with it.
NTP Files
For Solaris 2.5.1, the NTP executables are installed by SUNWxntp in the
/opt/SUNWxntp/bin directory along with the key file (ntp.keys),
while the configuration file (ntp.conf) and drift file (ntp.drift) are
installed in the /etc/opt/SUNWxntp directory.
In both cases, the NTP daemon is started during run level 2 processing
at boot time. The daemon’s name is xntpd for 2.5.1, and ntpd for 2.6.
Normally, when all servers are in agreement, NTP chooses the best,
where “best” is defined in terms of lowest stratum (closest to
stratum-1), closest in terms of network delay and claimed precision.
While a goal should be to provide each client with three or more
sources of lower stratum time, several of these will only be providing
backup service and may be of lesser quality in terms of network delay
and stratum. That is still acceptable; a same-stratum peer that receives
time from other lower-stratum sources not accessed directly by the
local server can provide good backup service.
Synchronization Sources
Other ways to explore the nearby subnet include use of the nptrace
and ntpq programs, provided with the NTP packages. See their man
pages for more detail.
Unless restricted using facilities described later, this host can provide
synchronization to dependent clients, which do not have to be listed in
the configuration file. Associations maintained for these clients are
transitory and result in no persistent state kept by the host. These
clients are normally not visible when using the ntpq program included
in the distribution; however, the daemon includes a monitoring
feature that caches a minimal amount of client information that is
useful for debugging and administrative purposes.
One of the things the NTP daemon does when it is first started is to
compute the error in the intrinsic frequency of the clock on the
computer that it is running on. It usually takes about a day or so after
the daemon is started to compute a good estimate of this (and it needs
a good estimate to synchronize closely to its server). Once the initial
value is computed, it will change only by relatively small amounts
during the course of continued operation. The “driftfile” declaration
indicates to the daemon the name of a file where it may store the
current value of the frequency error so that, if the daemon is stopped
and restarted, it can reinitialize itself to the previous estimate and
avoid the day's worth of time it will take to recompute the frequency
estimate. Since this is a desirable feature, a “driftfile” declaration
should always be included in the configuration file.
If the daemon stops for some reason, the local platform time will
diverge from UTC (Coordinated Universal Time) by an amount that
depends on the intrinsic error of the clock oscillator and the time since
it last synchronized. In view of the length of time necessary to refine
the frequency estimate, every effort should be made to operate the
daemon on a continuous basis and limit the time when it is not
running.
Configuration Guidelines
Three utility query programs are included with the NTP facility: ntpq,
ntptrace, and xntpdc. For more information on these, see their man
pages in /opt/SUNWxntp/man (2.5.1) or /usr/man (2.6).
After starting the NTP daemon, run the ntpq program with the -n
switch, which will avoid possible distractions due to name resolutions.
Use the peer command to display a table showing the status of the
configured peers and possibly other clients using the daemon.
● remote – Lists hosts that should agree with the entries in the
configuration file, plus any peers not mentioned in the file at the
same or lower level than your stratum that happen to be
configured to peer with you.
● when – The time since the peer was last heard, in seconds.
● The remaining entries show the latest delay, offset and dispersion
computed for that peer, in milliseconds.
B-1
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
B
The Enterprise 10000 system does not provide any default OBP device
aliases like the other Sun SPARC systems do. Because of the E10000’s
domain and DR capability, it is very difficult to determine which
interfaces or devices should have aliases, so none are created.
The E10000 does have several device aliases defined in the OBP, but
they are intended as examples only and should not be used for real
devices.
Device aliases can be created with the devalias and the nvalias
commands. Aliases created with devalias only last until the system is
reset. Aliases created with nvalias last until they are deleted with
nvunalias. It is usually best to use nvalias, since these device aliases
will remain available.
Note that, when you delete a device alias, it will not disappear from
the devalias listing until the OBP reset (or boot) command has
been run. As long as the alias has been deleted, it is safe to recreate it,
even if it is still visible using devalias.
By using the SBus and PCI card mapping table from Module 7, you
can determine the physical location of the SCSI host adapter card that
you wish to use.
Once you know this, you can create a disk device alias using the
show-disks command.
1. Delete any old alias with the same name using nvunalias.
<#15> ok nvunalias bootdrive
6. The system will replace the ^Y with the device path chosen from
show-disks, leaving the full path on the command line with the
cursor at the end of the line. Do not press Enter yet.
<#15> ok nvalias bootdrive /sbus@4c,0/QLGC,isp@1,10000/sd
7. You will need to manually add the SCSI device txdx numbers to
the alias. This takes the format of @target_number,lun_number:slice,
where slice is a letter from a to h, corresponding to disk slices 0
through 7, respectively. For t3d0, slice 0, you would add @3,0:a.
<#15> ok nvalias bootdrive /sbus@4c,0/QLGC,isp@1,10000/sd@3,0:a
9. You can check the new alias with devalias if you wish.
<#15> ok devalias
bootdrive /sbus@4c,0/QLGC,isp@1,10000/sd@3,0:a
...
By using the SBus and PCI card mapping table from Module 7, you
can determine the physical location of the SOC or SOC+ interface card
that you wish to use.
Once you know this, you can create a device alias using the
show-disks command.
You want to use the SSA with WWN 8a1085, connected to port b of its
SOC card. To specify drive t4d2 on this SSA, attached to the SOC card
in board 3, Sbus 1, slot 0:
1. Delete any old alias with the same name using nvunalias.
<#15> ok nvunalias bootdisk
4. Item b is the proper SBus card and has the correct WWN, so
choose b.
Enter Selection, q to quit: b
/sbus@4d,0/SUNW,soc@0,0/SUNW,pln@b0000000,8a1085/SUNW,ssd has been
selected.
Type ^Y ( Control-Y ) to insert it in the command line.
e.g. ok nvalias mydev ^Y
for creating devalias mydev for
/sbus@4d,0/SUNW,soc@0,0/SUNW,pln@b0000000,8a1085/SUNW,ssd
6. The system will replace the ^Y with the device path chosen from
show-disks, leaving the full path on the command line with the
cursor at the end of the line. Do not press Enter yet.
<#15> ok nvalias bootdisk
/sbus@4d,0/SUNW,soc@0,0/SUNW,pln@b0000000,8a1085/SUNW,ssd
7. You will need to manually add the SCSI device txdx numbers to
the alias. This takes the format of @target_number,lun_number:slice,
where slice is a letter from a to h, corresponding to disk slices 0
through 7, respectively. For t4d2, slice 0, you would add @4,2:a.
<#15> ok nvalias bootdisk
/sbus@4d,0/SUNW,soc@0,0/SUNW,pln@b0000000,8a1085/SUNW,ssd@4,2:a
9. You can check the new alias with devalias if you wish.
<#15> ok devalias
bootdisk
/sbus@4d,0/SUNW,soc@0,0/SUNW,pln@b0000000,8a1085/SUNW,ssd@4,0:a
...
By using the SBus and PCI card mapping table from Module 7, you
can determine the physical location of the network interface card that
you want to use.
Once you know this, you can create a network device alias using the
show-nets command. The only other thing you may need to know, for
a quad Ethernet card, is which interface you want to use. The
show-nets command will show all of these interfaces, in reverse order
(3 through 0).
1. Delete any old alias with the same name using nvunalias.
<#15> ok nvunalias bootnet
6. The system will replace the ^Y with the device path chosen from
show-nets, leaving the full path on the command line. Hit enter
to create the alias.
<#15> ok nvalias bootnet /sbus@4c,0/qec@0,20000/qe@0,0
7. You can check the new alias with devalias if you want.
<#15> ok devalias
bootnet /sbus@4c,0/qec@0,20000/qe@0,0
...
.postrc
A text file that controls options in hpost(1M). Some of the
functions can also be controlled from the command line.
Arguments on the command line take precedence over lines in
the .postrc file, which takes precedence over built-in defaults.
hpost -?postrc gives a terse reminder of the .postrc
options and syntax. See postrc(4).
ARBSTOP
A condition that occurs when one of the Ultra Enterprise 10000
ASICs detects a parity error or equivalent fatal system error.
Bus arbitration is frozen, so all bus activity stops. The system is
dead until the SSP detects the condition by polling the status
registers of the Address Arbiter ASICs via JTAG, and clears the
error condition.
ASIC
Application-specific integrated circuit. Used in the Enterprise
10000 system context to mean any of the large main chips in the
design, including the UltraSPARC processor and data buffer
chips.
BBSRAM
See bootbus SRAM.
blacklist
A text file that hpost(1M) reads when it starts up that tells it
about Enterprise 10000 system components that are not to be
used or configured into the system. The default path name for
this file can be overridden in the .postrc file (see postrc(4))
and on the command line.
Glossary-1
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
Board descriptor array
The description of the single configuration that hpost chooses.
It is part of the structure handed off to OBP.
Bootbus
A slow-speed byte-wide bus controlled by the processor port
controller ASICs, used for running diagnostics and boot code.
UltraSPARC starts executing code from bootbus when it exits
reset. In Enterprise 10000 system, the only component on the
bootbus is the BBSRAM.
bootbus SRAM
A 256-Kbyte static RAM attached to each processor PC ASIC.
Through the PC, it can be accessed for read/write from JTAG or
the processor. It is downloaded at various times with
hpost(1M) and OBP start-up code, and provides shared data
between the downloaded code and the SSP.
Caching UPA master
A UPA module with master capability that also has a coherent
cache. The caching UPA master module participates in the
cache coherence protocol.
centerplane
A double-sided backplane where eight system boards, one
centerplane support board, and one control board plug
perpendicularly into each side.
centerplane support board
Board that plugs into the centerplane and supplies clocks,
JTAG, and control functions for one-half of the centerplane.
Normally, two centerplane support boards are used; each
plugging into opposites sides of the centerplane.
CIC
Coherency Interface Controller. Handles coherency transactions
for the three port controllers on a board. Connects to one of four
global address buses. Snoops for one quarter of the address
space.
control board
Board that plugs into the centerplane and provides the system’s
JTAG, clock, fan, power, serial interface, and Ethernet interface
functions.
Glossary-3
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
ECC
Error correction code
Enterprise 10000
The successor to the CS6400 system. Has up to 64 UltraSPARC
processors and 64 GBytes of main storage, interconnected by a
UPA crossbar via the Gigaplane XB system bus.
Externally initiated reset (XIR)
Refer to Xir.
Fatal error
A class of unrecoverable errors which necessitate that the
machine be rebooted; may be hardware or software initiated.
This type of unrecoverable error will result in an arbiter stop
condition which requires SSP interaction.
FCS
First customer ship. The date a product will be shipped to
customers.
GAARB
Global address arbiter. Arbitrates for a global address buses.
Implemented by an Enterprise 10000 arbiter chip.
GAB
Global address buses. Four 16:16, 48-bit wide multiplexers that
connect together a coherent interface controller from each
system board. The multiplexors broadcast one of the inputs to
all the outputs. Implemented by 16 XMUX ASICs. Functions
like a snoopy bus for coherency purposes, but is really a point-
to-point address router.
GDARB
Global data arbiter. Arbitrates for the global data router’s 16x16
crossbar.
GDR
Global data router. Sixteen 16:1, 144-bits wide multiplexers that
connects together the local data routers on each system board.
Implemented by 12 XMUX ASICs.
Gbits/sec
Gigabits per second.
Gbytes/sec
Gigabytes per second.
Glossary-5
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
JTAG+
An extension of JTAG, developed by Sun, which adds a control
line to signal that board and ring addresses are being shifted on
the serial data line. Often referred to simply as JTAG.
Kilo
1,024
LAARB
Local address arbiter. Arbitrates for the local address router.
LAR
Local address router. Four bidirectional 3:1 multiplexors that
connect the three local address buses to four coherent interface
controllers. Implemented inside the four coherent interface
controllers on each board.
LDARB
Local data arbiter. Arbitrates for the local data router.
LDMUX
Local data mux. One mode of the XMUX.
LDR
Local data router. Two unidirectional 144-bit-wide 4:1
multiplexers that connect the four UPA databuses on a system
board with the global data router. Implemented by four XMUX
ASICs per system board.
Mbits/sec
Megabits per second.
MBus
A 64-bit wide, circuit switched bus, used by sun4m architecture
desktop systems from Sun Microsystems™.
Mbytes/sec
Megabytes per second.
MC
Memory Controller chip. Accepts memory addresses from the
four coherent interface controllers and data from the Starfire
data buffer (XDB), and performs reading and writing of 64-byte
blocks of data into one to four banks of memory.
Mega
1,024 x 1,024 = 1,048,576.
Glossary-7
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
RO
Read only.
RPC
Remote procedure call.
SBus
A Sun designed I/O bus, now an open standard.
SC2000
SPARCcenter™ 2000. Has up to 20 SuperSPARC processors
interconnected by two XDbuses.
SIMM
Single in-line memory module. Single refers to the fact that the
corresponding pins on each side of the edge connector are tied
together, so that there is only a single row on pins.
SIR
Software initiated reset. A software initiated reset is initiated by
SIR instruction within any processor. This pre-processor reset
has a trap type 4 at physical address offset 0x80 (PA = 0x1ff
f0000 0080).
SMP
Symmetric multiprocessor. Mainstream parallel systems.
Memory space is shared, and equally accessible to all the
processors. Caches are kept coherent by hardware mechanisms.
SOC
Serial optical channel. Connects two Fibre Channels to an SBus
SOC+
Second generation fibre channel incorporating FC-AL, Fibre
Channel Arbitrated Loop.
SRAM
Static RAM. These are memory chips that retain their contents
as long as power is maintained.
SS1000
SPARCserver™ 1000. Has up to eight SuperSPARC processors
interconnected by one XDBus.
SSP
System service processor. A networked SPARCstation™ from
which the system is booted and diagnosed.
Glossary-9
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
UltraSPARC-II processor module
A small circuit board containing an UltraSPARC-II processor,
two UltraSPARC-II data buffer chips, one external-cache tag
SRAM, and four external-cache data SRAMs.
Uncorrectable Error
Same as unrecoverable error.
Unrecoverable error
An error which cannot be corrected through hardware or
software action. Error detected by the hardware indicating data
has been lost, this type of error is fatal and will result in an
arbiter stop condition.
UPA
UltraSPARC port architecture. Defines the processor and DMA
interface to shared memory through a cache-coherent
interconnect for a family of uni- and multiprocessors designed
around the V.9 UltraSPARC processor.
UPA_Addressbus
The UPA Addressbus can be a bus, or a point-to-point
interconnection between the SCs and UPAs. For descriptive
purposes, the address path is sometimes also referred to as the
UPA_Addressbus.
UPA_Databus
The UPA interconnect data path can be a bus, a switch, or a
combination of the two. For descriptive purposes, the data path
is sometimes also referred to as the UPA_Databus.
UPA master port
A UPA port which can initiate data transfer actions on the
interconnect.
UPA slave port
A UPA port which can only be the recipient of a transaction. A
slave port does not generate transactions. A slave port has an
address space associated with it for programmed I/O, and
implements the port-ID registers. A slave port also handles
copyback requests for cache blocks in UPA ports which support
a coherent cache, and handles interrupt transactions in a UPA
port which contains a processor.
Glossary-11
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
Index
Index-1
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
AP database boundary scan 2-26
create 8-26 bringup 1-31, 7-19
deleting 8-31 control flow 7-8
on AP disks 8-28 Hostview 5-37
refresh 8-27 syntax 7-9
status 8-29 buttons
apboot 8-64, 8-75 abort 9-22, 9-67
mirrored drive 8-78 complete 9-22, 9-67
apconfig 8-29 CPU 9-31
apdb 8-26, 8-31 device 9-35
apdisk 8-60 dismiss 9-22, 9-67
apinst 8-55 force 9-67
apnet 8-38 help 9-22, 9-67
arbstop 10-31 reconfig 9-22, 9-67
array disk select 9-23
device alias B-5
attach C
complete 9-20, 9-25
dr shell 9-18 cabinet components 2-6
Hostview 9-22 cb_prom 4-16
init 9-19 cb_reset 4-16
attach buttons 9-22 cbe (control board
attachable memory 9-33 executive) 3-36
autoconfig 10-13 cbs (control board server) 3-36,
4-15
centerplane configuration 5-35
B centerplane support board 2-27
blacklist 1-29, 7-17 cmdtool 5-32
blacklisting components 5-46 colors
clearing 5-51 Hostview domain 5-33
Hostview 5-49 Hostview processors 4-36
processors 5-50 command line
board detach 9-39 creating domains 5-22
board location removing domains 5-26
PCI 7-37 commands
SBus 7-36 add_install_client 6-7
board, attach 9-23 apboot 8-75
boot apconfig 8-29
AP apdb 8-26, 8-31
mirrored disk 8-78 apdisk 8-60
recovery 8-80 apinst 8-55
AP disk 8-74 apnet 8-38
domain 7-8 autoconfig 10-13
SSP 7-4 bringup 7-19
boot-device and AP 8-76 cmdtool 5-32
bound threads 9-31 domain_create 5-22
Index Index-3
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
detach 9-39 preinstalled software 6-22
buttons 9-67 properties 1-10
dr shell 9-61 removing
FDDI 9-48 command line 5-26
Hostview 9-64 Hostview 5-28
memory errors 9-60 rename
network devices 9-48 command line 5-29
non-network devices 9-50 renaming
processors 9-42 Hostview 5-31
detach-safe devices SSP control 5-21
adding 9-53 status
detach-safe list 9-53 command line 5-17
detach-unsafe 9-53 Hostview 5-19
devalias 6-9, B-1 switching 5-20
devalias boot-device and domain messages files 5-18
AP 8-76 domain_create 5-22
device alias B-1 domain_history 5-18
array disk B-5 domain_remove 5-26
disk B-3 domain_rename 5-29
network interface B-7 domain_status 5-17
device button 9-35 domain_switch 5-20
device tree 7-36 domains
OpenBoot PROM 7-36 configuration
diagnosing problems 10-15 requirements 5-7
disk download_helper 7-21
device alias B-3 DR and processor sets 5-53
pathgroup DR detach 9-39
automatic switch 8-70 DR overview 1-15
components 8-53 dr shell
create 8-60 attach 9-18
delete 8-72 detach 9-61
switch 8-68 drain 9-41
viewing 8-51 dr shell 9-62
dismiss button 9-22, 9-67 percent complete 9-34
DMP 8-65 drain button 9-67
domain driftfile A-6
/etc/hosts 3-8 driver
bringup AP meta-device 8-5
Hostview 5-37 dr-max-mem environment
creating variable 7-33, 9-8, 9-33
command line 5-22 drshow 9-20, 9-29
Hostview 5-24 dump
environment variables 7-33 hardware 10-33
Hostview colors 5-33 panic 10-25
netcon window 5-32
network configuration 3-7
Index Index-5
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
-D option 10-33 memory
syntax 7-12 configuring for detach 9-58
hung host 10-28 draining, detach 9-41
errors 10-20
I detach 9-60
interleaving 9-58
I/O devices, configuring for pageable 9-69
detach 9-46 permanent 9-69
icons, Hostview 4-35 reduction, detach 9-34
ID PROM 5-10, 7-29 subsystem 2-22
idn-smr-size 7-35 usage 9-8
init_attach memory attach capacity 9-33
button 9-24 message files
dr shell 9-19 domain 4-41, 5-18
Hostview 9-22, 9-24 meta-disk 8-15
installation configuration 8-57
AP 8-13 meta-network 8-18
domain 6-11 configuration 8-36
SMCC software mirrored drive
packages 6-16 apboot 8-78
Solaris 3-18 mirroring
SSP software packages 3-26 AP 8-21
xntp 3-22 modunload 9-55
Inter-Domain Networking 5-6 monitoring power,
interface card locations 7-36 Hostview 4-47
interleaving memory 9-58 monitoring temperature,
Hostview 4-49
J
JTAG 1-37, 2-26 N
name service 3-24
L netcon 1-22, 1-26, 5-39
locations control commands 5-39
interface card 7-36 data paths 7-25
processor 7-38 overview 5-38
locked write (netcon) 5-43 session types 5-42
log file netcon ~ commands 5-39
domain message 10-4 netcon_server 7-23
SSP 4-41 netcontool 5-40
logfile .postrc directive 7-16 buttons 5-45
terminal type 5-44
network
M AP
MAC address 5-16 primary interface 8-47
mem_board_interleave_ok 7-16, AP devices 8-11
9-58 console paths 7-25
Index Index-7
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
PROM serial number 5-16
OpenBoot 7-26 session types
properties of domains 1-10 netcon 5-42
SEVM 8-65
Q sigbcmd 10-24
single user mode
quiesce 9-88 AP 8-82
example 9-83 sir-sync? 7-34
failures 9-75 SMCC
purpose 9-73 software packages
2.5.1 6-18
R 2.6 6-16
RAS software
concurrent serviceability 1-37 installing packages 3-29
error logging 1-39 software packages
read only session (netcon) 5-42 SMCC
real time thread 9-74 2.5.1 6-18
reboot request 10-23 2.6 6-16
reconfig button 9-22, 9-67 SSP 3-12
reconfiguring installation 3-26
SSP 3-32 Solaris 6-4
redlist 7-18 SSP version 1-21
redmode 10-26 Solaris, installation and
redmode-reboot? 7-34 configuration 3-18
redmode-sync? 7-34 Solstice Disk Suite 8-10
reduction of memory, AP 8-66
detach 9-34 source domain, attach 9-23
redx 10-35 speed
removing domains controlling fan 4-53
Hostview 5-28 ssi-smr-size 7-35
rename SSP
domain /etc/hosts 3-8
command line 5-29 accounts 3-11
Hostview 5-31 boot 7-4
reset 10-26 boot server 6-6
handling 7-34 configuring 3-28, 3-29
restarted daemons 7-7 daemons 7-7
restart 7-7
domain control 5-21
S domain messages files 5-18
savecore 10-25 environment variables 3-14
saving configuration files 3-16 features 1-21
SBus files
slot decoding 7-36 backup 3-16
SBus card location 7-36 network configuration 3-6
select button 9-23 network privacy 3-7
Index Index-9
Copyright 1998 Sun Microsystems, Inc. All Rights Reserved. SunService June 1998
X
xir 10-26
xir-sync? 7-34
xntp
installing
SSP 3-22
xntpdc A-8
Please
Recycle