You are on page 1of 78

PDMS – 2 Hour Tutorial 1

•  Multicore computing revolution


–  The need for change…

•  Proposed Open Unified Technical Framework (OpenUTF)


architecture standards
–  OpenMSA, OSAMS, OpenCAF as future standards

•  Introduction to parallel computing


–  Programming models
–  High Speed Communications (HSC) through shared memory

•  Synchronization and Parallel Discrete Event Simulation (PDES)


–  Event Management
–  Time Management

•  Open discussion

PDMS – 2 Hour Tutorial 2


Future of computing is…

MULTICORE
PDMS – 2 Hour Tutorial 3
I skate to where the puck is going to be, not where it has been!
Wayne Gretzky

PDMS – 2 Hour Tutorial 4


•  Performance wall
–  Clock speed and power consumption
–  Memory access bottlenecks
–  Single instruction level parallelism

•  Multiple processors (cores) on a single chip is the future


–  No foreseeable limit to the number of cores per chip
–  Requires software to be written differently

•  Supercomputing community consensus: Low-level parallel


programming is too hard
–  Threads, shared memory, locks/semaphores, race-conditions,
repeatability, etc., are too hard and expensive to develop and debug
(fine-grained HPC is not for your average programmer)
–  Message-passing is much easier but can be less efficient
–  High-level approaches, tools, and frameworks are needed (OpenUTF,
new compilers, languages, math libraries, memory management, etc.)

PDMS – 2 Hour Tutorial 5


Computer/Blade/Cluster
 World of computing is rapidly
changing and will soon demand
Board
 new parallel and distributed
Board

service-oriented programming
Chip
 Board
 Chip
 methodologies and technical
Node
 Node
 Node
 Node
 frameworks.

Node
 Node
 Node
 Node


Cloud
Compu3ng,

Net‐centric,
GIG,

Systems
of
Systems

Chip
 Chip

Node
 Node
 Node
 Node


Experts say that parallel and


Node
 Node
 Node
 Node
 distributed programming is too hard
for normal development teams. The
Open Unified Technical Framework
abstracts low-level programming
details.

PDMS – 2 Hour Tutorial 6


•  Microsoft
–  Sponsor of the by-invitation-only 2007 Manycore Computing Workshop
that brought together the who’s who of supercomputing together
–  Unanimous consensus on the need for multicore computing software
tools and frameworks for developers (e.g., OpenUTF)

•  Apple
–  Snow Leopard will have no new features (focus on multicore computing)
–  The next version of Apple's OS X operating system will include
breakthroughs in programming parallel processors, Apple CEO Steve
Jobs told The New York Times in an interview after this week's
Worldwide Developers Conference. "The way the processor industry is
going is to add more and more cores, but nobody knows how to
program those things," Jobs said. "I mean, two, yeah; four, not really;
eight, forget it.

http://bits.blogs.nytimes.com/2008/06/10/apple-in-parallel-turning-
the-pc-world-upside-down/

PDMS – 2 Hour Tutorial 7


Next generation chips
Intel has disclosed details on a chip that will compete directly with
Nvidia and ATI and may take it into unchartered technological and
market-segment waters. Larrabee will be a stand-alone chip, meaning it
will be very different than the low-end–but widely used–integrated
graphics that Intel now offers as part of the silicon that accompanies its
processors. And Larrabee will be based on the universal Intel x86
architecture.

…The number of cores in each Larrabee chip may vary, according to


market segment. Intel showed a slide with core counts ranging from 8 to
48, claiming performance scales almost linearly as more cores are
added: that is, 16 cores will offer twice the performance of eight cores.

http://i4you.wordpress.com/2008/08/05/intel-details-future-larrabee-
graphics-chip

PDMS – 2 Hour Tutorial 8


Next generation chips

Intel touts 8-core Xeon monster Nehalem-EX

Intel gave a demo yesterday of its eight-core, 2.3 billion-transistor


Nehalem-EX, which is set to launch later this year… Nehalem EX
has up to 8 cores, which gives a total of 16 threads per socket.

By Jon Stokes | Last updated May 28, 2009 8:25 AM CT


http://arstechnica.com/hardware/news/2009/05/intel-touts-8-core-xeon-monster.ars

PDMS – 2 Hour Tutorial 9


Service
Components

Model
Components

OpenUTF
Kernel

Open Unified Technical Framework (OpenUTF)…

COMPOSABLE SYSTEMS
PDMS – 2 Hour Tutorial 10
•  Simulation is not as cost effective as it should be – we need to
do things differently… Revolutionary, not evolutionary change!

•  Multicore computing revolution demands change in software


development methodology – need standardized framework

•  New architecture standards – we should be building models, not


simulations

•  Model and Service components developed in a common


framework – automates integration for Test and Evaluation

•  Verification and Validation – need a common test framework with


standard processes

•  Open source – Overcomes the technology/cost barrier and


supports widespread community involvement

PDMS – 2 Hour Tutorial 11


10 ms 1 ms 100 µs 10 µs 1 µs 100 ns 10 ns 1 ns

PDMS – 2 Hour Tutorial 12


Requires assessment of the current state
Existing tools, technologies, methodologies, data
models, existing interfaces, policies, requirements,
business models, contract language, lessons
learned, impediments to progress, etc.

Requires the right vision for the future


Lowered costs, better quality, faster end-to-end
execution, easier to use and maintain, feasible
technology, optimal use of workforce skill sets,
multiuse concepts, composability, modern
computational architectures, multiplatform, net-
centric, etc.

Requires an executable transition strategy


Incremental evolution, risk reduction, phased
capability, accurately assessed transition costs,
available funding, prioritization, community buy-in
and participation, formation of new standards

PDMS – 2 Hour Tutorial 13


1.  Engine and Model Separation 14.  Platform Independence
2.  Optimized Communications 15.  Scalability
3.  Abstract Time 16.  LVC Interoperability Standards
4.  Scheduling Constructs 17.  Web Services
5.  Time Management 18.  Cognitive Behavior
6.  Encapsulated Components 19.  Stochastic Modeling
7.  Hierarchical Composition 20.  Geospatial Representations
8.  Distributable Composites 21.  Software Utilities
9.  Abstract Interfaces 22.  External Modeling Framework
10.  Interaction Constructs 23.  Output Data Formats
11.  Publish/Subscribe 24.  Test Framework
12.  Data Translation Services 25.  Community-wide Participation
13.  Multiple Applications

PDMS – 2 Hour Tutorial 14


•  OpenMSA – Layered Technology
–  Focuses on parallel and distributed computing technologies
–  Modularizes technologies through a layered architecture
–  Contains OSAMS and OpenCAF
–  Proven technologies based on experience with large programs
–  Cost effective strategy for developing scalable computing technology
–  Provides interoperability without sacrificing performance
–  Facilitates sequential, parallel, and distributed computing paradigms

•  OSAMS – Model/Service Composability


–  Focuses on interfaces and software development methodology to
support highly interoperable plug-and-play model/service components
–  Provided by OpenMSA but could be supported by other architectures

•  OpenCAF – Cognitive Intelligent Behavior


–  Thoughts and stimulus, goal-oriented behaviors, decision branch
exploration, five dimensional excursions
–  Provided as an extension to OSAMS

PDMS – 2 Hour Tutorial 15


OpenUTF OpenMSA OSAMS OpenCAF
• Architecture • Open Source • Modularity • Behaviors
• Standards • Technology • Composability • Cognitive
• Net-centricity • HPC/Multicore • Interoperability Thought
• Data Models • Performance • Flexibility Processes
• Synchronization • Programming • 5D Simulation
Constructs • Goal-oriented
• VV&A Optimization

Behavior
HPC
Representation

Cognitive Rule
Network Services
Triggering

Bayesian
Scheduling Models
Branching

Modeling Goals and State


Composites
Framework Machines

Pub/Sub Decision
Services Support

LVC
Interoperability

Web-based
SOA

PDMS – 2 Hour Tutorial 16


Abstract HLA LVC – Federation External System
Federate Federate & Enterprises Visualization/Analysis
Direct
Federate External Modeling
CASE Tools Gateway Interfaces
Framework (EMF)
HPC-RTI (HLA, DIS,
&
Bridge TENA, Web-based
Entity Composite – Repository Distributed
SOA)
Model & Service Component – Repository Blackboard
SOM/FOM Data Translation Services
Distributed Simulation Management Services (OSAMS – Pub/Sub Data Distribution)
Standard Modeling Framework (OSAMS, OpenCAF)
Time Management
Event Management Services
Standard Template Library (OSAMS)
Persistence (OSAMS)
Rollback Utilities (OSAMS)
Rollback Framework
Internal High Speed Communications External Distributed Communications
ORB Network Services
General Software Utilities (OSAMS)
Threads
Operating System Services

PDMS – 2 Hour Tutorial 17


State, Action & Tasks
Prioritized Goals Tasks
Tasks
Task Management
(5D Branching)
Reasoning
Engine

Thought Thought Thought


1 2 N

Stimulus - Perception
(Short Term Memory)

Data Processing
Behaviors, Tasks, Notifications, Abstract Methods, Uncertainty

Data Received
Federation Objects and/or Interactions

PDMS – 2 Hour Tutorial 18


Based on OpenUTF Kernel Sensitivity List
•  Sensitive variables (stimulus) are registered with sensitive methods (thoughts)
•  Thoughts are automatically triggered whenever registered stimulus is modified
•  Thoughts can modify other stimulus to triggers additional thoughts
•  Terminates when solution converges or when reaching max thoughts

Outputs   Left brain reasoning


  Inputs are ints, doubles, or
A B C
Boolean
  Inputs are prioritized when
they are associated with
RBRs
  Inputs can be fed into
multiple reasoning nodes
  Outputs can be inputs to
other reasoning nodes
W X Y Z
  Feedback loops are
Inputs permitted

PDMS – 2 Hour Tutorial 19


Outputs

A B C   Learned reasoning
  Inputs are ints, doubles, or
Boolean
  TBR is trained and then
utilized to produce outputs
(can be continually trained
during execution)
  Inputs can be fed into
multiple reasoning nodes
W X Y Z
  Outputs can be inputs to
Inputs other reasoning nodes
  Feedback loops are
permitted

PDMS – 2 Hour Tutorial 20


1 = ωW + ω X + ωY + ω Z
Output A
A = ωW Wˆ + ω X Xˆ + ωY Yˆ + ω Z Zˆ × TW 2T X1TY1T Z 3
[ ]
  Right brain reasoning
€   Inputs are normalized,
weighted, and summed
ωW ωX ωY ωZ
  Sum is multiplied by the
product of thresholds to
1 TW1 1 1 1 TZ1 produce the output
TX1 TZ2   Output is normalized
TW2
TY1
TZ3
  Inputs can be fed into
TW3 TX2 multiple reasoning nodes
0 0 0 0 TZ1
  Outputs can be inputs to
W X Y Z other reasoning nodes
Inputs   Feedback loops are
permitted

PDMS – 2 Hour Tutorial 21


  Arbitrary graphs can be
constructed from Rules,
Neural Nets, and Emotions
  Outputs of graphs can
trigger changes to
behaviors by reprioritizing
goals
  Behaviors are only
triggered once reasoning is
completed

Emotion Based Reasoning

Training Based Reasoning

Rule Based Reasoning

PDMS – 2 Hour Tutorial 22


Monolithic
Applications Simulations
•  Collection of Hardwired Services •  Collection of Hardwired Models

Composable Plug and Play


OpenUTF Service Model Abstract V&V Test
Kernel Components Components Interfaces Framework

Net Centric Enterprise Framework


Composable
LVC Web GCCS Data Visualization
Systems

PDMS – 2 Hour Tutorial 23


PDMS – 2 Hour Tutorial 24
•  Reusable Software Components
•  Plug and Play Composability
•  Conceptual Model Interoperability
•  Pub/Sub Data Distribution & abstract Interfaces
•  V&V Test Framework
•  Performance Benchmarks

•  Parallel and Distributed Operation


•  Scalable Run-time Performance
•  Platform/OS Independence
•  OpenMSA: Technology
•  OSAMS: Modeling Constructs
•  OpenCAF: Behavior Representation

•  Composable Systems
•  LVC (HLA, DIS, TENA)
•  Web Services (SOA)
•  Data Model
•  C4I/GCCS
•  Visualization and Analysis

PDMS – 2 Hour Tutorial 25


Standalone Operation:
- Laptops, Desktops, Clusters, HPC
- Standalone Operation
Net-centric Operation: - Pub/Sub Data Distribution Legacy Interoperability:
- Enterprise Frameworks - Distributed Federation
- Command and Control - Training, Analysis, Test
- Standard Data Models - FOM/SOM

Composable
System

OpenUTF Kernel

Plug-and-play Model/Service Components

PDMS – 2 Hour Tutorial 26


Net-centric SOA/LVC on Networks of Single-processor and Multicore Computers
Web Composites are distributed across processors to
Services achieve parallel performance

Dynamically
configured
structure

Composite
LVC
Interface Net Centric System on
Multicore Computer
•  Transparently hosts hierarchical
services using the same interfaces Services communicate through Pub/Sub
as model components data exchanges and abstract interfaces
•  SOAP interface connects services to
external applications
•  Collections of related services are Subscribed Data Received Published Data Provided
dynamically configured and
distributed across processors on
multicore systems
•  Services internally communicate Abstract Services Provided Abstract Services Invoked
through pub/sub services and
decoupled abstract interfaces
•  Seamlessly supports LVC
integration

PDMS – 2 Hour Tutorial 27


•  General concept…
–  Government maintained software configuration management
–  Automatic platform-independent installation & make system
–  Test framework (verification, validation, and benchmarks)
–  Will seamlessly support mainstream interoperability standards
–  Designed for secure community-wide software distribution

OpenUTF

Global Installation & Make System

Component Repository OpenUTF Kernel


320,000 Lines of Code
Installation & Make System
Installation & Make System

Source Include Library Tests

Models Services Interfaces Verification Validation Benchmarks

DAS Weather Polymorphic Methods DAS DAS DAS


ETS CCSI Interactions ETS ETS ETS
T&D ATP-45 Federation Objects T&D T&D T&D
XML Interfaces Weather Weather Weather
Web Services CCSI CCSI CCSI
ATP-45 ATP-45 ATP-45

PDMS – 2 Hour Tutorial 28


V&V Test
Framework Data &
Services
Interfaces

Development
Models
Tools

Web Composability
Standards Tools

LVC
Visualization
Interoperability
Tools
Standards

OpenUTF
Analysis Tools
Kernel

PDMS – 2 Hour Tutorial 29


2D Mesh Topology
(m+n) worst case hops
16 Node Hypercube Topology
Log2(N) worst case hops

3D Mesh Topology
(l+m+n) worst case hops

Introduction to…

PARALLEL COMPUTING
PDMS – 2 Hour Tutorial 30
Startup

Node 0 Node 1 Node N-1

Initialize Initialize Initialize

Compute Compute Compute


Process Process Process
Cycle Cycle Cycle
Communicate Communicate Communicate

Store Results Store Results Store Results

File File File

PDMS – 2 Hour Tutorial 31


•  Parallel computing vs. distributed computing
–  Parallel computing maps computations, data, and/or object instances of
within an application to multiple processors to obtain scalable speedup
•  Normally occurs on a single multicore computer, but can operate
across multiple machines
•  The entire application crashes if one node or thread crashes
–  Distributed computing interconnects loosely coupled applications within
a network environment to support interoperable execution
•  Normally occurs on multiple networked machines, but can operate
on a single multicore computer
•  Dynamic connectivity supports fault tolerance but loses scalability

•  Speedup(N) = T1 / TN

•  Efficiency(N) = Speedup / N

•  RelativeEfficiency(M,N) = (M / N) [Speedup(N) / Speedup(M)]

PDMS – 2 Hour Tutorial 32


•  Time driven (or time stepping) is the simplest approach
for (double time=0.0; time < END_TIME; time+=STEP) {
UpateSystem(time);
Communicate();
}

•  The discrete event approach (or event stepping) manages


activities within the system more efficiently
–  Events occur at a point in time and have no duration
–  Events do not have to correspond to physical activities (pseudo-events)
–  Events occur for individual object instances, not for the entire system
–  Events when processed can modify state variables and/or schedule new
events

•  Parallel discrete event simulation offers unique


synchronization challenges…

PDMS – 2 Hour Tutorial 33


•  Distributed net-centric computing
–  Programs communicate through a network interface
•  TCP/IP, HTTPS, SOA and Web Services, Client/Server, CORBA,
Federations, Enterprises, Grid Computing, NCES, etc.

•  Parallel multicore computing


–  Processors directly communicate through high speed mechanisms
•  Threads, shared memory, message passing

Sequential Multi
Program Threaded

Shared Message
Memory Passing

PDMS – 2 Hour Tutorial 34


Shared Shared Shared
Memory Memory Memory
Server Server Server

Parallel Parallel Parallel


Application Application Application

Cluster
Server

PDMS – 2 Hour Tutorial 35


•  Startup and Terminate •  Asynchronous Message
–  Forks processes Passing
–  Cleans up shared memory –  Unicast, destination-based
multicast, broadcast
•  Miscellaneous services
–  Automatic or user-defined memory
–  Node info, shared memory tuning allocation
parameters, etc.
–  Up to 256 message types
•  Synchronization
•  Coordinated Message Passing
–  Hard and fuzzy barriers
–  Patterned after the Crystal Router
•  Global reductions
–  Synchronized operation
–  Min, Max, Sum, Product, etc. guarantees all messages received
–  Performance Statistics by all nodes
–  Can support user-defined –  Unicast, destination-based
operations multicast, broadcast
•  Synchronized data distribution •  ORB Services
–  Broadcast, Scatter, Gather, Form –  Remote asynchronous method
Matrix invocation with user-specified
interfaces

PDMS – 2 Hour Tutorial 36


Example of a global synchronization on five processing nodes

Stage 0 Stage 1 Stage 2 Stage 3

Node 0
Final
Node 1 Result

Node 2

Node 3
Wait Until
Completed
Node 4

PDMS – 2 Hour Tutorial 37


PDMS – 2 Hour Tutorial 38
Slots (circular buffer)
Node 0 One shared memory block per node
Node 1 Slots manage incoming messages for each node
Node 2
Circular buffer manages outgoing messages
Node 3
Steps in sending a message:
1.  Write header and message to head in senders
output message buffer.
Tail
2.  Write index of msg header in the receiving node
shared memory slot for the senders node.

Steps in receiving a message


1.  Iterate over slot mgrs to find messages
2.  Read message using index in the slot
3.  Mark the header as being read
Head
Potential technical issues
Cache coherency
Output Messages (circular buffer) Instruction synchronization
PDMS – 2 Hour Tutorial 39
Tail chasing Head Head chasing Tail

Tail Head

Circular Circular
Buffer Buffer

Head Tail

PDMS – 2 Hour Tutorial 40


Header 1 int NumBytes
Header 2

Header Format
int Index

unsigned short Packet


Header n
Message Format

unsigned short NumPackets


char DummyChar0
char DummyChar1
char DummyChar2
char ReadFlag

PDMS – 2 Hour Tutorial 41


PDMS – 2 Hour Tutorial 42
PDMS – 2 Hour Tutorial 43
Parallel Discrete Event Simulation (PDES)…

SYNCHRONIZATION
PDMS – 2 Hour Tutorial 44
•  Standardized processing cycle interfaces to support any time
management algorithm
–  Uses virtual functions on scheduler to specialize processing steps
–  Supports reentrant applications (e.g., HPC-RTI, graphical interfaces,
etc.)

•  Highly optimized internal algorithms for managing events


–  Optimized and flexible event queue infrastructure
–  Native support for sequential, conservative, and optimistic processing
–  Internal usage of free lists to reduce memory allocation overheads
–  Optimized memory management with high speed communications

•  Statistics gathering and debug support


–  Rollback and rollforward application testing
–  Automatic statistics gathering (live critical path analysis, message
statistics, event processing and rollbacks, memory usage, etc.)
–  Merged trace file generation for debugging parallel simulations that can
be tailored to include rollback information, performance data, and user
output
PDMS – 2 Hour Tutorial 45
•  Time Management Modes are generically implemented through
class inheritance from the WpScheduler
–  OpenMSA provides a generic framework to support basic parallel and
distributed event processing operations, which makes it easy to
implement new time management algorithms
–  OpenMSA creates the object implementing the requested time
management algorithm at run time
–  The base class WpScheduler provides generic event management
services for sequential, conservative, and optimistic processing
–  WpWarpSpeed, WpSonicSpeed, WpLightSpeed, and
WpHyperWarpSpeed time management objects inherit from
WpScheduler to implement their specific event processing and
synchronization algorithms

PDMS – 2 Hour Tutorial 46


main { Initialize {
Plug in User SimObjs Launch processes
Plug in User Components Establish Communications
Plug in User Events Construct/Initialize SimObjs
Execute Schedule Initial Events
} }

Execute { Process Up To (Time) { Process GVT Cycle {


Initialize while (GVT < Time) { Process Events & User Functions
Process Up To (End Time) Process GVT Cycle Update GVT
Terminate } Commit Events
} } Print GVT Statistics
}

Terminate {
Terminate All SimObjs
Print Final Statistics
Shut Down Communications
}

PDMS – 2 Hour Tutorial 47


*

*
*

PDMS – 2 Hour Tutorial 48


*

* 1

PDMS – 2 Hour Tutorial 49


Scheduler: A priority queue of Logical Processes (i.e., Simulation
Objects) ordered by next event time

Processed Events Future Pending Events


Doubly Linked List Priority Queue

Simulation Rollback Queue


Time Simulation Time
Event Messages

PDMS – 2 Hour Tutorial 50


•  Priority queue uses new self-correcting tree data structure that
employs a heuristic to keep the tree roughly balanced
–  Tree data structure efficiently supports three critical operations
•  Element insertion in O(log2(n)) time
•  Element retraction in O(log2(n)) time
•  Element removal in O(1) time
–  Does not require storage of additional information in tree nodes to keep
the tree balanced
•  Tracks depth on insert and find operations to adjust tree
organization through specially combined multi-rotation operations
•  Goal is to minimize long left/left and/or right/right chains of elements
in the tree
–  Competes with STL Red-Black Tree
•  Beats STL when compiled unoptimized
•  Slightly worse than STL when compiled optimized

PDMS – 2 Hour Tutorial 51


Rotation heuristic decreases depth
to keep the tree roughly balanced

OptimalDepth = log 2 (NumElements)


NumRotations = ActualDepth − OptimalDepth

PDMS – 2 Hour Tutorial 52


•  Rollback Manager
–  Manages list of rollbackable items that were created as rollbackable
operations are performed
–  Each event provides a rollback manager
•  Global pointer is set before the event is processed
•  Rollbacks are performed in reverse order to undo operations

•  Rollback Items
–  Each rollbackable operation generates a Rollback Item that is managed
by the Rollback Manager
•  Rollback utilities include (1) native data types, (2) memory
operations, (3) container classes, (4) strings, and (5) various misc.
operations
–  Rollback Items inherit from the base class to provide four virtual
functions
•  Rollback, Rollforward, Commit, Uncommit

PDMS – 2 Hour Tutorial 53


•  Distributed Synchronization

•  Conservative Vs. Optimistic Algorithms

•  Rollbacks in the Time Warp Algorithm

•  The Event Horizon

•  Breathing Time Buckets

•  Breathing Time Warp

•  WarpSpeed

•  Four Flow Control Techniques

PDMS – 2 Hour Tutorial 54


PDMS – 2 Hour Tutorial 55
•  Conservative algorithms impose one or more constraints
–  Object interactions limited to just “neighbors” (e.g., Chandy-Misra)
–  Object interactions have non-zero time scales (e.g., lookahead)
–  Object interactions follow FIFO constraint

•  Optimistic algorithms impose no constraints but require a more


sophisticated engine
–  Support for rollbacks (and advanced features for rollforward)
–  Require flow control to provide stability
–  Optimistic approaches can sometimes support real-time applications
better...

•  The most important thing is for applications to develop their


models to maximize parallelism
–  Simulations will generally not execute in parallel faster than their critical
path

PDMS – 2 Hour Tutorial 56


F A

E G

D B

PDMS – 2 Hour Tutorial 57


Self-scheduled
events and time
FIFO
from D Input
Q

Scheduled output
events and time to F
FIFO

FIFO Input Q
D
Scheduled input FIFO
events and time Scheduled output
from E events and time to B

FIFO
Scheduled input Input
events and time Q
from C

PDMS – 2 Hour Tutorial 58


•  GVT is defined as the minimum time-tag of:
–  Unprocessed event
–  Unsent message
–  Message or antimessage in transit

•  Theoretically, GVT changes as events are processed


–  In practice, GVT is updated periodically by a GVT update algorithm

•  To correctly provide time management services to the outside


world, GVT must be updated synchronously between internal
nodes

PDMS – 2 Hour Tutorial 59


PDMS – 2 Hour Tutorial 60
PDMS – 2 Hour Tutorial 61
PDMS – 2 Hour Tutorial 62
100

Proximity Detection (32 Nodes)


90 259 Ground Sensors
1099 Aircraft

80

70

60
CPU Time

50

40

30

20
Time Warp
Breathing Time Buckets
10

0
0 10,000 20,000

Simulation Time

PDMS – 2 Hour Tutorial 63


500,000

Proximity Detection (32 Nodes)


259 Ground Sensors
1099 Aircraft

400,000
Events and Rollbacks

300,000

Time Warp
Processed Rollbacks
Events

200,000

100,000

Breathing Time Buckets


Rollbacks

0
0 10,000 20,000

Simulation Time

PDMS – 2 Hour Tutorial 64


PDMS – 2 Hour Tutorial 65
Generated Generated
Messages Messages

PDMS – 2 Hour Tutorial 66


•  Opposite problems when comparing Breathing Time Buckets
and Time Warp

•  Imagine mapping events into a global event queue

•  Events processed by runaway nodes have good chance of


being rolled back

•  Should hold back messages from runaway nodes

PDMS – 2 Hour Tutorial 67


•  Example with four nodes
–  Time Warp: Messages released as events are processed
–  Breathing Time Buckets: Messages held back
–  GVT: Flushes messages out of network while processing events
–  Commit: Releases event horizon messages and commits events

Wall Time

PDMS – 2 Hour Tutorial 68


•  Abstract representation of logical time uses 5 tie-breaking
fields to guarantee unique time tags
–  double Time Simulated physical time of the event
–  int Priority1 First user settable priority field
–  int Priority2 Second user settable priority field
–  int Counter Event counter of the scheduling SimObj
–  int UniqueId Globally unique Id of the scheduling SimObj

•  Guaranteed logical times


–  The OpenUTF automatically increments the SimObj event Counter to
guarantee that each SimObj schedules its events with unique time tags
•  Note, Counter may “jump” to ensure that events have increasing
time tags
•  SimObj Counter = max(SimObj Counter, Event Counter) + 1
–  The OpenUTF automatically stores the UniqueId of the SimObj in event
time tags to guarantee that events scheduled by different SimObjs are
unique
PDMS – 2 Hour Tutorial 69
•  Four algorithms, selectable at run-time, are currently supported
in the OpenUTF reference implementation
–  LightSpeed for fast sequential processing
•  Optimistic processing overheads are removed
•  Parallel processing overheads are removed
–  SonicSpeed for ultra-fast sequential parallel and conservative event
processing
•  Highly optimized event management (no bells and whistles)
–  WarpSpeed for optimistic parallel event processing with four new flow
control techniques to ensure stability
•  Cascading antimessages can be eliminated
•  Individual event lookahead evaluation for message-sending risk
•  Message sending risk based on uncommitted event CPU time
•  Run-time adaptable flow control for risk and optimistic processing
–  HyperWarpSpeed for supporting five-dimensional simulation
•  Branch excursions, event splitting/merging, parallel universes

PDMS – 2 Hour Tutorial 70


Case 1

GVT Time
Hold Back Messages
Case 2

GVT Time
Ok to Send Messages

PDMS – 2 Hour Tutorial 71


Risk
Lookahead

Send Messages Hold Back Message Time

PDMS – 2 Hour Tutorial 72


Processing Threshold Exceeded
Hold Back Messages

Tcpu6
Tcpu2
Tcpu0

Tcpu1

Tcpu3
Tcpu4

Tcpu5

Time

PDMS – 2 Hour Tutorial 73


Unstable - Decrease Nopt
NRollbacks

Stable - Slightly Increase Nopt

Time
NAntimessagess

Unstable - Decrease Nrisk

Stable - Slightly Increase Nrisk

Time

PDMS – 2 Hour Tutorial 74


PDMS – 2 Hour Tutorial 75
PDMS – 2 Hour Tutorial 76
Final thoughts…

OPEN DISCUSSION
PDMS – 2 Hour Tutorial 77
•  Participate in the PDMS Standing Study Group (PDMS-SSG)
–  Simulation Users
–  Model Developers
–  Technologists
–  Sponsors
–  Program Managers
–  Policy Makers

•  Receive OpenUTF hands-on training for the open source


reference implementation
–  One-week hands-on-training events can be arranged for groups if there
is enough participation

•  Begin considering OpenUTF architecture standards


–  OpenMSA… layered technology
–  OSAMS… plug-and-play components
–  OpenCAF… representation of intelligent behavior
PDMS – 2 Hour Tutorial 78

You might also like