Cyberinfrastructure and Networks

:
The Advanced Networks and Services Underpinning the Large-Scale Science of DOE’s Office of Science
William E. Johnston ESnet Manager and Senior Scientist Lawrence Berkeley National Laboratory
1

ESnet Provides Global High-Speed Internet Connectivity for DOE Facilities and Collaborators (ca. Summer, 2005)
ESnet Science Data Network (SDN) core
Japan (SINet) Australia (AARNet) Canada (CA*net4 Taiwan (TANet2) Singaren CA*net4 France GLORIAD (Russia, China) Korea (Kreonet2 MREN Netherlands StarTap Taiwan (TANet2, ASCC) SINet (Japan) Russia (BINP) CERN (USLHCnet CERN+DOE funded) GEANT - France, Germany, Italy, UK, etc

LIGO

PNNL

ESnet IP core
MIT BNL JGI
TWC

LBNL NERSC SLAC

LLNL SNLL
PAIX-PA Equinix, etc.

QWEST ATM

AMES

FNAL ANL

Lab DC Offices

PPPL MAE-E

OSC GTN NNSA KCP OSTI ARM NOAA SRS

Equinix JLAB

ORNL

YUCCA MT

LANL SNLA
Allied Signal

ORAU

GA 42 end user sites Office Of Science Sponsored (22) NNSA Sponsored (12) Joint Sponsored (3) Other Sponsored (NSF LIGO, NOAA) Laboratory Sponsored (6) commercial and R&E peering points ESnet core hubs

ESnet IP core: Packet over SONET Optical Ring and Hubs

high-speed peering points with Internet2/Abilene

International (high speed) 10 Gb/s SDN core 10G/s IP core 2.5 Gb/s IP core MAN rings (≥ 10 G/s) OC12 ATM (622 Mb/s) OC12 / GigEthernet OC3 (155 Mb/s) 45 Mb/s and less

 DOE Office of Science Drivers for Networking

• •

The role of ESnet is to provide networking for the Office of Science Labs and their collaborators The large-scale science that is the mission of the Office of Science is dependent on networks for
o o o o o

Sharing of massive amounts of data Supporting thousands of collaborators world-wide Distributed data processing Distributed simulation, visualization, and computational steering Distributed data management

These issues were explored in two Office of Science workshops that formulated networking requirements to meet the needs of the science programs (see refs.)

3

CERN / LHC High Energy Physics Data Provides One of Science’s Most Challenging Data Management Problems
(CMS is one of several experiments at LHC)
Online System

human

~PByte/sec Tier 0 +1
CERN LHC CMS detector

~100 MBytes/sec
event reconstruction

event simulation

15m X 15m X 22m, 12,500 tons, $700M.
French Regional Center German Regional Center

2.5-40 Gbits/sec

Tier 1
Italian Center FermiLab, USA Regional Center

analysis

~0.6-2.5 Gbps Tier 2

Tier2 Center Tier2 Center Tier2 Center Tier2 Center Tier2 Center

Tier 3
Institute Institute Institute Institute ~0.25TIPS

~0.6-2.5 Gbps

• 2000 physicists in 31 countries are

Physics data cache
Courtesy Harvey Newman, CalTech

100 - 1000 Mbits/sec

involved in this 20-year experiment in which DOE is a major player. and Europe coordinates the data analysis

Tier 4
Workstations

• Grid infrastructure spread over the US

LHC Networking

• This picture represents the MONARCH model – a
hierarchical, bulk data transfer model

• Still accurate for Tier 0 (CERN) to Tier 1 (experiment
data centers) data movement

• Probably not accurate for the Tier 2 (analysis) sites

5

Example: Complicated Workflow – Many Sites

6

Distributed Workflow

• Distributed / Grid based workflow systems involve
many interacting computing and storage elements that rely on “smooth” inter-element communication for effective operation

• The new LHC Grid based data analysis model will
involve networks connecting dozens of sites and thousands of systems for each analysis “center”

7

Example: Multidisciplinary Simulation
A “complete” Chemistry Climate CO , CH , N O Temperature, Precipitation, approach to ozone, aerosols Radiation, Humidity, Wind climate Heat CO CH Moisture N O VOCs modeling Momentum Dust involves many Biogeophysics Biogeochemistry Carbon Assimilation interacting Decomposition models and data Mineralization Microclimate that are provided Canopy Physiology by different Phenology Hydrology groups at Bud Break Leaf Senescence different locationsEvaporation Gross Primary Species Composition Transpiration Production Ecosystem Structure (Tim Killeen, Snow Melt Plant Respiration Nutrient Availability Infiltration Microbial Respiration Water NCAR) Runoff
2 4 2 2 4 2

Minutes-To-Hours

Aerodynamics

Energy

Water

Days-To-Weeks

Intercepted Water

Soil Water

Snow

Years-To-Centuries

Nutrient Availability

Disturbance Fires Hurricanes Vegetation Ice Storms Dynamics Windthrows (Courtesy Gordon Bonan, NCAR: Ecological Climatology: Concepts and Applications. Cambridge University Press, Cambridge, 2002.)

Watersheds Surface Water Subsurface Water Geomorphology Hydrologic Cycle

Ecosystems Species Composition Ecosystem Structure

8

Distributed Multidisciplinary Simulation

• Distributed multidisciplinary simulation involves
integrating computing elements at several remote locations
o

Requires co-scheduling of computing, data storage, and network elements Also Quality of Service (e.g. bandwidth guarantees) There is not a lot of experience with this scenario yet, but it is coming (e.g. the new Office of Science supercomputing facility at Oak Ridge National Lab has a distributed computing elements model)

o o

9

Projected Science Requirements for Networking
Science Areas considered in the Workshop [1]
(not including Nuclear Physics and Supercomputing)

Today End2End Throughput

5 years End2End Documented Throughput Requirements 100 Gb/s

5-10 Years End2End Estimated Throughput Requirements 1000 Gb/s

Remarks

High Energy Physics

0.5 Gb/s

high bulk throughput with deadlines (Grid based analysis systems require QoS) high bulk throughput remote control and time critical throughput (QoS) time critical throughput (QoS) computational steering and collaborations high throughput and steering
10

Climate (Data & Computation) SNS NanoScience

0.5 Gb/s Not yet started 0.066 Gb/s (500 MB/s burst) 0.013 Gb/s (1 TBy/week) 0.091 Gb/s (1 TBy/day)

160-200 Gb/s 1 Gb/s

N x 1000 Gb/s 1000 Gb/s

Fusion Energy

0.198 Gb/s (500MB/ 20 sec. burst) N*N multicast

N x 1000 Gb/s

Astrophysics

1000 Gb/s

Genomics Data & Computation

100s of users

1000 Gb/s

TBytes/Month

T B y te /M o n th 400 500 600

ESnet is Currently Transporting About 530 Terabytes/mo. and this volume is increasing exponentially

100
ESnet Monthly Accepted Traffic Feb., 1990 – May, 2005

200

300

0
Feb,90 A ug,90 Feb,91 A ug,91 Feb,92 A ug,92 Feb,93 A ug,93 Feb,94 A ug,94 Feb,95 A ug,95 Feb,96 A ug,96 Feb,97 A ug,97 Feb,98 A ug,98 Feb,99 A ug,99 Feb,00 A ug,00 Feb,01 A u g ,0 1 Feb,02 A ug,02 Feb,03 A ug,03 Feb,04 A ug,04 Feb,05
11

Observed Drivers for the Evolution of ESnet

Who Generates ESnet Traffic?
ESnet Inter-Sector Traffic Summary, Jan 03 / Feb 04/ Nov 04

72/68/62%
DOE is a net supplier of data because DOE facilities are used by universities and commercial entities, as well as by DOE researchers

21/14/10% ESnet 14/12/9% 17/10/14% 10/13/16% 9/26/25% 4/6/13%

Commercial

DOE sites

~25/19/13%

R&E (mostly universities)

53/49/50%
DOE collaborator traffic, inc. data

Peering Points International (almost entirely R&E sites)

Note • more than 90% of the ESnet traffic is OSC traffic • less that 20% of the traffic is inter-Lab

Traffic coming into ESnet = Green Traffic leaving ESnet = Blue Traffic between ESnet sites % = of total ingress or egress traffic
12

A Small Number of Science Users Account for a Significant Fraction of all ESnet Traffic
ESnet Top 100 Host-to-Host Flows, Feb., 2005
Class 1: DOE LabInternational R&E Total ESnet traffic Feb., 2005 = 323 TBy in approx. 6,000,000,000 flows
All other flows (< 0.28 TBy/month each)

TBytes/Month

12 10 8 6 4 2 0

Class 2: Lab-U.S. (domestic) R&E

Class 3: Lab-Lab (domestic)

Class 4: Lab-Comm. (domestic)

Top 100 flows = 84 TBy

Notes: 1) This data does not include intra-Lab (LAN) traffic (ESnet ends at the Lab border routers, so science traffic on the Lab LANs is invisible to ESnet 2) Some Labs have private links that are not part of ESnet - that traffic is not represented here.

Terabytes/Month
2 4 6 8

10

12

10

12

0

Source and Destination of the Top 30 Flows, Feb. 2005

0
Lab-Lab (domestic) DOE Lab-International R&E Lab-U.S. R&E (domestic) Lab-Comm. (domestic)
14

SLAC (US)  RAL (UK) Fermilab (US)  WestGrid (CA) SLAC (US)  IN2P3 (FR) LIGO (US)  Caltech (US) SLAC (US)  Karlsruhe (DE) LLNL (US)  NCAR (US) SLAC (US)  INFN CNAF (IT) Fermilab (US)  MIT (US) Fermilab (US)  SDSC (US) Fermilab (US)  Johns Hopkins Fermilab (US)  Karlsruhe (DE) IN2P3 (FR)  Fermilab (US) LBNL (US)  U. Wisc. (US) Fermilab (US)  U. Texas, Austin (US) BNL (US)  LLNL (US) BNL (US)  LLNL (US) Fermilab (US)  UC Davis (US) Qwest (US)  ESnet (US) Fermilab (US)  U. Toronto (CA) BNL (US)  LLNL (US) BNL (US)  LLNL (US) CERN (CH)  BNL (US) NERSC (US)  LBNL (US) DOE/GTN (US)  JLab (US) U. Toronto (CA)  Fermilab (US) NERSC (US)  LBNL (US) NERSC (US)  LBNL (US) NERSC (US)  LBNL (US) NERSC (US)  LBNL (US) CERN (CH)  Fermilab (US)

2

4

6

8

 Observed Drivers for ESnet Evolution

The observed combination of
exponential growth in ESnet traffic, and o large science data flows becoming a significant fraction of all ESnet traffic
o

show that the projections of the science community are reasonable and are being realized

The current predominance of international traffic is due to high-energy physics
However, all of the LHC US tier-2 data analysis centers are at US universities o As the tier-2 centers come on-line, the DOE Lab to US university traffic will increase substantially
o

High energy physics is several years ahead of the other science disciplines in data generation
o

Several other disciplines and facilities (e.g. climate modeling and the supercomputer centers) will contribute comparable amounts of additional traffic in the next few years

15

DOE Science Requirements for Networking 1) Network bandwidth must increase substantially, not just in the backbone but all the way to the sites and the attached computing and storage systems 2) A highly reliable network is critical for science – when large-scale experiments depend on the network for success, the network must not fail 3) There must be network services that can guarantee various forms of quality-of-service (e.g., bandwidth guarantees) and provide traffic isolation 4) A production, extremely reliable, IP network with Internet services must support Lab operations and the process of small and medium scale science
16

 ESnet’s Place in U. S. and International Science

• ESnet and Abilene together provide most of the
nation’s transit networking for basic science
o

o

Abilene provides national transit networking for most of the US universities by interconnecting the regional networks (mostly via the GigaPoPs) ESnet provides national transit networking for the DOE Labs Abilene interconnects regional R&E networks – it does not connect sites or provide commercial peering ESnet serves the role of a tier 1 ISP for the DOE Labs
- Provides site connectivity - Provides full commercial peering so that the Labs have full Internet access
17

• ESnet differs from Internet2/Abliene in that
o o

ESnet and GEANT

• GEANT plays a role in Europe similar to Abilene and
ESnet in the US – it interconnects the European National Research and Education Networks, to which the European R&E sites connect

• GEANT currently carries essentially all ESnet traffic
to Europe (LHC use of LHCnet to CERN is still ramping up)

18

Ensuring High Bandwidth, Cross Domain Flows

• ESnet and Abilene have recently established highspeed interconnects and cross-network routing

• Goal is that DOE Lab ↔ Univ. connectivity should
be as good as Lab ↔ Lab and Univ. ↔ Univ.
 

Constant monitoring is the key US LHC Tier 2 sites need to be incorporated

• The Abilene-ESnet-GEANT joint monitoring
infrastructure is expected to become operational over the next several months (by mid-fall, 2005)

19

Monitoring DOE Lab ↔ University Connectivity
AsiaPac SEA

• Current monitor infrastructure (red&green) and target infrastructure • Uniform distribution around ESnet and around Abilene • All US LHC tier-2 sites will be added as monitors CERN
CERN Europe Europe

LBNL Abilene FNAL ESnet
SNV DEN Japan CHI

OSU

Japan

NYC DC KC IND

BNL

Japan

LA

NCS*
SDG ALB ELP ATL

SDSC

ESnet Abilene
*intermittent

HOU DOE Labs w/ monitors Universities w/ monitors Initial site monitors network hubs high-speed cross connects: ESnet ↔ Internet2/Abilene (

scheduled for FY05)

20

One Way Packet Delays Provide a Fair Bit of Information
Normal: Fixed delay from one site to another that is primary a function of geographic separation The result of a congested tail circuit to FNAL

The result of a problems with the monitoring system at CERN, not the network

21

Strategy For The Evolution of ESnet
A three part strategy for the evolution of ESnet
1) Metropolitan Area Network (MAN) rings to provide
dual site connectivity for reliability much higher site-to-core bandwidth support for both production IP and circuit-based traffic provisioned, guaranteed bandwidth circuits to support large, high-speed science data flows very high total bandwidth multiply connecting MAN rings for protection against hub failure alternate path for production IP traffic general science requirements Lab operational requirements Backup for the SDN core vehicle for science services
22

2) A Science Data Network (SDN) core for

3) A High-reliability IP core (e.g. the current ESnet core) to address

Strategy For The Evolution of ESnet: Two Core Networks and Metro. Area Rings
Asia-Pacific CERN GEANT (Europe) Australia

Science Data Network Core (SDN) (NLR circuits)

New York Sunnyvale

IP Core
LA

Washington, DC

Aus.

San Diego
Metropolitan Area Rings

Albuquerque

IP core hubs SDN/NLR hubs

Primary DOE Labs New hubs

Production IP core (10-20 Gbps) Science Data Network core (30-50 Gbps) Metropolitan Area Networks (20+ Gbps) Lab supplied (10+ Gbps) International connections (10-40 Gbps)

ESnet MAN Architecture (e.g. Chicago)
core router
T320

R&E peerings International peerings core router Qwest

ESnet production IP core

ESnet SDN core

switches managing multiple lambdas Starlight
ESnet managed λ / circuit services ESnet management and monitoring

2-4 x 10 Gbps channels

ESnet production IP service

ESnet managed λ / circuit services tunneled through the IP backbone

FNAL monitor monitor

ANL

site equip.

Site gateway router Site LAN

Site gateway router Site LAN

site equip.
24

First Two Steps in the Evolution of ESnet 1) The SF Bay Area MAN will provide to the five OSC Bay Area sites
o o

Very high speed site access – 20 Gb/s Fully redundant site access

2) The first two segments of the second national 10 Gb/s core – the Science Data Network – are San Diego to Sunnyvale to Seattle

25

ESnet SF Bay Area MAN Ring (Sept., 2005)
• 2 λs (2 X 10 Gb/s channels) in a ring configuration, and delivered as 10 GigEther circuits - 10-50X current site bandwidth • Dual site connection (independent “east” and “west” connections) to each site • Will be used as a 10 Gb/s production IP ring and 2 X 10 Gb/s paths (for circuit services) to each site • Qwest contract signed for two lambdas 2/2005 with options on two more • Project completion date is 9/2005

SDN to Seattle (NLR)

IP core to Chicago (Qwest)

Joint Genome Institute LBNL NERSC

10 Gb/s optical channels
λ1 production IP λ2 SDN/circuits λ3 future λ4 future

SF Bay Area

LLNL SNLL

SLAC
DOE Ultra Science Net (research net) Level 3 hub Qwest / ESnet hub

ESnet MAN ring (Qwest circuits) ESnet hubs and sites

SDN to San Diego

NASA Ames

IP core to El Paso

SF Bay Area MAN – Typical Site Configuration
max. of 2x10G connections on any line card to avoid switch limitations 0-10 Gb/s drop-off IP traffic Site LAN site Site nx1GE or 10GE IP

West λ1 and λ2

ESnet

6509

SF BA MAN

1 or 2 x 10 GE (provisioned circuits via VLANS) 0-20 Gb/s VLAN traffic 0-10 Gb/s pass-through VLAN traffic 0-10 Gb/s pass-through IP traffic = 24 x 1 GE line cards = 4 x 10 GE line cards (using 2 ports max. per card) 27

East λ1 and λ2

Evolution of ESnet – Step One: SF Bay Area MAN and West Coast SDN
Asia-Pacific CERN GEANT (Europe) Australia

Science Data Network Core (SDN) (NLR circuits)

New York Sunnyvale

IP Core (Qwest)
LA

Aus.

Washington, DC

San Diego
Metropolitan Area Rings

Albuquerque El Paso Production IP core Science Data Network core Metropolitan Area Networks Lab supplied International connections

IP core hubs SDN/NLR hubs

Primary DOE Labs New hubs

In service by Sept., 2005 planned

 ESnet Goal – 2009/2010
AsiaPac SEA

• 10 Gbps enterprise IP traffic • 40-60 Gbps circuit based transport
ESnet Science Data Network (2nd Core – 30-50 Gbps, National Lambda Rail)

CERN Europe

Europe

CERN

Aus.

Japan CHI

Japan Europe

SNV DEN
Metropolitan Area Rings

NYC DC

Aus. ALB SDG

ESnet IP Core (≥10 Gbps)

ATL

ESnet hubs New ESnet hubs ELP Metropolitan Area Rings Major DOE Office of Science Sites High-speed cross connects with Internet2/Abilene Production IP ESnet core Science Data Network core Lab supplied Major international 10Gb/s 10Gb/s 30Gb/s 40Gb/s 29

 Near-Term Needs for LHC Networking

• The data movement requirements of the several
experiments at the CERN/LHC are considerable

• Original MONARC model (CY2000 - Models of
Networked Analysis at Regional Centres for LHC Experiments – Harvey Newman’s slide, above) predicted o Initial need for 10 Gb/s dedicated bandwidth for LHC startup (2007) to each of the US Tier 1 Data Centers - By 2010 the number is expected to 20-40 Gb/s per Center o Initial need for 1 Gb/s from the Tier 1 Centers to each of the associated Tier 2 centers
30

Near-Term Needs for LHC Networking

• However, with the LHC commitment to Grid based
data analysis systems, the expected bandwidth and network service requirements for the Tier 2 centers are much greater than the MONARCH bulk data movement model
o

MONARCH still probably holds for the Tier0 (CERN) – Tier 1 transfers For widely distributed Grid workflow systems QoS is considered essential
Without a smooth flow of data between workflow nodes the overall system would likely be very inefficient due to stalling the computing and storage elements

o

• Both high bandwidth and QoS network services
must be addressed for LHC data analysis
31

Proposed LHC high-level architecture
This image cannot currently be display ed.

T0/T1/T2 Interconnectivity
T2s and T1s are inter-connected by the general purpose research networks
T2 T2 T2

GridKa T2 IN2P3 T2

T2

Any Tier-2 may access data at any Tier-1

Dedicated 10 Gbit links
Brookhaven

TRIUMF

ASCC T2 Nordic Fermilab CNAF RAL T2 SARA PIC

T2

T2

LHC Network Operations Working Group, LHC Computing Grid Project

32

Near-Term Needs for North American LHC Networking

• Primary data paths from LHC Tier 0 to Tier 1
Centers will be dedicated 10Gb/s circuits

• Backup paths must be provided
o o

About day’s worth of data can be buffered at CERN However, unless both the network and the analysis systems are over-provisioned it may not be possible to catch up even when the network is restored Primary: Dedicated 10G circuits provided by CERN and DOE Secondary: Preemptable10G circuits (e.g. ESnet’s SDN, NSF’s IRNC links, GLIF, CA*net4) Tertiary: Assignable QoS bandwidth on the production networks (ESnet, Abilene, GEANT, CA*net4)
33

• Three level backup strategy
o o o

Proposed LHC high-level architecture

Tier2s

Tier1s

L3 Backbones

Tier0
Main connection Backup connection

34

LHC Networking and ESnet, Abilene, and GEANT

USLHCnet (CERN+DOE funded) supports US participation in the LHC experiments
o

Dedicated high bandwidth circuits from CERN to the U.S. transfer LHC data to the US Tier 1 data centers (FNAL and BNL)

ESnet is responsible for getting the data from the transAtlantic connection points for the European circuits (Chicago and NYC) to the Tier 1 sites
o

ESnet is also responsible for providing backup paths from the transAtlantic connection points to the Tier 1 sites

• •

Abilene is responsible for getting data from ESnet to the Tier 2 sites The new ESnet architecture (Science Data Network) is intended to accommodate the anticipated 20-40 Gb/s from LHC to US (both US tier 1 centers are on ESnet)
35

ESnet Lambda Infrastructure and LHC T0-T1 Networking
CERN-1 CERN-2 CERN-3 TRIUMF Vancouver Seattle Toronto Boise BNL Chicago Sunnyvale Denver KC Pitts FNAL Wash DC Raleigh GEANT-2 LA San Diego Dallas Phoenix Albuq. Tulsa Atlanta Clev

CANARIE

New York GEANT-1

NLR PoPs ESnet IP core hubs

El Paso Las Cruces San Ant. Houston

Jacksonville Pensacola Baton Rouge

ESnet SDN/NLR hubs Tier 1 Centers Cross connects with Internet2/Abilene New hubs

ESnet Production IP core (10-20 Gbps) ESnet Science Data Network core (10G/link) (incremental upgrades, 2007-2010) Other NLR links CERN/DOE supplied (10G/link) International IP connections (10G/link)

36

Abilene* and LHC Tier 2, Near-Term Networking
CERN-1 CERN-2 CERN-3 TRIUMF Vancouver Seattle Toronto Boise Chicago Sunnyvale Denver KC Pitts FNAL Wash DC Raleigh GEANT-2 LA San Diego Dallas Phoenix Albuq. Tulsa Atlanta Clev

CANARIE

BNL New York

Jacksonville El Paso Atlas Tier 2 Centers Pensacola Las Cruces NLR PoPs • University of Texas at Arlington Baton Rouge Houston • University CMS Tier 2 Centers San Ant. ESnet of IP Oklahoma core hubsNorman • University of New Mexico Albuquerque • MIT ESnet Production IP core (10-20 Gbps) • Langston • University of Florida at Gainesville ESnetUniversity SDN/NLR hubs < 10G connections to Abilene ESnet Science Data Network core (10G/link) • University of Chicago • University of Nebraska at Lincoln (incremental upgrades, 2007-2010) Tier University 1 CentersBloomington 10G connections to USLHC or • Indiana • University of Wisconsin at Madison Other NLR links ESnet • Boston University • Caltech Cross connects with Internet2/Abilene CERN/DOE supplied (10G/link) • Harvard University • Purdue University USLHCnet nodes New hubs • University of Michigan • University of California SanInternational Diego 37 IP connections (10G/link)

GEANT-1

 QoS - New Network Service

• New network services are critical for ESnet to meet
the needs of large-scale science like the LHC

• Most important new network service is dynamically
provisioned virtual circuits that provide
o

Traffic isolation
- will enable the use of high-performance, non-standard transport mechanisms that cannot co-exist with commodity TCP based transport (see, e.g., Tom Dunigan’s compendium http://www.csm.ornl.gov/~dunigan/netperf/netlinks.html )

o

Guaranteed bandwidth
- the only way that we have currently to address deadline scheduling – e.g. where fixed amounts of data have to reach sites on a fixed schedule in order that the processing does not fall behind far enough so that it could never catch up – very important for experiment data analysis
38

OSCARS: Guaranteed Bandwidth Service

• Must accommodate networks that are shared
resources
o o o

Multiple QoS paths Guaranteed minimum level of service for best effort traffic Allocation management
- There will be hundreds of contenders with different science priorities

39

OSCARS: Guaranteed Bandwidth Service

• Virtual circuits must be set up end-to-end across
ESnet, Abilene, and GEANT, as well as the campuses
o o

There are many issues that are poorly understood To ensure compatibility the work is a collaboration with the other major science R&E networks
- code is being jointly developed with Internet2's Bandwidth Reservation for User Work (BRUW) project – part of the Abilene HOPI (Hybrid Optical-Packet Infrastructure) project - Close cooperation with the GEANT virtual circuit project (“lightpaths – Joint Research Activity 3 project)

40

OSCARS: Guaranteed Bandwidth Service
authorization

user system1 site A

shaper

policer

resource manager

bandwidth broker

allocation manager

• To address all of the issues is complex -There are many potential restriction points -There are many users that would like priority service, which must be rationed

resource manager

policer

resource manager

user system2 site B

41

Between ESnet, Abilene, GEANT, and the connected regional R&E networks, there will be dozens of lambdas in production networks that are shared between thousands of users who want to use virtual circuits.
similar situation in Europe

US R&E environment

Federated Trust Services

• Remote, multi-institutional, identity authentication is
critical for distributed, collaborative science in order to permit sharing computing and data resources, and other Grid services

• Managing cross site trust agreements among many
organizations is crucial for authentication in collaborative environments
o

ESnet assists in negotiating and managing the cross-site, cross-organization, and international trust relationships to provide policies that are tailored to collaborative science

• The form of the ESnet trust services are driven
entirely by the requirements of the science community and direct input from the science community
43

ESnet Public Key Infrastructure

• ESnet provides Public Key Infrastructure and X.509
identity certificates that are the basis of secure, cross-site authentication of people and Grid systems

• These services (www.doegrids.org) provide
o

Several Certification Authorities (CA) with different uses and policies that issue certificates after validating request against policy This service was the basis of the first routine sharing of HEP computing resources between US and Europe

44

ESnet Public Key Infrastructure

• ESnet provides Public Key Infrastructure and X.509
identity certificates that are the basis of secure, cross-site authentication of people and Grid systems

• The characteristics and policy of the several PKI
certificate issuing authorities are driven by the science community and policy oversight (the Policy Management Authority – PMA) is provided by the science community + ESnet staff

• These services (www.doegrids.org) provide
o

Several Certification Authorities (CA) with different uses and policies that issue certificates after validating certificate requests against policy This service was the basis of the first routine sharing of HEP computing resources between US and Europe
45

ESnet Public Key Infrastructure
• Root CA is kept off-line in a vault • Subordinate CAs are kept in locked,
alarmed racks in an access controlled machine room and have dedicated firewalls
ESnet root CA

• CAs with different policies as required by
the science community o DOEGrids CA has a policy tailored to accommodate international science collaboration o NERSC CA policy integrates CA and certificate issuance with NIM (NERSC user accounts management services) o FusionGrid CA supports the FusionGrid roaming authentication and authorization services, providing complete key lifecycle management
46

DOEGrids CA NERSC CA FusionGrid CA

…… CA

DOEGrids CA (one of several CAs) Usage Statistics
7500 7250 7000 6750 6500 6250 6000 5750 5500 5250 5000 4750 4500 4250 4000 3750 3500 3250 3000 2750 2500 2250 2000 1750 1500 1250 1000 750 500 250 0
Ja n Fe -0 b 3 M -03 ar A -03 p M r-0 ay 3 Ju 03 nJu 03 A l-0 ug 3 S -03 ep O -03 ct N -03 ov D -0 ec 3 Ja -03 n Fe -0 b- 4 M 04 ar A 04 pr M -0 ay 4 Ju -04 nJu 04 A l-0 ug 4 S -04 ep O -04 ct N -04 ov D -0 ec 4 Ja -04 n Fe -0 b- 5 M 05 ar A 05 pr M -0 ay 5 Ju 05 nJu 05 l-0 5

User Certificates Service Certificates Expired(+revoked) Certificates Total Certificates Issued Total Cert Requests

No.of certificates or requests

Production se rvice be ga n in June 2003

User Certificates Host & Service Certificates ESnet SSL Server CA Certificates DOEGrids CA 2 CA Certificates (NERSC) FusionGRID CA certificates

1999 Total No. of Certificates 3461 Total No. of Requests

5479 7006 38 15 76
* Report as of Jun 15, 2005 47

DOEGrids CA Usage - Virtual Organization Breakdown
DOEGrids CA Statistics (5479) ANL 3.5% DOESG 0.3% ESG 0.8% ESnet 0.4% FusionGRID 4.8% iVDGL 18.8%

*Others 41.2%

“Other” is mostly auto renewal certs (via the Replacement Certificate interface) that does not provide VO information
NCC-EPA 0.1% LCG 0.6%

LBNL 1.2% NERSC 3.2% ORNL 0.6% PNNL 0.4%

FNAL 8.9%

PPDG 15.2%

*DOE-NSF

collab.
48

North American Policy Management Authority
• • • • • • • •
The Americas Grid, Policy Management Authority An important step toward regularizing the management of trust in the international science community Driven by European requirements for a single Grid Certificate Authority policy representing scientific/research communities in the Americas Investigate Cross-signing and CA Hierarchies support for the science community Investigate alternative authentication services Peer with the other Grid Regional Policy Management Authorities (PMA).
European Grid PMA [www.eugridpma.org ] o Asian Pacific Grid PMA [www.apgridpma.org]
o

Started in Fall 2004 [www.TAGPMA.org] Founding members
o o o o o

DOEGrids (ESnet) Fermi National Accelerator Laboratory SLAC TeraGrid (NSF) CANARIE (Canadian national R&E network)
49

References – DOE Network Related Planning Workshops
 
1) High Performance Network Planning Workshop, August 2002
http://www.doecollaboratory.org/meetings/hpnpw

2) DOE Science Networking Roadmap Meeting, June 2003
http://www.es.net/hypertext/welcome/pr/Roadmap/index.html

3) DOE Workshop on Ultra High-Speed Transport Protocols and Network Provisioning for Large-Scale Science Applications, April 2003
http://www.csm.ornl.gov/ghpn/wk2003

4) Science Case for Large Scale Simulation, June 2003
http://www.pnl.gov/scales/

5) Workshop on the Road Map for the Revitalization of High End Computing, June 2003
http://www.cra.org/Activities/workshops/nitrd http://www.sc.doe.gov/ascr/20040510_hecrtf.pdf (public report)

6) ASCR Strategic Planning Workshop, July 2003
http://www.fp-mcs.anl.gov/ascr-july03spw

7) Planning Workshops-Office of Science Data-Management Strategy, March & May 2004
o

http://www-conf.slac.stanford.edu/dmw2004
50

Sign up to vote on this title
UsefulNot useful