You are on page 1of 37

CSCE569 Parallel Computing

Lecture 1
TTH 03:30AM-04:45PM
Dr. Jianjun Hu
http://mleg.cse.sc.edu/edu/csce569/

University of South Carolina


Department of Computer Science and
Engineering
CSCE569 Course Information
Meet time: TTH 03:30AM-04:45PM Swearingen 2A21

4 Homework
 Use CSE turn-in system to submit your Homework
(https://dropbox.cse.sc.edu)
 Deadline policy
1 Midterm Exam (conceptual understanding)
1 Final Project (deliverable to your future employer!)
 Teamwork
 Implementation project/research project
TA: No TA.
CSCE569 Course Information
Textbook and references
 Parallel Programming: for Multicore and Cluster Systems
By:Thomas Rauber (Author), Gudula Rünger (Author)
Publisher: Springer; 1st Edition. edition (March 10, 2010)
 Good reference book: Parallel Programming in C with
MPI and OpenMP
by Michael J. Quinn
 Most important information sources: Slides.
Grading policy
 4 homeworks, 1 midterm, 1 final project, in-class
participation
About Your Instructor
Dr. Jianjun Hu (jianjunh@cse.sc.edu)
Office hours: TTH 2:30-3:20PM or Drop by any time
Office Phone#: 803-7777304 3A66 SWNG
Background:
Mechanical Engineering/CAD
Machine learning/Computational intelligence/Genetic
Algorithms/Genetic Programming (PhD)
Bioinformatics and Genomics (Postdoc)
Multi-disciplinary just as parallel computing app.
Outline
Motivation
Modern scientific method
Evolution of supercomputing
Modern parallel computers
Seeking concurrency
Data clustering case study
Programming parallel computers
Why You are Here?

Solve BIG problems

Use Supercomputers

Write parallel programs


Why Faster Computers?
Solve compute-intensive problems faster
Make infeasible problems feasible
Reduce design time
Solve larger problems in same amount of time
Improve answer’s precision
Reduce design time
Gain competitive advantage
Why Parallel Computing?
The massively parallel architecture of GPUs, coming
from its graphics heritage, is now delivering
transformative results for scientists and researchers all
over the world. For some of the world’s most
challenging problems in medical research, drug
discovery, weather modeling, and seismic
exploration – computation is the ultimate tool.
Without it, research would still be confined to trial and
error-based physical experiments and observation.
What problems need Parallel Computing?
Parallel Computing in the Real-world

Engineering
Science
Business
Game
Cloud-computing
Definitions
Parallel computing
Using parallel computer to solve single
problems faster
Parallel computer
Multiple-processor/core system supporting
parallel programming
Parallel programming
Programming in a language that supports
concurrency explicitly
Classical Science

Nature

Observation

Physical
Theory
Experimentation
Modern Scientific Method

Nature

Observation

Numerical Physical
Theory
Simulation Experimentation
Evolution of Supercomputing
World War II
Hand-computed artillery tables
Need to speed computations
ENIAC
Cold War
Nuclear weapon design
Intelligence gathering
Code-breaking
Supercomputer
General-purpose computer
Solves individual problems at high speeds, compared
with contemporary systems
Typically costs $10 million or more
Traditionally found in government labs
Commercial Supercomputing
Started in capital-intensive industries
Petroleum exploration
Automobile manufacturing
Other companies followed suit
Pharmaceutical design
Consumer products
CPUs 1 Million Times Faster
Faster clock speeds
Greater system concurrency
Multiple functional units
Concurrent instruction execution
Speculative instruction execution
Systems 1 Billion Times Faster
Processors are 1 million times faster
Combine thousands of processors
Parallel computer
Multiple processors
Supports parallel programming
Parallel computing = Using a parallel computer to
execute a program faster
Beowulf Concept
NASA (Sterling and Becker)
Commodity processors
Commodity interconnect
Linux operating system
Message Passing Interface (MPI) library
High performance/$ for certain applications
Computing speed of supercomputers
Projected Computing speed of supercomputers
Top 10 Supercomputers 2010.11

GPU
What you can use
Hardware
Multicore chips (2011: mostly 2 cores and 4 cores, but
doubling) (cores=processors)
Servers (often 2 or 4 multicores sharing memory)
Clusters (often several, to tens, and many more servers
not sharing memory)
Supercomputer at USC CEC
Supercomputers at USC CEC

64 Nodes: Dual CPU


76 Compute Nodes w/ dual 3.4 GHz
Supercomputers at USC CEC
SGI Altix 4700 Shared-memory system
Hardware
 128 Itanium Cores @ 1.6 GHz/ 8MB Cache
 256 GB RAM
 8TB storage
 NUMAlink Interconnect Fabric
Software
 SUSE10 w/SGI PROPACK
 Intel C/C++ and Fortran Compilers
 VASP
 PBSPro scheduling software
 Message Passing Toolkit
 Intel Math Kernel Library
 GNU Scientific Library
 Boost library
Some historical machines
Earth Simulator was #1
Some interesting hardware

Nvidia Cell Processor

Sicortex – “Teraflops from Milliwatts”


http://www.sicortex.com/products/sc648
http://www.gizmag.com/mit-cycling-human-powered-computation/8503/
GPU-based supercomputing+CUDA
Topic1: Hardware Architecture of parallel
computing system
Topic2: Programming/Software
Common parallel computing methods
PBS- job scheduling system
MPI: The Message Passing Interface
Low level “lowest common denominator”
language that the world has stuck with for nearly
20 years
Can get performance, but can be a hindrance as
well
Pthread for multi-core shared memory parallel
programming
CUDA GPU programming
MapReduce Google style high-performance
computing
Why MPI?
MPI = “Message Passing Interface”
Standard specification for message-passing libraries
Libraries available on virtually all parallel computers
Free libraries also available for networks of
workstations or commodity clusters
Why OpenMP?
OpenMP an application programming interface (API)
for shared-memory systems
Supports higher performance parallel programming of
symmetrical multiprocessors
Topic3: Performance
 Single processor speeds for now no longer growing.
 Moore’s law still allows for more real estate per core
(transistors double/nearly every two years)
 http://www.intel.com/technology/mooreslaw/index.htm

 People want performance but hard to get


 Slowdowns seen before speedups
 Flops (floating point ops / second)
 Gigaflops (109), Teraflops (1012), Petaflops(1015)
Summary (1/2)
High performance computing
U.S. government
Capital-intensive industries
Many companies and research labs
Parallel computers
Commercial systems
Commodity-based systems
Summary (2/2)
Power of CPUs keeps growing exponentially
Parallel programming environments changing very
slowly
Two standards have emerged
MPI library, for processes that do not share
memory
OpenMP directives, for processes that do share
memory
Places to Look
Best current news:
http://www.hpcwire.com/
Huge Conference:
http://sc09.supercomputing.org/
http://www.interactivesupercomputing.com
Top500.org

You might also like