You are on page 1of 10

2006-1733: DESIGNING AND IMPLEMENTING A PARALLEL COMPUTING

CURRICULUM BASED ON BEOWULF CLUSTERING

Fitra Khan, University of Texas-Brownsville


Mahmoud Quweider, University of Texas-Brownsville
Juan Iglesias, University of Texas-Brownsville
Amjad Zaim, University of Texas-Brownsville

Page 11.418.1

© American Society for Engineering Education, 2006


Designing and Implementing a Parallel Computing Curriculum
Based on Beowulf Clustering1

Introduction

The Computer Science/Computer Information Systems (CS/CIS) Department at The University


of Texas at Brownsville (UTB) has improved its curriculum by including parallel computing
topics based on a computing and networking laboratory (CNL)1. Built around a 24-node
distributed Beowulf2,3 supercomputer, the main goal of CNL is to enhance the understanding of
parallel computing principles in key courses of the Bachelor of Science in Computer Science
(BS-CS) degree, the two-year Associate in Applied Science in Computer Information Systems
(AAS-CIS), and the four-year Bachelor of Applied Technology in Computer Information
Systems Technology (BAT-CIST).

The strategy has been to use this supercomputer as the main instrument to infuse concepts and
principles into targeted courses by creating a set of laboratory modules and capstone projects.
Such project framework in CS education is strongly emphasized in the ACM/IEEE-CS curricula
model4. CNL has aided in motivating the students by engaging them in integrating distributed
computing and networking concepts into their course work through laboratory modules and
capstone projects.

There are benefits in joining the practice and theory of different computer science areas via an
integrated laboratory environment such as the one provided by CNL. First, it is easier to develop
laboratory modules that help students to put different theoretical concepts together5,6. Second, an
integrated laboratory is a low-cost solution compared to developing separate physical
laboratories to serve different areas of computer science.

The laboratory has proved to be a dynamic educational tool for providing in depth understanding
of essential concepts by incorporating state-of-the-art technologies into the curricula. This has
allowed educators to keep on developing new laboratory modules for enriching their courses. In
addition to currently implemented modules in areas like networking, databases and operating
systems, new modules in areas such as encryption, autonomous intelligent systems, and web
design and programming are planned to be developed, for example.

After being supported originally by NSF, the CNL project has reached maturity and it is now
institutionalized. This paper details the rationale, scope and achievements of the project. The
Page 11.418.2

1
This material is based upon work supported by the National Science Foundation under Grant
No. 0101648.
methodology used is also discussed with emphasis on considerations and feasibility for
implementing similar computing and networking environment at peer institutions.

Laboratory Design

CNL project is built around the concept of a laboratory which offers laboratory projects in key
courses of computer science. The equipment consists mainly of 24 computers, three Alpha
workstations, and network hardware used to build a 24-node rack-mounted Beowulf cluster. The
24-node Beowulf cluster currently runs using open source Linux operating system. The
clustering software used is based on a Message Passing Interface (MPI) package called MPICH7
which is available free of cost. MPI based software packages/toolkits are used to familiarize
students with real-world tools to develop and implement algorithms with a short development
cycle.

The Beowulf cluster is complemented with devices to develop real-world laboratory projects in
order to enhance student understanding of important concepts of computer science. For example,
network devices to simulate leased lines of Public Switched Network (PSN)8 are installed to
provide a true network environment of the real-world. As another example, image capturing
devices were acquired to capture an image of an object for recognition by a neural network based
pattern recognition algorithm.

The hardware is interconnected using network auxiliary devices to provide a locally simulated
PSN that models the real-world connectivity environment. The network hardware includes a
400Mbps network switching matrix with a 100Mbps Fast Ethernet uplink to the building's
Gigabit backbone. The backplane is attributed by 10Mbps switches (VN900EE and VN900EA)
providing a total of one ATM port and 36 10Mbps switched ports. The building's LAN is
connected to the Internet via a GigaMAN circuit leased from the local phone provider.

One of the three Alpha workstations is used for distributing tasks to the Beowulf and also for
accepting tasks from connected users. This Alpha workstation is the management station for the
Beowulf. The second Alpha workstation is used to compile and analyze results produced by the
Beowulf. It is a dedicated user workstation due to graphics required to analyze data. The third
Alpha workstation is placed on the far side of the simulated Public Switched Network (PSN).
Among the many tasks, it is used to simulate congestion on the PSN by transferring large
amounts of data back and forth across the PSN.

Figure 1 shows the general schematic of CNL. The laboratory houses the 24 computers that
constitute the 24-node rack-mounted Beowulf as a central component of B-CEIL. Network
devices are required to simulate a real-world PSN. This consists of a pair of T1-to-V.35 devices
to simulate a leased line8, a pair of DACs to aggregate or cross-connect different channels of
T1's, a pair of routers to provide WAN-to-LAN connectivity at each end of the leased line, and
VoIP units on each end to simulate real-world voice grade channels. The Beowulf nodes and
other LAN equipment are connected by a hub backplane. A LAN switch provides connectivity to
the LAN devices on the other side of the simulated PSN.
Page 11.418.3
Illustration 1: Overall CNL configuration.

Hardware is also available to introduce image processing algorithms9. Transducers are used to
convert video signals to binary frame format fit for image processing10. A high resolution
Charged Coupled Device (CCD) camera and a comparable image capturing card is used to
capture high resolution images in order to process real-world images. A pair of video codec’s is
used for benchmarking student algorithms.

Implementation

The authors participated during the implementation of the project; each one was scheduled to
teach two different courses per semester for which the corresponding laboratory modules (LM)
were developed. A total of eight courses were selected for utilizing B-CEIL in the first year for
this project: COSC 3330 Networking and Database Management Systems, COSC 3310 Systems
Programming and Concurrent Processes, COSC 3325 Digital Logic and Computer Organization,
COSC 4310 Operating Systems, COSC 3355 Principles of Programming Languages, COSC
4342 Database Management Systems, COSC 4360 Numerical Methods, and COSC 4380 Image
Processing.

Two levels of student laboratory projects were developed for curriculum enrichment. Appendix
A presents a finer LM´s breakdown including the subject areas in which they were utilized to
enhance understanding of essential concepts.

The first level of student laboratory projects was related directly to the Beowulf cluster itself,
Page 11.418.4

specifically, its hardware architecture, connectivity, and existence as a logical cluster. This
entails the development of laboratory projects on topics such as computer interfacing, Local Area
Networking (LAN), clustering, task scheduling and optimization, and benchmarking.

The second level of student laboratory projects were focused on setting up a PSN and associated
data/voice channels to model the real-world connectivity, and building applications for Beowulf
in the simulated PSN environment. This includes the development of laboratory projects in the
area of Wide Area Networking (WAN), in order to enhance understanding of real-world PSN
based connectivity, and computationally intensive fields such as artificial neural networks, image
compression, image analysis, numerical analysis, and distributed databases where parallel
processing concepts may be utilized to speed up computations.

LM’s were developed so that students could complete them in one to two weeks during a regular
semester instruction. In addition to LM’s, course projects (CP) were also proposed to students.
These CP’s had a long term nature in the sense that they were intended to be developed during
the entire semester and carried on during different offerings of the same course from one
semester to another.

Topics of CP’s were not restricted to the ambit of a single particular course. Instead, CP’s were
developed having in mind a crossing-discipline emphasis that could integrate different areas of
computer science. Appendix B shows a more detailed description of the CP’s.

As the reader can appreciate from Appendix B, the topics of CP’s are wide in range going from
an “Integrated Monitoring System” for public networks to the “Parallel Simulation of
Electromagnetic Wave Propagation” and “Optimization Based on Genetic Algorithms”. This
variety is in fact a reflection of the versatility and generality of the CNL.

Results

During the three years of its implementation, the project has proven to be successful. The
laboratory was opened in fall 2002 and it has remained operational ever since. During this time
students and faculty have received the benefit of this lab.

LM’s created so far cover topics such as Beowulf cluster design and implementation,
benchmarking computational machines, public switched networks, voice over IP, monitoring
network traffic bandwidth, image processing, Taylor series, number representation, LAM and
MPI for Beowulf clusters, concurrent and parallel processing, database paging management, and
several others.

Besides the application of the modules in the classroom, it is interesting to note that the
laboratory has been used as a recruitment and retention tool. Additionally, students and faculty
have used the laboratory in projects that have motivated students to advance in research and
continue their education pursuing graduate studies. As a matter of fact students have already
presented results from their scholarly work11,12.

Some of the research projects motivated by the laboratory include topics like hybrid
Page 11.418.5

software/hardware approaches for teaching digital logic, implementation of multithreaded web


servers using Java, implementation of integrated monitoring systems, studying the effects of
congestion control on multimedia applications, and software/hardware simulation of multi-
functioned calculators, among others.

Each of the laboratory modules and course projects can be freely accessed at
http://blue.utb.edu/bceil. Further information about the research projects is located on this web
site as well.

Conclusions

In this paper the authors have presented the development, current state and future work of a
parallel computing and networking laboratory built around a 24-node Beowulf cluster that is
intended to raise the educational levels at The University of Texas at Brownsville. The project
has been proven to be a successful tool in terms of stimulating students enrolled in participating
courses.

The program is currently being institutionalized at UTB. Faculty and students now benefit from
having this tool for research purposes and the development of laboratory modules.

Since our programs in computer science and computer information systems at UTB did not
previously have hardware lab, CNL has had a great impact on our ability to provide opportunities
for our students to understand the contents of wide variety of computer courses. Furthermore,
CNL has proved to be a powerful tool in terms of enrollment and retention.

Acknowledgments

The authors would like to acknowledge all the students that have made CNL a successful project.
We specially thank Francisco Arteaga, Mario Guajardo, Ariel Martinez, Brian W. Matthews,
David Ortiz, Julie Pedraza, and Jose D. Zamora.

Bibliographic Information
1. Khan, F. and Quweider, M., “Beowulf based Curriculum Enrichment Integrated Laboratory,” National Science
Foundation ATE Grant 2001.
2. Sterling, T. et. al., “How to Build a Beowulf: A Guide to the Implementation and Application of PC Clusters,”
The MIT Press, 1999
3. Spector, D., “Building Linux Clusters: Scaling Linux for Scientific and Enterprise Applications,” O’Reilly &
Associates, Inc., 2000
4. The Joint Task Force on Computing Curricula IEEE-CS ACM, “Computing Curricula 2001 Computer
Science”.
5. Khan, F., “Lessons Learned from an NSF Pilot Project on Minority Student Retention,” Proc. of the Frontiers In
Education (FIE) Conference '97, Pittsburgh, Pennsylvania, Nov. 5-8, 1997
6. Khan, F. and Siddique, B., “An NSF Pilot Project on Minority Student Retention,” Proc. of the Frontiers In
Education Conference '96, Salt Lake City, Utah, November 6-9, 1996
7. Gropp, W., Lusk, E., and Skjellum, A., “Using MPI: Portable Parallel Programming with the Message-Passing
Page 11.418.6

Interface,” The MIT Press, 1999


8. Bates, R. and Gregory, D., “Voice and Data Communications Handbook,” McGraw Hill, 1998
9. Efford, N., “Digital image processing: a practical introduction using Java,” Addison-Wesley publishing, 2000
10. Sonka, M., Hlavac, V. and Boyle, R., “Image processing, analysis and machine vision,” PWS publishing, 1999
11. Guajardo, Mario, “Integrated Monitoring System”, 30th Annual National Conference. Society for Advancement
of Chicanos and Native Americans in Science, Albuquerque, NM, October 2003
12. Pedraza, Julie “Beowulf-based Curriculum Enrichment Laboratory”, Annual UT-System AMP-NSF Research
Conference. University of Texas at El Paso, El Paso, Texas, October 2002.

Appendix A. Laboratory Modules

A.1 First level of projects

LM1 — Physical aspects of constructing a cluster: The students will become familiarized with
different components of a Beowulf node, its architecture, and the required switching matrix.

LM2 — Benchmarking computational machines: The students will learn to benchmark clusters
and standalone machines to rate computational speed up factor provided by parallel machine
architecture.

LM3 — Scheduling & optimization: Proper scheduling algorithms will be programmed and
utilized by students to optimize the utility of a parallel processing machine.

A.2 Second level of projects

LM4 — Public Switched Network (PSN): The students will learn the realities of real-world
network traffic over simulated leased lines, for example, T1's. By setting up a local PSN and
associated data/voice channels over simulated leased lines, they will learn how the real-world
PSN and its components invariably affect network speed and integrity between two distant
locations no matter if a Beowulf cluster is connected to it. Students will program Direct Access
Cross-connects (DACs) to aggregate (trunking or bonding process) leased lines to provide
different connection bandwidths. They will learn programming of routers for end-to-end
connectivity of LANs over a PSN. They will understand how a supercomputer's processing load
needs to be balanced given the slow connection between two distant locations through a PSN.

LM5 — Voice over IP (VoIP): Merging of telephone traffic into normal data network traffic
poses new challenges. VoIP channels will be created by students to provide voice grade channels
across the simulated PSN. These voice grade channels will be used in Beowulf applications in
order to realize that VoIP channels share network traffic and are susceptible to network
congestion. Students will also learn bonding of these VoIP channels to provide greater
bandwidth.

LM6 — Pattern recognition: The students will parallelize pattern recognition algorithms of
neural networks to speed up pattern recognition process.
Page 11.418.7
LM7 — Coding: Compression is concerned with reducing prohibitively large amounts of data
generated by image acquisition and capture devices. Such reductions are essential in
environments of limited bandwidth or limited storage. Parallelizing such techniques will
contribute in the reduction of the transmission time and time delay associated with it.
Videoconferencing and telebrowsing would be such applications. Audio and image compression
techniques will be used to demonstrate the application of parallel processing in coding images in
order to compress them for delivery over a slower medium such as a voice channel. To simulate
a real-world voice grade channel, Voice over IP (VoIP) equipment will be utilized to provide a
voice channel between the Beowulf cluster end and the other side across the simulated PSN.

LM8 — Image Processing Operations: Image processing applications range from noise
reduction, image sharpening, feature extraction, pattern deduction, and image analysis. Due to
their discrete multidimensional nature they are good candidates for parallel processing. The need
and benefit of parallelizing algorithms will be demonstrated in real-world image processing
techniques. Given an average image size (say 5000 X 5000), a simple smoothing operation
which removes noise contamination from the image requires over 250 million multiplications
and additions. This would take several minutes to complete on a common desktop workstation,
whereas it is expected to be much faster on a Beowulf cluster.

LM9 — Simulation: A model of a real-world system will be created and simulated on the
available cluster. Then random events will be generated in order to study the affect of different
kinds of events on the behavior of the system.

LM10 — Matrix operations: The cluster will be used to teach students how to program matrix
operations in order to speed up the computations based on parallel processing.

LM11 — Concurrent & Parallel Processing: The students will be taught how to parallelize a
computational problem. The difference between concurrent and parallel processing will be made
clear by executing the developed parallel algorithms on one machine and then on a cluster.

LM14 — Distributed Databases: Parallelizing search algorithm on distributed databases will be


demonstrated. This will include record-locking and read-sharing of a distributed database.

Appendix B. Course Projects

CP1 — Implementing a Multithreaded Web Server Using Java: This project is aimed at
understanding the basic nature of the Internet's five layer protocols: The Application Layer, The
Transport Layer, The Network Layer, the Link Layer and the Physical Layer and to allow the
student to create a complete Web Server for the application layer using Java as the basic software
development tool. The CNL facility is being used to provide an integrated computing and
network testing environment. The important concepts of multithreading and distributed
computing are enforced through the creation of a realistic web server which is able to handle
Page 11.418.8

multiple requests at a time.


CP2 — Studying the Effects of Congestion Control on Multimedia Application: This project is
aimed at integrating knowledge from Image Processing, Computer Networking and Software
Engineering to study the effects of congestion control created by limited bandwidth and the use
of unreliable transport protocols such as UDP (User Data Protocol). The student will be
responsible to setting up the video conferencing equipment purchase by CNL and conducting a
quantitative study of congestion control, image quality, delay and jitter will be studied in a
systematic way. Also, the effect of image compression techniques on reducing congestion will be
also investigated.

CP3 — Software/Hardware Simulating of a multi-functioned BCD based Calculator: This


project is aimed at integrating knowledge from Digital Logic Design and Software tool by
creating, both in software and hardware, a multi-functioned BCD based calculator. The student
will specify the functionality of the calculator; verify this functionality in software after which a
hardware implementation using a digital logic kit will be attempted.

CP4 — Creating an Interactive Digital Image Viewer: The goal of this project will be to create
an interactive image viewer which will allow the student to capture an image, choose an image
processing operation and then apply that operation. The 2048x2048 8-bit Cohu and the Integral
high-resolution video capture card will be used for this project.

CP5 — Creating an HTTP based Digital Image Processing Server: The goal of this project is to
create an image processing server. This dedicated image processing server will have the ability
to accept an image from a client side and perform the operation specified by the client and send
the result wrapped in a web page.

CP6 — Integrated Monitoring System. Critical service equipment, such as network equipment,
needs to be monitored for continued successful operation. However, due it being spread over
several sites, the best way is to monitor it remotely. This project recommends devices and their
mode of connectivity which will send signals back & forth between the network center and
closets in order to monitor/control different events such as, entry into a communication closet,
high temperature/humidity, and power failure. This information is comprehensively displayed at
the designated console in the network center. The operator in the network center has to decide if
the problem can be taken care of remotely or a visit to the problematic communication closet is
necessary.

CP7 — Parallel Simulation of Electromagnetic Wave Propagation: The current project intends
to implement, based on a consolidated frequency domain technique, a novel electromagnetic
structure simulator with a highly interactive user interface. The software developed is based on
the current Beowulf technology provided by CNL. Our goal is to develop a tool that can be used
for the simulation and analysis of virtually any electromagnetic transmission system.

CP8 — Optimization Based on Genetic Algorithms: The goal of this research project is to
develop a general library for connective methods that can be used as the underlying platform to
test diverse approaches for optimization using genetic algorithms. The main role of students in
Page 11.418.9

the project will be coding routines under specification for optimization that follow traditional
techniques while researching in a parallel architecture under Beowulf new techniques that may
be useful as alternatives to previously proposed work.

Page 11.418.10

You might also like