Professional Documents
Culture Documents
SYSTEMS
INTERNET & DISTRIBUTED SYSTEMS
Jocelyn O. Padallan
ARCLER
P r e s s
www.arclerpress.com
Internet & Distributed Systems
Jocelyn O. Padallan
Arcler Press
2010 Winston Park Drive,
2nd Floor
Oakville, ON L6H 5R7
Canada
www.arclerpress.com
Tel: 001-289-291-7705
001-905-616-2116
Fax: 001-289-291-7601
Email: orders@arclereducation.com
This book contains information obtained from highly regarded resources. Reprinted material
sources are indicated and copyright remains with the original owners. Copyright for images and
other graphics remains with the original owners as indicated. A Wide variety of references are
listed. Reasonable efforts have been made to publish reliable data. Authors or Editors or Publish-
ers are not responsible for the accuracy of the information in the published chapters or conse-
quences of their use. The publisher assumes no responsibility for any damage or grievance to the
persons or property arising out of the use of any materials, instructions, methods or thoughts in
the book. The authors or editors and the publisher have attempted to trace the copyright holders
of all material reproduced in this publication and apologize to copyright holders if permission has
not been obtained. If any copyright holder has not been acknowledged, please write to us so we
may rectify.
Notice: Registered trademark of products or corporate names are used only for explanation and
identification without intent of infringement.
Arcler Press publishes wide variety of books and eBooks. For more information about
Arcler Press and its products, visit our website at www.arclerpress.com
ABOUT THE AUTHOR
List of Figures.................................................................................................xi
List of Tables.................................................................................................xiii
Preface..........................................................................................................xv
Chapter 4 Coordination............................................................................................ 75
4.1 Distributed Mutual Exclusion.............................................................. 76
4.2 Multicast Communication................................................................... 85
4.3 Group Communication....................................................................... 86
4.4 Failure................................................................................................ 87
4.5 Elections............................................................................................. 91
References................................................................................................ 94
viii
Chapter 7 Security and Privacy............................................................................... 137
7.1 Concepts of Security......................................................................... 138
7.2 Kerberos........................................................................................... 144
7.3 SSL Security...................................................................................... 145
7.4 Firewall............................................................................................ 145
7.5 Authorization.................................................................................... 146
7.6 Auditing............................................................................................ 148
7.7 Digital Signatures.............................................................................. 148
7.8 Cryptography.................................................................................... 152
References.............................................................................................. 161
ix
10.5 Middleware.................................................................................... 223
10.6 Challenges and Constraints............................................................. 224
10.7 Applications of Wireless Sensor Networks....................................... 227
References.............................................................................................. 229
Bibliography........................................................................................... 231
Index...................................................................................................... 235
x
LIST OF FIGURES
xii
LIST OF TABLES
xvi
introduction and is kept at a judicious length. Relevant images are inserted with
sources for future reference and an intensive effort has been made to make the
whole content comprehensive, self-explanatory, and very much interesting for
the readers.
xvii
CHAPTER 1
INTRODUCTION TO INTERNET
“The Internet came out of our frustration that there were only a limited
number of large, powerful research computers in the country, and that many
research investigators who should have access to them were geographically
separated from them.”
-Leiner et al. 2003
CONTENTS
1.1 History of The Internet............................................................................................... 2
1.2 Local Area Network................................................................................................. 12
1.3 Wide Area Network................................................................................................. 13
1.4 Wireless Local Area Network................................................................................... 14
1.5 Metropolitan Area Network..................................................................................... 15
1.6 Connecting to The Internet....................................................................................... 15
1.7 Web Services........................................................................................................... 17
References..................................................................................................................... 24
2 Internet & Distributed Systems
The public internet is a global computer network, that is, a network that
interconnects other networks throughout the world via devices. Most of
these devices are conventional desktop PCs, Unix-based workstations, and
servers that store and transmit data such as Web (WWW) pages and e-mail
messages. Gradually, non-traditional computing devices such as Web TVs,
mobile computers, pagers, and toasters are being connected to the Internet.
The Internet provides a capability so powerful and universal that it can be
used for almost any purpose that depends on information, and it is accessible
by every individual who connects to one of its constituent networks. This
chapter discusses the internet at large, modern means of communication such
as email, the diverse protocols that drive the internet and also highlights the
ways by which one can get connected to the internet.
At this time, the ARPANET had 200 routers. In 1984, NSF funded a TCP/
IP-based backbone network. This backbone grows into the NSFNET, which
becomes the successor of the ARPANET. In 1995, NSF stopped funding
of NSFNET. The Internet went completely commercial. Researchers and
academics in other fields began to make use of the network, and ultimately
the National Science Foundation (NSF), which had generated a similar and
parallel network, called NSFNet, took over much of the TCP/IP technology
from ARPANET and established a distributed network of networks capable
of handling far traffic. In 1985, NSF initiated a program to establish
Internet access across the United States. They created a backbone called the
NSFNET and gave access to all educational facilities, academic researchers,
government agencies, and international research organizations. By the
1990s, the Internet experienced explosive evolution. It is expected that the
number of computers connected to the Internet was doubling every year.
Business owners realized that they could make use of the Internet to upgrade
and enhance their services to their customers, so they started spending huge
amounts of money to develop the Internet. This spawned violent competition
among the communications carriers and hardware and software suppliers to
meet this demand. The result is that bandwidth (i.e., the information carrying
capacity of communications lines) on the Internet has increased immensely
and costs have dropped. It is widely believed that the Internet has played a
momentous role in the economic success.
world and links easily to other pieces of information. This enables the user
to find vital information swiftly. It is an Internet resource where one can get
information about diverse topics and content.
1.1.2 E-mail
E-mail has become an essential form of communication. Modern e-mail
operates across the Internet, but the use of e-mail actually precedes
ARPANET. Some of the earlier e-mail systems date back to the mid-1960s,
including the Automatic Digital Network by the US. Department of Defense
and the Compatible Time-Sharing System by MIT. As ARPANET grew
significantly, regulations were developed for e-mail communication. The
first ARPANET e-mail was sent in 1971, using the @ symbol to separate
the name of the user from the user’s machine. The most commonly used
e-mail standard, called Simple Mail Transfer Protocol (SMTP) was
developed in 1982. Standardization led to a significant increase in the
popularity of e-mail. In fact, e-mail was the first killer app of ARPANET
and the Internet. A killer application is a computer program that is so useful
and popular that it proves the core value of some larger technology. In this
case, e-mail really established the expediency of the Internet. These early
systems worked on a restricted number of terminals across relatively small
networks. The systems were also not compatible with each other. E-mail
allows the subscriber to stay in touch with people. It actually takes just less
than a minute to send an e-mail. You may get your e-mail account through
the same people who provide your internet service, or you may sign up to
one of the committed e-mail providers, such as Gmail, Outlook, or Yahoo!
One difference between using e-mail software and using webmail is that
you usually have to login to a webmail account each time you use it. You
will need your e-mail address and a password to do this. Webmail is free
but is paid for by advertising, so you may also have to get used to small
adverts on screen when you use it. E-mail lets you type and send a message
to anyone at any location in the world provided that the receiver has an
Internet connection and an e-mail account. The message is delivered within
seconds once it is sent. The way e-mail works on the Internet is similar to
the way the postal system works although the e-mail is faster. The address
is the core part of both standard letters and e-mail messages. If the address
is incorrect, the supposed receiver will never see the message. Clicking the
send button on your computer is similar to placing your letter in a mailbox.
Once you have sent your message on its way, you cannot withdraw it. The
postal service collects mail and takes it to the local post office for processing
6 Internet & Distributed Systems
and routing. When you send e-mail, your local post office reads the e-mail
address and either conveys the message to another local mail account or
sends it to the Internet. As your letter travels from one post office to another,
the postal service routes it based on the address, so that there is proximity to
its destination. Your e-mail message passes through a number of computers,
known as routers, on its way to the recipient. A router is a device that acts
as a gateway between networks. It basically routes data by matching the
destination address in the data to network topology information and routing
tables. Often, a router connects a local area network (LAN) to the Internet.
Postal mail goes from the local post office to the recipient’s mailbox. The
e-mail recipient’s local post office system receives the message. Both postal
letters and e-mail messages remain in the mailbox until the recipient checks
for new mail. A major difference between electronic and postal mail is that
e-mail is significantly faster.
FTP is a specific Internet service that either places files on, or retrieves
files from, a remote computer. It is likely to log-on to a computer without
establishing an account and to retrieve information from its public file area.
This was one of the reasons for creating unidentified log-on to these sites.
The term anonymous FTP, was used to describe these sites. FTP utilizes
both TCP control and data connection. The TCP control connection is used
for opening and closing an FTP session and for reassigning commands
from client to server. The data connection is used for conveying single files
between client and server. Every file transfer uses a separate data connection.
Any two computers that are on the similar network (and that can be on the
Internet) can transfer files using the FTP. There is an application in there,
known as “ftp,” that forwards the actual protocol (FTP) to transfer files. The
steps involved goes as follows
i. A person runs an FTP client application on one computer.
ii. Another computer runs an FTP server program. That might be as
an “ftpd” (FTP daemon) on a UNIX/Linux computer.
iii. It could also be an FTP service on other platforms. The function
of FTP protocol is to allow autonomy between client and server
machine system files, enable exchange of files between remote
computers, and enable effective transfer of data. Essentially, it
defines the method by which data is transferred over a TCP/IP
network. You can depend on FTP to deliver files efficiently. Most
of the time, you do not have to worry whether your operating
system (Windows, Apple OX, Linux) is the same as that of an
FTP server. To computers, the TCP/IP protocol is the only thing
that matters. Sometimes, however, you might have to download
FTP server software to make it work. Finally, FTP transfers are
not necessarily secure. Be sure to explore the security features of
FTP services before jumping right in. FTP protocol falls within a
client–server model, that is, one machine sends orders (the client)
and the other awaits requests to carry out actions (the server).
During an FTP connection, two transmission channels are open.
Usually, a channel is open for data while the other is open for
commands or control channel. FTP commands make it possible
to specify the port used, data structure, method of data transfer
and the nature of the action to be conducted. However, there are
three different types of FTP commands namely access control
commands, transfer parameter commands, and FTP service
commands. Traditionally, these are port 21 for the command port
Introduction to Internet 9
and port 20 for the data port. The confusion begins however,
when we find that depending on the mode, the data port is not
always on port 20. There are two modes of FTP namely
i. Active FTP In active mode FTP, the client connects from a random
unprivileged port (N > 1024) to the FTP server’s command port,
port 21. Then, the client starts listening to port N+1 and sends the
FTP command PORT N+1 to the FTP server. The server will then
connect back to the client’s specified data port from its local data
port, which is port 20. From the server-side firewall’s standpoint,
to support active mode FTP, the following communication
channels need to be opened: FTP server’s port 21 from anywhere
(Client initiates connection) FTP server’s port 21 to ports > 1024
(server responds to client’s control port) FTP server’s port 20
to ports > 1024 (server initiates data connection to client’s data
port) FTP server’s port 20 from ports > 1024 (Client sends ACKs
to server’s data port).The main problem with active mode FTP
actually falls on the client side. The FTP client does not create the
actual connection to the data port of the server—it simply tells
the server what port it is listening on and the server connects back
to the specified port on the client. From the client-side firewall,
this appears to be an outside system initiating a connection to an
internal client which is usually restricted.
testbox1: {/home/p-t/slacker/public_html} % ftp -d testbox2
Connected to testbox2.slacksite.com. 220 testbox2.slacksite.com
FTP server ready. Name (testbox2: slacker): slacker ---> USER
slacker 331 Password required for slacker. Password: TmpPass
---> PASS XXXX 230 User slacker logged in. ---> SYST 215
UNIX Type: L8 Remote system type is UNIX. Using binary mode
to transfer files. ftp> ls ftp: setsockopt (ignored): Permission
denied.
ii. Passive FTP To resolve the problem of the server initiating the
connection to the client, a different method for FTP connections
was established. This was identified as passive mode, or PASV,
after the command used by the client to tell the server it is in a
passive mode. In a passive mode FTP, the client initiates both
connections to the server, solving the issue of firewalls filtering
the inbound data port connection to the client from the server.
When opening an FTP connection, the client opens two random
ports locally (N > 1024 and N+1). The first port communicates
10 Internet & Distributed Systems
with the server on port 21, but instead of then issuing a PORT
command and allowing the server to connect back to its data port,
the client will issue the PASV command. The result of this is that
the server then opens a random unprivileged port (P > 1024) and
sends the PORT P command back to the client. The client then
initiates the connection from port N+1 to port P on the server to
transfer data. From the server-side firewall’s standpoint, to support
passive mode FTP the following communication channels need
to be opened: FTP server’s port 21 from anywhere (client initiates
connection) FTP server’s port 21 to ports > 1024 (server responds
to client’s control port) FTP server’s ports > 1024 from anywhere
(client initiates data connection to random port specified by
server) FTP server’s ports > 1024 to remote ports > 1024 (server
sends ACKs (and data) to client’s data port).
1.1.5 Telnet
Telnet (TELecommunication NETwork) is a network protocol used on the
Internet or LAN connections. It is a network protocol used on the Internet or
LANs to provide a bidirectional interactive communications facility. Telnet
can connect to a remote machine that on a network and is port listening.
Introduction to Internet 11
of years for use with text-based conferencing. The IRC protocol has been
established on systems using the TCP/IP network protocol, although there is
no condition that this must remain the only layer in which it operates. IRC
itself is a teleconferencing system, which (through the use of the client–
server model) is well-suited to running on many machines in a distributed
manner. It has several benefits: it does not need a multimedia PC and unlike
the e-mail, the communication is instantaneous. Each party sees what he or
she has written in the PC screen as well as what the other person replies;
several people can join in one session. To establish a dialogue through this
system, one must take note of which channel or chat room it will take place.
That channel is actually a space in an Internet computer called server. In the
Network, there are thousands of chat channels in which people participate
in conversations on topics that vary from technology to cinema, music,
religion, and many others. A typical setup involves a distinct process (the
server) forming a central point for clients (or other servers) to connect to,
performing the required message delivery/multiplexing, and other function.
The protocol as described herein is used for both connections. There are,
however, more restrictions on client connections (which to be unreliable)
than on server connections. No definite character set is stated. The protocol
is based on a set of codes which are comprises eight (8) bits, making up an
octet. Every message comprises any number of these octets; however, some
octet values are used for control codes which function as message delimiters.
Irrespective of being an 8-bit protocol, the delimiters and keywords are such
that protocol is mostly operational from USASCII terminal and a telnet
connection.
Mbps to 100 Mbps, have low delay rate (microseconds or nanoseconds), and
make very few miscalculations. Examples of LANs are Ethernet and Wi-Fi.
use for Internet access. Configuring the tethering software and the computer
to use your smartphone for Internet service differs depending on which
tethering software you use, as well as the type of phone you have. Check
online for specific instructions for the tethering software and smartphone
you are using. However, not all smartphones are capable of tethering, and
not all cell phone carriers allow smartphones on their network to be used for
tethering. Cell phone carriers that allow tethering may not provide support
for tethering. These carriers may also charge an additional fee to tether their
smartphone.
system, Web services do not provide the user with a GUI. Web services
instead share business logic, data, and processes through a programmatic
interface across a network. The applications interface, not the users.
Developers can then add the Web service to a GUI (such as a Web page
or an executable program) to offer specific functionality to users. Web
services allow different applications from different sources to communicate
with each other without time-consuming custom coding, and because all
communication is in XML, web services are not tied to any one operating
system or programming language. For example, Java can talk with Perl,
Windows applications can talk with UNIX applications. There are two basic
types of web services namely SOAP and Representational State Transfer.
1.7.3 Ports
Computer ports are vital features of all computing devices. Computer ports
provide the input and output interfaces the device needs to communicate
with peripherals and computer grids. The most important ports on computers
are used for computer networking. The transport layer’s task is to provide
a network-independent message transport service between pairs of network
ports. Ports are software-defined destination points at a host computer. They
are attached to processes, enabling data transmission to be addressed to a
specific process at a destination node A computer port is a type of electronic,
software- or programming-related docking point through which information
flows from a program on your computer or to your computer from the
Internet or another computer in a network. (A network, by the way, is a
series of computers that are physically or electronically linked. In computer
terms, a computer or a program connects to somewhere or something
else on the Internet througha port. Port numbers and the user’s IP address
combine into the “who does what” information kept by every ISP. Ports
are numbered for consistency and programming. The most commonly used
and best-known ports are those numbered 0 to 1023 dedicated for Internet
use, but they can extend far higher for specialized purposes. Each port set
or range is assigned specialized jobs or functions, and that’s generally all
they do. Usually, all identical system services or functions use the same
port numbers on the receiving servers. For example, all computers accessing
or requesting Quote of the Day will always use port 17, because that port
is officially reserved for that purpose, and only requests for that service
Introduction to Internet 21
Label Description
Source: https://dayaramb.files.wordpress.com/2011/03/computer-network-
notes-pu.pdf
a FTP server, and telnet for a session to access remote computers Parts of
a URL http://host.company.com:80/a/b/c.html? user=Alice& year=2008#p2
Scheme (http:): identifies protocol used to fetch the content. Host name (//
host.company.com): name of a machine to connect to. Server’s port number
(80): allows multiple servers to run on the same machine. Hierarchical
portion (/a/b/c.html): used by server to find content. Query parameters
(? user=Alice& year=2008): provides additional parameters While all
website URLs begin with “http,” several other prefixes exist. Below is
a list of various URL prefixes: http—a webpage, website directory, or
other file available over HTTP ftp—a file or directory of files available to
download from an FTP server news—a discussion located within a specific
newsgroup telnet—a Unix-based computer system that supports remote
client connections gopher—a document or menu located on a gopher server
wais—a document or search results from a WAIS database mailto—an
e-mail address (often used to redirect browsers to an e-mail client) file—a
file located on a local storage device (though not technically a URL because
it does not refer to an Internet-based location) You can manually enter a
URL by typing it in the address bar of your web browser. For example, you
might enter a website URL printed on a business card to visit the company’s
website. Most URLs, however, appear automatically when you click on a
link or open a bookmark. If the server name in the URL is not valid, your
browser may display a “Server not found” error. If the path in the URL is
incorrect, the server may respond with a 404 error.
REFERENCES
1. Berners-Lee, T. (1989).Information management: A proposal, CERN
[The first document describing the World Wide Web].RetrievedJanuary
10, 2018.
2. Birman, K. (2005). Reliable distributed systems: Technologies, web
services and applications. Springer-Verlag. Retrieved January15,
2018, from http://net.pku.edu.cn/~course/cs501/2011/resource/2005-
Book Reliable%20Distributed%20Systems.pdf
3. Coulouris, G, Dollimore, J.,&Kindberg, T. (2005). Distributed systems:
Concepts and design (4th ed.). Addison Wesley.Retrieved January10,
2018, from http://www.gecg.in/papers/ds5thedn.pdf
4. Kozierok, C. (2005). The TCP/IP guide: A comprehensive, illustrated
internet protocols reference. No Starch Press Series.Retrieved
January30, 2018, from http://index-of.es/Magazines/hakin9/books/No.
Starch. TCP. IP. Guide. Oct.2005.pdf
5. Verissimo, P., &Rodrigues, L. (2012). Distributed systems for system
architects (Vol. 1).New York:Springer Science & Business Media.
CHAPTER 2
FUNDAMENTAL CONCEPTS OF
DISTRIBUTED SYSTEMS
“The result of these technologies is that it is now not only feasible, but easy,
to put together computing systems composed of large numbers of computers
connected by a high-speed network.”
-Andrew Tanenbaum
CONTENTS
2.1 Distributed Systems................................................................................................. 26
2.2 Architecture Models................................................................................................ 28
2.3 Internet Architecture................................................................................................ 32
2.4 Ethernet Design....................................................................................................... 36
2.5 Open Systems Interconnections (ISO/OSI Reference Model) ................................... 39
2.6 Middleware............................................................................................................. 43
2.7 The TCP/IP Reference Model.................................................................................... 45
2.8 Interprocess Communication................................................................................... 48
References..................................................................................................................... 51
26 Internet & Distributed Systems
i. Thin-client model
In a thin-client model, all of the application processing and data management
is carried out on the server. The client is simply accountable for running the
presentation software. A thin client refers to software/hardware the supports
a window-based user interface local to the user while executing application
programs on a remote computer. It is normally used when legacy systems
are migrated to client server architectures. The legacy system acts as a server
in its own right with a graphical interface implemented on a client. A major
disadvantage is that it places a heavy processing load on both the server
and the network. Thin clients are devices/programs that have very limited
functionality and depend heavily on their server counterparts. Thin clients
can function as a user workstation that contains a minimal operating system
and little or no data storage. Examples include:–Sun Rays–VNC–Microsoft
remote desktop. Essentially, it can function as programs—a client mainly
provides a user interface, while the bulk of processing occurs in the server,
for example, the OnLive gaming client (about 10 MB in size).
Source: https://www.cnet.com/products/dell-xps-one-27-all-in-one-desktop/re-
view/
so the clocks on different computers do not necessarily give the same time as
one another. Messages communications can be affected by delays and suffer
from a variety of failures and vulnerable to security attacks. All system models
have some common fundamental properties. There are three fundamental
models namely interaction models, failure models, and security models. (i)
Interaction: Processes communicate with messages and coordinate through
synchronization and ordering of activities. The message delays are often
of considerable duration, the coordination between processes is limited by
lack of global clock. The interaction model deals with performance and
with difficulty of setting time limits. (ii) Failure: The correct operation is
threatened whenever a fault occurs in any of the computers and network.
We should define types of faults to tolerate them for the system to continue
to run correctly. The failure model attempts to give a precise specification of
the faults that can be exhibited by processes and communication channels.
iii. Security: The modular feature of distributed system and their openness
exposes them to attack by both external and internal agents. Security model
defines and classifies the forms of attack may take, providing a basis for the
analysis of threats to a system and for the design of system that are able to
resist them. The security model discusses the possible threats to processes
and communication channels. It introduces the concept of a secure channel,
which is secure against those threats
Figure 2.3: Encapsulation of data in the TCP/IP protocol stack. Source: https://
www.technologyuk.net/telecommunications/internet/tcp-ip-stack.shtml
or box and define how big it is, but a router cannot. Therefore, additional
information is required at the IP layer, in addition to the source and destination
IP addresses. The figure represented below is a rational representation of the
information that is used at the IP layer to enable the delivery of electronic
data. This information is called a header and is analogous to the addressing
information on an envelope. A header contains the information required to
route data on the Internet and has the same format regardless of the type of
data being sent. This is the same for an envelope where the address format
is the same irrespective of the type of letter being sent.
2.3.5 Addressing
As with any other network-layer protocol, the IP addressing scheme is
integral to the process of routing IP datagrams through an internetwork. Each
IP address has specific components and follows a basic format. These IP
addresses can be subdivided and used to create addresses for subnetworks, as
discussed in more detail later in this chapter. Each host on a TCP/IP network
is assigned a unique 32-bit logical address that is divided into two main parts:
the network number and the host number. The network number identifies a
network and must be assigned by the Internet Network Information Center
(InterNIC) if the network is to be part of the Internet. An ISP can obtain blocks
of network addresses from the Inter NIC and can itself assign address space
as necessary. The host number identifies a host on a network and is assigned
by the local network administrator. Internet network layer addresses have a
fixed size. Depending on the protocol version (IPv4 or IPv6), these addresses
are either 4-byte or 16-byte long. (i) Four-byte IPv4 addresses are typically
written as four decimal numbers separated by dots where every decimal
number represents one byte (dotted quad notation). A typical example is
the IPv4 address 212.201.48.1. (ii)Sixteen-byte IPv6 addresses are typically
written as a sequence of hexadecimal numbers separated by colons (:)
where every hexadecimal number represents two bytes. Leading nulls can
be omitted, and two consecutive colons can represent a sequence of nulls.
For example, the IPv6 address 1080:0:0:0:8:800:200C:417A can be written
somewhat shorter as 1080:8:800:200C:417A. IPv6 addresses which contain
IPv4 addresses can be written by using the dotted quad notation for the IPv4
address portion. For example, theIPv6 address 0:0:0:0:0:0:0D01:4403 can
be written as:0D01:4403 as well as:13.1.68.3.
This contrasts with static routing, where routers are established by the
network administrator and do not change until the network administrator
changes them. An IP routing table, which consists of destination address/
next hop pairs, is used to facilitate dynamic routing. An entry in this table,
for example, would be interpreted as follows: to get to network 172.31.0.0,
send the packet out Ethernet interface 0 (E0). IP routing specifies that IP
datagrams travel through internet works one hop at a time. The entire route
is not known at the onset of the journey, however. Instead, at each stop, the
next destination is calculated by matching the destination address within
the datagram with an entry in the current node’s routing table. Each node’s
involvement in the routing process is limited to forwarding packets based
on internal information. The nodes do not check whether the packets get to
their ultimate destination, nor does IP provide for error reporting back to the
source when routing anomalies occur.
and interconnection. Ethernet system is divided into two parts: the data link
layer and the physical layer. These layers are intended to correspond closely
to the lowest layers of the ISO model for Open Systems Interconnection.
Architectural organization of the specification has two main advantages:(i)
Clarity: aclean overall separation of the design along architectural lines
makes the specification clearer.(ii) Flexibility: segregation of medium-
dependent aspects in the physical layer allows the Data Link Layer to apply
to transmission media other than the stated coaxial cable.The architectural
model is based on a set of interfaces different from those emphasized in
the implementations. One essential aspect of the design, however, must be
addressed largely in terms of the implementation interfaces: compatibility.
Two important compatibility interfaces are defined within what is
architecturally the Physical Layer which is the coaxial cable interface and the
transceiver cable interface.The major division in the Ethernet architecture is
between the Physical Layer and the Data Link Layer, corresponding to the
lowest two levels in the ISO model. The upper levels of the overall network
architecture, which use the Data Link Layer, will be collectively referred
to as the “Client Layer” as, strictly speaking, the identity and function of
higher level facilities are outside the scope of this specification. The intent
is that the Ethernet physical and data link layers support the higher layers
of the ISO model (Network Layer, Transport Layer, etc.) as shown in the
figure below.
Figure 2.6: How LAN protocols map to the OSI reference model. Source:
http://www.ciscopress.com/articles/article.asp? p=2092245&seqNum=2
2.4.1 Layering
The major division in the Ethernet architecture is between the physical layer
and the data link layer, corresponding to the lowest two levels in the ISO
38 Internet & Distributed Systems
model. The higher levels of the overall network architecture, which use the
data link layer, will be collectively referred to as the “client layer” as, strictly
speaking, the identity and function of higher level facilities are outside the
scope of this specification. The purpose is that the Ethernet physical and
data link layers support the higher layers of the ISO model (Network Layer,
Transport Layer, etc.).
The framing process provides important delimiters that are used to identify
a group of bits that make up a frame. This process provides synchronization
between the transmitting and receiving nodes. (ii)Addressing: The
encapsulation process also provides for data link layer addressing. Each
Ethernet header added in the frame contains the physical address (MAC
address) that enables a frame to be delivered to a destination node.(iii) Error
detection: each Ethernet frame contains a trailer with a cyclic redundancy
check (CRC) of the frame contents. After reception of a frame, the receiving
node creates a CRC to compare to the one in the frame. If these two CRC
calculations match, the frame can be trusted to have been received without
error.
is also handled by this layer. This layer establishes a logical layer between
two nodes and also manages the Frame traffic control over the network. It
signals the transmitting node to stop, when the frame buffers are full.
2.6 MIDDLEWARE
Constitutes a communication structure that is independent of operating
systems and the nature of transmission systems. In the OSI model which
defines the different communication levels between information systems,
middleware is located right at the top. It defines the communication protocols
between applications. This inter-application communication structure rests
on communication structures at lower levels such as network protocols
(TCP/IP, DECnet, SNA or OSI) and/or mechanisms offered by operating
systems (Daniel S., 2002).
until either a reply is received, or it times out. When the request arrives at
the server, it calls a dispatch routine that performs the requested service,
and sends the reply to the client. After the RPC call is completed, the client
process continues. Threads are common in RPC-based distributed systems.
Each incoming request to a server typically spawns a new thread. A thread in
the client typically issues an RPC and then blocks (waits). When the reply is
received, the client thread resumes execution. A programmer writing RPC-
based code does three things: i. It stipulates the protocol for client-server
communication. ii. It develops the client program. iii. It develops the server
program. The communication protocol is created by stubs generated by a
protocol compiler. A stub is a routine that does not really do much other
than pronounce itself and the parameters it accepts. The stub contains just
enough code to allow it to be compiled and linked. The client and server
programs must communicate via the procedures and data types specified in
the protocol. The server side registers the procedures that may be called by
the client and receives and returns data required for processing. The client
side calls the remote procedure, passes any required data and receives the
returned data. Thus, an RPC application uses classes generated by the stub
generator to execute an RPC and wait for it to finish. The programmer needs
to supply classes on the server side that provide the logic for handling an
RPC request. RPC introduces a set of error cases that are not present in local
procedure programming. For example, a “binding error” can occur when a
server is not running when the client is started. “Version mismatches” occur
if a client was compiled against one version of a server, but the server has
now been updated to a newer version. A timeout can result from a server
crash, network problem, or a problem on a client computer. Some RPC
applications view these types of errors as unrecoverable. Fault-tolerant
systems, however, have alternate sources for critical services and fail-over
from a primary server to a backup server.
from practical application, dominate the Internet. The figure below shows a
representation of the layered components of TCP/IP model.
destination and another process at the destination receives the message. This
activity involves the communication of data from the sending process to the
receiving process and may involve the synchronization of the two processes.
2.8.2 Sockets
Sockets are a dominant abstraction that programmers use for writing
synchronous and asynchronous communication. Both forms of
communication (UDP and TCP) use the socket abstraction, which provides
an endpoint for communication between processes. Sockets originate from
BSD UNIX but are also present in most other versions of UNIX, including
Linux as well as Windows and the Macintosh OS. IPC consists of transmitting
a message between a socket in one process and a socket in another process.
Coulouris G. et al. showed this as follows;
importjava.net.*;
importjava.io.*;
publicclassUDPClient {
publicstaticvoidmain (Stringargs []){
//argsgivemessagecontentsandserverhostname
DatagramSocketaSocket= null;
try {
aSocket= newDatagramSocket ();
byte [] m=args [0].getBytes ();
InetAddressaHost= InetAddress.getByName (args [1]);
intserverPort= 6789;
DatagramPacketrequest=
newDatagramPacket (m, m.length (), aHost, serverPort);
aSocket.send (request);
byte [] buffer=newbyte [1000];
DatagramPacketreply=newDatagramPacket (buffer, buffer.length);
aSocket.receive (reply);
System.out.println (“Reply:” + newString (reply.getData ()));
} catch (SocketExceptione){System.out.println (“Socket:”+e.getMessage
());
50 Internet & Distributed Systems
REFERENCES
1. Coulouris, G., Dollimore, J., & Kindberg, T. (2005).Distributed
systems: concepts and design(4th ed.). Addison Wesley.Retrieved
January 10, 2018,from http://www.gecg.in/papers/ds5thedn.pdf
2. Hinden , R., & Deering, S. (2003 April). Internet Protocol Version 6
(IPv6) Addressing Architecture. RFC 3513, Nokia, Cisco Systems.
3. Tanenbaum, A., & Steen, M. (2002). Distributed systems: principles
and paradigms. Prentice Hall. Retrieved January 15, 2018, from http://
barbie.uta.edu/~jli/Resources/MapReduce&Hadoop/Distributed%20
Systems%20Principles%20and%20Paradigms.pdf
CHAPTER 3
CONTENTS
3.1 Clocks and Concepts of Time................................................................................... 54
3.2 Timing Problem....................................................................................................... 55
3.3 Physical Clocks........................................................................................................ 55
3.4 Logical Clocks......................................................................................................... 61
3.5 Events Ordering Synchronization............................................................................. 65
3.6 Global States........................................................................................................... 68
3.7 Distributed Debugging............................................................................................ 69
References..................................................................................................................... 73
54 Internet & Distributed Systems
Figure 3.1: Computation of mean solar day by Physical clock. Source: http://
www.myshared.ru/slide/1234782/
your watch loses a minute every two months and making a mental note to
adjust the clock by that amount every two months (except the system does
it recurrently).
errors are relative to the server stratum number. For this reason, the NTP
topology should minimize the synchronization distance between servers
and between servers and clients, although loops should be avoided. NTP
servers are arranged into strata. The first stratum contains the primary
servers, which are computers that are connected directly to an accurate time
source (the time source hardware itself is considered to be at stratum 0).
The second stratum contains the secondary servers. These are machines that
synchronized from the main stratum machines. The third stratum contains
tertiary servers that synchronized from the secondary stratum machines
and so on. Together, all these servers form the synchronization subnet. A
computer will often try to synchronize with several servers, using the best
of all the results to set its time. The best result is a function of a number
of qualities, including: round-trip delay, consistency of the delay, round-
trip error, server’s stratum, the accuracy of the server’s clock, the last time
the server’s clock was synchronized, and the estimated drift on the server.
The Simple NTP (SNTP) is a subset of the NTP that allows operation in
a stateless remote procedure call mode or multicast mode. It is not a new
protocol but just a subset of NTP. It is intended for environments when the
full NTP implementation is not needed or is not justified. The intention is
that SNTP be used at the ends of the synchronization subnet (high strata)
rather than for synchronizing time servers. SNTP can operate in either a
unicast, multicast, or anycast modes: 1.
3.7.1 Heterogeneity
A distributed system’s nodes may include mobile phones, laptops, server-
class machines, and more. This hardware and software diversity in node
resources and network connectivity can make a distributed system more
robust, but this heterogeneity forces developers to manage compatibility
during both development and debugging.
3.7.2 Concurrency
Simultaneous operation by multiple nodes leads to concurrency, which
can make a distributed system outperform a centralized system. However,
concurrency may introduce race conditions and deadlocks, which are
notoriously difficult to diagnose and debug. Additionally, networks
introduce packet delay and loss, exacerbating the issues of understanding
and debugging concurrency.
3.7.4 Testing
A test suite exercises a specific set of executions to ensure that they behave
properly. Most testing of distributed systems is done using manually written
tests, typically introduced in response to failures and then minimized.
Testing is an effective way to detect errors. However, as testing exercises
a limited number of executions, it can never guarantee to reveal all errors.
3.7.6 Tracing
Tracks the flow of data through a system, even across applications and
protocols such as a database, web server, domainname server, load balancer,
or virtual private network protocol. For example, pivot tracing dynamically
instruments Java-based systems to collect user-defined metrics at different
points in the system and collates the final data to provide an inter-component
view of the metrics over multiple executions. Dapper is a lower-level tracing
system used at Google to trace infrastructure services. Tracing is more
efficient than record and replay because it focuses on a specific subset of
the data, but it requires instrumenting applications and protocols to properly
forward the tracing metadata.
console logs, debug logs, and other log sources are used to understand the
system. For example, Xu et al. applied machine learning to logs to detect
anomalies in Google infrastructure services (2010). Detailed logs from
realistic systems contain a great deal of valuable detail, but they tend to be
so large that they are overwhelming to programmers, who as a result cannot
directly benefit from them.
3.7.8 Visualization
Threads and processes can be seen as a way to do more things at the same
time. In effect, they allow us build (pieces of) programs that appear to be
executed simultaneously. On a single-processor computer, this simultaneous
execution is, of course, an illusion. As there is only a single CPU, only an
instruction from a single thread or process will be executed at a time. By
rapidly switching between threads and processes, the illusion of parallelism
is created. This separation between having a single CPU and being able to
pretend there are more can be extended to other resources as well, leading
to what is known as resource virtualization. This virtualization has been
applied for many decades, but has received renewed interest as (distributed)
computer systems have become more commonplace and complex, leading
to the situation that application software is mostly always outliving its
underlying systems software and hardware. In this section, we pay some
attention to the role of virtualization and discuss how it can be realized. The
complexity of distributed systems has inspired the system to make them
more transparent to the developers. One of the most important reasons for the
introduction of virtual technology in the 1970s is to allow the old software
to run expensive hardware on mainframe. The program not only integrates
different applications, but really the operating system they have created.
This approach to preserving hereditary programs has been successfully
applied to the mainframe of IBM 370 (and their successor), which offers
virtual machines that multiple operating systems are transferred. As the
hardware has become cheaper, the PC has become more powerful and the
number of different operating systems has decreased, the virtual becomes
less of a problem. But many issues have changed again since the late 1990s,
with many reasons we will be debating for the first time, while the low-
end hardware and software systems are changing rapidly, with high-end
applications (such as middleware and applications) stabilized. In other
words, we face situations where the legacy program cannot be saved at
the same pace as the platform it is based on. Virtualization can help here
by transferring legacy interfaces to new platforms and thus opening it
72 Internet & Distributed Systems
REFERENCES
1. Distributed systems: principles and paradigms, Physical and logical
clocks (2010).
2. Gusella, R., & Zatti, S. (1989).The accuracy of the clock synchronization
achieved by TEMPO in Berkeley UNIX 4.3BSD.Software Engineering,
IEEE Transactions on, IEEE,15(7), 847-853.
3. Leslie, L. (1978 July). Time, clocks, and the ordering of events in a
distributed system. Communications of the ACM,21(7), 558-565.
4. Sinha, P. (2009). Distributed operating systems: concepts and design
(pp. 282-336). PHI Learning Private Limited.
5. Xu, W., Huang, L., Fox, A., Patterson, D., & Jordan, M. (2010).
Experience mining Google’s production console logs. Proceedings of
the Workshop on Managing Systems via Log Analysis and Machine
Learning Techniques. Retrieved January 15, 2018, from http://iiis.
tsinghua.edu.cn/~weixu/files/slaml10.pdf
CHAPTER 4
COORDINATION
“The theories of computation are evolving from notions like value, evalua-
tion and function to those of link, interaction, and process.”
-Robin Milner
CONTENTS
4.1 Distributed Mutual Exclusion................................................................................... 76
4.2 Multicast Communication....................................................................................... 85
4.3 Group Communication............................................................................................ 86
4.4 Failure..................................................................................................................... 87
4.5 Elections.................................................................................................................. 91
References..................................................................................................................... 94
76 Internet & Distributed Systems
received replies are stored in a Boolean array, which turns to false whenever
it receives a reply from the respective node. On exiting the critical section,
the node broadcasts a release message and all the processes remove the
request from their queue. In the second version, there is an optimization.
It goes like this: when a process receives a request, if it has already sent a
message with clock value greater than the request message, then it does not
send a reply message. Similarly, when a process sends a request message it
also checks if it has received any message with a timestamp higher than its
request, if yes then that message is considered as a reply and its Boolean
value is set to false. Rests of the operations are performed as it is. When a
process enters critical section, it creates a log file and writes about its entry
in critical section and also about its exit from critical section on exiting.
array, RDi, the size of which is the same as the number of processes in the
system. Initially, ∀i ∀j: RDi [j]=0. Whenever pi defer the request sent by pj, it
sets RDi [j]=1 and after it has sent a REPLY message to pj, it sets RDi [j]=0.
Requesting the critical section 1. When a site Si wants to enter the CS,
it sends a timestamped REQUEST message to all the sites in its request set.
2. When site Sjreceives a REQUEST message from site Si, it sends a REPLY
message to site Si if site Sj is neither requesting nor executing the CS or if
site Sj is requesting and Si’s request’s timestamp is smaller than Sj’s own
request’s timestamp. The request is deferred otherwise.
Executing the critical section Site Sienters the CS after it has received
REPLY messages from all the sites in its request set.
Releasing the critical section i. When site Si exits the CS, it sends
REPLY messages to all the deferred requests. A site’s REPLY messages are
blocked only by sites that are requesting the CS with higher priority (i.e.,
a smaller timestamp). Thus, when a site sends out REPLY messages to all
the deferred requests, the site with the next highest priority request receives
the last needed REPLY message and enters the CS. The execution of CS
requests in this algorithm is always in the order of their timestamps.
exploit the changing conditions in the system. Note that an algorithm can
exploit dynamic conditions of the system to optimize the performance. For
example, if few sites are invoking mutual exclusion very frequently and
other sites invoke mutual exclusion much less frequently, then a frequently
invoking site need not ask for the permission of less frequently invoking
site every time it requests access to the CS. It only needs to take permission
from all other frequently invoking sites. Singhal developed an adaptive
mutual exclusion algorithm based on this observation. The information-
structure of the algorithm evolves with time as sites learn about the state
of the system through messages. Dynamic information structure mutual
exclusion algorithms are attractive because they can adapt to fluctuating
system conditions to optimize the performance.
Requesting the CS
1. If the requesting site Si does not have the token, then it takes the following
actions: - a. It sets SVi [i]:=R. b. It increments SNi [i]:=SNi [i]+1. c. It
sends REQUEST (i, sn) message to all sites Sjfor which SVi [j]=R. (sn is the
updated value of SNi [i].) 2. When a site Sj receives the REQUEST (i, sn)
message, it discards the message if SNj [i]≥sn because the message is out
dated. Otherwise, it sets SNj [i] to sn and takes the following actions based
on its own state:
SVj [j]=N: Set SVj [i]:=R. SVj [j]=R: If SVj [i] R, then set SVj [i]:=R
and send a REQUEST (j, SNj [j]) message to Si (else do nothing).
SVj [j]=E: set SVj [i]:=R. SVj [j]=H: Set SVj [i]:=R, TSV [i]:=R,
TSN [i]:=sn, SVj [j]:=N, and send the token to site Si.
Executing the CS 3. Si executes the CS after it has received the token. Just
before entering the CS, Si sets SVi [i] to E
Releasing the CS 4. Having finished the execution of the CS, site Si sets SVi
[i]:=N and TSV [i]:=N, updates its local and token vectors in the following
way: For all Sj, j=1 to N do
4.4 FAILURE
Four types of failures that may occur in a distributed system are crash,
timing, omission, and byzantine failures. The failures usually fall into two
different categories, hardware and software. The following is an outline of
the four types of failures and a method to isolate and fix two of the types of
failures in a distributed system.
have said “Let’s retreat” to both of them, but Jakes lied when he said,
“Anthony says let’s attack.” This scenario as defined by Leslie Lamport,
Robert Shostak, and Marshall Pease in their 1982 paper, a general must
communicate his order to attack or retreat to his lieutenants, but any number
of participants, including the general, could be a traitor. This beautiful insight
is due to Lamport, Shostack, and Peace, who prove that the problem has a
solution if and only if n≥ 3t + 1. Of course, if the generals are able to sign
their messages, then no general dare say different things to two different
colleagues. This illustrates the power of digital signatures in particular and
of end to end security mechanisms in general. Relying on third parties to
introduce principals to each other or to process transactions between them
can give great savings, but if the third parties ever become untrustworthy
then it can impose significant costs. There are two main concerns indicating
that the Byzantine failure model should be used with care in the context
of secure systems. The first concern is that Byzantine-tolerant solutions
(for example to Interactive Consistency) are not concerned with secrecy.
For example, if a database is fully replicated to maintain availability in
the presence of Byzantine failures, an attacker compromising at least one
database server will know the entire contents of the database. If file contents
are required to be kept secret, then standard algorithms for Interactive
Consistency from the fault-tolerance area can only be applied if specific
firewall architectures are used. If secrecy of replicated data is required,
mechanisms from cryptography have to be applied, usually some instance
of secret sharing. In this approach, data is stored in a distributed way so that
it needs a certain fraction of nodes to collaborate in the reconstruction of the
data. If this fraction is higher than the number of nodes which may be faulty,
then the Byzantine nodes cannot by themselves reconstruct arbitrary data.
However, if standard agreement protocols are used, nothing can prevent a
Byzantine node from receiving information about what has been proposed
as the database update by individual processes. Interestingly, a similar
problem to Interactive Consistency where individual process inputs can be
kept secret has been studied for years in the area of cryptography under
the heading 128 of secure multi-party computation. In a secure multi-party
computation, a set of processes wants to compute a deterministic function
of all their inputs without trusting each other. A common example is the
millionaires’ problem: Two millionaires want to know who of them has more
money but they do not want to tell each other how much they really have.
A multi-party computation can compute any function even though a certain
fraction of nodes exhibits Byzantine behavior. It does this in a way so that a
Coordination 89
Byzantine process learns nothing about the inputs of other processes to the
computation (apart from what is derivable from the result of the function).
It is obvious that Interactive Consistency can be formulated as an instance
of secure multi-party computation and often researchers in fault-tolerance
are not aware of this fact leading to multiple re-inventions of the wheel. The
second concern suggesting care in using Byzantine failures in security is
related to measures of reliability. The Interactive Consistency problem (as
well as secure multiparty computation) is provably impossible if at least one
third of the processes can behave in a Byzantine fashion. If we denote by the
maximum number of faulty processes and by the total number of processes,
this means that we need for the problem to be solvable. In practice, we
cannot predict with certainty that will hold, we can merely give a probability,
which is called the assumption coverage. The relevant input measure is
the reliability of a component which is defined as the probability that the
component is not faulty until a certain time. Experience from large data
collections of component failures shows that the reliability of components
can be modeled as a random variable satisfying a certain distribution. The
overall reliability of a complex system can then be calculated from the
reliability of its components.
setting are not random events: If new security vulnerability is published, the
probability of a server being compromised rises. But maybe most importantly
it is very difficult to argue that the occurrences of security related Byzantine
failures are statistically independent (especially if all servers run the same
operating system). So, although Byzantine failures still result in arbitrary
behavior, they are not random events in security.
4.5 ELECTIONS
In a distributed environment, most applications often require a single entity
to act temporarily as a central controller to coordinate the execution of a
particular task by the entities. In some cases, the need for a single coordinator
arises from the desire to simplify the design of the solution protocol for a
rather complex problem; in other cases, the presence of a single coordinator
is required by the nature of the problem itself. The problem of choosing such
a coordinator from a population of autonomous symmetric entities is known
as Leader Election (Elect). Formally, the task consists in moving the system
from an initial configuration where all entities are in the same state (usually
called available), into a final configuration where all entities are in the
same state (traditionally called follower), except one which is in a different
state (traditionally called leader). There is no restriction on the number of
entities that can start the computation, nor on which entity should become
leader. We can think of the Election problem as the problem of enforcing
restriction Unique Initiator in a system where actually no such a restriction
exists: the multiple initiators would first start the execution of an Election
protocol; the sole leader will then be the unique initiator for the subsequent
computation. As election provides a mechanism for breaking the symmetry
among the entities in a distributed environment, it is at the basis of most
control and coordination processes (e.g., mutual exclusion, synchronization,
92 Internet & Distributed Systems
Minimum. In terms of signals and value messages, with Elect Root strategy,
we have just 2 value messages; with the Elect Minimum strategy we have n
value messages. Remember: measuring the number of bits gives us always
a representation of the efficiency at a more refined level of granularity.
Fortunately, it is not always necessary to go to such a level.
REFERENCES
1. Kopetz, H. (1997). Real-time systems: design principles for distributed
embeddedapplications. New York: Kluwer International. Retrieved
January 15, 2018, from http://utdallas.edu/~bhd130030/Lamport’s%20
Mutual%20Exclusion%20Algorithm/PROJECT_REPORT.pdf
2. Nicola, S. (2005). Design and analysis of distributed algorithms (pp.
99, 102-104). Hoboken, New Jersey: John Wiley & Sons, Inc.
3. Tanenbaum, A., & Renesse, R. (1985). Distributed operating systems.
ACM Computing Surveys, 17(4), pp. 419-70.
4. Tanenbaum, A.S. (2003). Computer networks (4th ed.). Upper
SaddleRivee, NJ: Prentice-Hall International.
CHAPTER 5
DISTRIBUTION AND
OPERATING SYSTEMS
CONTENTS
5.1 Introduction to Operating Systems........................................................................... 96
5.2 Batch Operating System.......................................................................................... 99
5.3 Time-Sharing Operating Systems.............................................................................. 99
5.4 Distributed Operating System................................................................................ 100
5.5 Real-Time Operating System.................................................................................. 101
5.6 Network Operating System.................................................................................... 102
5.7 Unix Operating System.......................................................................................... 105
5.8 Windows Operating System................................................................................... 107
5.9 Distributed and Network File System..................................................................... 108
References................................................................................................................... 114
96 Internet & Distributed Systems
software and monitor activity with built in auditing tools. Users count on
operating systems to provide easy to use graphical user interfaces (GUI),
operate multiple applications at one time, and store and access data and
information needed for daily operations (UMUC, 2011). Businesses count
on operating systems to address and provide for the four basic security
concerns of confidentiality, integrity, availability and authenticity (Stallings,
2011). Although many operating systems include built in controls to address
these security concerns, additional measures should be taken to ensure the
required level of security is achieved.
Program needs to read a file or write a file. ii. The operating system gives
the permission to the program for operation on file. iii. Permission varies
from read-only, read-write, denied, and so on. iv. Operating System provides
an interface to the user to create/delete files. v. Operating System provides
an interface to the user to create/delete directories. vi. Operating System
provides an interface to create the backup of file system.
5.1.4 Communication
In case of distributed systems which are a collection of processors that
do not share memory, peripheral devices, or a clock, the operating system
manages communications between all the processes. Multiple processes
communicate with one another through communication lines in the network.
The OS handles routing and connection strategies, and the problems of
contention and security. Following are the major activities of an operating
system with respect to communication: Two processes often require data to
be transferred between them. i. Both the processes can be on one computer
or on different computers, but are connected through a computer network.
ii. Communication may be implemented by two methods, either by Shared
Memory or by Message Passing.
5.1.7 Protection
Considering a computer system having multiple users and concurrent
execution of multiple processes, the various processes must be protected
from each other’s activities. Protection refers to a mechanism or a way to
control the access of programs, processes, or users to the resources defined
by a computer system. Following are the major activities of an operating
system with respect to protection: i. The OS ensures that all access to system
resources is controlled. ii. The OS ensures that external I/O devices are
protected from invalid access attempts. iii. The OS provides authentication
features for each user by means of passwords.
of tasks and maximizing the number of high priority tasks meeting their
deadlines. Also, hard real-time systems are used when it is imperative that
an event be reacted to within a strict deadline. Such strong guarantees are
required of systems for which not reacting in a certain interval of time would
cause great loss in some manner, especially damaging the surroundings
physically or threatening human lives (although the strict definition is
simply that missing the deadline constitutes failure of the system). For
example, a car engine control system is a hard real-time system because
a delayed signal may cause engine failure or damage. Other examples of
hard real-time embedded systems include medical systems such as heart
pacemakers and industrial process controllers. Hard real-time systems
are typically found interacting at a low level with physical hardware, in
embedded systems. Early video game systems such as the Atari 2600 and
Cinematronics vector graphics had hard real-time requirements because of
the nature of the graphics and timing hardware.
the means of organizing the layout of the data storage in complex ways. The
shell is the command interpreter. Although the shell is just a utility program,
and is not properly a part of the system, it is the part that the user sees. The
shell listens to your terminal and translates your requests into actions on
the part of the kernel and the many utility programs. UNIX/Linux UNIX
is one of the most primitive operating systems in the computer world, and
is still widely used today. However, it is not a very conspicuous operating
system. Somewhat arcane in its operation and interface, it is ideally suited
for the needs of large enterprise computing systems. It is also the most
common operating system run by servers and other computers that form the
bulk of the Internet. While you may never use UNIX on your local PC, you
are using it indirectly in one form or another every time you log on to the
net. While few people run UNIX on their own systems, there are in fact a
number of different versions of UNIX available for the PC, and millions of
PC users have chosen to install “UNIXy” operating systems on their own
desktop machines. There are dozens of variants of the basic UNIX interface;
the most common one for the PC platform is Linux, which is itself available
in many flavors. While UNIX operating systems can be difficult to set up
and require some knowledge to operate, they are very stable and robust, are
efficient with system resources and are generally free or very inexpensive
to obtain. Unix operating systems are widely used in both servers and
workstations. The Unix environment and the client-server program model
were essential elements in the development of the Internet and the reshaping
of computing as centered in networks rather than in individual computers.
Both Unix and the C programming language were developed by AT&T
and distributed to government and academic institutions, causing both to
be ported to a wider variety of machine families than any other operating
system. As a result, Unix became synonymous with “open systems.” Unix
was designed to be handy, multi-tasking and multi-user in a time-sharing
pattern. Unix systems are characterized by various concepts: the use of plain
text for storing data; a hierarchical file system; treating devices and certain
types of inter-process communication (IPC) as files; and the use of a large
number of small programs that can be strung together through a command
line interpreter using pipes, as opposed to using a single monolithic program
that includes all of the same functionality. These concepts are known as the
Unix philosophy. Under Unix, the “operating system” consists of many of
these utilities along with the master control program, the kernel. The kernel
provides services to start and stop programs, handle the file system and
other common “low level” tasks that most programs share, and, perhaps
Distribution and Operating Systems 107
entries in the root directory of the file system and had restrictions on the
maximum size of FAT-formatted disks or partitions. Specifically, FAT12 and
FAT16 had a limit of 8 characters for the file name, and 3 characters for the
extension (such as.exe). This is commonly referred to as the 8.3 filename
limit. FAT32 also addressed many of the limits in FAT12 and FAT16, but
remains limited compared to NTFS. NTFS introduced with the Windows
NT operating system, allowed ACL-based permission control. Hard links,
multiple file streams, attribute indexing, quota tracking, sparse files,
encryption, compression, reparse points (directories working as mount-
points for other file systems, symlinks, junctions, remote storage links) are
also supported, though not all these features are well-documented. Unlike
many other operating systems, Windows uses a drive letter abstraction at the
user level to differentiate one disk or partition from another. For example,
the path C:\WINDOWS represents a directory WINDOWS on the partition
represented by the letter C. The C drive is most commonly used for the
primary hard disk partition, on which Windows is usually installed and from
which it boots. This “culture” has become so firmly deep-rooted that bugs
came about in older versions of Windows which made assumption that the
drive that the operating system was installed on was C. The practice of using
“C” for the drive letter can be traced to MS-DOS, where the letters A and B
were reserved for up to two floppy disk drives. Network drives may also be
mapped to drive letters.
second process sees the results of the first process “write.” In a distributed
system using remote access, the results are the same as on a single processor,
but in a system in which files are cached locally, the following situation can
occur. Client 1 asks for a gets a copy of some file. Then client 2 asks for a
gets a copy of the same file. Now process 1 modifies the file. When process
2 reads its copy, it does not see the alterations that process 1 made. Worse
yet, if both processes modify the file and send it back to the server, the first
copy sent back will eventually be overwritten by the second one. There are
two possible solutions to this problem. In the first one, once a process has
asked for and been granted a file, no other copies will be sent out until the
first one has been returned. Effectively, the file is then locked. The second
solution allows multiple copies, but redefines the semantics to say that once
a file has been cached by a client, the effect of a simultaneous operation
by another client is undefined. The effect of one or more operations only
becomes definitive when the file is closed and written back to the server.
While this approach changes the meaning of concurrent operation, it is
much more efficient. A Network File System (NFS) allows remote hosts
to mount file systems over a network and interact with those file systems
as though they are mounted locally. This enables system administrators
to combine resources onto centralized servers on the network. Currently,
there are three versions of NFS. NFS version 2 (NFSv2) is older and is
widely supported. NFS version 3 (NFSv3) has more features, including
64bit file handles, Safe Async writes and more robust error handling. NFS
version 4 (NFSv4) works through firewalls and on the Internet, no longer
requires portmapper, supports ACLs, and utilizes stateful operations. Red
Hat Enterprise Linux supports NFSv2, NFSv3, and NFSv4 clients, and
when mounting a file system via NFS, Red Hat Enterprise Linux uses
NFSv3 by default, if the server supports it. All versions of NFS can useTCP
running over an IP network, with NFSv4 requiring it. NFSv2 and NFSv3
can use theUDP running over an IP network to provide a stateless network
connection between the client and server. When using NFSv2 or NFSv3
with UDP, the stateless UDP connection under normal conditions has less
Protocol overhead than TCP which can translate into better performance on
very clean, non-congested networks. The NFS server sends the client a file
handle after the client is authorized to access the shared volume. This file
handle is an opaque object stored on the server’s side and is passed along
with RPC requests from the client. The NFS server can be restarted without
affecting the clients and the cookie remains intact. However, because UDP
is stateless, if the server goes down unexpectedly, UDP clients continue
110 Internet & Distributed Systems
to saturate the network with requests for the server. For this reason, TCP
is the preferred protocol when connecting to an NFS server. NFSv4 has
no interaction with portmapper,rpc.mountd,rpc.lockd, andrpc.statd,
as protocol support has been incorporated into the v4 protocol. NFSv4
listens on the well-known TCP port (2049) which eliminates the need for
theportmapperinteraction. The mounting and locking protocols have been
incorporated into the V4 protocol which eliminates the need for interaction
withrpc.mountdandrpc.lockd.
REFERENCES
1. Coulouris, G., Dollimore, J.,&Kindberg, T. (2001). Distributed systems:
concepts and design. Pearson Education Ltd. Retrieved January 2018,
fromhttp://www.gecg.in/papers/ds5thedn.pdf
2. Hailperin, M. (2005). Operating systems and middleware: supporting
controlled interaction. Thomas Course Technology, a division of
Thomas Learning Inc. RetrievedJanuary 2018, from https://gustavus.
edu/mcs/max/os-book/osm-rev1.1.2.pdf
3. Serain, D. (2002). Middleware and enterprise application integration.
London: Springer-Verlag London Limited.
4. Sinha, P. (2002). Distributedoperating systems: concepts and design.
Prentice Hall of India Pvt, Ltd.
5. Tanenbaum, A., & Steen, M. (2002).Distributed systems: principles and
paradigm. Pearson EducationRetrieved January 15, 2018, from http://
barbie.uta.edu/~jli/Resources/MapReduce&Hadoop/Distributed%20
Systems%20Principles%20and%20Paradigms.pdf
CHAPTER 6
“Peer-to-peer applications have been used to provide file sharing, web cach-
ing, information distribution and other services, exploiting the resources of
tens of thousands of machines across the Internet.”
-George Coulouris
CONTENTS
6.1 Introduction to Peer to Peer System....................................................................... 116
6.2 Routing in P2P....................................................................................................... 117
6.3 Classification of P2P.............................................................................................. 121
6.4 First Generation of P2P.......................................................................................... 124
6.5 Second Generation P2P......................................................................................... 126
6.6 Third Generation P2P............................................................................................ 130
6.7 Current Generation P2P......................................................................................... 133
References................................................................................................................... 135
116 Internet & Distributed Systems
and there is not yet a paradigm under which research into p2p semantic
routing is conducted. The advent of semantic networking is making this an
important topic, and one that is sure to grow and flourish in future.
exchange their query routing tables with their neighbors periodically. The
query routing tables contain metadata of hosted content, i.e., keywords and
the corresponding IP-address, of the servent from which the metadata was
received. Any incoming query is then analyzed for its search keywords, and
then compared to the local query-routing table. If one of the search keywords
matches to one or more entries in the routing table, the query is forwarded
in the direction, given by the routing table, instead of being flooded to all
neighbors of the servent. If no match with the routing table can be found, the
query is forwarded to all neighbors of the servent, as long as the TTL-value
of the query message has not expired. To minimize the amount of bandwidth
necessary to propagate the routing tables, a variant of Bloom Filters are
used (Bloom, 1970). This means, that each keyword is hashed, and then all
keywords of the content of one servent are compressed in a bitmap. Thus not
a whole set of keywords and IP-addresses has to be exchanged periodically,
but only a comparatively small bitmap. Further on incremental updates
could also be used, if only small changes have taken place as the last routing
table has been propagated to its neighbors. However, the major problem
with the implementation of query-routing tables is again, how to keep them
up to date, if the network is very dynamic. The problem is, that routing
information for a certain file A which is hosted by the servent X, may still
propagate through the network, although the servent X is not anymore a
member of the network. Thus queries may be directed in a wrong direction,
which leads to useless traffic, and to unsatisfied users, as the content they
search for cannot be found anymore. A solution to this problem could be to
set a timer for every routing table entry. After the expiration, this routing
table entry is deleted again, to prevent any misleading routings. Further on,
the propagation reach of each routing table must be limited, to prevent the
routing table, from propagating through the whole network. This could be
done with a hop-counter, which avoids routing tables from being spread any
further, as soon as a certain value for the hop-count has been reached.
Peer to Peer Systems 121
and hence the systems benefit from the search techniques of centralized P2P
systems. Hybrid P2P systems draw advantages from the other types of P2P
architectures, i.e., centralized and fully distributed ones, while distinguish
themselves from the other two types by their elegant auxiliary mechanisms
that facilitate resource location. In some P2P systems of hybrid architecture,
there are some peers possessing much more powerful capabilities and
having more responsibilities than other peers, which are usually referred to
as “super” peers (or supernodes). These supernodes form an “upper level”
of a hybrid system, which provides similar services for the ordinary peers
as the central server does in a centralized P2P system. The common peers,
on the other hand, can enjoy much more services from the supernodes in
the “higher” layer, especially in the process of resource location. Though
supernodes share some similar features to the central server in centralized
P2P systems, it is easy to distinguish one from the other based on the
following metrics: (i) A supernode is not as powerful as the central server
of a centralized P2P system, and it is only in charge of a subset of peers in
the network. (ii) A server as in Napster just helps peers to locate desired
files without sharing any file by itself; however, a supernode has to not only
coordinate the operations among the peers under its supervision, but also
perform the same operations by itself and contribute its own resources as
the common peers do.
this model works really well in most of the aspects and its main weakness
concerns copyright issues. The figure below shows an example of centralized
directory model.
6.4.1 Napster
Napster-CPP, also known as music exchange system, and other similar
systems have a constantly updated object directory maintained at central
Napster server (s). Nodes login to this server and send the list of files they
can offer, then issue queries to the server to find which other nodes hold
their desired files, and finally download the desired objects directly from the
object home. It has a Server-Client structure where there is a central server
system which directs traffic between individual registered users. The central
servers maintain directories of the shared files stored on the respective PCs
of registered users of the network. These directories are updated every time a
user logs on or off the Napster server network. Clients connect automatically
to an internally designated “metaserver” that acts as common connection
arbiter. This metaserver assigns at random an available, lightly loaded server
from one of the clusters. Servers appeared to be clustered about five to a
geographical site and Internet feed, and able to handle up to 15,000 users
each. The client then registers with the assigned server, providing identity
and shared file information for the server’s local database. In turn, the client
receives information about connected users and available files from the
server. Although formally organized around a user directory design, the
Napster implementation is very data centric. Although Napster’s centralized
database naturally avoids query routing and other problems of other P2P
systems, it is clear that such centralized approach has single point of failure
and very poor scalability. Another feature of this architecture is that it can
support partial-match queries (e.g., searching for all objects whose titles
contain two or more specific words. Napster was shut down as a result of
legal proceedings instituted against the operators of the Napster service by
the owners of the copyright in some of the material (i.e., digitally encoded
music) that was made available on it. The architecture of Napster is shown
in the figure below.
126 Internet & Distributed Systems
6.5.1 Gnutella
Gnutella is a decentralized file-sharing system whose participants form a
virtual network and communicate peer-to-peer via the Gnutella protocol for
distributed file search. There is neither a centralized directory nor any precise
control over the network topology or object placement in such architecture
as the typical Gnutella. Gnutella is factually a decentralized file-sharing
system whose participants self-organize a virtual mesh network running in
a P2P fashion for distributed file search. To participate, a peer first connects
to a known Gnutella host. In other words, a node first must connect to a
known Gnutella node to get lists of some existed Gnutella nodes for start-up.
Gnutella’s software architecture is markedly different than that of Napster.
Upon receiving a message, the servent decrements the time-to-live (TTL)
field of the message. If the TTL is greater than 0 and the servent has never
seen the identifier of the message (loop detection), it resends the message
to all known peers. The servent also checks whether it should respond to
the message. If it receives a Query, for example, it checks its local file store
and responds with a QueryHit if it can satisfy the request. Responses are
routed along the same path as the originating message. As discussed above,
Napster uses a centralized server architecture, where a user logs on to a
server or group of Napster owned servers, connecting that user to others.
In contrast, Gnutella directly connects users and eliminates the need for an
intermediary server. A user connects directly with another user, who then
connects them with other users, and so on-creating a virtually limitless web
of users spreading throughout the Internet. If a user has linked to you, they
have also linked to virtually everyone to whom you have linked, and vice
versa. Gnutella has the following properties: i. Scalability. The broadcasting
128 Internet & Distributed Systems
6.5.2 BitTorrent
BitTorrent [www.bittorrent.com] is a popular peer-to-peer file-sharing
application designed particularly for downloading large files (including
Peer to Peer Systems 129
video files). It is not intended for the real-time streaming of content but
rather for the initial downloading of files to be played back later. The
principal design feature in BitTorrent is the splitting of files into fixed-sized
chunks and the subsequent availability of chunks at various sites across
the peer-to-peer network. Clients can then download a number of chunks
in parallel from different sites, reducing the burden on any one particular
site to service the download (remembering that BitTorrent relies on the
capabilities of ordinary user machines and also that there may be many
simultaneous requests for popular files). This compares favorably to more
centralized strategies where a client would download a file from a server
using, for example, HTTP. In more detail, the BitTorrent protocol operates
as follows. When a file is made available in BitTorrent, a.torrent file is
created that holds metadata associated with that file including: i. the name
and length of the file; ii. the location of a tracker (specified as a URL), which
is a centralized server that manages downloads of that particular file; iii. a
checksum associated with each chunk, generated using the SHA-1 hashing
algorithm, that enables content to be verified following download. The use
of trackers is a compromise against pure peer-to-peer principles, but this
allows the system to easily maintain the above information in a centralized
manner. Trackers are responsible for keeping track of the download status
associated with a particular file. To understand the information held by the
tracker, it is necessary to stand back and consider the lifecycle of a given
file.Any peer with a complete version of a file (in terms of all its chunks)
is known asa seeder in BitTorrent terminology. For example, the peer that
initially creates the file will provide the initial seed for the file distribution.
Peers that want to download a fileare known as leechers, and a given leecher
at any given time will contain a number ofchunks associated with that
file. Once a leecher downloads all the chunks associatedwith a file, it can
become a seeder for subsequent downloads. In this way, files spreadvirally
through the network, with the spread stimulated by demand. Based on this,
thetracker maintains information about the current state of downloads of a
given file interms of the associated seeders and leechers.
6.5.3 Freenet
Like Gnutella, Freenet is a peer-to-peer, decentralized system. Each user’s
computer is labeled a “node,” which stores and retrieves encrypted files that
can be opened by text strings called “keys.” The nodes pass the keys back
and forth in an attempt to find the encrypted file that the key will unlock.
Each node knows only its immediate neighbors, and it is extremely difficult
130 Internet & Distributed Systems
6.6.1 Chord
Chord is a protocol for looking up nodes by a key in a decentralized P2P
Peer to Peer Systems 131
system. “The Chord protocol supports just one operation: given a key,
it maps the key onto a node. Depending on the application using Chord,
that node might be responsible for storing a value associated with the key.
Chord uses a variant of consistent hashing to assign keys to Chord nodes.
Consistent hashing tends to balance load, as each node receives roughly the
same number of keys, and involves relatively little movement of keys when
nodes join and leave the system” (Ion Stoica et.al, 2001). Each node can then
have some arbitrary value, but that is outside of the scope of the protocol. It
provides a single key operation lookup (k): which provides the location of a
node responsible for that key k. It can thus be a key component of adistributed
hash table. It does this without using any special nodes, unlike previous peer-
to-peer systems such as Napster or DNS. DNS, for example, comes with a
set of special root servers pre-programmed. This makes the Chord protocol
more decentralized and resilient to failure. Chord usesconsistent hashing,
which means the values of keys are evenly distributed despite nodes joining
and leaving the system. One key innovation is that the consistent hashing
in a distributed hash table doesn’t require a node to know about all other
nodes. Instead, each node only knows about a select set of other nodes. This
way of implementing routing means that it only takes on the order ofO (log
n) operations to find the value you are looking for, despite nodes joining
and leaving. Thenetwork topologycan be thought of as a ring of nodes. If
we look up a key at a node and it doesn’t have the value, the node can point
us in the right direction. It does this using a finger table which includes the
address of other nodes. By using some nifty modular arithmetic this ensures
we will always get closer to the actual value.
6.6.2 Tapestry
Tapestry is an extensible infrastructure that provides decentralized object
location and routing focusing on efficiency and minimizing message
latency. This is accomplished as Tapestry creates locally optimal routing
tables from initialization and preserves them to lessen routing stretch.
Furthermore, Tapestry allows object distribution determination according to
the needs of a given application. Similarly Tapestry allows applications to
implement multicasting in the overlay network. Tapestry is an infrastructure
that provides decentralized object location and routing (DOLR). Similar to
Chord, Tapestry routes messages to nodes based on the keys assigned to
the messages. What distinguishes Tapestry from Chord is the introduction
of location mapping, which decreases the search time. Location mapping
is described in the following paragraphs. Tapestry has been implemented
132 Internet & Distributed Systems
6.6.3 Pastry
Pastry is an overlay and routing network for the implementation of a
distributed hash table (DHT) similar to Chord. The key-value pairs are
stored in a redundant peer-to-peer network of connected Internet hosts.
The protocol is bootstrapped by supplying it with the IP address of a peer
already in the network and from then on via the routing table which is
dynamically built and repaired. It is claimed that because of its redundant
and decentralized nature there is no single point of failure and any single
node can leave the network at any time without warning and with little or no
chance of data loss. The protocol is also capable of using a routing metric
supplied by an outside program, such as ping or traceroute, to determine the
best routes to store in its routing table. This P2P system was analyzed in a
publication, written by Antony Rowstron and Peter Druschel, awarded ten-
yearbest paper award from Middleware in 2011;
A Pastry system is a self-organizing overlay network of nodes, where
each node routes client requests and is capable of storing application-specific
objects. Any computer that is connected to the Internet and runs the Pastry
Peer to Peer Systems 133
6.7.1 Oneswarm
OneSwarm is a new BitTorrent based application for distributing data
over the Internet. As already known, the BitTorrent protocol have some
weaknesses regarding security concerns, and OneSwarm was developed as
an effort to fix these problems. OneSwarm is what is called a F2F (friend-to-
friend) protocol which means that users only can connect to users with whom
134 Internet & Distributed Systems
they are friends with. This way a third party can’t connect to a network in
which it has no friends, and thus cannot see the traffic inside the network.
This is however not the only measure taken to ensure better privacy of
the transfers. All data is encrypted with the RSA encryption algorithm, so
that sniffing a connection won’t give away the content of the traffic either.
OneSwarm is a free application that helps you to share your data or file in a
secure manner and with a full control, when you install the application you
will notice that it will automatically open your default browser on localhost
http://127.0.0.1:29615/.First you start by inviting your friend to server by
clicking invite button OneSwarm will launch your local e-mail client to
send invitation to different user with a unique code, this code is as one-time
password that will be used to connect the node.OneSwarm is illustrated in
the figure below. Bob downloads public data using One-Swarm’s backwards
compatibility with existing BitTorrent implementations, and makes the
downloaded file available to other One-Swarm users. Alice downloads the
file from Bob without attribution using OneSwarm’s privacy-preserving
overlay, but she is then free to advertise the data to friends. Advertisements
include a cryptographic capability, which allows only permitted friends to
observe the file.
REFERENCES
1. Adar, E., & Huberman, B. (2000). A. Free Riding on Gnutella.
2. Bloom, B. (1970 July). Space/time trade-offs in hash coding with
allowable errors.CACM, 13(7), 422-426.
3. Druschel, P., & Rowstron, A. (2001a). Pastry: Scalable, decentralized
object location and routing for large-scale peer-to-peersystems.
Middleware.
4. Druschel, P. & Rowstron, A. (2001b May). PAST: A large-scale,
persistent peer-to-peer storage utility. Proceedings of the eighth
workshop on hot topics in operating systems (HotOS-VIII)(pp. 75-80).
Germany:Schloss Elmau.
5. Rohrs, C. (2001). Query routing for the gnutella network. Retrieved from
http://www.limewire.com/developer/query_ routing/keyword%20
routing.htm. dec
CHAPTER 7
CONTENTS
7.1 Concepts of Security.............................................................................................. 138
7.2 Kerberos................................................................................................................ 144
7.3 SSL Security........................................................................................................... 145
7.4 Firewall................................................................................................................. 145
7.5 Authorization......................................................................................................... 146
7.6 Auditing................................................................................................................ 148
7.7 Digital Signatures.................................................................................................. 148
7.8 Cryptography......................................................................................................... 152
References................................................................................................................... 161
138 Internet & Distributed Systems
The security techniques in distributed systems are the important issues. Several
elements of distributed system security are identified, like authentication,
authorization, encryption and system protection. In initial days, the security
management environment was based on single authority systems but now the
focus is on the development per activity, authorities and groups with shared
responsibilities. The general security attacks on the distributed systems
are eavesdropping, masquerading, and message tempering, replaying the
message and denial of services. The trustworthiness of distributed systems
is important in a number of environments. Having discussed distributed
system architecture in the previous chapters; this chapter enumerates and
analyzes the factors affecting security in distributed systems and the various
tools in the framework used in securing these systems.
7.1.1 Encryption
We have already identified the need for authentication, both of the client
and the server. Client authentication usually involves authentication of the
principal. Server authentication is important not only because of the threat
of active tapping (Trojan Horse) but also because many RPC binders only
offer hints as to where a service is located. If it crashes, it may be replaced by
140 Internet & Distributed Systems
another, totally different server, but the client needs to know of this mismatch
so that the binder may be consulted again. Cryptography can be used for
such purposes. A related term is message integrity, often called message
authentication. It is often important to be sure that a received message has
not been tampered with in transit. Sometimes there is no requirement for
secrecy of the message in transit; sometimes there is. A form of strong
sumcheck based on cryptographic techniques is appended to messages for
such purposes. This is sometimes called a digital signature or a message
authentication code (MAC).
Encryption is widely used on the Internet to protect user information
being sent between a browser and a server, including passwords, payment
information and other personal information that should be considered
private. Organizations and individuals also commonly use encryption to
protect sensitive data stored on computers, servers and mobile devices like
phones or tablets.
Unencrypted data, often referred to as plaintext, is encrypted using an
encryption algorithm and an encryption key. This process generates cipher
text that can only be viewed in its original form if decrypted with the correct
key. Decryption is simply the inverse of encryption, following the same
steps but reversing the order in which the keys are applied. Today’s most
widely used encryption algorithms fall into two categories: symmetric and
asymmetric. These two categories are explained succinctly later in this
chapter. The primary purpose of encryption is to protect the confidentiality
of digital data stored on computer systems or transmitted via the Internet
or any other computer network. A number of organizations and standards
bodies either recommend or require sensitive data to be encrypted to
prevent unauthorized third parties or threat actors from accessing the data.
For example, the Payment Card Industry Data Security Standard requires
merchants to encrypt customers’ payment card data when it is both stored at
rest and transmitted across public networks.
7.1.2 Authentication
There are several steps necessary in providing authentication in a distributed
environment. The first step is authenticating or verifying the identity of a
user. There are three basic methods for accomplishing this. First, one may
verify a user by something that is known by the user such as a password, the
most popular method although not necessarily the most secure. The second
method involves something that is possessed by a user, such as a key. The
Security and Privacy 141
third method involves something the user is, such as the user’s fingerprints
or retina patterns, the most secure and expensive method. Observe that
up until this point, these methods are all possible to use on a centralized
system as well. A distributed operating system must also accommodate the
following concerns.
• Eavesdropping: How do we prevent someone from eavesdropping
on the communication line?
• Multiple password management: If we are accessing multiple
systems, does every system keep a copy of the user ID and
password? Every database, storing authentication information,
is a target for a system security loophole. Furthermore, do we
need to present our password every time we want to accomplish
something?
• Replay: Someone could copy authentication information while it
is transmitted on the network, even if it is encrypted, and replay it
at a later time, thus allowing inappropriate access.
• Trust: Should authentication be unilateral or should the user be
able to verify and have confidence that the service being utilized is
legitimate as well. A centralized system trusts itself; a distributed
system has to find a way to trust others.
A common solution to these concerns is using one of the certificate
management systems. A certificate is a computer-generated, frequently time-
valid, authenticated packet of information utilized in distributed systems
to allow easy access to the various resources. The time-valid feature helps
prevent replay at a later time. This may be accomplished with a timestamp
or with a nonce. Anonceis a random value unique for each communication.
Thus, no two communications share a nonce and replays are easy to
detect. There are two basic approaches to certificate management. The first
approach involves utilizing certificate lists that contain a list of certificates
from a trusted certificate authority and is exemplified by X.509. Any service
wishing to verify an identity must examine a list to verify the authenticity of
the client. The second approach involves a centralized certificate distribution
center where clients obtain a certificate for each service that the user they
wish to use. The client then presents this service certificate when utilizing
a service.
142 Internet & Distributed Systems
Source: http://csis.pace.edu/~marchese/CS865/Lectures/Chap9/Chap-
9New/Chapter9.html
message that contains secret shared keys KA, B that A can uses it. Moreover,
KDC also sends the shared key KA, B to B that is encrypted with secret
key KB, KDC. TheNeedham–Schroeder authentication protocol is designed
based on this model. i. This KDC shares a secret key with each of the hosts
ii. No pair of hosts is required to have a shared secret key as well. Iii. Using a
KDC requires that we manage N keys instead of N (N - 1)/2 iv. IfAlicewants
to set up a secure channel with Bob, she can do so with the help of a (trusted)
KDC. v. The KDC hands out a key to both Alice and Bob that they can use
for communication,
Source: http://csis.pace.edu/~marchese/CS865/Lectures/Chap9/Chap-
9New/Chapter9.html
Source: http://csis.pace.edu/~marchese/CS865/Lectures/Chap9/Chap9New/
Chapter9.html
7.2 KERBEROS
The Kerberos system is a widely used implementation of secure
communication channels, based on the DES encryption scheme (see Schiller,
Steiner et al.). Integrated into the DCE environment, Kerberos is quite
popular in the UNIX community. The approach genuinely offers a major
improvement in security over that which is traditionally available within
UNIX. Its primary limitations are, first, that SSL security dominates in the
Internet and Web, making Kerberos increasingly “non-standard.” Secondly,
applications using Kerberos must be modified to create communication
channels using the Kerberos secure channel facilities. Although this may
seem to be a minor point, it represents a surprisingly serious one for
potential Kerberos users, as application software using Kerberos is not yet
common. Nonetheless, Kerberos has had some important successes any
years ago. Kerberos per-se is perhaps less important than the basic ideas it
illustrates. The basic Kerberos protocols revolve around the use of a trusted
authentication server, which creates session keys between clients and servers
upon demand. The basic scheme is as follows. At the time the user logs in,
he or she presents a name and password to a log-in agent, which runs in
a trusted mode on the user’s machine, and establishes a secured channel
to the Kerberos authentication server at boot time. Having logged in, the
user is able to connect securely to various applications using Kerberos as an
intermediary. The role of Kerberos is to mediate during connection setup,
Security and Privacy 145
7.4 FIREWALL
So far, we have shown how protection can be established using cryptographic
techniques, combined with some implementation of an access control
matrix. These approaches work fine as long as all communicating parties
play according to the same set of rules. Such rules may be enforced when
developing a standalone distributed system that is isolated from the rest of
146 Internet & Distributed Systems
the world. However, matters become more complicated when outsiders are
allowed to access the resources controlled by a distributed system. Examples
of such accesses including sending mail, downloading files, uploading tax
forms, and so on. To protect resources under these circumstances, a different
approach is needed. In practice, what happens is that external access to any
part of a distributed system is controlled by a special kind of reference
monitor known as a firewall (Cheswick and Bellovin, 2000; and Zwicky
et al., 2000). Essentially, a firewall - disconnects any part of a distributed
system from the outside world. All outgoing, but especially all incoming
packets are routed through a special computer and inspected before they are
passed. Unauthorized traffic is discarded and not allowed to continue. An
important issue is that the firewall itself should be heavily protected against
any kind of security threat: it should never fail.
Source: https://docs.microsoft.com/en-us/azure/application-gateway/applica-
tion-gateway-web-application-firewall-overview
7.5 AUTHORIZATION
Authorization is the process of determining whether an already identified and
authenticated user is allowed to access information resources in a specific
way. Authorization is often the responsibility of the service providing access
to a resource. For example, if a user tries to access a file that resides on a file
server, it will be the responsibility of the file service to determine whether
the user will be allowed this type of access. Authorization can provide for
granular control and may distinguish between operations such as reading
or writing to a file, deleting a file, launching an executable file, etc. Before
Security and Privacy 147
7.6 AUDITING
Auditing is a never-ending process Auditing system involve two parts, such
as logging and auditing, i.e., analyzing and notifying.Logging is collection
of data and information according to system or configuration parameters.
The logs should note who has entered the system as well as what they have
done. Auditing consists of analysis, i.e., study of information received
from the log entries and notification, which is the reporting of the results
of analysis study. An analyzer receives and analyzes recorded data to detect
some unwanted or forbidden event or problem, i.e., detect violation of
security policy. The result of analysis is sent by notifier as a notification
or alarm to the responsible entity, usually system administrator. Logs must
contain security levels of the involved subjects and objects, the action and
constraints, as well as the results, i.e., success or failure [3]. Auditing system
can be configured to record identity of the subjects and objects as well and
connect those to the security level action, and result. Some existing types of
auditing are NFS auditing and logging and auditing file system. Logging and
auditing file system is a system that records user level actions taken on file
in a human readable format. A system administrator is allowed to automate
checks for violation of security policy. The audit reports all violations to
the security policy. Traditional logging and auditing file system remove
important information, such as facility and level from the audit logs that can
negatively affect log analysis.
discussion of them for now.) Formally, a signature scheme, like public key
encryption scheme, has a keypair generation function that, given a random
input R will return two keys, σ R (the private signing key) and VR (the
public signature verification key) with the properties that
i. Given the public signature verification key VR, it is infeasible to
compute, the private signing key σ R.
ii. There is a digital signature function that, given a message M and
a private signature key σ R, will produce a signature SigσR (M).
iii. There is a signature verification function that, given the signature
SigσR (M) and the public signature verification key VR, will
output TRUE if the signature was computed correctly with σ R;
otherwise, it will output FALSE.
Just as handwritten signatures or physical thumbprints are commonly
used to uniquely identify people for legal proceedings or transactions, so
digital signatures (“digital thumbprints”) are commonly used to identify
electronic entities for online transactions. Adigital signatureuniquely
identifies the originator of digitally signed data and also ensures the integrity
of the signed data against tampering or corruption.
One possible method for creating a digital signature is for the originator of
data to create the signature by encrypting all of the data with the originator’s
private key and enclosing the signature with the original data. Anyone
with the originator’s public key can decrypt the signature and compare the
decrypted message to the original message. Because only someone with the
private key can create the signature, the integrity of the message is verified
when the decrypted message matches the original. If an intruder alters the
original message during transit, the intruder cannot also create a new valid
signature. If an intruder alters the signature during transit, the signature does
not verify properly and is invalid.
However, encrypting all data to provide a digital signature is impractical
for three reasons:
• The ciphertext signature is the same size as the corresponding
plaintext, so message sizes are doubled, consuming large amounts
of bandwidth and storage space.
• Public key encryption is slow and places heavy computational
loads on computer processors, so network and computer
performance can be significantly degraded.
• Encrypting the entire contents of information produces large
150 Internet & Distributed Systems
Figure 7.5: Basic RSA Data Security Digital Signature Process. Source: https://
technet.microsoft.com/en-us/library/cc962021.aspx
7.8 CRYPTOGRAPHY
Encryption is the process of encoding a message in such a way as to hide
its contents. Modern cryptography includes several secure algorithms
for encrypting and decrypting messages. They are all based on the use of
secrets called keys. A cryptographic key is a parameter used in an encryption
algorithm in such a way that the encryption cannot be reversed without
knowledge of the key.
There are two main classes of encryption algorithm in general use.
The first uses shared secret keys—the sender and the recipient must share
knowledge of the key,and it mustnot be revealed to anyone else. The second
class of encryption algorithms uses public/private key pairs. Here the sender
of a message uses a public key—one that hasalready been published by
the recipient—to encrypt the message. The recipient uses acorresponding
private key to decrypt the message. Although many principals mayexamine
Security and Privacy 153
the public key, only the recipient can decrypt the message, because they
havethe private key.
Both classes of encryption algorithm are extremely useful and are used
widely in the construction of secure distributed systems.
Source: http://www.nucrypt.net/overview_encryption.html
private key must be kept secret. The RSA encryption algorithm is the most
widely used public key algorithm, partly because both the public and the
private keys can encrypt a message; the opposite key from the one used to
encrypt a message is used to decrypt it. This attribute provides a method of
assuring not only confidentiality but also the integrity, authenticity, and non-
repudiation of electronic communications and data at rest through the use of
digital signatures.
Source: https://blockgeeks.com/guides/cryptocurrencies-cryptography/
A message is encrypted by the sender applying some rule to transform
the plaintext message (any sequence of bits) to a ciphertext (a different
sequence of bits). The recipient must know the inverse rule to transform
the ciphertext back into the original plaintext. Other principals are unable to
decipher the message unless they also know the inverse rule. The encryption
transformation is defined with two parts: a function E and a key K. The
resulting encrypted message is written as {M} K
E (K, M) = {M} K
The encryption function E defines an algorithm that transforms data
items in plaintext into encrypted data items by combining them with the
key and transposing them in a manner that is heavily dependent on the value
of the key. We can think of an encryption algorithm as the specification of
a large family of functions from which a particular member is selected by
any given key. Decryption is carried out using an inverse function D, which
also takes a key as a parameter. For secret-key encryption, the key used for
decryption is the same as that used for encryption:
D (K, E (K, M)) = M
Security and Privacy 157
1. MD 5 Hash Algorithm
The MD5 (1992) message-digest algorithm was designed as a strengthened
extension of the MD4 (1990) message digest algorithm. MD5 is slightly
slower than MD4. Both algorithms were developed by Ron Rivest. The
algorithm accepts an input message of arbitrary length and produces a 128-
bit “Message digest,” “fingerprint,” or “hash result.” The following figure
shows the way the input message is turned into a 128-bit message digest.
The MD5 Message-Digest Algorithm is a widely used cryptographic hash
function that produces a 128-bit (16-byte) hash value. It is very simple and
straight forward; the basic idea is to map data sets of variable length to
data sets of a fixed length. After using MD5, generated hash is typically
a 32-digit hexadecimal number. To do this, the input message is split into
chunks of 512-bit blocks. Now these blocks are processed by the MD5
algorithm, which operates in a 128-bit state, and the result will be a 128-bit
hash value. Here, the data to be encoded is often called the “message” and
the generated hash value is called the message digest “digest.” MD5 is very
fast, and easy to learn and implement but it generates fairly weak hashes
which is not too much secure. As we know, storing the text password with
hashing is the most risky thing for application security today. MD5 can be
made more secure using salt: Salt is some randomly generated text, which is
Security and Privacy 159
REFERENCES
1. Pethe, H. B., & Pande, S. R. An overview of Cryptographic Hash
Functions MD-5 and SHA. National Conference on Recent Trends in
Computer Science and Information Technology 38 NCRTCSIT-2016.
Retrieved January 2018, from.
2. Goldman, J. (2006). Firewall basics. In H. Bidgoli (Ed.), Handbook
of information security (pp. 2-14). Hoboken: John Wiley & Sons, Inc.
3. Hamdi, M., Doudriga, N., & Obaidat, M. (2006).Security policy
guidelines.In H. Bidgoli (Ed.), Handbook of information security (pp.
227-241). Hoboken: John Wiley & Sons, Inc.
4. Kakareka, A. (2009). What is vulnerability assessment? In J. Vacca
(Ed.), Computer and information security handbook (pp. 383-393).
Boston: Morgan Kaufmann Inc.
5. Stallings, W. (1999). Cryptography and network security. Upper Saddle
River, NJ: Prentice Hall.
6. Stallings, W. (2011). Operating Systems Security. In Handbook of
information security (pp 154-163).
CHAPTER 8
CONTENTS
8.1 Communication Technology.................................................................................. 164
8.2 Traditional System of Distribution.......................................................................... 164
8.3 Nomadic Distribution System................................................................................ 168
8.4 Mobile Adhoc Network......................................................................................... 169
8.5 Theory in Mobile Computing................................................................................. 171
8.6 Models.................................................................................................................. 172
8.7 Palmtop and Laptop Technology............................................................................ 174
8.8 Mobile Middleware............................................................................................... 175
8.9 Ubiquitous Computing.......................................................................................... 181
References................................................................................................................... 185
164 Internet & Distributed Systems
1. Fault-Tolerant
Distributed systems consist of a large number of software and hardware
modules over a long period of time are tied to failure. Such composite
damage may be accompanied by a lack of service. So, the system that
has to be removed from the wrong components is the wrong purpose of
doing so, to avoid damaging the system, even if it is a mistake to ensure
continued service. This system is considered tolerable if it can mask the
presence of errors. The goal of any systematic misuse of this system is to
increase reliability or availability. Reliability of a system is defined as the
system’s probability to live up to now. Reliable system prevents loss of
information, even in the event of failure of components. Availability is part
of the time when the system is in use. In general, the patience of the error is
achieved by providing a reservation. Overheads are defined as those parts of
the system that are not essential for its proper functioning. There are three
types - hardware, software, and time. Hardware storage has been achieved
by adding additional hardware to the system, which assumes the role of the
component failing in the event of some error.
2. Scalable
The distribution system works properly even when some aspects of the
system are scaled to a greater scale. Scale consists of three components:
the number of users and other objects that are part of the system, the
distance between the most remote numbers in the system, and the number
of institutions that administer administration over certain parts of the
system. These three elements affect the distribution system in many ways.
Among the components involved are naming, authentication, authorization,
communication, remote access, and mechanisms that users use to track
the system. Three techniques for standardized control are used: copy,
distribution and cache (Clifford N., 1994). Backup creates multiple copies
166 Internet & Distributed Systems
of the resource. Its use of name services, authentication and help reduce
file uploads on individual servers and improve reliability and availability of
services in general. The two main issues of the copy are the location of the
copy and the mechanism that they were kept.
Another distribution mechanism in the distribution system allows the
information supported by the distribution service to be expanded to multiple
servers. Data distribution between multiple servers reduces the size of the
database that will be maintained by each server, eliminating the time needed
to search the database. The distribution also distributes hosting servers by
reducing the number of questions processed by them. If suggestions can be
distributed on a plane engine in proportion to their power, the burden on the
server can be effectively controlled. Caching is another important technique
for creating a scalable system. Cache reduces server and network load. Cache
data can be logged faster if new requests are created. The difference between
copying and caching is that repository data is short-term data. Instead of
distributing update caches, data stabilization is stored by deleting warehouse
data when it can be secured. Caching is generally performed by customers,
reducing the need for regular network service.
3. Predictive Performance
Multiple indications perform as response times (time elapsed between end
of request or requirement in computer system and start of the answer),
performance (speed of a network, sending and receiving data), network
capabilities, and so on. It is used to evaluate the results. A predictable
performance is the ability to provide the desired reaction on time.
4. Openness
Open Attribute Attribution ensures that the subsystem is always open to
communication with other systems. Network Services is an application
system designed to help machine-to-network interactions on the web.
These protocols allow the distribution system to be expanded and scaled.
A scalable, open system benefits from closed and self-contained systems.
An independent distribution system of exceptional basal factors such as
hardware platforms and software, the properties of openness achieved. Thus,
each service can be accessed equally for each customer (local or remote) in
the system.
Mobile and Ubiquitous Computing 167
5. Security
The system must allow for the sharing of communication between
applications you use and/or other computer resources by implementing
the necessary security measures. Security features are designed primarily
to ensure confidentiality, integrity, and availability. Privacy is protection
against disclosure to unauthorized persons. Private disclaimers can range
from inflexibility to damage. Integrity provides protection against change
and corruption. Presence Save resource that is accessible. Many cases of
hacking compromise the integrity of databases and other resources. Denying
service attacks is an attack against availability. Other key security issues
include access control and non-payment.
6. Transparency
The distribution system is understood by the user and the whole developer
rather than the set of collaborative components. The location of the computer
system that is related to the transaction operation, the copy data, the sourcing
of multi-site sources, and the failure of the system restored has been hidden
by the user. Transparency hides the nature of its users and shows users that
the system appears and works as a centralized system. Transparency can be
used in different ways in distribution systems (Kazi et al, 2000) (Pradeep,
K. 2002).
Transparency Description
Access Hide differences in data representation and how
a resource is accessed
Location Hide where a resource is located
Migration Hide that a resource may be moved to another
location
Relocation Hide that a resource may be moved to another
location while in use
Replication Hide that a resource is replicated
Concurrency Hide that a resource may be shared by several
competitive users
Failure Hide the failure and recovery of a resource
Source: Andrew S. T., Maarten V. S., 2002. Distributed Systems: Principles
and Paradigms
168 Internet & Distributed Systems
8.4.3 Scalability
The scalability problem occurs in ad hoc networks due to the nature of the
multi-hop (Wattenhofer et al., 2001). The scalability in ad hoc network
depends on the network size and forwarded packet capacity in the network.
The scalability can preserve by dividing the network area according to
their geographical location. The whole network area can be divided into
small virtual grid cells such that every node in each virtual grid cell can
communicate with other nodes in the same cell. The scalability depends on
equal node priority, uniform distribution of destination, spreading the node
reliably, and the shortest path.
8.4.5 Security
Security is a major concern in the ad hoc networking standards. Exchange of
data on the Adhoc network should be done safely. Adhoc network security
issues are dynamic topologies, speed, device size, and limited battery life.
Due to the dynamic nature, it is difficult to maintain cross-network security.
Adhoc networks do not rely on existing infrastructure, so nodes can go
out and join the network in a situation where security can fall. There are
two types of attacks in the Adhoc network. The first is the passive network
which does not change the data for the web. But this can allow unauthorized
users to find messages. The second is the active attack; it is a serious attack
and protects the flow of messages between the nodes on the network. This
can allow unauthorized users to edit messages.
term multipath fading is caused by similar signal taking different paths and
arriving at the receiver shifted in phase. On the other hand, long-term fading,
or radio shadows is caused by the topography of the terrain (like mountains)
and can lead to signal dropouts. In wireless environments, management of
transmission power is critical. The importance of power arises from two
factors: i. Energy is a scarce resource in mobile units. Therefore, transmission
with power, as low as feasible, is highly desirable. ii. Signals from other
terminals are considered by a given terminal as interference, which has a
negative effect on the signal-to-noise ratio. Thus, it is important that each
terminal use the “right” amount of power to transmit successfully without
unnecessary degradation of others’ transmissions.
8.6 MODELS
It is important to develop a good understanding of the mobile operating
system model and the unique search issues. There are many topology
systems for cellular networks. For example, cellular networks, mobile ad
hoc networks (MANETs), body area networks (BANs) and WLANs. It is
important to understand, capture and describe the features of each of these
models and identify the most common architecture that combines these
basic patterns into sophisticated architectures. The WLANs, LAN, PAN and
MAN are already discussed in the first chapter of this book (see chapter one,
Introduction to the Internet).
cluster. Analogously, in the 12-group frequency plan each cell uses only 1/12
of the total bandwidth and its bandwidth allocation is disjoint with that of
11 neighboring cells. The Cellular Digital Packet Data (CDPD) is designed
to provide packet data services on the top of the existing AMPS. The CDPD
system provides data services without interfering with the existing analog
and digital telephone services that use the same AMPS channels. In CDPD,
voice transmission has a higher priority than data transmission. Data can be
transmitted on a channel only if a there is no voice call which is using that
channel. Voice transmission can preempt data transmission.
disk completely in favor of flash memory) and offer CPUs with an energy-
efficient doze mode. For example, the Hobbit chip consumes 5,000 times less
energy while in doze mode than in the active mode (250 mW in the active
mode as opposed to 50/xW in doze mode). There is a growing pressure on
software vendors to incorporate energy management features.
including those presented above. To achieve this goal, Lime borrows and
adapts the communication model made popular by Linda. In Linda, processes
communicate through a shared tuple space, a repository of elementary
data structures, called tuples that can be accessed concurrently by several
processes. Each tuple is an ordered sequence of typed data. Tuples are
inserted using the out (t) operation on the tuple space, and can be removed
by executing in (p), where p is a template used to identify tuples based on
pattern matching against their content. Tuples can also be read from the
tuple space using the rd operation. Both in and rd are blocking. A typical
extension to this synchronous model is the provision of the asynchronous
primitives inp and rdp, called probes, that allow non-blocking access to
the tuple space. Linda characteristics resonate well with the mobile setting.
Communication in Linda is decoupled in time and space, i.e., senders
and receivers do not need to be available at the same time, and mutual
knowledge of their location is not necessary for data exchange. Decoupling
is of paramount importance in mobility, where the parties involved in
communication change dynamically due to their migration. Nevertheless,
when mobility is fully exploited, as with ad hoc networks, there is no
predefined, static, global context for the computation, as assumed by Linda.
Rather, the current global context is defined by the transient community of
mobile units that are currently present, to which each unit is contributing its
own individual context. As these communities are dynamically changing
according to connectivity and migration, the context changes as well. This
observation alone leads to the model underlying Lime. Although still based
on the Linda notion of a tuple space, Lime exploits it in a radically different
way. Lime is fully implemented in Java, with support for version 1.1 and
higher. Communication is handled entirely at the socket level—no support
for RMI or other additional communication mechanisms is needed or
exploited in Lime. The lime package is about 5,000 non-commented source
statements, for about 100 Kbyte of jar file. The companion lighTS package
provides a lightweight tuple space implementation plus an adapter layer
integrating other tuple space engines, for an additional 20 Kbyte. Thus far,
Lime has been tested successfully on mobile hosts running Windows9x/NT/
CE networked with WaveLAN wireless technology.
Bayou
The Bayou storage system provides an infrastructure for collaborative
applications. Bayou manages conflicts introduced by concurrent activity
while relying only on the fluctuating connectivity available in mobile
computing. Replication is seen as a requirement in the mobile scenario as
a single storage site may not be reachable by some mobile clients or within
disconnected work-groups. Bayou allows arbitrary read and write operations
to any replica without explicit coordination with the other replicas: every
computer eventually receives updates from every other, either directly or
indirectly, through a chain of peer interactions. The weak consistency of
the replicated data is not transparent to applications; instead, they are aware
180 Internet & Distributed Systems
they may be using weakly consistent data and that their write operations may
conflict with those of other users and applications. Moreover, applications
are involved in the detection and resolution of conflicts as these naturally
depend on the semantics of the application. In particular, the system provides
the application with ways of specifying its ownnotion of a conflict, along with
its policy for resolving it. In return, the system implements the mechanisms
for reliable detection of conflicts, as specified by the application, and for
automatic resolution when possible. However, one of the major drawbacks
of Bayou is its client-server architecture. Although in principle client and
server may co-exist on a host in practice, the system requires that each data
collection is replicated in full on a number of servers.
Hermes
A primary feature of the Hermes event-based middleware is scalability, as
it is targeted at the development of large-scale distributed systems. Hermes
includes two content-based routing algorithms to disseminate events from
event publishers to subscribers. The type-based routing algorithm only
supports subscriptions depending on the event type of event publications.
It is comparable to a topic-based publish/subscribe service but differs by
observing inheritance relationships between event types. The second
algorithm is type- and attribute-based routing, which extends type-based
Mobile and Ubiquitous Computing 181
applications and systems is that they involve devices with limited resources.
With ubicomp, however, a wide range of new devices are built and
introduced, which are much more resource-constrained. Devices such as
PDAs, mobile phones, and music players have limited CPU, memory, and
network connectivity compared to a standard PC, and embedded platforms
such as sensor networks and smart cards are very limited compared to a PC
or even a smart phone. Hence, when creating systems support in a ubicomp
setting, it is important to recognize the constraints of the target devices,
and to recognize that hardware platforms are highly heterogeneous and
incompatible with respect to hardware specifications, operating system,
input/output capabilities, network, etc.
etc., and unlike the traditional use of the term application, which usually refers
to software, at most two physical nodes, the Ubicomp application typically
encompasses several devices that need to interact closely and consistently to
make the application. For example, Smart Room is an application that depends
on several devices, services, communications links, software components,
and an end-user application that needs to work in full concert to fulfill the
full functionality of a smart room. Therefore, managing heterogeneity is
not just a matter of the ability to compile, build, and deploy application to
different target platforms - such as building a desktop application that works
on different versions of Windows, Mac OS, and Linux. The challenge of
heterogeneity arises partially because Ubicomp is a new research field and
the new standard technology stack, including hardware, operating system,
etc. has not yet been exhausted. In the above scenario, it can be argued that
the variety of related technologies is too complex and unnecessary. This is
fairly accurate, and existing or new technology platforms may progressively
be able to deal with Ubicomp requirements in a more homogeneous and
regular way.
Mobile and Ubiquitous Computing 185
REFERENCES
1. Adelstein, F., Gupta, S., Richard, G.,&Schwiebert, L. (2005).
Fundamentals of mobile and pervasive computing.New York:McGraw-
Hill.
2. Akyildiz, I.,&Kasimoglu. I. (2004). Wireless sensor and actor networks:
research challenges. Ad Hoc Networks,2,351-367.
3. Akyildiz, I., Su, W.,Sankarasubramaniam, Y., &Cayirci, E.
(2002). A survey on sensor networks.IEEE Communications
Magazines,40(8),102-114.
4. Han, S. et al. Reconfigurable Sensor Networks with SOS, UCLA
Electrical Engineering Department Annual Research Review (ARR).
University of California, Los An , October 2004.
5. Mascolo, C.; Hailes, S. Survey of middleware for networked embedded
systems. Technical Report for Project: Reconfigurable Ubiquitous
Networked Embedded Systems, University College London, 2005.
6. Serain, D. (2002). Middleware and enterprise application integration.
Springer-Verlag London Limited.
CHAPTER 9
DISTRIBUTED PROGRAMMING
SYSTEMS AND ENVIRONMENT
CONTENTS
9.1 Distributed Programming Environment.................................................................. 188
9.2 Occam 2............................................................................................................... 188
9.3 Ada 95.................................................................................................................. 189
9.4 Java....................................................................................................................... 194
9.5 Corba.................................................................................................................... 198
References................................................................................................................... 206
188 Internet & Distributed Systems
Recently due to the advances of the computing facilities and the development
of the network in which the computing resources are connected and
shared, the programming environment migrates from the single and local
environment to the distributed and global environment. Therefore the
programming environment that can smoothly support cooperative works
on the network is necessary. e. Computing using distributed objects
provide a flexible and scalable programming on the distributed and parallel
programming environment. Many systems such as CORBA, DCOM, and
Java support the distributed object computing. This chapter enlightens the
readers on some of the programming environments and they are analyzed in
terms of performance, support, ease of coding, and ease of debugging.
9.2 OCCAM 2
The Occam programming language was designed with the philosophy of
eliminating unnecessary complexities, thus keeping the language simple
and elegant. Concurrency and communication are the prime concepts of
the Occam model. Occam captures the hierarchical structure of a system
by allowing an interconnected set of processes to be regarded as a unified,
single process. At any level of detail, the programmer is only concerned with
a small, manageable set of processes. Occam is an ideal introduction to a
number of key methodologies in recent computer science. Occam programs
can provide a degree of security unidentified in conventional programming
languages such as C, FORTRAN or Pascal. Occam semantics simplify
Distributed Programming Systems and Environment 189
9.3 ADA 95
Ada defines a distributed system as an interconnection of one or more
processing nodes (a system resource that possess both operative and storage
capabilities), and zero or more storage nodes (a system resource that has only
storage capabilities, with the storage addressable by more than one processing
nodes). The Ada model for programming distributed systems specifies
a partition as the unit of distribution. Partitions comprised collections of
library units (separately compiled library packages or subprograms) that
may mutually execute in a distributed target execution environment. The
configuration of library units into partitions is not defined by the language;
it is presumed that an implementation will provide this, along with facilities
for allocation and, if necessary, reconfiguration.
Each partition resides at a single execution site where all its library
units occupy the same logical address space. More than one partition may,
however, reside on the same execution site. Partitions may be either active or
passive. The library units comprising an active partition reside and execute
upon the same processing element. In contrast, library units comprising a
passive partition reside at a storage element that is directly accessible to
the nodes of different active partitions that reference them. This model
ensures that active partitions cannot directly access variables in other active
partitions. Variables can only be shared directly between active partitions by
190 Internet & Distributed Systems
The Ada. Text_IO phrase on Line 1 makes the contents of the standard
Ada. Text_IO library contained in the Tasking_Example procedure. This
package contains the Put_Line procedure for displaying the phrase. The
“finish” variable is used to communicate the main process and the Outputter
task. The Atomic Commitment by pragma prevents the compiler from
local variable cache in Outputter’s task. Such optimization will affect
Distributed Programming Systems and Environment 191
program objectives. The atom also checks whether variables can be used—
Contextchangescannot occur in the middle of reading or writing. The
Outputter task is simply an ending cycle when the actual completion shows
the current value of Count increasing the number and then pausing for at
least one second. When the cycle comes out, the control is transferred to
a statement that says “end” message and the task is finished. The task of
the result is activated when starting the shutdown process. The body of the
procedure stops executing for 20 seconds, and then ends with the fact that
the Outputter will end. The Tasking_Example procedure will not return until
the Outputter terminates. Delivery machines in the transport development
environment are usually used as an add-on to existing computers, called
machines. By machine, transport programs can obtain files, screens, and
caching services. This file accepts an IBM PC or Compatible Computers
with some specific MS-DOS features. There are equivalents for other
systems. Transporting contacts with the host on the INMOS connection.
An application called a server is running on the machine at the same time
running applications on the network. All communication between the
applications being implemented by the carrier and the host service (such as
keyboard screen and feed feed) is in the form of a message. The standard C,
Pascal and FORTRAN development systems use servers called the afserver
server. Office equipment D705B occam with INMOS Parallel C and Parallel
FORTRAN development system using servers called servers. The main
network transmission is the transport that connects the bus received through
the connection adapter. All of the other servers on the network are connected
by connecting INMOS to the main cargo. The transport network may have
space and a combination of transport types. Relationships between machines
and engines during application development do not require any restriction
on how materials are used in the target environment.
Source: http://www.transputer.net/tn/55/tn55.html
generic
type Message_Type is private;
package Mailbox_Pkg is
type Message_Array is array (Positive range <>) of Message_Type;
protected type Mailbox (Size: Natural) is
function Count return Natural;
(Count of messages in the Mailbox.)
procedure Discard_All;
(Discard all messages in the Mailbox.)
entry Put (Message: in Message_Type);
(Put a message into the mailbox;
suspend until there is room.)
entry Get (Message: out Message_Type);
(Get a message from the mailbox;
suspend until there is a message.)
private record
(Define the components of the protected record.)
Contents: Message_Array (1. Size);
(This array holds the messages.)
Distributed Programming Systems and Environment 193
Current_Count: Natural:= 0;
(Count of messages in the mailbox.)
Put_Index: Positive:= 1;
(Index into array for adding next message.)
Get_Index: Positive:= 1;
(Index into array for moving next message.)
end Mailbox;
end Mailbox_pkg;
This generic package may be implemented as follows:
package body Mailbox_Pkg is
protected body Mailbox is
function Count return Natural is
(Count of messages in the Mailbox.)
begin
return Current_Count
end Count;
procedure Discard_All is
Discard all messages in the mailbox
begin
count:= 0;
(Reset the array indices as well.)
Put_Index:= 1;
Get_Index:= 1;
end Discard_All;
entry Put (Message: in Message_Type) when Count < Size is
(Put a message into the mailbox.
“Count ≶ Size” is the “barrier” for this entry.
Any caller is suspended until this is true.)
begin
(Insert new message, bump index (cyclicly)
and increment counter.)
194 Internet & Distributed Systems
9.4 JAVA
The name Java is applied to a variety of technologies created by Sun
Microsystems. While the reader will instantly associate Java with a
programming language (which it most certainly is), in actuality Java is also
much more. There are three main components of Java:
-The Java programming language—a programming language used to
write software for the Java platform -The Java platform—a range of runtime
environments that support execution of software written in Java -The
Java API—a rich, fully featured class library that provides graphical user
Distributed Programming Systems and Environment 195
9.4.2 Simplicity
Java Is Simple No language is simple, but Java is a bit easier than the popular
object-oriented programming language C++, which was the dominant
software-development language before Java. Java is partially modeled
on C++, but greatly simplified and improved. For instance, pointers and
multiple inheritance often make programming complicated. Java replaces
the multiple inheritance in C++ with a simple language construct called an
interface, and eliminates pointers. Java uses automatic memory allocation
and garbage collection, whereas C++ requires the programmer to allocate
memory and collect garbage. Also, the number of language constructs is
196 Internet & Distributed Systems
small for such a powerful language. The clean syntax makes Java programs
easy to write and read. Some people refer to Java as “C++--” because it is
like C++ but with more functionality and fewer negative phases. One of the
central issues in software development is how to reuse code. Object-oriented
programming provides great flexibility, modularity, clarity, and reusability
through encapsulation, inheritance, and polymorphism. For years, object-
oriented technology was perceived as elitist, requiring a sizable investment
in training and infrastructure. Java has helped object-oriented technology
enter the mainstream of computing. Its simple, clean syntax makes programs
easy to write and read. Java programs are quite communicative in terms of
designing and developing applications.
9.4.4 Portability
Because Java is architecture neutral, Java programs are portable. They can
be run on any platform without being recompiled. Moreover, there are no
platform-specific features in the Java language. In some languages, such as
Ada, the largest integer varies on different platforms. But in Java, the range
of the integer is the same on every platform, as is the behavior of arithmetic.
The fixed range of the numbers makes the program portable. The Java
environment is portable to new hardware and operating systems. In fact, the
Java compiler itself is written in Java. Of course, this portability comes at a
cost. Java source code is compiled into bytecode, which is executed by the
JVM. This means that Java code does not run as fast as native code compiled
to machine language instructions. While some efforts have been made to
boost the performance of Java software, such as just-in-time (JIT) compilers
Distributed Programming Systems and Environment 197
that convert Java bytecode to native code, developers and users will find
that performance is not as fast when compared to C++ code, and that a
greater amount of memory is used up. There are also, of course, operating-
system-specific differences between Java applications. A Macintosh applet
or application will have a different GUI than that of a Windows or Unix
system. Glitches in initial releases caused problems when running software
on different platforms due to defects in early JVM implementations, but for
the most part, the potential of Java portability remains resilient.
9.4.5 Multithreaded-programming
Multithreading is a program’s capability to perform several tasks
simultaneously. For example, downloading a video file while playing the
video would be considered multithreading. Multithread programming is
smoothly integrated in Java, whereas in other languages you have to call
procedures specific to the operating system to enable multithreading.
Multithreading is particularly useful in graphical user interface (GUI) and
network programming. In GUI programming, there are many things going
on at the same time. A user can listen to an audio recording while surfing
a Web page. In network programming, a server can serve multiple clients
at the same time. Multithreading is a necessity in multimedia and network
programming. A multi-threaded language supports concurrent processing,
but with shared memory for application code and data. This allows threads to
conserve memory and interact with each other to work together if required.
The importance of a multi-threaded language for network programming
cannot be overstated. Though it is possible to write trivial client and server
applications without using multiple threads of execution, even a moderately
complex server will typically use the method of multi-threading. Having
this support within Java is useful, and makes it an attractive choice for
almost any type of programming. Other languages, too, have multi-threaded
support (often in the form of an add-on API or operating system calls), but
Java has been designed from the ground up to support such programming,
and provides language keywords to simplify writing thread-safe code.
9.4.6 Security
In a networked world, security is an extremely important issue for software
developers. After all, systems connected to a public network are far more
vulnerable to attack than a system connected to a closed private network.
When systems contain sensitive data, they are prime targets for attacks, but
198 Internet & Distributed Systems
even systems with little or no useful data can be compromised and used
to hijack other systems. Because of such risks, developers must take the
issue of security very seriously. Of course, while many developers write
their own security mechanisms, it is often useful if a language administers
some form of security of its own. This can save developers both time and
effort, and it is reassuring to know that users will have an equitable level of
security installed by default. Java is often billed as a “secure” language, and
while it is impossible for a language to guarantee absolute security (much
of this must be the responsibility of individual programmers, and the 24
implementers of the JVM), the Java security model makes it an attractive
choice for network developers. As an Internet programming language, Java
is used in a networked and distributed environment. If you download a Java
applet (a special kind of program) and run it on your computer, it will not
damage your system because Java implements several security mechanisms
to protect your system against harm caused by stray programs. The security
is based on the premise that nothing should be trusted.
9.4.7 Performance
Java’s performance is sometimes criticized. The execution of the bytecode is
never as fast as it would be with a compiled language, such as C++. Because
Java is interpreted, the bytecode is not directly executed by the system, but
is run through the interpreter. However, its speed is more than adequate for
most interactive applications, where the CPU is often idle, waiting for input
or for data from other sources. CPU speed has increased dramatically in the
past few years, and this trend will continue. There are many ways to improve
performance. Users of the earlier Sun Java Virtual Machine certainly noticed
that Java was slow. However, the new JVM is significantly faster. The new
JVM uses the technology known as just-in-time compilation. It compiles
bytecode into native machine code, stores the native code, and reinvokes the
native code when its bytecode is executed. Sun recently developed the Java
HotSpot Performance Engine, which includes a compiler for optimizing the
frequently used code. The HotSpot Performance Engine can be plugged into
a JVM to dramatically boost its performance.
9.5 CORBA
The Common Object Request Broker Architecture (CORBA) is a
standard developed by the Object Management Group (OMG) to provide
interoperability among distributed objects. CORBA is the world’s leading
Distributed Programming Systems and Environment 199
repository also contains type codes. Type codes are not explicitly listed in
IDL files, but are automatically derived from the types (long, string, struct,
and so on) that are defined or mentioned in IDL files. Type codes are used
to encode and decode instances of the CORBA any type--a generic type that
stands for any type and is used with the dynamic invocation interface.
used to describe interfaces only, not implementations. Despite the fact that
IDL syntax is similar to C++ and Java, IDL is not a programming language.
Through IDL, a particular object implementation tells its potential clients
what operations are available and how they should be invoked. From IDL
definitions, the CORBA objects are mapped into different programming
languages. Some of the programming languages with IDL mapping include
C, C++, Java, Smalltalk, Lisp, and Python. Thus, once you define an
interface to objects in IDL, you are free to implement the object using any
suitable programming language that has IDL mapping. And, consequently,
if you want to use that object, you can use any programming language to
make remote requests to the object.
1. Basic Types
The integer types are short, long and long long for signed integer numbers
of 16, 32 and 64 bits and unsigned short, unsigned long and unsigned long
for their unsigned counterparts.
const short aShortConstant = 6 * 7;
Thefloating point typesarefloat, doubleandlong doublefor ANSI/IEEE 754-
1985 single precision, double precision and double extended precision
floating point numbers.
const float aFloatConstant = 3.141593;
Thecharacter typesarecharfor a single character in a single-byte character
set andwcharfor a single character in a multiple-byte character set. The
interface definition language itself uses ISO 8859-1 Latin 1.
const char aTab = ‘\t’;
const wchar aWideTab = L’\t’;
Thelogical typeisbooleanwith values of true and false.
const boolean aTrueValue = TRUE;
const boolean aFalseValue = FALSE;
Thespecial typesareoctetfor 8 bits of raw data andanyfor a container of
another arbitrary type
REFERENCES
1. Coulouris, G., Dollimore, J., & Kindberg, T. (2005).Distributed
systems: Concepts and design (4th ed.). London, England: Addison-
Wesley.
2. Brosgol, B. M. Introduction to Ada 95.Retrieved January 2018,
from https://www.embedded.com/electronics-blogs/beginner-s
corner/4024497/Introduction-to-Ada-95
3. Liang, Y. Introduction to Java programming. Retrieved from http://
www.cs.armstrong.edu/liang/JavaCharacteristics.pdf
CHAPTER 10
“Wireless sensor and actuator nets, also known as motes and smart dust, are
an emerging computer class based on a new platform, networking structure,
and interface that enable novel, low cost, high volume, applications.”
CONTENTS
10.1 Sensing and Sensors............................................................................................ 208
10.2 Wireless Sensor Networks.................................................................................... 213
10.3 Medium Access Control (MAC)............................................................................ 217
10.4 Operating Systems............................................................................................... 218
10.5 Middleware......................................................................................................... 223
10.6 Challenges and Constraints.................................................................................. 224
10.7 Applications of Wireless Sensor Networks........................................................... 227
References................................................................................................................... 229
208 Internet & Distributed Systems
Sensors link the physical with the digital world by capturing and revealing
real-world phenomena and converting these into a form that can be processed,
stored, and acted upon. Integrated into numerous devices, machines, and
environments, sensors provide a tremendous societal benefit. This chapter
describes the major concepts behind wireless sensor networks (WSN). It is
a good reference for anyone interested in learning how wireless sensors and
wireless sensor networks work and what different solutions exist in terms
of operating software, network layouts and routing. It starts by discussing
the sensing and sensors and goes on to explain main components of wireless
sensor networks (WSNs). It also gives an exposition of the fundamental
aspects of wireless sensor networks and communication techniques. An
extensive discussion and revision of information processing and application
of WSNs are given towards the end of this chapter.
Type Examples
Temperature Thermistors, thermocouples
Pressure Pressure gauges, barometers, ionization gauges
Optical Photodiodes, phototransistors, infrared sensors, CCD
sensors
Acoustic Piezoelectric resonators, microphones
Mechanical Strain gauges, tactile sensors, capacitive diaphragms,
piezoresistive cells
Motion, vibration Accelerometers, gyroscopes, photo sensors
Flow Anemometers, mass air flow sensors
Position GPS, ultrasound-based sensors, infra-red sensors, in-
clinometers
Electromagnetic Hall-effect sensors, magnetometers
Chemical pH sensors, electrochemical sensors, infrared gas sen-
sors
Humidity Capacitive and resistive sensors, hygrometers, MEMS-
based humidity sensors
Radiation Ionization detectors, Geiger-Mueller counters
Source: http://bayanbox.ir/view/4489134451971294360/Fundamentals-of-
Wireless-Sensor-Networks-Waltenegus-Dargie.pdf
´r
R=
A (10.1)
where ℓ is the length of the conductor and A is the area of the cross-section.
For example, the well-known Wheatstone bridge as indicated in the figure
below is a simple circuit that can be used to convert a physical property
into an apparent electric effect. In this bridge, R1, R2, and R3 are resistors
of known resistance (where the resistance of R2 is adjustable) and Rx is a
resistor of unknown value. If the ratio R2/R1 is identical to the ratio Rx/R3, the
measured voltage VG will be zero. However, if the resistance of Rx changes
(e.g., due to changes in temperature), there will be an imbalance, which will
be reflected by a change in voltage VB. If all four resistor values and the
supply voltage (VS) are known, and the resistance of the galvanometer is high
enough thatIGis negligible, the voltage across the bridge (VG) can be found
by working out the voltage from eachpotential dividerand subtracting one
from the other. The equation for the relationship between these parameters
is:
æç
VG = VS ´ ççç
R2 - RX ö÷÷÷÷
÷
ççè R1+R 2 RX + R 3 ÷ø
(10.2)
{\displaystyle V_{G}=\left ({R_{2}\over {R_{1}+R_{2}}}-{R_{x}\over
{R_{x}+R_{3}}}\right) V_{s}} where VG is the voltage of node D relative
to node B.
e´A
C = (10.3)
d
where A is the plate area and d is the distance between the two plates.
Similar to the resistive model, changes in any of these parameters will
change the capacitance. For example, if pressure is applied to one of the two
plates, the separation d can be reduced, thereby increasing the capacitance.
Similarly, a change in the permittivity of the dielectric can be caused by
an increase in temperature or humidity, thereby resulting in a change in
capacitance. Existing and potential applications of sensor networks include,
among others, military sensing, physical security, air traffic control, traffic
surveillance, video surveillance, industrial and manufacturing automation,
process control, inventory management, distributed robotics, weather
sensing, environment monitoring, national border monitoring, and building
and structures monitoring.
Typical primary design concerns are Energy is the main constraint in the
network performance and latencies; en- design of all node and network compo-
ergy is not a primary concern nents
Networks are designed and engineered Deployment, network structure, and
according to plans resource use are often ad hoc (without
planning)
Devices and networks operate in con- Sensor networks often operate in envi-
trolled and mild environments ronments with harsh conditions
Maintenance and repair are common, Physical access to sensor nodes is often
and networks are typically easy to ac- difficult or even impossible
cess
Component failure is addressed through Component failure is expected and ad-
maintenance and repair dressed in the design of the network
Obtaining global network knowledge is Most decisions are made localized with-
typically feasible, and centralized man- out the support of a central manager
agement is possible
Structure of a Wireless Sensor Network includes different topologies for
radio communications networks. A succinct discussion of the network
topologies that apply to wireless sensor networks are outlined below:
can communicate with multiple other nodes, if a node fails or if a radio link
goes down (e.g., due to interferences or lack of battery), the network will
reconfigure itself around the remaining nodes.
10.4.1 TinyOS
TinyOSis an open source, flexible, component based, and application-
specific operating system designed for sensor networks. TinyOS can support
concurrent programs with very low memory requirements. The OS has a
footprint that fits in 400 bytes. The TinyOS component library includes
network protocols, distributed services, sensor drivers, and data acquisition
tools. TinyOS falls under the monolithic architecture class. TinyOS uses
the component model and, according to the requirements of an application,
different components are glued together with the scheduler to compose
Wireless Sensor Network 219
10.4.3 Contiki
Contiki (Dunkels et al. 2004) is a hybrid operating system. By default, its
kernel functions as an event-driven kernel but multithreading support is
implemented as an application library. There is a dynamic linking strategy
to couple the multithreading library with applications that explicitly require
it. Like SOS, Contiki realizes the separation of the basic system support
(by the kernel) from the rest of dynamically loadable and reprogrammable
services, which are called processes. The services communicate with each
other through the kernel by posting events. The kernel itself does not provide
any hardware abstraction; instead it allows device drivers and applications
to communicate directly with the hardware. This restricted scope of the
kernel makes it easy to reprogram and replace services. Each Contiki
service manages its own state in a private memory and the kernel keeps a
pointer to the process state. However, a service shares with other services
the same address space. It also implements an event handler and an optional
Wireless Sensor Network 221
10.4.4 LiteOS
LiteOS is a Unix-like operating system designed for WSNs at the University
of Illinois at Urbana-Champaign. The motivations behind the design
of a new OS for WSN are to provide a Unix-like OS for WSN, provide
system programmers with a familiar programming paradigm (thread-based
programming mode, although it provides support to register event handlers
using callbacks), a hierarchical file system, support for object-oriented
programming in the form of LiteC++, and a Unix-like shell. The footprint
of LiteOS is small enough to run on MicaZ nodes having an 8 MHz CPU,
128 bytes of program flash, and 4 Kbytes of RAM. LiteOS is primarily
composed of three components: LiteShell, LiteFS, and the Kernel.
222 Internet & Distributed Systems
10.5 MIDDLEWARE
Middleware derives from the gap between the high-level requirements from
pervasive computing applications and the complexity of the operations in
the underlying WSNs. The application requirements include high flexibility,
re-usability, and reliability. The complexity of the operations with a WSN is
characterized by constrained resources, dynamic network topology, and low
level embedded OS APIs. WSN middleware provides a probable solution to
bridging the gap and eliminating the inhibitions. In the initial stage of the
research on WSN, people did not pay much attention to middleware because
the simplicity of the early applications did not show much request for the
support from the middleware. Along with the rapid evolution in this area, the
gap becomes increasingly obvious and hinders the popularity of WSN-based
applications. WSN middleware helps the programmer develop applications
in several ways. First, it provides appropriate system abstractions, so that the
application programmer can focus on the application logic without caring
224 Internet & Distributed Systems
too much about the lower level implementation details. Second, it provides
reusable code services, such as code update, and data services, such as data
filtering, so that the application programmer can deploy and execute the
application without being troubled with complex and tedious functions.
Third, it helps the programmer in network infrastructure management and
adaptation by providing efficient resource services, e.g., power management.
It also supports system integration, monitoring, as well as system security in
distributed computing systems, WSN poses new challenges to middleware
research. The traditional middleware techniques cannot be applied directly
to WSNs. First, most distributed system middleware techniques aim at
providing transparency abstractions by hiding the context information, but
WSN-based applications should usually be context aware. Second, although
many mobile computing middleware supports context awareness, their major
concern is how to continuously satisfy the interests of individual mobile
nodes in the presence of mobility. In contrast, WSN-based systems are data
centric, reflecting the whole application’s interests. Thus, the locations and
mobility of the sensor nodes should be handled by WSN middleware in a
different way. For example, a node moving away from a phenomenon may
choose to hand off the monitoring responsibility to a nearby node. Also,
WSNs mostly use attribute based addressing rather than relying on network
wide unique node addresses. Third, data aggregation in intermediate nodes
of the forwarding path is desirable in a WSN, but no such kind of support is
provided in traditional distributed system middleware because of the end-to-
end paradigm used. Finally, WSN requires the middleware to be light weight
for implementation in sensor nodes with limited processing and energy
resources. WSNs also have new requirements on hardware (e.g., various
sensors and computing nodes), operating systems and routing protocols, as
well as the applications.
increased at the cost of loss of detail. The intuition behind the system is that,
as data get older, the importance of detail decreases. Another model is the
sensor-as-database model, which turns the on-board storage into a database,
from which records can be retrieved with SQL-like queries.
10.6.4 Deployment
Node deployment is a fundamental issue to be solved in Wireless Sensor
Networks. A proper node deployment scheme can lessen the complexity of
difficulties. Deploying and managing a high number of nodes in a relatively
bounded environment requires exceptional techniques. Hundreds to several
thousands of sensors may be deployed in a sensor region. They are deployed
within tens of feet of each other. The node densities may be as high as 20
nodes/m3. Deploying high number of nodes densely requires careful handling
of topology maintenance. There are two deployment models at present and
these are static- and dynamic deployment. The static deployment chooses
the best location according to the optimization strategy, and the location of
the sensor nodes has no change in the lifetime of the WSN. The dynamic
deployment throws the nodes randomly for optimization.
should be much less than 1 $ in order for the sensor network to be feasible.
The cost of a Bluetooth radio, which is known to be a low-cost device, is
even 10 times more expensive than the targeted price for a sensor node.
REFERENCES
1. Akyildiz, I., Su, W., Sankarasubramaniam, Y, &Cayirci, E. (2002).
Wireless sensor networks: A survey. Computer Networks,38,393-422.
2. Chiara, B., Andrea, C.,Davide, D.,&Roberto, V. (2009). An overview
on wireless sensor networks technology and evolution. Sensors,9,
6869-6896.
3. Dunkels, A., Schmidt, O., Voigt, T., & Ali, M. (2006, October 31–
November 2).Protothreads: Simplifying event-driven programming
of memory-constrained embedded systems.Proceedings of the
International Conference on Embedded Networked Sensor Systems
(ACM SenSys). Boulder, CO: ACM.
4. Nath, S. (2009).Energy efficient sensor data logging with amnesic flash
storage.Proceedings of the International Conference on Information
Processing in Sensor Networks (ACM/IEEE IPSN)April 13-16, 2009.
San Francisco, CA: IEEE Computer Society.
5. Sohraby, K.,Minoli, D.,& Znati, T. (2007).Wireless sensor
networks: technology, protocols, and applications.USA: John
Wiley & Sons.Retrieved January 30, 2018, from http://dx.doi.
org/10.1002/047011276X
6. Wilson, J. (2005). Sensor technology handbook.Burlington, MA, USA:
Elsevier/Newnes.
BIBLIOGRAPHY
Communications,8(5), 10-17.
21. Garcia-Molina, H., & Barbara, D. (1985).How to assign votes in a
distributed system.Journal of the ACM.
22. Gaddah, A., & Kuns, T. (2003 July).A survey of middleware paradigms
for mobile computing.Systems and Computer Engineering,Carleton
University, (Technical Report SCE-03-16).
23. Goscinski, A. (1990).Two algorithms for mutual exclusion in real-time
distributed computing systems.Journal of Parallel and Distributed
Computing,9.
24. Gulli, A., & Signorini, A. (2005).The indexable Web is more than 11.5
billion pages.Proceedings of the 14th International Conference on
WWW (Chiba, Japan),(pp. 902-903). [An estimate of the Web size at
that time].
25. Helary, M., Plouzeau, N., & Raynal, M. (1988).A distributed
algorithm for mutual exclusion in an arbitrary network.The Computer
Journal,31(4), 289-295.
26. Kumar, P., & Selvakumar, S.( 2011, July 15).Distributed denial
of service attack detection using an ensemble of neural classifier.
Computer Communications,34(11), 1328-1341.
27. Liu, H., Luo, P.,& Wang, D. (2008 November ). A scalable
authentication model based on public keys.Journal of Network and
Computer Applications,31(4), 375-386.
28. Lorch, J., & Smith, A. J. R. (1998). Software strategies for portable
computer energy management. IEEE Personal Communications,60-73.
29. Malacaria, P., & Smeraldi, F.( 2013 May). Thermodynamic aspects
of confidentiality.Information and Computation Special Issue:
Information Security as a Resource,226, 76-93.
30. Peterson, I., & Silberschatz, A. (1985) Operating Systems Concepts.
Addison–Wesley Publishing Co.
31. Smart Card Alliance. (2003).Using smart cards for secure physical
access.Princeton Junction: Smart Card Alliance. Retrieved from http://
www.smartcardalliance.org/resources/lib/Physical_Access_Report.
pdf
32. Prakasha, V.,&Darbarib, M. (2013). A new proposal for distributed
system security framework. AASRI Procedia2013 AASRI Conference
on Parallel and Distribute Computing and Systems (pp. 183-188).
234 Internet & Distributed Systems
A Computer networking 85
Concurrent programming language
Advanced Encryption Standard
189
(AES) 155
Cyclic redundancy check (CRC) 39
Analog-to-digital converter (ADC)
209 D
Asynchronous Transfer Mode
Data Encryption Standard (DES)
(ATM) 85
155
B Data processing system 101
Decentralized object location and
Body area networks (BANs) 172
routing (DOLR) 131
C Digital Signature Algorithm (DSA)
150, 151
Client-server communication 44
Distributed file systems (DFS) 108
Commercial-off-the-shelf (COTS)
Distributed hash table (DHT) 132
96
Distributed object middleware
Common Object Request Broker
(DOM) 32
Architecture (CORBA) 198
Distributed operating system 141
Communication channel 217
Distributed Programming Environ-
Communications topology 215
ment (DPE) 188
Computer communication 45
236 Internet & Distributed Systems