You are on page 1of 7

BCS HIGHER EDUCATION QUALIFICATIONS BCS Level 6 Professional Graduate Diploma in IT April 2011 EXAMINERS REPORT Distributed &

Parallel Systems
Overall Examiners Comments Overall, the cohort of candidates sitting the April 2011 paper performed very well, having achieved the highest pass rate in several years.

Section A
A1. a) Outline a transaction scenario that may lead to deadlock in a parallel/distributed system; propose a scheme whereby deadlocks may be avoided. (10 marks) Briefly describe the differences between processes and threads, stating where each is an appropriate choice in the development of parallel/distributed applications. (5 marks) Distinguish between synchronous and asynchronous inter-process communication, providing one example of each. (5 marks) Distinguish between datagram and stream sockets, suggesting where each would be an appropriate choice for a parallel/distributed application. (5 marks)

b)

c)

d)

Answer Pointers A1. a) Deadlocks occur when a process waits for a resource indefinitely that is not forthcoming since the resource has, for one reason or another, not been relinquished by another process. For example, two competing processes P1 and P2 both require resources A and B to complete. P1 has acquired resource A, and P2 has acquired resource B, and both wait indefinitely for the other resource to become available without relinquishing the lock on the resource they currently hold. Deadlocks can also occur when a process is in a blocked state, awaiting a message from a cooperating process that does not arrive. In either case, this may occur for a variety of reasons, such as race conditions, communication link failure, host failure, messages arriving out of order, or corrupted. Deadlocks may be avoided in a number of ways; examples include incorporating a time limit when blocking/waiting for messages, after which contingency code is executed to work around missing messages (such as a re-request), and to employ schemes such as semaphores to ensure that critical sections of code and resource acquisitions are coordinated in such a way that reduces the risk of deadlocks arising from the scenarios outlined above. In a similar fashion, locked resources could be relinquished after a period of time without successfully securing the other resources needed to continue executing. Alternatively, we might require that a process acquires all necessary resources simultaneously before locking those resources, or implement a system that supports the pre-emptive acquisition of resources from competing processes. These are difficult to accomplish in distributed systems since there is no global view of the system; for this reason, it is common to employ deadlock detection systems with the capability to roll back processes to an epoch before the deadlock occurred

in an attempt to avoid the original issue. A1. b) Threads and processes are both mechanisms to achieve concurrent execution. Processes, which are essentially programs (or largely separate program components) in a state of execution, contain state information, possess their own address space, and interact via O/S mediated inter-process communication mechanisms (IPCs). Multiple processes may be generated within a single application (e.g., in Unix, via the fork system call) to create an application architecture for example, larger applications may be formed from multiple processes performing different tasks that share data via IPCs. A process may contain multiple threads. A thread is, relative to a process, not an architectural construct; threads share identical state, memory space, and may be configured to communicate directly with one another. Processes consume more resources than threads. Switching between threads is faster than switching between processes. A1. c) Blocking inter-process communication routines do not return until the communication attempted has been successfully completed (e.g., for a producer, the message transmitted, and for a consumer, the message received). They enforce synchronisation, since the receiving application will not continue without the expected data. Conversely, non-blocking routines do not wait for data transfer to complete, returning regardless. This enables an application to continue if an expected message is not forthcoming, avoiding deadlocks caused by lost messages, but potentially requiring extra code to re-request missing messages or otherwise deal with their absence. Blocking communication is used in a scenario in which parallel tasks must be completed in a specific order; non-blocking communication may be used where tasks may continue with other work whilst waiting for a result, perhaps periodically checking to check if an awaited result has arrived, reducing time expended in an inactive state. A1. d) Datagram sockets do not require that a connection is established prior to transmission (i.e., are connectionless), with each message being stamped with the destination address. Twoway communication is supported, but data transfer is unreliable and messages may arrive in a different order than the original transmission order, which may present a deadlock rick if not properly controlled. Stream sockets are also two-way, but, unlike datagram socket connections, stream sockets are connection-oriented, and are both sequenced and reliable. Examiners Comments This question was jointly the second most popular in this years examination paper, although it accrued the second lowest average marks.

A2.

a)

Contrast distributed shared memory (DSM) and message passing approaches in parallel/distributed computing. (10 marks) A sequential program has three principal sections. The input section takes 25% of the total time. Processing section takes 50% of the total time. The output section takes the remaining 25%. What is the maximum attainable speedup if only the processing part can be parallelized? (5 marks) How does the concept of efficiency differ from speedup in a parallel/distributed application? How is efficiency calculated? (5 marks) Propose a scheme for load balancing on a heterogeneous high-performance compute cluster with a head node generating a continuous stream of variable-sized tasks. (5 marks)

b)

c)

d)

Answer Pointers A2. a) DSM and message passing may be distinguished as follows. In DSM, physically distributed memories appear to the programmer as a single resource (i.e., as a single logical address space). Internally (and transparently), DSM systems use message passing techniques to maintain this illusion, relieving the onus of inter-host or inter-task communication from the programmer, with good potential to reduce the size and complexity of program code. In a conventional message passing system, explicit calls to transmit/receive messages are required between communicating parties, such that the programmer is explicitly aware that different data will be stored on different processes that may be physically spread among the hosts of the distributed system (although it is not necessarily the case that the programmer will know on which host the required data resides, he/she typically is required to know with which process/task communication is required). A2. b) If only the parallel section of the program described can be parallelised, the maximum attainable speedup is 2 times, since even if the time taken to execute the parallel section were reduced to 0, one half of the program (the input and output sections, collectively accounting for 50% of the original program execution time) cannot be improved upon. An upper bound on speedup is in effect, as dictated by Amdahls Law, beyond which no further improvement can occur. A2. c) Unlike speedup, which simply measures how much more quickly a parallel application executes in relation to a serial implementation (for a given number of processors), efficiency takes the specific number of processors used into consideration, and is thus indicative of how well these additional resources are being utilised. The formula for speedup is: S=TS/TP. The formula for efficiency is: E=TP/N, where N is the number of processors, which will vary between 0 and 1. A2. d) A sensible scheme for load balancing on a HPC with a head node generating a continuous stream of variable-sized tasks would be to allocate tasks dynamically, depending upon how quickly existing tasks return (as opposed to, for example, allocating the same number of tasks to each host). This is because the question specifically states that the hosts are heterogeneous, and therefore may have differing processing capabilities that, if a static scheduling approach were used, would need to be explicitly measured beforehand and taken into consideration by the task allocation algorithm in the head node. Furthermore, it is also difficult to know how quickly a particular task will execute, since this may be highly dependent upon the specific data to be processed (which may be independent of the volume of data to be processed).

Examiners Comments This was the second least popular question, and produced the lowest average marks. This is despite that several similar questions have appeared in earlier exams.

A3.

a)

Identify the basic quality of service parameters, and state how they would be configured differently for FTP and VOIP applications. (10 marks) Outline the main features of a grid computing system. (5 marks)

b)

c)

Distinguish between peer-to-peer and client-server distributed system architectures. (5 marks) Why is a distributed system usually more reliable than a non-distributed system? (5 marks)

d)

Answer Pointers A3. a) The three applicable QoS parameters are bandwidth, latency and loss rate. For a FTP application, bandwidth and latency are do not need to have specific minimums (though, clearly, higher bandwidth will decrease the amount of time taken to transmit the data file), but loss rate needs to be 0, since data loss is intolerable in the transmission of a data file (i.e., corruption of the file is likely to lead to it not functioning correctly once received). For a VOIP application, bandwidth will need to have a minimum tolerable level in order that the voice audio stream is serviceable. Also, to avoid delay/jitter, latency should be minimised. In VOIP, a non-zero loss rate may be tolerated, since data corresponding to a particular audio time-frame is only useful in if that frame is still current to the client. Where data that has not already been transmitted has been lost or is unlikely to be displayed at the clients machine because of its age, it can be safely abandoned. A3. b) A grid computer system, relative to a conventional cluster, utilises processing units (which are typically entire PCs from different administrative domains) that are heterogeneous and widely geographically dispersed. It is common to use middleware such as BOINC to oversee the allocation of work and the collection of results; example applications include SETI@Home and Folding@Home, which are volunteer computing projects in the sense that home users can elect to join BOINC to donate their spare CPU cycles (such as during the times when a screensaver would normally be activated). The workload allocated to participating hosts in a grid does not require communication with other hosts, which distinguishes Grid computing from conventional high-performance cluster computing, wherein the entire cluster is under central management and hosts may interact if required. A3. c) In a peer to peer architecture, each participating computer is equally privileged. Peer to peer networks are typically ad-hoc, with participants joining and leaving as required. Each participant provides a resource, which may include storage capacity, files for download or processor time. Resources are offered directly to peers without the need for intermediate server. Peer to peer networks are commonly used in file sharing services. In client-server architectures, it may be that one or more of the machines in the network will assume server responsibilities, offering services to clients that may make requests. Clients do not share resources. Though servers may be dedicated machines, this term may also be used to describe the services running (as background processes) on any standard machine. Relative to peer-to-peer networks, the client-server architecture tends to be less dynamic. Examples include HTTP, SMTP and telnet services.

A3. d) The potential for increased reliability in a distributed system stems from the distribution of workload and processing power across multiple hosts. With defensive programming, it is possible that the failure of one of more hosts will not jeopardise the global execution of the application, since the work allocated to these hosts may be redistributed to hosts with continued availability i.e., there is no single point of failure. We can think of this as fault tolerance, but it can also lead to high availability. Examiners Comments This was both the most popular question, and the question for which students accrued the highest marks, on average. Some candidates were not able to clearly distinguish between general cluster computing and grid computing.

Section B
B4. a) Describe a scheme by which email communication can be protected from masquerading, tempering, replay and denial of service attacks. (10 marks) Distinguish between symmetric and asymmetric security algorithms. (5 marks) c) Distinguish between steganography and cryptography. (5 marks) d) Compare public and private key encryption methods. (5 marks) Answer Pointers B4. a) For a typical email system (for example, based upon SMTP using POP/IMAP) retrieving mail from a local host: Weakness Sender is not authenticated Type of Attack Masquerading, Denial of Service Remedy Implement end-to-end authentication with digital signatures (for example, using PGP) Implement end-to-end authentication with digital signatures (for example, using PGP) Implement end-to-end encryption (for example, using PGP) Utilise Kerberos or SSL for the authentication of clients

b)

Message contents are not authenticated

Tampering, Masquerading

Message contents in plain text

Eavesdropping

Delivery and deletion from the POP or IMAP server is authenticated by a login with password only

Masquerading

B4. b) In a symmetric security algorithm, both encryption and decryption require the same key (or a key that must be transformed between two states, one for each task). The key (and any required transform) is held secretly by both communicating parties which must be exchanged somehow. In contrast, asymmetric security algorithms entail different keys for encryption and decryption, such that only the recipient keeps the decryption key. Both keys are mathematically related, but retrieving the private key from the public key is a prohibitively expensive operation to be attempted by hackers. B4. c) The steganography approach to secure communication involves embedding a sensitive message in a carrier, which typically a larger segment of data with an apparently different use. For example, embedding an email message at intervals within a sound file, such that only the recipient user has the correct information to extract the original message. If the carrier is large and is used merely as a capsule in which to embed the sensitive message, the volume of data to be transmitted is large. On the other hand, cryptography requires that the original message is modified in some way so that it is no longer directly readable. This may not increase the volume of data to be processed, but required that the recipient has the correct algorithm and key to decrypt the message. These techniques may also be used in unison encrypting a sensitive message and then embedded it in a carrier using steganography. B4. d) This question is related to the one posed above (b), which focussed upon symmetric and asymmetric algorithms, which are typically implemented by means of private and public key encryption (i.e., these terms are more specific). Using private key encryption, both communicating parties share a secret key before the exchange takes place to enable encryption/decryption to take place (it is common for encryption/decryption keys to be identical, but they might also be related via a trivial transform). Conversely, in the public key approach, two different keys exist, one public and one private. It is often that case that neither may be inferred from the other, but this is not always so. The public key, which is freely available, enables messages to be encrypted, but messages cannot be decrypted without the private key. Interestingly, not even the sender possesses the private decryption key, so not even that can decrypt the message that they have prepared for transmission. Examiners Comments This was the least popular question on this years paper, and accrued the second lowest mean score. Few students were able to correctly distinguish steganography and cryptography approaches to security. Question a) was adapted from an exercise in the recommended textbook, but despite this it was generally not well answered.

B5.

Your employer has asked to your prepare an oral presentation comparing the relative merits of high-availability and high-performance cluster computing. Provide 8-10 content-rich slides for of your presentation, with explanatory notes where necessary. Note: your answer will be assessed for its quality of approach, accuracy of content, clarity of expression, range of discussion, and depth of argument. (25 marks)

Answer Pointers This question format features regularly in the distributed & parallel systems examination. In this instance, candidates were required to discuss HA and HP clustering. Candidates would be expected to spend approx. 5 minutes on each slide, focusing on quality rather than quantity, and to ensure that the slides presented are succinct, factual and informative. Furthermore, candidates should endeavour to ensure that all slides presented are relevant to the topic to be discussed. Credit is given for identifying relevant issues, illustrating these carefully (with both words and diagrams), and having a logical structure to the presentation that leads the reader through the topics identified in a thoughtful manner. Examiners Comments This question remained popular, being jointly the second most commonly attempted question. It was the question with the second highest mean mark. Candidates who carefully selected an appropriate number of relevant topics, and presented information clearly and succinctly scored the highest marks. Candidates who used a scattershot approach, presenting a large number of slides containing barely relevant material scored considerably less well.