You are on page 1of 193

Master of Computer Application (MCA) Semester 4 MC0076 Management Information Systems

Assignment Set -1

Q. 1. What Do You Understand By Information Process data ? Information as Processed Data Data are generally considered to be raw facts that have undefined uses and application; information is considered to be processed data that influences choices, that is, data that have somehow been formatted, filtered, and summarized; and knowledge is considered to be an understanding derived from information distinctions among data, information, and knowledge may be derived from scientific terminology. The researcher collects data to test hypotheses; thus, data refer to unprocessed and unanalysed numbers. When the data are analysed, scientists talk about the information contained in the data and the knowledge acquired from their analyses. The confusion often extends to the information systems context, and the three terms maybe used interchangeably. Information as the Opposite of Uncertainty A different perspective on information derives from economic theory and defines information as the negative measure of uncertainty; that is, the less information is available, the more uncertainty exists, and conversely, the more information is available, the less uncertainty exists? In microeconomic theory the equilibrium of supply and demand depends on a market known as a perfect market, where all buyers and sellers have complete knowledge about one another and where uncertainty does not exist. Information makes a market perfect by eliminating uncertainties about supply and demand. In macroeconomic theory, firms behave according to how they read the economic climate. Economic signals that measure and predict the direction of the economy provide information about the economic climate. The firm reduces its uncertainty by decoding these signals.

Taking an example of Federal Express in USA, each incoming aircraft has a scheduled arrival time. However, its actual arrival depends on unforeseen conditions. Data about when an aircraft departed from its destination is information in the economic sense because it reduces uncertainty about the aircrafts arrival time, thereby increasing Federal Expresss ability to handle arriving packages. Managers also define information in terms of its reducing uncertainty. Because managers must project the outcomes of alternatives in making decisions, the reduction of uncertainty about the outcomes of various alternatives improves the effectiveness of the decision- making process and the quality of the decision. Information as a Meaningful Signal Information theory, a branch of statistics concerned with measuring the efficiency of communication between people and/or machines, defines information as the inputs and outputs of communication. Electronic, auditory, visual, or other signals that a sender and receiver interpret similarly convey information. For example, in the recruitment scenario about, the resumes and applications for the open positions are information because they are signals sent by the applicants, and interpreted similarly by both. The Managers in their roles as communicators both generate and receive information. They receive reports that organize signals or data in a way that conveys their meaning. Reports of sales trends become information; so do reports about hazardous waste sites. Managers derive meaning from the information they see and hear as part of communication and use it to make decisions. This definition of information requires a manager to interpret a given signal as it was intended. For example, a managers incorrect interpretation of body language in a negotiation would not be considered to be information from this perspective, although we know that managers use both correct and incorrect perceptions as information in decision making and other managerial functions. Again, this view of information suggests the complexity of the concept and the value of a multifaceted definition. Q. 2. Discuss the Components of An Organizational Information Systems.

Ans. Components of an Organizational Information System The environment in which organizations operate from the informational perspective in terms proposed by George Huber of the University of Texas, who has studied the organizational design required by an information society. His conclusions provide a framework for determining what is required of an organizational information system. These, according to Huber, are the hallmarks of an information society: 1) Dramatic Increase of Available Knowledge Whether measured in terms of the number of scholarly journals, patents and copyrights, or in terms of the volumes of corporate communications, both the production and the distribution of knowledge have undergone a manifold increase. 2) Growth of Complexity Huber characterizes complexity in terms of numerosity, diversity, and interdependence. A growing world population and the industrial revolution combined to produce numerosity, or a growing number of human organizations. To succeed, people and organizations learned to specialize: they do things differently and organize themselves differently to accomplish specialized tasks. These differences lead to diversity. Two principal factors have led to increased interdependence. The first as been the revolution in the infrastructure of transportation and communication. The second factor is specialization in firms that make narrowly defined products, as opposed to the self-sufficiency of companies producing a complex product down to its minute elements. A companys product is typically a part of a larger system, produced with contributions from a number of interdependent firms (consider a car or a computer). Moreover, interdependence has increased on a global scale. Even the most isolated of countries participates in some way in the international division of labor. Organizations operating in the public sector, while rarely in a competitive situation, are still governed by the demands of society. Pressures on the public sector in democratic societies, along

with the pressures conveyed from the private sector, also make the environment in which public organizations operate more complex. 3) Increased Turbulence The pace of events in an information society is set by technologies. The speeds of todays computer and communication technologies have resulted in a dramatic increase in the number of events occurring within a given time. Consider the volumes and speed of trades in the securities and currency markets. Widespread use of telefacsimile, as another example, has removed the "float"-the lag between sending and receiving-in written communications. Equally important, because of the infrastructure discussed earlier, the number of events that actually influence an organizations activities (effective events) has also grown rapidly. The great amount of change and turbulence pressuring organizations today thus calls for rapid innovation in both product and organizational structure. To thrive, an organization must have information systems able to cope with large volumes of information in a selective fashion. Huber concludes that these factors an increase of available knowledge, growth of complexity, and increased turbulence-are not simply ancillary to a transition to the new societal form. Rather, they will be a permanent characteristic of the information society in the future. Moreover, we should expect that these factors would continue to expand at an accelerating rate (a positive feedback exists). Barring some catastrophic event, we expect that the rapidly changing environment will be not only "more so" but also "much more so." To succeed in an information society, organizations must be compatible with this environment.

Q. 3. What are the Features Contributing to Success and failure of MIS Models ? Ans. The components of MIS The physical components of MIS comprise the computer and communications hardware, software, database, personnel, and procedures. Almost all organizations employ multiple computer systems, ranging from powerful mainframe machines (sometimes including

supercomputers) through minicomputers, to widely spread personal computers (also known as microcomputers). The use of multiple computers, usually interconnected into networks by means of telecommunications, is called distributed processing. The driving forces that have changed the information processing landscape from centralized processing, relying on single powerful mainframes, to distributed processing have been the rapidly increasing power and decreasing costs of smaller computers. Though the packaging of hardware subsystems differs among the three categories of computers (mainframes, minicomputers, and microcomputers), all of them are similarly organized. Thus, a computer system comprises a central processor (though multiprocessors with several central processing units are also used), which controls all other units by executing machine instructions; a hierarchy of memories; and devices for accepting input (for example, a keyboard or a mouse) and producing output (say, a printer or a video display terminal). The memory hierarchy ranges from a fast primary memory from which the central processor can fetch instructions for execution; through secondary memories (such as disks) where on-line databases are maintained; to the ultra high capacity archival memories that are also employed in some cases. COMPONENT DESCRIPTION Hardware Multiple computer systems: mainframes, minicomputers, personal computers Computer system components are: central processor(s), memory hierarchy, input and output devices Communications: local area networks, metropolitan area Software Database Personnel Procedures networks, and wide area networks Systems software and applications software Organized collections of data used by applications software Professional cadre of computer specialists; end users in certain aspects of their work Specifications for the use and operation of computerized information systems collected in user manuals, operator manuals, and similar documents

Multiple computer systems are organized into networks in most cases. Various network configurations are possible, depending upon an organizations need. Fast local area networks join machines, most frequently clusters of personal computers, at a particular organizational site such as a building or a campus. The emerging metropolitan area networks serve large urban communities. Wide area networks connect machines at remote sites, both within the company and in its environment. Through networking, personal-computer users gain access to the broad computational capabilities of large machines and to the resources maintained there, such as large databases. This connectivity converts personal computers into powerful workstations. Computer software falls into two classes: systems software and applications software. Systems software manages the resources of the system and simplifies programming. Operating systems (UNIX, for example) control all the resources of a computer system and enable multiple users to run their programs on a computer system without being aware of the complexities of resource allocation. Even if you are just using a personal computer, a complex series of actions takes place when, for example, you start the machine, check out its hardware, and call up a desired program. All of these actions fall under the control of an operating system, such as DOS or IBM OS/2. Telecommunications monitors manage computer communications; database management systems make it possible to organize vast collections of data so that they are accessible for fast and simple queries and the production of reports. Software translatorscompilers or interpreters, make it possible to program an application in a higher-level language, such as COBOL or C. The translator converts program statements into machine instructions ready for execution by the computers central processor. Many categories of applications software are purchased as ready-to-use packages. Applications software directly assists end users in their functions. Examples include general-purpose spreadsheet or word processing programs, as well as the so-called vertical applications serving a specific industry segment (for example, manufacturing resource planning systems or accounting packages for small service businesses). The use of purchased application packages is increasing. However, the bulk of applications software used in large organizations are developed to meet a specific need. Large application systems consist of a, number of programs integrated by the database.

To be accessible, data items must be organized so that individual records and their components can be identified and, if needed, related to one another. A simple way to organize data is to create files. A file is a collection of records of the same type. For example, the employee file contains employee records, each containing the same fields (for example, employee name and annual pay), albeit with different values. Multiple files may be organized into a database, or an integrated collection of persistent data that serves a number of applications. The individual files of a database are interrelated. Professional MIS personnel include development and maintenance managers, systems analysts, programmers, and operators, often with highly specialized skills. The hallmark of the present stage in organizational computing is the involvement of end users to a significant degree in the development of information systems. Procedures to be followed in using, operating, and maintaining computerized systems are a part of the system documentation.

Q. 4. List down the potential external opportunities, potential internal weaknesses. Ans. Potential External Opportunities Serve additional customer groups Enter new markets or segments Expand product line to meet broader range of customer needs Diversify into related products Vertical integration Falling trade barriers in attractive foreign markets Complacency among rival firms Faster market growth

Potential Internal Weaknesses No clear strategic direction Obsolete facilities Lack of managerial depth and talent Missing key skills or competence Poor track record in implementing strategy Plagued with internal operating problems Falling behind in R&D Too narrow a product line Weak market image Weaker distribution network Below-average marketing skills Unable to finance needed changes in strategy Higher overall unit costs relative to key competitors

Q.5 What do you Understand by Multinational Corporation, Global Corporation, International Corporation, Transnational Corporation. Ans. Multinational Corporation A multinational corporation has built or acquired a portfolio of national companies that it operates and manages with sensitivity to its subsidiaries local environments. The subsidiaries

operate autonomously, often in different business areas. A company that follows a multinational strategy has little need to share data among its subsidiaries or between the parent and subsidiaries except to consolidate financial positions at years end. Global Corporation A global corporation has rationalized its international operations to achieve greater efficiencies through central control. Although its strategy and marketing are based on the concept of a global market, a headquarters organization makes all major decisions. A company pursuing a global strategy needs to transfer the operational and financial data of its foreign subsidiaries to headquarters in real time or on a frequent basis. A high level of information flows from subsidiary to parent, while limited data move from parent to subsidiary. International Corporation An international corporation exports the expertise and knowledge of the parent company to subsidiaries. Here subsidiaries operate more autonomously than in global corporations. Ideally, information flows from the parent to its subsidiaries. In practice, subsidiaries often rely on the parent to exercise its knowledge for the subsidiaries benefit rather than simply to export it to the subsidiaries. For example, a subsidiary without a great deal of human resources expertise may "pay" its parent to operate its human resources function. Although the information theoretically should stay within the subsidiary, in this case it may flow back and forth between the parents location and the subsidiarys location. Transnational Corporation A transnational corporation incorporates and integrates multinational, global, and international strategies. By linking local operations to one another and to headquarters, a transnational company attempts to retain the flexibility to respond to local needs and opportunities while achieving global integration. Because transnational operate on the premise of teamwork, they demand the ability to share both information and information services.

Q-6. What are the limitation of ERP systems ? How ERP Packages help in overcoming these limitations ?. Ans. ERP Selection Since, the market offer a number of ERP packages, the buyer has a choice to make. Each product has its own USP and differs in a number of ways in content, scope, an ease of implementation, etc. The selection can be made on three dimensions, viz, the vendor, the technology, the solution scope, and architecture. Vendor Evaluation Factors 1) Business strength of the vendor. 2) Product share in total business of the vendor. 3) R & D investment in the product. 4) Business philosophy of the vendor. 5) Future plans of the vendor. 6) Market reach and resource strength of the vendor. 7) Ability to execute the ERP solution. 8) Strength in the other technology knowledge and the ability to use them. 9) Perspective plan of the ERP improvement with technology development. 10) Image in the business and in the information technology world. 11) Financial strength of the vendor to sustain and handle the business and technology 12) Organisation for product development and support. risk.

13) The global experience of the vendor and commitment to the product for long term. Technology Evaluation Factor 1) Client server architecture and its implementation-two tier or three tier. 2) Object orientation in development and methodology. 3) Handling of server and client based data and application logic. 4) Application and use of standards in all the phases of development and in the product. 5) Front end tools and back end data based management system tools for the data, process presentation management. 6) Interface mechanism: Data transfer, real time access, OLE/ODBC compliance. 7) Use of case tool, screen generators, report writers, screen painter and batch processor. 8) Support system technologies like bar coding, EDI, imaging, communication, network. 9) Down loading to PC based packages, MS-Office, lotus notes, etc. 10) Operating system and its level of usage in the system. 11) Hardware-software configuration management. ERP Solution Evaluation Factor 1) ERP fit for the business of the organization in terms of the functions, features and processes, 2) business scope versus application scope and so on. 3) The degree of deviation from the standard ERP product.

4) Ease of use: Easy to learn, implement and train. 5) The ability to migrate to the ERP environment from present status. 6) Flexible design. 7) The level of intelligent usage of help , error messages, dictionaries. 8) The ability for a quick start on implementation. 9) Versatility of the solution for implementation on a platform with the project of saving the investment. 10) Rating on performance, response and integration. 11) Product quality in terms of security, reliability, and precision in results. 12) Documentation for system handling and administration. 13) Product rating in its class of products. 14) Solution architecture and technology. The methodology of selection will begin first with the study of organisation in terms of the business focus, critical application, sensitive business process, etc. Since, the ERP solution is a tool to change the style of business management, it requires thorough understanding of the business, the business issues, the management criticalities, and the socio-cultural factors. Such a study will help find out if the ERP is fit for the organisation. It is a very important to find out that the ERP is fit or not, as it is the most important and critical success factor. The price of the ERP package is difficult to judge and often it is a negotiable point in favour of the buyer in competitive scenario. Since the ERP implementation is a two three years project, the ERP solution will sustain and be adequate for the current and the future business needs for a period of five to seven years. After that, it would become a platform for the future expansions and growth.

It is advisable for the organisation to form a committee for selection of the ERP solution. It should have important functional head, a strong Information Technology person and a person from corporate planning function. The committee should be headed by a CEO or his designated authority. This committee should prepare a requirement document spelling out the business goals, and objectives, the futuristic scenario of business, the critical functions, processes, business focus and customer deliverables. A note on the management philosophy, procedures, practices and style will be a valuable input. When such a document is ready, the selected ERP vendors should be called for seeking the ERP offer. The document should be given to the vendors, and they should be allowed to study the organisation and its business. All the vendors should be asked to submit a technical proposal explaining the fit of the ERP to the organisation. The submission of the vendors should be scrutinized by the committee for short-listing. The short-listed vendors then should be asked to give the product presentation to the selected group of decision makers to seek their opinion on the product. When the product presentation is over, product demonstration should be arranged, for a detailed security and evaluation. In this process, the committee should confirm whether the critical requirement of business, in terms of information, process handling facilities, features, etc. are available or not. If some of them are not available then there is a possibility of work around to achieve the same result. A second evaluation note should be made for a comparative analysis of the ERP solutions and then a critical evaluation of this analysis should lead to the choice list. Simultaneously, the committee should gather information on the experience of the other organisation where the ERP is implemented. This information should be on how successful the vendor is, in the implementation of the ERP? The strengths and the weaknesses of the vendor, the product and the post sales processes should be ascertained. The choice list should be weighed by these points. Though such an approach is appropriate, it is not always possible to bring out a clear win in the evaluation, as many factors are intangible in nature. In such an event, the committee should examine the trade off involvement in the selection. It should not happen that organisational issue

dominate the choice of the ERP and in the process the best product is rejected. Ideally, the organisation should be carrying out business process engineering and reengineering study, restructure the organisation, modify the processes functionalities before the ERP decision is made. Once the committee makes the decision, the vendor should be asked to resubmit the technical and commercial proposal with price and the terms of offer. The proposal should have the following details: 1) Scope of supply 2) Objectives 3) Modules and deliverables 4) Implementation methodology 5) Plan and schedules of hardware and software implementation 6) Resource allocation 7) Responsibility division between the organisation and the vendor 8) Process of implementation 9) Organisation of implementation 10) Progress monitoring and control of the important events 11) Process of resolving the issue all levels 12) The official product literature 13) Association with the other vendor its purpose 14) Commercial submission:

i. Price by module and number of users ii. Payment terms 15) Process of acceptance of the ERP by stages and linking with the payments Once the ERP decision is made, the vendor and organisation enter into a legal contract. Such legal contract should list the obligations, duties, responsibilities, deliverables and the value components. It should also include the clauses on issues arising out of unforeseen circumstances and how to resolve them with the legal remedy available to both the parties. Since, the ERP is a product of several technologies, there should be clauses relating to safeguarding the interests of each other to cover the risk arising out of the technology failure. The ERP is a tool to manage the enterprise resources to achieve the business objective. It is a supporting system and does not solve all the problems of business management. The success of the ERP lies in its implementation with commitment. It requires full participation of the organisation. It is to be appreciated as a managerial tool and not as a labour saving device. Since, potentially the ERP is designed for productivity rise, the management must exploit it to its advantage by adopting the best practices or changing the practices through the business process reengineering.

MC0076 Management and Information Systems


Assignment Set 2 (60 Marks)

1. How do you define Management Reporting Systems?


Ans Management Reporting Systems Management reporting systems (MRS) are the most elaborate of the managementoriented MIS components. Indeed, some writers call MRS management information systems, the name we reserve for the entire area of informational support of operations and management. The main objective of MRS is to provide lower and middle management with printed reports and inquiry capabilities to help maintain operational and management control of the enterprise. Characteristics of MRS 1) MRS are usually designed by MIS professionals, rather than end users, over an extensive period time, with the use of life-cycle oriented development methodologies (as opposed to first building a simpler prototype system and then refining it in response to user experience). Great care is exercised in developing such systems because MRS is large and complex in terms of the number of system interfaces with various users and databases. 2) MRS is built for situations in which information requirements are reasonably well known and are expected to remain relatively stable. Modification of such systems, like their development, is a rather elaborate process. This limits the informational flexibility of MRS but ensures a stable informational environment. 3) MRS does not directly support the decision-making process as a search for alternative solutions to problems. Naturally, information gained through MRS is used in the manager's decision-making process. Well-structured decision rules, such as economic order quantities for ordering inventory or accounting formulas for computing various forms of return on equity, are built into the MRS itself. 4) MRS is oriented towards reporting on the past and the present, rather than projecting the future. 5) MRS generally has limited analytical capabilities-they are not built around elaborate models, but rather rely on summarization and extraction from the database according to given criteria. Based on simple processing of the data summaries and extracts, report information is obtained and printed (or, if of limited size, displayed as a screen) in a prespecified format. 6) MRS generally report on internal company operations rather than spanning the company's boundaries by reporting external information. Reporting by MRS MRS may produce reports; either directly from the database collected through transaction processing systems, or from databases spun off for the purpose. Separate spin-off databases may be created for several reasons, such as avoiding interference

and delays in transaction processing, maintaining the security of central databases, or economizing by using local databases accessible to local managers to counter the heavy telecommunication costs of working with a central database. MRS provides the following report forms: 1) Scheduled (Periodic) Reports These reports are furnished on a daily, weekly, biweekly, or other basis depending on the decision making need. The sales manager to assess the performances of sales districts or individual salespeople may use a weekly sales analysis report. A brand manager responsible for a particular product might obtain weekly sales report containing information useful in his or her decision making-showing regional sales and sales to various market segments. 2) Exception Reports Another means of preventing information overload is resorting to exception reports, produced only when preestablished "out-of-bounds" conditions occur and containing solely the information regarding these conditions. For example, purchasing managers may need an exception report when suppliers are a week late in deliveries. Such a report may be triggered automatically by the delay of an individual supplier, or produced on a scheduled basis-but only if there are late suppliers. The report might include a list of late suppliers, the extent to which each is late, and the supplies ordered from each. Exception reporting helps managers avoid perusal of unneeded figures. 3) Demand (Ad Hoc) Reports The ability of a manager to request a report or screen output as needed enhances the flexibility of MRS use and gives the end user (the individual manager) the capability to request the information and format that best suit his or her needs. Query languages provided by DBMS make data accessible for demand reporting.

2. Explain with relevant examples the concept of business process. Also mention their elements
Ans: Strategic management level: Strategic network optimization, including the number, location, and size of warehousing, distribution centers, and facilities. #Strategic partnerships with suppliers, distributors, and customers, creating communication channels for critical information and operational

improvements such as cross docking, direct shipping, and third-party logistics. # Product life cycle management, so that new and existing products can be optimally integrated into the supply chain and capacity management activities. # Information technology chain operations. # Where-to-make and make-buy decisions. #Aligning overall organizational strategy with supply strategy. #It is for long term and needs resource commitment.

Tactical management level # Sourcing contracts and other purchasing decisions. # Production decisions, including contracting, scheduling, and planning process definition. # Inventory decisions, including quantity, location, and quality of inventory. # Transportation strategy, including frequency, routes, and contracting. # Benchmarking of all operations against competitors and implementation of best practices throughout the enterprise. # Milestone payments. # Focus on customer demand and Habits.

Operational management level # Daily production and distribution planning, including all nodes in the supply chain.

# Production scheduling for each manufacturing facility in the supply chain (minute by minute). # Demand planning and forecasting, coordinating the demand forecast of all customers and sharing the forecast with all suppliers. #Sourcing planning, including current inventory and forecast demand, in collaboration with all suppliers. #Inbound operations, including transportation from suppliers and receiving inventory. #Production operations, including the consumption of materials and flow of finished goods. #Outbound operations, including all fulfillment activities, warehousing and transportation to customers. #Order promising, accounting for all constraints in the supply chain, including all suppliers, manufacturing facilities, distribution centers, and other customers. #From production level to supply level accounting all transit damage cases & arrange to settlement at customer level by maintaining company loss through insurance company.

Consider information required by different departments at different levels as above i.e

human resource department, financial department, marketing department, production/operations department

4. Explain various Organizational limits to Relational Decision Making.


Ans A manager is a problem solver, and the fundamental activity in problem solving is decision making. Decision making is the process of identifying a problem, developing alternative solutions, and choosing and implementing one of them. Problems that require decisions are sometimes difficult to perceive, and even more difficult to define (or "frame"); often they require multiple decisions to solve. A well-established model of the decision-making process has been proposed by Herbert Simon (1960), based on the formulation of methodical thinking by the philosopher John Dewey (though it can be traced back to Aristotle). Simon's three-step model is shown in figure 9.1. The process begins with a search for a problem or an opportunity-bold people do call problems "opportunities in disguise." Effective managers thus do not avoid problemsthey seek them out. Innovative companies seek out customer opinions about their products. Proactive managers of information systems work closely with end users to see what problems they can solve for them.

Fig. 9.1: Simons three step model More and more, chief executives turn to their executive information systems each morning to look for first signs of developing problems or opportunities. This first stage of the decision-making process is called intelligence, because problem finding requires a search of the environment: problems frequently do not present themselves for some time, and opportunities do so even more rarely. Executive information systems and carefully designed management reporting systems contain built-in triggers and exceptions that help alert a manager to a problem. Systems developed to address the critical success factors (CSF) of an individual manager are likely to spotlight a problem. Once a manager finds a problem, he or she needs to formulate or "frame" it. An experienced manager often recognizes a problem as similar to one he or she has

already encountered. The intuitive grasp of a problem most often relics on such an ability to establish an analogy. The activity that Simon called design involves the development of alternative solutions to a problem. This is a creative, divergent (leading in many directions) process. Some of the solutions may require more intelligence-more information gathering about the problem. Solutions to the problems we are discussing are actually courses of action-there are many aspects to such a solution, and the phases of intelligence and design may be rather tightly interlocked iterations: garnering more information leads to new alternatives, which in turn call for more information. Problem framing and development of alternatives, highly creative processes, find rather scant support from automated systems. Some decision support systems offer a certain assistance here, but most of the tools we shall discuss below rely' on human ingenuityinformed by MIS. The choice of an alternative often has to be made in an environment of considerable risk or uncertainty. No satisfactory solution may be found among the available alternatives, in which case the decision makers may have to go back to the design stage to develop other alternatives, or even to the intelligence stage to reformulate the problem. The what-if mode of decision support systems directly supports this phase. Expert systems begin to support it as well.

Implementation of a decision is a broad issue. In general, both the quality of the decision and of its implementation are higher if the people who make the decision are also responsible for its implementation. Many implementation difficulties have been tracked to the separation of these functions. Organizational change processes, may need to be activated to implement more far reaching decisions. The effects of such decisions may be tracked with management reporting and executive support systems. As in any control process, corrective actions may have to be initiated when necessaryindeed, managers may have to rethink the decision. Project management software is used to schedule human resources and to track project timelines. Simon has also classified all decisions into two classes, now called structured and unstructured. Structured decisions are repetitive and can be represented as algorithms-programmable procedures. Thus, they can be relegated with ease to computer processing. Unstructured decisions require human judgment, while the decision-making process can usually profit from computerized support. Rational decision making and its limits
The classical concept of a perfectly rational decision maker does not apply to the plethora of situations in organizational decision making. Most decision making is subject to bounded, limited rationality. Some of the limits arise from the way organizations function, some from our cognitive limitations as human individuals. The classical model of a decision maker was formulated in economic theory and is usually attributed to Adam Smith. This model is normative (prescriptive); that is, it describes how a person should make a decision. The process the model describes is

known asrational decision-making. The model makes the following very strong assumptions: The rational decision maker seeks to maximize the payoff from a decision (for example, profit or market share attained by a firm); in more general terms, the decision maker seeks to optimize. The decision maker knows all possible courses of action (alternatives). The decision maker knows the outcome of each course of action. The model approximates certain real situations, and we shall discuss techniques for applying it. However, with more complex, less structured decisions, it is impossible to specify all the alternatives and their outcomes, owing both to their excessive number and to lack of information. The impossibility of centralized planning of a nation's economy, even with the use of any computing power available in the foreseeable future, is just one proof that the decision-making model based on full rationality is, in general, not realistic. Since full rationality, with its goal of optimizing, is an impossibility in most realistic situations, an alternative theory of decision-making behavior has evolved. In this theory, proposed by Herbert Simon (1960), the decision maker exhibits bounded rationality. Since a decision maker's ability to perceive all the alternatives and their outcomes is limited by cognitive abilities, financial resources, and time pressures, this model suggests that decision makers do not actually optimize when making decisions. Rather, they satisfice (word coined by Simon). That is, decision makers choose the first alternative that moves them toward their goal; the goal itself may be adjusted as incremental decisions succeed or fail. The alternative chosen by a satisficing decision maker satisfies his or her aspiration level and risk-taking propensity. Therefore, raising the aspiration levels of managers and heightening their expectations is one technique for teaching them innovative decision making. This prevents them from settling on minimalistic departures from standard operating procedures. If we consider the concept of bounded rationality more broadly, we should be able to identify both the organizational and individual factors that limit it. Organizational Limits to Rational Decision Making The rational model of organizational decision making reflects only some aspects of the decision-making environment: those that lend themselves most readily to receiving support from information systems. Other aspects include incrementalism, chance-driven choice making, political/competitive behavior, and programmed choice making. As you hall see, most of these decision-making behaviors are rooted in the divergent interests of the people involved in making a decision. Therefore, various types of group decision support systems (GDSSs) can help these groups to negotiate, foresee, and manage a crisis, or to look at a broad array of alternatives before arriving at a decision.

Charles Lindblom analyzed how the decision-making process, particularly in large organizations (including governments), differs from the rational model. He contended that decision making in large organizations under ordinary circumstances is a process of "muddling through"-making small, incremental changes from existing actions and policies. The important criteria in this decision-making mode are avoiding the uncertainty of major changes and maintaining the consensus of all involved. Making a decision is not concluded by the "choice" of an alternative; it is rather a continuous process, during which any chosen course of action may be modified as it is implemented. The more recent, and most pessimistic, so-called garbage can theory of organizational decision making is based on the premise that not all organizations are destined to succeed-many companies (even those considered excellent at some point) will fail. These firms are unable to adapt to the changing environment, and much of their decision making consists of attaching solutions to problems in a rather random manner. In one sense, "garbage-can" decision making is present to some extent in all companies: because of the difficulty in forecasting outcomes, chance does playa role in providing a solution to many an organizational problem. Other aspects of organizational decision making are reflected by what George Huber called the political/competitive model. A decision process generally includes several participants, each of whom may seek to influence the decision in a direction favorable to themselves or to the unit they represent. For example, several studies of budget development clearly point to it being a political process. The need to reconcile the diverging interests of various stakeholders (for example, senior management, labor, government, and others) often leads participants to avoid making major departures from current policies-and is thus one of the reasons for incremental decision making. Rational decision making in organizations is also limited by programmed behavior. When decision makers engage in this type of behavior, they follow standard operating procedures, which constrains their choices and prevents creative problem solving as they opt for the "safe and tried." An analysis of the results of previous choices, assisted by information systems, may help decision makers relax the constraints of programmed choice making. Individual Limits to Rational Decision Making Individual capability to make rational decisions is also limited. Individuals have frames of reference based on their experience, knowledge, and cultural backgrounds. These frames of reference act as filters, blocking out certain types of information or certain alternative courses of action-to the possible detriment of quality decision making. Human ability to process information is limited by what Princeton University psychologist George Miller called "the magical number seven, plus or minus two". In other words, we cannot retain in short-term memory and consider simultaneously during decision making more than five to nine "chunks" of information. A simple example is the number of digits in an international long-distance telephone number, which we usually need to dial right after being told what it is. To cope, we organize the individual digits into larger chunks (a familiar area code or country code is such a chunk-we remember it as a single unit). It is partly because of this limitation that we

analyze or design information systems through a process of stepwise refinement; we are thus able to handle the system by dealing with only a few components at a time. Human decision making is distorted by a variety of biases. Amos Tversky and Daniel Kahneman have established that people are highly averse to possible loss and will undergo significant risk to prevent it, even though they would not incur such a risk when seeking gain. People frequently perceive a causal relationship between two factors when there are no grounds for doing so. Vivid events that are easily recalled or events in the recent past are unjustifiably assigned higher probability and weigh more heavily in the decision. People more readily accept information that confirms, rather than challenges, their current convictions. Our understanding of probabilistic information is generally poor. For example, unwarranted inferences are frequently drawn from small samples, with neglect of available statistical techniques for ensuring the reliability of such conclusions. This is why "lying with statistics" often encounters low resistance. Decision makers are more likely to use only readily available information rather than transform that information into a potentially more useful form. The form in which information is displayed influences people's understanding of it. For example, items listed first or last have a greater impact than those in the middle. All this means that people's perception of information and their decision making based on that information may be manipulated. It also points up the need to consider carefully the way information is presented in order to avoid biasing decision making. On the other hand, the skill of decision making can and must be acquired through training and reflective practice.

5. Explain different components of DSS.


Ans The three principal DSS subsystems and their principal capabilities are shown in figure 10.1. Various commercial systems support DSS development and package these DSS capabilities in a variety of ways by distributing them among a series of optional modules.

Fig. 10.1: Components of DSS Data Management Subsystem The data management subsystem of a DSS relies, in general, on a variety of internal and external databases. Indeed, we have said that the power of DSS derives from their ability to provide easy access to data. This is not to say that a simple, usually spreadsheet-based DSS for the personal use of a manager cannot rely on the manager's limited personal database. It is simply that maintaining the currency and integrity of a significant database of this kind is usually a daunting task. Proliferation of personal databases also contradicts the principles of information resource management.

: Data Management Subsystem On the other hand, it is usually undesirable to provide a DSS with direct access to corporate databases. The performance of the transaction processing systems that access these databases, as well as the responsiveness of the DSS, would both be degraded. Usually, therefore, the database component of DSS relies on extracts from the relevant internal and external databases. The user is able to add to these data at will. This is shown in figure 10.2. The extraction procedure itself is generally specified by a specialist rather than an end user. The specialist needs to pay particular attention to data consistency across multiple decision support systems that extract data from the corporate databases. If extracts for the DSS serving the same functional area are made at different times, the extracted databases will differ and "battles of the printout" may result. The Model Management Subsystem The power of DSS rests on the user's ability to apply quantitative, mathematical models to data. Models have different areas of application and come from a variety of sources. Software packages for developing DSS (so-called DSS generators) contain libraries of

statistical models. These models include tools for the exploratory analysis of data-tools designed to obtain summarized measures such as mean and median values, variances, scatter plots, and so forth. Other statistical models help analyze series of data and forecast future outcomes by approximating a set of data with a mathematical equation, by extending the trend of a curve by extrapolation techniques, or by providing for seasonal adjustment. The capabilities of the model management component of DSS are summarized in figure 10.3. Other models help establish (or reject) causal relationships between various factors (for example, whether the drop in sales volume is caused by the aging of our target market segment). Market response models show how sales depend on such factors as price and promotion. Simulation models that generate input values randomly from a certain probability distribution (also called Monte Carlo models-after the city where the famous casino is, of course) are employed for waiting-line problems, such as establishing the number of operators needed for order taking or deciding on staffing levels for a service center.

Fig. 10.3: Model Management Subsystem Optimization models, developed by management scientists, are available for use in DSS. These models aim to allocate resources to maximize profit or minimize cost or time. A number of such models are based on a linear programming technique. These include models that allocate input resources (labor, materials, capital) among various products; models that assign activities to personnel or equipment; and models that determine the best shipping schedules from several points of origin to several destinations. Other

models optimize inventory levels or determine optimal network configurations. Specialized model libraries are available for financial modeling, risk analysis, or marketing. A particular advantage of DSS is the decision maker's ability to use a model to explore the influence of various factors on outcomes (a process known as sensitivity analysis). Two forms of such analysis are the what-if analysis and goal seeking. When doing what-if analysis, the decision maker creates multiple scenarios by assuming various realistic values for input data, Thus, the decision maker asks "What if these are the values of the inputs?" The model recomputes outputs for each case. Here are some examples of questions that can be directed toward appropriate models: What will be the cost of goods sold if the cost of raw materials increases by 10 percent? What will be the effects on the company bonus program if sales increase by 3 percent and direct expenses increase by 5 percent? When goal seeking, the decision maker works backward from the assumed results to the needed input values. Thus, the decision maker asks "What will it take to achieve this goal?" Some examples of questions asked in this mode are: What sales volume will be necessary to ensure a revenue growth of 10 percent next year? How many service center employees will it take to ensure that every order is handled within three minutes? What quarterly revenues will we need from each of our three products to generate the desired profits during these quarters? The actual form in which these questions may be asked depends on the options offered by the dialog management subsystem of the DSS, which we shall discuss next. There is significant research interest in providing a degree of automated model management. The user would be able to present the problem in a system of this kind, and the system would automatically select an appropriate model or construct one from the existing models and "building blocks." The Dialog Management Subsystem Along with DSS's ability to apply models to large volumes of data from a variety of sources, a single advantage of DSS is the user-friendly and flexible interface between the human decision maker and such a system. This stands in contrast to management reporting systems. The notable feature is support of multiple forms of input and output. By combining various input and output capabilities of a DSS, users can engage in the individual dialog styles that best support their decision-making styles. The field of artificial intelligence has made some notable contributions to dialog management, such as the ability to specify what is wanted in a subset of natural language or to activate the system by voice. The window capability enables the user to maintain several activities at the same time, with the results displayed in screen windows (the user

employs a mouse to move between the windows). A variety of help and even trainingby-example capabilities may be offered. Significant attention has been devoted by researchers to the effectiveness of computer graphics, as opposed to the tabular display of data. Gary Dickson and his colleagues found that, in general, one cannot claim an advantage (however intuitively appealing it may he) for graphics throughout all decision-related activities. They did find, however, that graphs outperform tables when a large amount of information must be presented and a relatively simple impression is desired. This is very often the case-and the main reason why executive information systems, discussed later in this chapter, rely heavily on graphics. By analyzing the results of research in this area, Ali Montazemi and Shuohong Wang, concluded that line graphics have time-saving effects on decision making for more complex decision tasks only, and are less defective at providing precise information. Color graphics were found to improve decision quality, but they did not reduce the time necessary to arrive at a decision. Graphic representation of quantitative information requires considerable care to prevent distorted perception; Edward Tufte gives a thorough and exciting presentation of the subject. Summarizing the uses of graphical presentation of business information, Richard Scovill tells us that most business graphs are designed to answer just four questions: 1. Who is the biggest? 2. How do circumstances change over time? 3. What is typical or exceptional? 4. How well does one fact predict another? In general, it has been established that different decision makers and tasks are best supported by different display formats. This again proves that the advantage of DSS in the area of dialog management lies in providing a variety of dialog styles.

6. Write a note on Ethical and Social issues with E-Commerce

Ans 1. Internet can be used in illegal ways, as there are no laws related to its use. Many servers contain illegal, immoral, defamatory information (which cannot be legally communicated using facilities like TV, radio, etc.). 2. There is minimal or no control over the Internet (unlike telephone, radio, TV, etc.). Limited banning of material in Internet is not possible i.e. all-or-none rule. 3. Free speech advocates say that screening of incoming material is the responsibility of the receiving end 4. There is no law against Spamming i.e. sending unsolicited mail 5. Massive flaming of large quantity of e-mail to one address. The question arises Is sending/receiving large quantity of mail ethical?

Master of Computer Application (MCA) Semester 4 MC0077 Advanced Database Systems 4 Credits Assignment Set -1

Q. 1. Describe the Following. A. Dimensional Model The dimensional model is a specialized adaptation of the relational model used to represent data in data warehouses in a way that data can be easily summarized using OLAP queries. In the dimensional model, a database consists of a single large table of facts that are described using dimensions and measures. A dimension provides the context of a fact (such as who participated, when and where it happened, and its type) and is used in queries to group related facts together. Dimensions tend to be discrete and are often hierarchical; for example, the location might include the building, state, and country. A measure is a quantity describing the fact, such as revenue. Its important that measures can be meaningfully aggregated for example, the revenue from different locations can be added together. In an OLAP query, dimensions are chosen and the facts are grouped and added together to create a summary. The dimensional model is often implemented on top of the relational model using a star schema, consisting of one table containing the facts and surrounding tables containing the dimensions. Particularly complicated dimensions might be represented using multiple tables, resulting in a snowflake schema. A data warehouse can contain multiple star schemas that share dimension tables, allowing them to be used together. Coming up with a standard set of dimensions is an important part of dimensional modeling.

B. Object Database Model In recent years, the object-oriented paradigm has been applied to database technology, creating a new programming model known as object databases. These databases attempt to bring the database world and the application programming world closer together, in particular by ensuring that the database uses the same type system as the application program. This aims to avoid the overhead (sometimes referred to as the impedance mismatch) of converting information between its representation in the database (for example as rows in tables) and its representation in the application program (typically as objects). At the same time, object databases attempt to introduce the key ideas of object programming, such as encapsulation and polymorphism, into the world of databases. A variety of these ways have been tried for storing objects in a database. Some products have approached the problem from the application programming end, by making the objects manipulated by the program persistent. This also typically requires the addition of some kind of query language, since conventional programming languages do not have the ability to find objects based on their information content. Others have attacked the problem from the database end, by defining an object-oriented data model for the database, and defining a database programming language that allows full programming capabilities as well as traditional query facilities. Object databases suffered because of a lack of standardization: although standards were defined by ODMG, they were never implemented well enough to ensure interoperability between products. Nevertheless, object databases have been used successfully in many applications: usually specialized applications such as engineering databases or molecular biology databases rather than mainstream commercial data processing. However, object database ideas were picked up by the relational vendors and influenced extensions made to these products and indeed to the SQL language

C. Post Relational Database Model.

Several products have been identified as post-relational because the data model incorporates relations but is not constrained by the Information Principle, requiring that all information is represented by data values in relations. Products using a post-relational data model typically employ a model that actually pre-dates the relational model. These might be identified as a directed graph with trees on the nodes. Post-relational databases could be considered a sub-set of object databases as there is no need for object-relational mapping when using a post-relational data model. In spite of many attacks on this class of data models, with designations of being hierarchical or legacy, the post-relational database industry continues to grow as a multi-billion dollar industry, even if the growth stays below the relational database radar. Examples of models that could be classified as post-relational are PICK aka MultiValue, and MUMPS, aka M.

Q. 2. Explain the Concept of query ? How a Query Optimizer Works ? Ans. The aim of query processing is to find information in one or more databases and deliver it to the user quickly and efficiently. Traditional techniques work well for databases with standard, single-site relational structures, but databases containing more complex and diverse types of data demand new query processing and optimization techniques. Most real-world data is not well structured. Todays databases typically contain much non-structured data such as text, images, video, and audio, often distributed across computer networks. In this complex milieu (typified by the World Wide Web), efficient and accurate query processing becomes quite challenging. Principles of Database Query Processing for Advanced Applications teaches the basic concepts and techniques of query processing and optimization for a variety of data forms and database systems, whether structured or unstructured. Query Optimizer

The Query Optimizer is the component of a database management system that attempts to determine the most efficient way to execute a query. The optimizer considers the possible query plans (discussed below) for a given input query, and attempts to determine which of those plans will be the most efficient. Cost-based query optimizers assign an estimated "cost" to each possible query plan, and choose the plan with the least cost. Costs are used to estimate the runtime cost of evaluating the query, in terms of the number of I/O operations required, the CPU requirements, and other factors. Query plan A Query Plan (or Query Execution Plan) is a set of steps used to access information in a SQL relational database management system. This is a specific case of the relational model concept of access plans. Since SQL is declarative, there are typically a large number of alternative ways to execute a given query, with widely varying performance. When a query is submitted to the database, the query optimizer evaluates some of the different, correct possible plans for executing the query and returns what it considers the best alternative. Because query optimizers are imperfect, database users and administrators sometimes need to manually examine and tune the plans produced by the optimizer to get better performance. The set of query plans examined is formed by examining the possible access paths (e.g. index scan, sequential scan) and join algorithms (e.g. sort-merge join, hash join, nested loops). The search space can become quite large depending on the complexity of the SQL query. The query optimizer cannot be accessed directly by users. Instead, once queries are submitted to database server, and parsed by the parser, they are then passed to the query optimizer where optimization occurs. Implementation Most query optimizers represent query plans as a tree of "plan nodes". A plan node encapsulates a single operation that is required to execute the query. The nodes are arranged as a tree, in which intermediate results flow from the bottom of the tree to the top. Each node has zero or more child nodes those are nodes whose output is fed as input to the parent node. For example,

a join node will have two child nodes, which represent the two join operands, whereas a sort node would have a single child node (the input to be sorted). The leaves of the tree are nodes which produce results by scanning the disk, for example by performing an index scan or a sequential scan. Q. 3. Explain the following with respect to Heuristics of Query Optimizations: A. Equivalence of Expressions

The first step in selecting a query-processing strategy is to find a relational algebra expression that is equivalent to the given query and is efficient to execute. Well use the following relations as examples: Customer-scheme = (cname, street, ccity) Deposit-scheme = (bname, account#, name, balance) Branch-scheme = (bname, assets, bcity)

B. Selection Operation

. Consider the query to find the assets and branch-names of all banks who have depositors living in Port Chester. In relational algebra, this is bname, assets( ccity=Port Chester (customer deposit branch))

- This expression constructs a huge relation, customer deposit branch of which we are only interested in a few tuples.

- We also are only interested in two attributes of this relation. - We can see that we only want tuples for which ccity = Port Chester. - Thus we can rewrite our query as: bname, assets(ccity=Port Chester(customer)) customer deposit branch)

- This should considerably reduce the size of the intermediate relation. 2. Suggested Rule for Optimization: - Perform select operations as early as possible. - If our original query was restricted further to customers with a balance over $1000, the selection cannot be done directly to the customer relation above. - The new relational algebra query is

- The selection cannot be applied to customer, as balance is an attribute of deposit. We can still rewrite as

- If we look further at the subquery (middle two lines above), we can split the selection predicate in two:

- This rewriting gives us a chance to use our perform selections early rule again. - We can now rewrite our subquery as:

3. Second Transformational Rule: - Replace expressions of the form P1^P2(C) by P1( P2( C)) where P1 and P2 predicates and e is a relational algebra expression. - Generally, P1( P2( C)) = P2( P1( C)) = P1^P2(C)

C). Projection Operation.

Like selection, projection reduces the size of relations. It is advantageous to apply projections early. Consider this form of our example query:

2. When we compute the subexpression

we obtain a relation whose scheme is (cname, ccity, bname, account#, balance)

3. We can eliminate several attributes from this scheme. The only ones we need to retain are those that - appear in the result of the query or - are needed to process subsequent operations. 4. By eliminating unneeded attributes, we reduce the number of columns of the intermediate result, and thus its size. 5. In our example, the only attribute we need is bname (to join with branch). So we can rewrite our expression as:

Note that there is no advantage in doing an early project on a relation before it is needed for some other operation: - We would access every block for the relation to remove attributes. - Then we access every block of the reduced-size relation when it is actually needed. - We do more work in total, rather than less! D) Natural Join Operation

Another way to reduce the size of temporary results is to choose an optimal ordering of the join operations. Natural join is associative:

Although these expressions are equivalent, the costs of computing them may differ. Look again at our expression

we see that we can compute deposit branch first and then join with the first part. However, deposit account. The other part, is probably a small relation (comparatively). branch is likely to be a large relation as it contains one tuple for every

So, if we compute first, we get a reasonably small relation.

It has one tuple for each account held by a resident of Port Chester. This temporary relation is much smaller than deposit branch. Natural join is commutative:

Thus we could rewrite our relational algebra expression as:

But there are no common attributes between customer and branch, so this is a Cartesian product. Lots of tuples! If a user entered this expression, we would want to use the associativity and commutativity of natural join to transform this into the more efficient expression we have derived earlier (join with deposit first, then with branch).

Q. 4. There are a number of historical, organizational, and technological reasons explain the lack of an all-encompassing data management system. Discuss few of them with appropriate examples. Ans. Models of Failures Failures can be classified as 1) Transaction Failures a) Error in transaction due to incorrect data input. b) Present or potential deadlock. c) Abort of transactions due to non-availability of resources or deadlock. 2) Site Failures: From recovery point of view, failure has to be judged from the viewpoint of loss of memory. So failures can be classified as a) Failure with Loss of Volatile Storage: In these failures, the content of main memory is lost; however, all the information which is recorded on disks is not affected by failure. Typical failures of this kind are system crashes.

b) Media Failures (Failures with loss of Nonvolatile Storage): In these failures the content of disk storage is lost. Failures of this type can be reduced by replicating the information on several disks having independent failure modes. Stable storage is the most resilient storage medium available in the system implemented by replicating the same information on several disks with (i) independent failure modes, and (ii) using the so-called careful replacement strategy, at every update operation, first one copy of the information is updated, then the correctness of the update is verified, and finally the second copy is updated. 3) Communication Failures: There are two basic types of possible communication errors: lost messages and partitions. When a site X does not receive an acknowledgment of a message from a site Y within a predefined time interval, X is uncertain about the following things: i) Did a failure occur at all, or is the system simply slow? ii) If a failure occurred, was it a communication failure, or a crash of site Y? iii) Has the message been delivered at Y or not? (as the communication failure or the crash can happen before or after the delivery of the message.) Network Partition Thus all failures can be regrouped as i) Failure of a site ii) Loss of message(s), with or without site failures but no partitions. iii) Network Partition: Dealing with network partitions is a harder problem than dealing with site crashes or lost messages.

Q.5 Describe the Structural Semantic Data Model (SSM) with relevant examples. Ans. The Structural Semantic Model, SSM, first described in Nordbotten (1993a & b), is an extension and graphic simplification of the EER modeling tool 1st presented in the 89 edition of (Elmasri & Navathe, 2003). SSM was developed as a teaching tool and has been and can continue to be modified to include new modeling concepts. A particular requirement today is the inclusion of concepts and syntax symbols for modeling multimedia objects. 4.7.1 SSM Concepts The current version of SSM belongs to the class of Semantic Data Model types extended with concepts for specification of user defined data types and functions, UDT and UDF. It supports the modeling concepts defined in below and compared in below. Following diagram shows the concepts and graphic syntax of SSM, which include:

Data Modeling Concepts

1. Three types of entity specifications: base (root), subclass, and weak 2. Four types of inter-entity relationships: n-ary associative, and 3 types of classification hierarchies,

3. Four attribute types: atomic, multi-valued, composite, and derived, 4. Domain type specifications in the graphic model, including;

standard data types, Binary large objects (blob, text, image, ), user-defined types (UDT) and functions (UDF), 5. Cardinality specifications for entity to relationship-type connections and for multi-valued attribute types and 6. Data value constraints.

Q-6. Describe the following with respect to Fuzzy querying to relational databases: A. Proposed Model The easiest way of introducing fuzziness in the database model is to use classical relational databases and formulate a front end to it that shall allow fuzzy querying to the database. A limitation imposed on the system is that because we are not extending the

database model nor are we defining a new model in any way, the underlying database model is crisp and hence the fuzziness can only be incorporated in the query. To incorporate fuzziness we introduce fuzzy sets / linguistic terms on the attribute domains / linguistic variables e.g. on the attribute domain AGE we may define fuzzy sets as YOUNG, MIDDLE and OLD. These are defined as the following:

Age For this we take the example of a student database which has a table STUDENTS with the following attributes:

A snapshot of the data existing in the database

B. Meta knowledge At the level of meta knowledge we need to add only a single table, LABELS with the following structure:

Meta Knowledge This table is used to store the information of all the fuzzy sets defined on all the attribute domains. A description of each column in this table is as follows: Label: This is the primary key of this table and stores the linguistic term associated with the fuzzy set. Column_Name: Stores the linguistic variable associated with the given linguistic term. Alpha,Beta, Gamma, Delta: Stores the range of the fuzzy set

C. Implementation The main issue in the implementation of this system is the parsing of the input fuzzy query. As the underlying database is crisp, i.e. no fuzzy data is stored in the database, the INSERT query will not change and need not be parsed therefore it can be presented to the database as it is. During parsing the query is parsed and divided into the following 1. Query Type: Whether the query is a SELECT, DELETE or UPDATE.

2. Result Attributes: The attributes that are to be displayed used only in the case of the SELECT query. 3. Source Tables: The tables on which the query is to be applied. 4. Conditions: The conditions that have to be specified before the operation is performed. It is further sub-divided into Query Attributes (i.e. the attributes on which the query is to be applied) and the linguistic term. If the condition is not fuzzy i.e. it does not contain a linguistic term then it need not be subdivided.

Master of Computer Application (MCA) Semester 4 MC0077 Advanced Database Systems 4 Credits
Assignment Set 2 (60 Marks) 1. How costs are computed for execution of a query? Discuss the method of Measuring Index Selectivity? Ans 1: Heuristics of Query Optimizations Equivalence of Expressions The first step in selecting a query-processing strategy is to find a relational algebra expression that is equivalent to the given query and is efficient to execute. We'll use the following relations as examples: Customer-scheme = (cname, street, ccity) Deposit-scheme = (bname, account#, name, balance) Branch-scheme = (bname, assets, bcity) Selection Operation 1. Consider the query to find the assets and branch-names of all banks who have depositors living in Port Chester. In relational algebra, this is bname, assets( ccity=Port Chester (customer deposit branch)) o This expression constructs a huge relation, customer deposit branch of which we are only interested in a few tuples. o We also are only interested in two attributes of this relation. o We can see that we only want tuples for which ccity = Port Chester''. o Thus we can rewrite our query as: bname, assets(ccity=Port Chester(customer)) customer deposit branch) o This should considerably reduce the size of the intermediate relation.

2. Suggested Rule for Optimization: o Perform select operations as early as possible. o If our original query was restricted further to customers with a balance over $1000, the selection cannot be done directly to the customer relation above. o The new relational algebra query is bname, assets( ccity = PortChester ^ balance >1000 (customer deposit branch)) o The selection cannot be applied to customer, as balance is an attribute of deposit. We can still rewrite as bname, assets ((ccity = PortChester ^ balance >1000 (customer deposit)) branch) o If we look further at the subquery (middle two lines above), we can split the selection predicate in two: ccity = PortChester( balance >1000 (customer deposit)) o This rewriting gives us a chance to use our perform selections early'' rule again. o We can now rewrite our subquery as: ccity = PortChester(customer) balance >1000 (deposit) 3. Second Transformational Rule: o Replace expressions of the form P1^P2(C) by P1( P2( C)) where P1 and P2 predicates and e is a relational algebra expression. o Generally, P1( P2( C)) = P2( P1( C)) = P1^P2(C) Projection Operation 1. Like selection, projection reduces the size of relations. It is advantageous to apply projections early. Consider this form of our example query: bname, assets (((ccity = PortChester (customer)) deposit) branch) 2. When we compute the subexpression (((ccity = PortChester (customer)) deposit) we obtain a relation whose scheme is (cname, ccity, bname, account#, balance) 3. We can eliminate several attributes from this scheme. The only ones we need to retain are those that o appear in the result of the query or

o are needed to process subsequent operations. 4. By eliminating unneeded attributes, we reduce the number of columns of the intermediate result, and thus its size. 5. In our example, the only attribute we need is bname (to join with branch). So we can rewrite our expression as: bname, assets (((ccity = PortChester (customer)) deposit)) branch) 6. Note that there is no advantage in doing an early project on a relation before it is needed for some other operation: o We would access every block for the relation to remove attributes. o Then we access every block of the reduced-size relation when it is actually needed. o We do more work in total, rather than less! Natural Join Operation Another way to reduce the size of temporary results is to choose an optimal ordering of the join operations. Natural join is associative: (r1 r2) r3 = r1 (r2 r3) Although these expressions are equivalent, the costs of computing them may differ. Look again at our expression bname, assets ((ccity = PortChester (customer)) deposit branch) we see that we can compute deposit branch first and then join with the first part. However, deposit branch is likely to be a large relation as it contains one tuple for every account. The other part, is probably a small relation (comparatively). (ccity = PortChester (customer) So, if we compute first, we get a reasonably small relation. (ccity = PortChester (customer) deposit It has one tuple for each account held by a resident of Port Chester. This temporary relation is much smaller than deposit branch. Natural join is commutative: r1 r2 = r2 r1 Thus we could rewrite our relational algebra expression as: bname, assets (((ccity = PortChester (customer)) deposit)) branch) But there are no common attributes between customer and branch, so this is a Cartesian product. Lots of tuples!

If a user entered this expression, we would want to use the associativity and commutativity of natural join to transform this into the more efficient expression we have derived earlier (join with deposit first, then with branch).

2. Describe the following with respect to SQL3 DB specification: A) Complex Structures B) Hierarchical Structures LOBs Ans 2: (A) Complex structures 1. Create row type Address_t defines the address structure that is used in line 8. 2. Street#, Street, ... are regular SQL2 specifications for atomic attributes. 3. PostCode and Geo-Loc are both defined as having user defined data types, Pcode and Point respectively. Pcode is typically locally defined as a list or table of valid postal codes, perhaps with the post office name. 4. Create function Age_f defines a function for calculation of an age, as a decimal value, given a start date as the input argument and using a simple algorithm based on the current date. This function is used as the data type in line 9 and will be activated each time the Person.age attribute is retrieved. The function can also be used as a condition clause in a SELECT statement. 5. Create table PERSON initiates specification of the implementation structure for the Person entity-type. 6. Id is defined as the primary key. The not null phrase only controls that some 'not null' value is given. The primary key phrase indicates that the DBM is to guaranty that the set of values for Id are unique. 7. Name has a data-type, PersName, defined as a Row type similar to the one defined in lines 1-3. BirthDate is a date that can be used as the argument for the function Age_f defined in line 4. 8. Address is defined using the row type Address_t, defined in lines 1-3. Picture is defined as a BLOB, or Binary Large Object. Note that there are no functions for content search, manipulation or presentation, which support BLOB data types. These must be defined either by the user as user-defined functions, UDFs, or by the ORDBMS vendor in a supplementary subsystem. In this case, we need functions for image processing. 9. Age is defined as a function, which will be activated each time the attribute is retrieved. This costs processing time (though this algorithm is very simple), but gives a correct value each time the attribute is used. C) Relationships D) Large Objects, LOBs E) Storage of

(B) Hierarchical Structures 1. Create table STUDENT initiates specification of the implementation of a subclass entity type. 2. GPA, Level, ... are the attributes for the subclass, here with simple SQL2 data types. 3. under PERSON specifies the table as a subclass of the table PERSON. The DBM thus knows that when the STUDENT table is requested, all attributes and functions in PERSON are also relevant. An OR-DBMS will store and use the primary key of PERSON as the key for STUDENT, and execute a join operation to retrieve the full set of attributes. 4. Create table COURSE specifies a new table specification, as done for statements in lines 5 and 10 above. 5. Id, Name, and Level are standard atomic attribute types with SQL2 data types. Id is defined as requiring a unique, non null value, as specified for PERSON in line 6 above. 6. Note that attributes must have unique names within their tables, but the name may be reused, with different data domains in different tables. Both Id and Name are such attribute-names, appearing in both PERSON and COURSE, as is Level used in STUDENT and COURSE. 7. Course.Description is defined as a character large object, CLOB. A CLOB data type has the same defined character-string functions as char, varchar, and long char, and can be compared to these. User_id is defined as Ucode, which is the name of a user defined data type, presumably a list of acceptable user codes. The DB implementer must define both the data type and the appropriate functions for processing this type. 8. User_Id is also specified as a foreign key which links the Course records to their "user" record, modeled as a category sub entity - type, through the primary key in the User table. (C) Relationships The relationship TakenBy is defined in Figure b. This definition needs only SQL2 specifications. Note that: {Sid, Cid, and Term} form the primary key, PK. Since the key is composite, a separate Primary key clause is required. (As compared with the single attribute PK specifications for PERSON.Id and COURSE.Id.) The 2 foreign key attributes in the PK, must be defined separately. TakenBy.Report is a foreign key to a report entity-type, forming a ternary relationship as modeled in Figure a. The ON DELETE trigger is activated if the Report relation is deleted and assures that the FK link has a valid value, in this case 'null'.

(D) Large OBjects, LOBs The SSM syntax includes data types for potentially very long media types, such as text, image, audio and video, as shown in Figure 6.8 . If this model is to be realized in a single database, the DMS will have to have the capability to manage - store, search, retrieve, and manipulate different media types. Object-relational dbms vendors claim to be able to do this.

Figure: Media objects as attributes SQL3 provides support for storage of Binary Large OBjects, BLOBs. A BLOB is simply a very long bit string, limited in many systems today to 2 or 4GB. Several OR-Dbms vendors differentiate BLOBs into data-types that give more information about the format of the content and provide basic/primitive manipulation functions for these large object, LOB, types. For example, IBM's DB2 has 3 LOB types: BLOB for long bit strings, CLOB for long character strings, and DBCLOB for double-byte character strings. Oracle data types for large objects are BLOB, CLOB, NCLOB (fixed-width multi-byte CLOB) and BFILE (binary file stored outside the DB). Note that the 1st 3 are equivalent to the DB2 LOBs, while the last is really not a data-type, but rather a link to an externally stored media object. SQL3 has no functions for processing, f.ex. indexing the content of a BLOB, and provides only functions to store and retrieve it given an external identifier. For example, if the BLOB is an image, SQL3 does not 'know' how to display it, i.e. it has no functions for image presentation. DBMS vendors who provide differentiated blob types have also extended the basic SQL string comparison operators so that they will function for LOBs, or at least CLOBs. These operators include the pattern match function "LIKE", which gives a true/false response if the search string is found/not found in the *LOB attribute.

Note: "LIKE" is a standard SQL predicate that simply has been extended to search very long data domains. Storage of LOBs There are 3 strategies for storing LOBs in an or-DB: 1. Embedded in a column of the defining relation, or 2. Stored in a separate table within the DB, linked from the *LOB column of the defining relation. 3. Stored on an external (local or geographically distant) medium, again linked from the *LOB column of the defining relation. Embedded storage in the defining relation closely maps the logical view of the media object with its physical storage. This strategy is best if the other attributes of the table are primarily structural metadata used to specify display characteristics, for example length, language, format. The problem with embedded storage is that a DMS must transfer at least a whole tuple, more commonly a block of tuples, from storage for processing. If blobs are embedded in the tuples, a great deal of data must be transmitted even if the LOB objects are not part of the query selection criteria or the result. For example, a query retrieving the name and address of persons living in Bergen, Norway, would also retrieve large quantities of image data if the data for the Person.Picture attribute of Figure 8 were stored as an embedded column in the Person table. Separate table storage gives indirect access via a link in the defining relation and delays retrieval of the LOB until it is to be part of the query result set. Though this gives a two-step retrieval, for example when requesting an image of Joan Nordbotten, it will reduce general or average transfer time for the query processing system. A drawback of this storage strategy is a likely fragmentation of the DB area, as LOBs can be stored 'anywhere'. This will decrease the efficiency of any algorithm searching the content of a larger set of LOBs, for example to find images that are similar to or contain a given image segment. As usual, the storage structure chosen for a DB should be based on an analysis of anticipated user queries. External storage is useful if the DB data is 'connected' to established media databases, either locally on CD, DVD, ... or on other computers in a network as will most likely be the case when sharing media data stored in autonomous applications, such as cooperating museums, libraries, archives, or government agencies. This storage structure eliminates the need for duplication of large quantities of data that are normally offered in read-only mode. The cost is in access time which may currently be nearly unnoticeable. A good multimedia DMS should support each of these storage strategies.

3. Explain: A) Data Warehouse Architecture B) Data Storage Methods Ans 3: A. Data Warehouse Architecture The term Data Warehouse Architecture is primarily used today to describe the overall structure of a Business Intelligence system. Other historical terms include Decision Support Systems (DSS), Management Information Systems (MIS), and others. The Data Warehouse Architecture describes the overall system from various perspectives such as data, process, and infrastructure needed to communicate the structure, function and interrelationships of each component. The infrastructure or technology perspective details the various hardware and software products used to implement the distinct components of the overall system. The data perspective typically diagrams the source and target data structures and aid the user in understanding what data assets are available and how they are related. The process perspective is primarily concerned with communicating the process and flow of data from the originating source system through the process of loading the data warehouse, and often the process that client products use to access and extract data from the warehouse.

B. Data Storage Methods In OLTP - Online Transaction Processing Systems relational database design use the discipline of data modeling and generally follow the Codd rules of data normalization in order to ensure absolute data integrity. Less complex information is broken down into its most simple structures (a table) where all of the individual atomic level elements relate to each other and satisfy the normalization rules. Codd defines 5 increasing stringent rules of normalization and typically OLTP systems achieve a 3rd level normalization. Fully normalized OLTP database designs often result in having information from a business transaction stored in dozens to hundreds of tables. Relational database managers are efficient at managing the relationships between tables and result in very fast insert/update performance because only a little bit of data is affected in each relational transaction. OLTP databases are efficient because they are typically only dealing with the information around a single transaction. In reporting and analysis, thousands

to billions of transactions may need to be reassembled imposing a huge workload on the relational database. Given enough time the software can usually return the requested results, but because of the negative performance impact on the machine and all of its hosted applications, data warehousing professionals recommend that reporting databases be physically separated from the OLTP database. Designing the data warehouse data Architecture synergy is the realm of Data Warehouse Architects. The goal of a data warehouse is to bring data together from a variety of existing databases to support management and reporting needs. The generally accepted principle is that data should be stored at its most elemental level because this provides for the most useful and flexible basis for use in reporting and information analysis. However, because of different focus on specific requirements, there can be alternative methods for design and implementing data warehouses. There are two leading approaches to organizing the data in a data warehouse. In the "dimensional" approach, transaction data is partitioned into either a measured "facts", which are generally numeric data that captures specific values or "dimensions" which contain the reference information that gives each transaction its context. As an example, a sales transaction would be broken up into facts such as the number of products ordered, and the price paid, and dimensions such as date, customer, product, geographical location and salesperson. The main advantages of a dimensional approach are that the data warehouse is easy for business staff with limited information technology experience to understand and use. Also, because the data is pre-joined into the dimensional form, the data warehouse tends to operate very quickly. The main disadvantage of the dimensional approach is that it is quite difficult to add or change later if the company changes the way in which it does business. The main advantage of this approach is that it is quite straightforward to add new information into the database the primary disadvantage of this approach is that because of the number of tables involved, it can be rather slow to produce information and reports. Subject areas are just a method of organizing information and can be defined along any lines. The traditional approach has subjects defined as the subjects or nouns within a problem space. For example, in a financial services business, you might have customers, products and contracts. An alternative approach is to organize around the business transactions, such as customer enrollment, sales and trades.

4. Discuss, how the process of retrieving a Text Data differs from the process of retrieval of an Image? Text Retrieval Using SQL3/TextRetrieval SQL3 supports storage of multimedia data, such as text documents, in an ordatabase using the blob/clob data types. However, the standard SQL3 specification does not include support for such media content processing functions as indexing or searching using elements of the media content. For example SQL3's support for a query to retrieve documents about famous Norwegian artists is limited to using a serial search of all documents using the pattern match operator 'LIKE'. Queries using this operator are likely to miss the Web sites dedicated to the composer Seekers of information from text-based documents, commonly use 'free text' queries, i.e. queries that consist of a set of selection terms, as illustrated above. Depending on the underlying query processing system, the input can vary from a single search term to a longer document. This is a 'normal' input format for Information retrieval, IR, systems, such as the web search engines, but not for systems based on SQL. do not have a specific Therefore, most of the larger or-dbms vendors (IBM, Oracle, Ingres, Postgress, etc.) have used SQL3's UDT/UDF support to extend their or-dbms with sub-systems for the management of media data. The approach used has been to add-on own or purchased specialized media management systems to the basic or-dbms. Basically, the new - to SQL3 - functionality includes: Indexing Routines for the various types of media data, as discussed in CH.6, for example using: o Content terms for text data and o Color, shape, and texture features for image data. Selection Operators for the SQL3 WHERE clause for specification of selection criteria for media retrieval. Text Processing Sub-Systems for similarity evaluation and result ranking. Unfortunately, the result of this 'independent' activity, is non standard ordbms/mm (multimedia) systems that differ in the functionality included and limit data retrieval from multiple or-dbm system types. For example, unified access to data stored in Oracle and DB2 systems is difficult, both in query formulation and result presentation. Since the syntax of the SQL3 extensions

varies between or-dbms/mm implementations, the examples used in the following are given in generic SQL3/TextRetrieval (or sql3/tr) statements. Text Document Retrieval Text-based documents are basically unstructured and can be complex. They can consist of the raw text only, have a tagged structure (such as for html documents), include embedded images, and can have a number of fixed attributes containing the metadata describing aspects of the document. They may also include links to supplementary materials. For example, a news report for an election could include the following components: where n, m, k, and x are the number of occurrences of each component type. 1. Identifier, date, and author(s) of the report, 2. n* text blocks - (titles, abstract, content text), 3. m* images - example: image_of_candidate 4. k* charts, and 5. x* maps. Note that the document elements listed in pt.1 above function as context metadata for the report, while the text itself can function as semantic metadata for both the text (through indexing) and the image materials. The Web document shown in illustrates elements of a semi-structured document. Since an OR-DB can contain text documents such as web pages, SQL3 should be extended with processing operators that support access to each of the element types listed above. Retrieval using Context Metadata In an OR-DB, document descriptors such as Document ID, Date, and Author(s) function as context metadata. The metadata can be implemented as standard atomic attributes and relationships, thus enabling use of standard SQL queries for retrieval of the document(s). For example, an SQL query to find recent articles on database management by Joan Nordbotten could be expressed as: Select R.* FROM WHERE AND AND AND Person P, Author A, Report R P.id = A.Pid AND A.Rid=R.id Name = 'Joan Nordbotten' A.Date > 1999-12-31 Title LIKE '%Database%';

Note that this query assumes that there could be reports on different topics and therefore requires use of a semantic descriptor to select only those

whose documents that indicate that the report has something to do with databases. The Title attribute was used in this query, but other semantic metadata, such as the summary and/or keyword attributes could also have been chosen - alone or in combination. Execution optimization of this query, will place the LIKE operator 'last' so that its time consuming serial search of the Report.title attribute will be restricted to those reports that satisfy the Author.name and date conditions. However, as noted previously, no term index functionality for multiple term attributes has been included in the standard SQL3, thus there is no option to the serial search for the LIKE operator. Information retrieval using the standard SQL exact match operators functions well for the context metadata of all media types and moderately well for the semantic content metadata attributes. The problem is that the user must know the DB structure, the attribute names and the DB values in order to form a query. This will not be the case for Internet searchers. Text Retrieval by Semantic Content Researchers and developers of document collections strongly recommend that the semantic information content of the documents be described using such semantic content metadata attributes as a title, (a list of) subject keywords, and a content description - all multiple term descriptors. This information can be stored with the document as standard SQL attributes using variable length character data types. For example, an OR-DB for web-site maintenance could be developed to contain Web documents described using Dublin Core metadata elements. If the DB contained the Web page, it could be retrieved using the following SQL statement based on the semantic metadata and the text itself. Select * from Document where or (Title LIKE '%Edvard Grieg%' Text LIKE '%Edvard Grieg%');

In this case, the document was selected by a match in the title, since Edvard Grieg is not mentioned by full name in the text of the article. However, the following SLQ3 query will not return this document though it is relevant to the intent of the query, unless the phrase Norwegian composer has been defined in the Keywords list. Select Where or or * from Document (Title LIKE '%Norwegian composer%' Keywords LIKE '%Norwegian composer%'); Text LIKE '%Norwegian composer%');

The most obvious problems using a standard SQL3 system for text search include the: Lack of utilization of the document structure. Dependency on the serial search of the LIKE operator for the multiple term semantic metadata attributes and text body. The potential mismatch between the user query terms and the terms in the document descriptors. As noted earlier, SQL3 has no concept of a document or words and therefore there are no search operators for specification of the placement search terms in a document (adjacent, near, before, after,...). Since data retrieval in SQL3 is based on an exact match of the query terms and the DB values, no support is provided for similarity evaluation between the query terms and the document content. Obviously, more powerful operators are needed for text retrieval. Ideally, a query language that supports text search and retrieval by the semantic content of text documents must provide at least the following functionality. Search Criteria Example List of terms Norwegian, composer, Grieg Term proximity Edvard near Grieg Synonym concepts about "Norwegian Similar documents composers" like this document.

To help avoid problems with the use of various term forms, a root extraction function must be available for both document indexing and query preprocessing. Using the above examples some elements in the root-term table could be: Root Term Norway Compose Music Variations Norwegian, Norge, ... composer, Norsk, composers,

composes, ... tune, tunes, song, songs, ...

Note that there exist numerous electronic dictionaries, thesauri, taxonomies, ontologies that can be incorporated into a text query processor. SQL3/Text Information Retrieval Systems( IRS) have been under development since the mid 1950s. They provide search and retrieval functions for text

document collections based on document structure, concepts of words, and grammar. It is functionality from these systems that has been added by orDBMS vendors to support management of multimedia data. The resulting ORDBMS / MM (Multimedia) conforms (to some degree) to the Multimedia Information Retrieval Systems, MIRS, envisioned by Lu (1999). Basic ORDBMS / MM - text retrieval functionality includes generation of multiple types of term indexes, as well as a contains operator with suboperators for the WHERE clause. The contains operator differs from an exact match query in that it gives a probability for a match - a similarity score - between the query search terms and the documents in the database, rather than a true/false result. This operator can be used with multiple search terms and operators that specify relationships between the search terms, for example: the Boolean operators AND, OR, Not and location operators such as: adjacent, within same sentence or paragraph for text documents as illustrated in the following table. Term combination Term location Concept Various other operators AND, OR, NOT ADJACENT, NEAR, WITHIN, ... ABOUT, SIMILAR FUZZY, LIKE, ...

Assuming that whole Web pages are stored in an OR-DB attribute Document.text, the following examples will retrieve the document, in addition to other documents containing the search terms. 1) Select where 2) Select where 3) Select where * from Document Text CONTAINS ('Edvard' AND 'Grieg'); * from Document Text CONTAINS ('Edvard' ADJACENT 'Grieg'); * from Document Text ABOUT ('composers');

In processing the above queries, the SQL3/Text processing system utilizes the term indexes generated for the document set, as well as a thesaurus for query 3. Note that a term location index is required for query 2, while query 1 needs a frequency index if the retrieved documents are to be ranked /ordered by the frequency of the search terms within the documents. Image Retrieval

Popular knowledge claims that an image is worth 1000 words. Unfortunately, these 1000 words may differ from one individual to another depending on their perspective and/or knowledge of the image context. For example, Figure 6 gives a familiar demonstration that an image can have multiple, quite different interpretations. Thus, even if a 1000-word image description were available, it is not certain that the image could be retrieved by a user with a different description.

The problem is fundamentally one of communication between an information/image seeker/user and the image retrieval system. Since the user may have differing needs and knowledge about the image collection, an image retrieval system must support various forms for query formulation. In general, image retrieval queries can be classified as: 1. Attribute-Based Queries: which use context and/ structural metadata values to retrieve images, for example: o Find image number 'x' or o Find images from the 17th of May (the Norwegian national holiday day). 2. Textual Queries: which use a term-based specification of the desired images that can be matched to textual image descriptors, for example: o Find images of Hawaiian sunsets or o Find images of President Bush delivering a campaign speech 3. Visual Queries: which give visual characteristics (color, texture) or an image that can be compared to visual descriptors. Examples include: o Find images where the dominant color is blue and gold or o Find images like <this one>.

These query types utilize different image descriptors and require different processing functions. Image descriptors can be classified into: Metadata Descriptors: those that describe the image, as recommended in the numerous metadata standards, such as Dublin Core, CIDOC/CRM and MPEG-7, from the library, museum and motion picture communities respectively. These metadata can again be classified as: 1. Attribute-based context and structural metadata, such as creator, dates, genre, (source) image type, size, file name, ..., or 2. Text-based semantic metadata, such as title/caption, subject/keyword lists, free-text descriptions and/or the text surrounding embedded images, for example as used in a html document. Note that for embedded images, content indexing can be generated using the nearby text.

5. What are differences in Centralized and Distributed Database Systems? List the relative advantages of data distribution. Ans 5: Features of Distributed vs. Centralized Databases or Differences in Distributed & Centralized Databases Centralized Control vs. Decentralized Control In centralized control one "database administrator" ensures safety of data whereas in distributed control, it is possible to use hierarchical control structure based on a "global database administrator" having the central responsibility of whole data along with "local database administrators", who have the responsibility of local databases. Data Independence In central databases it means the actual organization of data is transparent to the application programmer. The programs are written with "conceptual" view of the data (called "Conceptual schema"), and the programs are unaffected by physical organization of data. In Distributed Databases, another aspect of "distribution dependency" is added to the notion of data independence as used in Centralized databases. Distribution Dependency means programs are written assuming the data is not distributed. Thus correctness of programs is unaffected by the movement of data from one site to another; however, their speed of execution is affected. Reduction of Redundancy In centralized databases redundancy was reduced for two reasons: (a) inconsistencies among several copies of the same logical data are avoided, (b) storage space is saved. Reduction of redundancy is obtained by data

sharing. In distributed databases data redundancy is desirable as (a) locality of applications can be increased if data is replicated at all sites where applications need it, (b) the availability of the system can be increased, because a site failure does not stop the execution of applications at other sites if the data is replicated. With data replication, retrieval can be performed on any copy, while updates must be performed consistently on all copies. Complex Physical Structures and Efficient Access In centralized databases complex accessing structures like secondary indexed, interfile chains are used. All these features provide efficient access to data. In distributed databases efficient access requires accessing data from different sites. For this an efficient distributed data access plan is required which can be generated either by the programmer or produced automatically by an optimizer. Problems faced in the design of an optimizer can be classified in two categories: a) Global optimization consists of determining which data must be accessed at which sites and which data files must consequently be transmitted between sites. b) Local optimization consists of deciding how to perform the local database accesses at each site. Integrity, Recovery and Concurrency Control A transaction is an atomic unit of execution and atomic transactions are the means to obtain database integrity. Failures and concurrency are two dangers of atomicity. Failures may cause the system to stop in midst of transaction execution, thus violating the atomicity requirement. Concurrent execution of different transactions may permit one transaction to observe an inconsistent, transient state created by another transaction during its execution. Concurrent execution requires synchronization amongst the transactions, which is much harder in all distributed systems. Privacy and Security In traditional databases, the database administrator, having centralized control, can ensure that only authorized access to the data is performed. In distributed databases, local administrators face the same as well as two new aspects of the problem; (a) security (protection) problems because of communication networks is intrinsic to database systems. (b) In certain databases with a high degree of "site autonomy" may feel more protected because they can enforce their own protections instead of depending on a central database administrator. Distributed Query Processing

The DDBMS should be capable of gathering and presenting data from more than one site to answer a single query. In theory a distributed system can handle queries more quickly than a centralized one, by exploiting parallelism and reducing disc contention; in practice the main delays (and costs) will be imposed by the communications network. Routing algorithms must take many factors into account to determine the location and ordering of operations. Communications costs for each link in the network are relevant, as also are variable processing capabilities and loadings for different nodes, and (where data fragments are replicated) trade-offs between cost and currency. If some nodes are updated less frequently than others there may be a choice between querying the local out-of-date copy very cheaply and getting a more up-to-date answer by accessing a distant location.. Distributed Directory (Catalog) Management Catalogs for distributed databases contain information like fragmentation description, allocation description, mappings to local names, access method description, statistics on the database, protection and integrity constraints (consistency information) which are more detailed as compared to centralized databases. Relative Advantages of Distributed Databases over Centralized Databases Organizational and Economic Reasons Many organizations are decentralized, and a distributed database approach fits more naturally the structure of the organization. The organizational and economic motivations are amongst the main reasons for the development of distributed databases. In organizations already having several databases and feeling the necessity of global applications, distributed databases is the natural choice. Incremental Growth In a distributed environment, expansion of the system in terms of adding more data, increasing database size, or adding more processors is much easier. Reduced Communication Overhead Many applications are local, and these applications do not have any communication overhead. Therefore, the maximization of the locality of applications is one of the primary objectives in distributed database design. Performance Considerations

Data localization reduces the contention for CPU and I/O services and simultaneously reduces access delays involved in wide area networks. Local queries and transactions accessing data at a single site have better performance because of the smaller local databases. In addition, each site has a smaller number of transactions executing than if all transactions are submitted to a single centralized database. Moreover, inter-query and intraquery parallelism can be achieved by executing multiple queries at different sites, or breaking up a query into a number of sub queries that execute in parallel. This contributes to improved performance. Reliability and Availability Reliability is defined as the probability that a system is running (not down) at a certain time point. Availability is the probability that the system is continuously available during a time interval. When the data and DBMS software are distributed over several sites, one site may fail while other sites continue to operate. Only the data and software that exist at the failed site cannot be accessed. This improves both reliability and availability. Further improvement is achieved by judiciously replicating data and software at more than one site. Management are possible: Distribution or Network Transparency This refers to freedom for the user from the operational details of the network. It may be divided into location and naming transparency. Location transparency refers to the fact that the command used to perform a task is independent of the location of data and the location of the system where the command was issued. Naming transparency implies that once a name is specified, the named objects can be accessed unambiguously without additional specification. Replication Transparency Copies of the data may be stored at multiple sites for better availability, performance, and reliability. Replication transparency makes the user unaware of the existence of copies. Fragmentation Transparency Two main types of fragmentation are Horizontal fragmentation, which distributes a relation into sets of tuples (rows), and Vertical Fragmentation which distributes a relation into sub relations where each sub relation is defined by a subset of the column of the original relation. A global query by of Distributed Data with Different Levels of

Transparency In a distributed database, following types of transparencies

the user must be transformed into several fragment queries. Fragmentation transparency makes the user unaware of the existence of fragments.

6. What are Commit Protocols? Explain, how Two-Phase Commit Protocol responds to following types of failures:i) Failure of Participating Site, ii) Failure of Coordinator Ans 6: Commit Protocols: In distributed data base and transaction systems a distributed commit protocol is required to ensure that the effects of a distributed transaction are atomic, that is, either all the effects of the transaction persist or none persist, whether or not failures occur. Several commit protocols have been proposed in the literature. These are variations of what has become a standard and known as the two-phase commit (2PC) protocol. Two-phase commit protocol In transaction processing, databases, and computer networking, the two-phase commit protocol (2PC) is a type of atomic commitment protocol (ACP). It is a distributed algorithm that coordinates all the processes that participate in a distributed atomic transaction on whether to commit or abort (roll back) the transaction (it is a specialized type of consensus protocol). The protocol achieves its goal even in many cases of temporary system failure (involving either process, network node, communication, etc. failures), and is thus widely utilized. However, it is not resilient to all possible failure configurations, and in rare cases user (e.g., a system's administrator) intervention is needed to remedy outcome. To accommodate recovery from failure (automatic in most cases) the protocol's participants use logging of the protocol's states. Log records, which are typically slow to generate but survive failures, are used by the protocol's recovery procedures. Though usually intended to be used infrequently, recovery procedures comprise a substantial portion of the

protocol, due to many possible failure scenarios to be considered and supported by the protocol. (I) Failure of Participating Site: The commit-request phase (or voting phase), in which a coordinator process attempts to prepare all the transaction's participating processes (named participants, cohorts, or workers) to take the necessary steps for either committing or aborting the transaction and to vote, either "Yes": commit (if the transaction participant's local portion execution has ended properly), or "No": abort (if a problem has been detected with the local portion), and The commit phase, in which, based on voting of the cohorts, the coordinator decides whether to commit (only if all have voted "Yes") or abort the transaction (otherwise), and notifies the result to all the cohorts. The cohorts then follow with the needed actions (commit or abort) with their local transactional resources (also called recoverable resources; e.g., database data) and their respective portions in the transaction's other output (if applicable). (ii) Failure of Coordinator

If any cohort votes No during the commit-request phase (or the coordinator's timeout expires): (1) The coordinator sends a rollback message to all the cohorts. (2) Each cohort undoes the transaction using the undo log, and releases the resources and locks held during the transaction. (3) Each cohort sends an acknowledgement to the coordinator. (4) The coordinator undoes the transaction when all acknowledgements have been received.

Master of Computer Application (MCA) Semester 4 MC0078- Java Programing


Assignment Set -1

Q. 1. Write a program to perform the basic arithmetic operations:

a) Addition class Prog21 { public static void main(String[] args) { int i = 10; int j = 20; i = i + j; System.out.println("The sum is: " + i); } }

b) Subtraction class Prog21 { public static void main(String[] args) { int i = 20; int j = 10; i = i - j; System.out.println("The sum is: " + i); } } c) Multiplication class Prog21 { public static void main(String[] args) { int i = 20; int j = 10; i = i * j; System.out.println("The sum is: " + i); } } d) Division class Prog21 { public static void main(String[] args) { int i = 20; int j = 10; i = i / j;

System.out.println("The sum is: " + i); } }

Q. 2. Discuss the following with suitable example programs for each: A) Data Types in Java

There are two kinds of data types in Java Primitives/standard data types. Abstract/derived data types. Primitives Data Types Primitive data types (also know as standard data types) are the data types that are built into the Java language. The Java compiler holds details instructions on each operation the data types that are built into the Java language. The Java compiler holds detailed instructions on each legal operation the data type supports. There are eight primitive data types in Java.

The data types byte, short, int, long, float and double are numeric data types. The first four of these can hold only whole numbers whereas the last two (float and double) can hold decimal values like 5.05. All these data types can hold negative values. However, the keyword unsigned can be used to restrict the range of values to positive numbers. Amongst others, boolean can hold only the value true or false and char can hold only a single character. Abstract/Derived Data Types Abstract data types are based on primitives data types and have more functionality that the primitive data types. For example, String is an abstract data type that can store alphabets, digits and other characters like /, (); :$#. You cannot perform calculations on a variable of the string data type even if the data stored in it has digits.

B) Variables in Java

When you learned algebraic equations in school, you used x and y to represent values in equations. Unlike pi which has a constant value of 3.14, the values of x and y are not constant in equations. Java provides constants and variables to store data in programs. Java allocates memory to each variable and constant you use in your program. As in algebra, the values of variables may change in a program, but the values of constants, as the name suggests, do not change. You must assign unique names to variables and constants. Variable names are used in a program in much the same way as they are in ordinary Algebra. Each variable used in a program must be declared. That is to say, the program must contain a statement specifying precisely what kind of information (data type) the variable will contain. This applies to every variable used in the program, regardless of the type. Naming Variables A program refers to a variable using its name. Certain rules and conventions govern the naming of variables. You must adhere to rules. Conventions help improve the readability of the program, but following them is not mandatory. Rules for Naming Variables in Java A variable name: Must not be a keyword in Java. Must not begin with a digit. Must not contain embedded spaces. Can contain characters from various alphabets, like Japanese, Greek, and Cyrillic. Syntax for Defining Variables

All the attributes of a class are defined as data members. The syntax used to declare a class variable is: <data_type> <variable_name> As the braces { } are used to mark the beginning and end of a class, a semicolon ; is used to mark the end of a statement.

Q. 3. What are the different types of control statements? Ans. Following statements are used to control the flow of execution in a program. 1. Decision Making Statements If-else statement Switch case statement 2. Looping Statement For loop While loop Do-while loop

3. Other statement Break Continue Label If-else statement The if statement is Javas conditional branch statement. It can be used to route program execution through two different paths. Here is the general form of the if statement: if (condition) statement1; else statement2; Here, each statement may be a single statement or a compound statement enclosed in curly braces (that is, a block). The condition is any expression that returns a boolean value. The else clause is optional. The if works like this: If the condition is true, then statement1 is executed. Otherwise, statement2 (if it exists) is executed. In no case will both statements be executed. For example, consider the following:

Figure 3.6 Most often, the expression used to control the if will involve the relational operators. However, this is not technically necessary. It is possible to control the if using a single boolean variable, as shown in this code fragment: boolean dataAvailable; // if (dataAvailable) ProcessData(); else waitForMoreData(); Remember, only one statement can appear directly after the if or the else. If you want to include more statements, youll need to create a block, as in this fragment: int bytesAvailable;

// if (bytesAvailable > 0) { ProcessData(); bytesAvailable -= n; } else waitForMoreData(); Here, both statements within the if block will execute if bytesAvailable is greater than zero. Some programmers find it convenient to include the curly braces when using the if, even when there is only one statement in each clause. This makes it easy to add another statement at a later date, and you dont have to worry about forgetting the braces. In fact, forgetting to define a block when one is needed is a common cause of errors. For example, consider the following code fragment: int bytesAvailable; // if (bytesAvailable > 0) { ProcessData(); bytesAvailable -= n; } else waitForMoreData(); bytesAvailable = n;

It seems clear that the statement bytesAvailable = n; was intended to be executed inside the else clause, because of the indentation level. However, as you recall, whitespace is insignificant to Java, and there is no way for the compiler to know what was intended. This code will compile without complaint, but it will behave incorrectly when run. The preceding example is fixed in the code that follows: int bytesAvailable; // if (bytesAvailable > 0) { ProcessData(); bytesAvailable -= n; } else { waitForMoreData(); bytesAvailable = n; } The if-else-if Ladder A common programming construct that is based upon a sequence of nested ifs is the ifelseif ladder. It looks like this: if(condition)

statement; else if(condition) statement; else if(condition) statement; . . . else statement; The if statements are executed from the top down. As soon as one of the conditions controlling the if is true, the statement associated with that if is executed, and the rest of the ladder is bypassed. If none of the conditions is true, then the final else statement will be executed. The final else acts as a default condition; that is, if all other conditional tests fail, then the last else statement is performed. If there is no final else and all other conditions are false, then no action will take place. Here is a program that uses an if-else-if ladder to determine which season a particular month is in. // Demonstrate if-else-if statements. class IfElse {

public static void main(String args[ ]) { int month = 4; // April String season; if(month == 12 || month == 1 || month == 2) season = "Winter"; else if(month == 3 || month == 4 || month == 5) season = "Spring"; else if(month == 6 || month == 7 || month == 8) season = "Summer"; else if(month == 9 || month == 10 || month == 11) season = "Autumn"; else season = "Bogus Month"; System.out.println("April is in the " + season + "."); } } Here is the output produced by the program: April is in the Spring.

You might want to experiment with this program before moving on. As you will find, no matter what value you give month, one and only one assignment statement within the ladder will be executed. Switch Statement The switch statement is Javas multiway branch statement. It provides an easy way to dispatch execution to different parts of your code based on the value of an expression. As such, it often provides a better alternative than a large series of if-else-if statements. Here is the general form of a switch statement: switch (expression) { case value1: // statement sequence break; case value2: // statement sequence break; . . . case valueN:

// statement sequence break; default: // default statement sequence } The expression must be of type byte, short, int, or char; each of the values specified in the case statements must be of a type compatible with the expression. Each case value must be a unique literal (that is, it must be a constant, not a variable). Duplicate case values are not allowed. The switch statement works like this: The value of the expression is compared with each of the literal values in the case statements. If a match is found, the code sequence following that case statement is executed. If none of the constants matches the value of the expression, then the default statement is executed. However, the default statement is optional. If no case matches and no default is present, then no further action is taken. The break statement is used inside the switch to terminate a statement sequence. When a break statement is encountered, execution branches to the first line of code that follows the entire switch statement. This has the effect of "jumping out" of the switch. Example

Figure 3.7 The break statement is optional. If you omit the break, execution will continue on into the next case. It is sometimes desirable to have multiple cases without break statements between them. For example, consider the following program: // In a switch, break statements are optional. class MissingBreak { public static void main(String args[ ]) { for(int i=0; i<12; i++) switch(i) { case 0: case 1: case 2:

case 3: case 4: System.out.println("i is less than 5"); break; case 5: case 6: case 7: case 8: case 9: System.out.println("i is less than 10"); break; default: System.out.println("i is 10 or more"); } } } This program generates the following output: i is less than 5

i is less than 5 i is less than 5 i is less than 5 i is less than 5 i is less than 10 i is less than 10 i is less than 10 i is less than 10 i is less than 10 i is 10 or more i is 10 or more Nested switch Statements You can use a switch as part of the statement sequence of an outer switch. This is called a nested switch. Since a switch statement defines its own block, no conflicts arise between the case constants in the inner switch and those in the outer switch. For example, the following fragment is perfectly valid: switch(count) { case 1: switch(target) { // nested switch

case 0: System.out.println("target is zero"); break; case 1: // no conflicts with outer switch System.out.println("target is one"); break; } break; case 2: // Here, the case 1: statement in the inner switch does not conflict with the case 1: statement in the outer switch. The count variable is only compared with the list of cases at the outer level. If count is 1, then target is compared with the inner list cases. In summary, there are three important features of the switch statement to note: The switch differs from the if in that switch can only test for equality, whereas if can evaluate any type of Boolean expression. That is, the switch looks only for a match between the value of the expression and one of its case constants. No two case constants in the same switch can have identical values. Of course, a switch statement enclosed by an outer switch can have case constants in common. A switch statement is usually more efficient than a set of nested ifs.

The last point is particularly interesting because it gives insight into how the Java compiler works. When it compiles a switch statement, the Java compiler will inspect each of the case constants and create a "jump table" that it will use for selecting the path of execution depending on the value of the expression. Therefore, if you need to select among a large group of values, a switch statement will run much faster than the equivalent logic coded using a sequence of if-elses. The compiler can do this because it knows that the case constants are all the same type and simply must be compared for equality with the switch expression. The compiler has no such knowledge of a long list of if expressions. for Loop The usage of for loop is as follows for (initial statement; termination condition; increment instruction) statement; When multiple statements are to be included in the for loop, the statements are included inside flower braces. for (initial statement; termination condition; increment instruction) { statement1; statement2; } The example below prints numbers from 1 to 10

Figure 3.8 The results of the above program is shown below

Figure 3.9 Like all other programming languages, Java allows loops to be nested. That is, one loop may be inside another. For example, here is a program that nests for loops: // Loops may be nested. class Nested {

public static void main(String args[ ]) { int i, j; for(i=0; i<10; i++) { for(j=i; j<10; j++) System.out.print("."); System.out.println(); } } } The output produced by this program is shown here: . .. . .. While Statement

The while loop is Javas most fundamental looping statement. It repeats a statement or block while its controlling expression is true. Here is its general form: while (condition) { // body of loop } The condition can be any Boolean expression. The body of the loop will be executed as long as the conditional expression is true. When condition becomes false, control passes to the next line of code immediately following the loop. The curly braces are unnecessary if only a single statement is being repeated. Example

do.while statement As you just saw, if the conditional expression controlling a while loop is initially false, then the body of the loop will not be executed at all. However, sometimes it is desirable to execute the body of a while loop at least once, even if the conditional expression is false to begin with. In other words, there are times when you would like to test the

termination expression at the end of the loop rather than at the beginning. Fortunately, Java supplies a loop that does just that: the do-while. The do-while loop always executes its body at least once, because its conditional expression is at the bottom of the loop. Its general form is do { // body of loop } while (condition); Each iteration of the do-while loop first executes the body of the loop and then evaluates the conditional expression. If this expression is true, the loop will repeat. Otherwise, the loop terminates. As with all of Javas loops, condition must be a boolean expression. Example

Figure 3.11 The do-while loop is especially useful when you process a menu selection, because you will usually want the body of a menu loop to execute at least once. Consider the

following program which implements a very simple help system for Javas selection and iterationstatements: // Using a do-while to process a menu selection class Menu { public static void main(String args[]) throws java.io.IOException { char choice; do { System.out.println("Help on:"); System.out.println(" 1. if"); System.out.println(" 2. switch"); System.out.println(" 3. while"); System.out.println(" 4. do-while"); System.out.println(" 5. for\n"); System.out.println("Choose one:"); choice = (char) System.in.read(); } while( choice < 1 || choice > 5); System.out.println("\n");

switch(choice) { case 1: System.out.println("The if:\n"); System.out.println("if(condition) statement;"); System.out.println("else statement;"); break; case 2: System.out.println("The switch:\n"); System.out.println("switch(expression) {"); System.out.println(" case constant:"); System.out.println(" statement sequence"); System.out.println(" break;"); System.out.println(" // "); System.out.println("}"); break; case 3: System.out.println("The while:\n"); System.out.println("while(condition) statement;");

break; case 4: System.out.println("The do-while:\n"); System.out.println("do {"); System.out.println(" statement;"); System.out.println("} while (condition);"); break; case 5: System.out.println("The for:\n"); System.out.print("for(init; condition; iteration)"); System.out.println(" statement;"); break; } } } Here is a sample run produced by this program: Help on: 1. if

2. switch 3. while 4. do-while 5. for Choose one: The do-while: do { statement; } while (condition); In the program, the do-while loop is used to verify that the user has entered a valid choice. If not, then the user is reprompted. Since the menu must be displayed at least once, the do-while is the perfect loop to accomplish this. A few other points about this example: Notice that characters are read from the keyboard by calling System.in.read( ). This is one of Javas console input functions. Although Javas console I/O methods wont be discussed in detail until System.in.read( ) is used here to obtain the users choice. It reads characters from standard input (returned as integers, which is why the return value was cast to char). By default, standard input is line buffered, so you must press ENTER before any characters that you type will be sent to your program. Javas console input is quite limited and awkward to work with. Further, most real-world Java programs and applets will be graphical and window-based. For these reasons, not much use of console input has been made in this book. However, it is useful in this

context. One other point: Because System.in.read( ) is being used, the program must specify the throws java.io.IOException clause. This line is necessary to handle input errors. Break statement By using break, you can force immediate termination of a loop, bypassing the conditional expression and any remaining code in the body of the loop. When a break statement is encountered inside a loop, the loop is terminated and program control resumes at the next statement following the loop. Here is a simple example: // Using break to exit a loop. class BreakLoop { public static void main(String args[ ]) { for(int i=0; i<100; i++) { if(i == 10) break; // terminate loop if i is 10 System.out.println("i: " + i); } System.out.println("Loop complete."); } } This program generates the following output: i: 0

i: 1 i: 2 i: 3 i: 4 i: 5 i: 6 i: 7 i: 8 i: 9 Loop complete. As you can see, although the for loop is designed to run from 0 to 99, the break statement causes it to terminate early, when i equal 10. Continue Statement Sometimes it is useful to force an early iteration of a loop. That is, you might want to continue running the loop, but stop processing the remainder of the code in its body for this particular iteration. This is, in effect, a goto just past the body of the loop, to the loops end. The continue statement performs such an action. In while and do-while loops, a continue statement causes control to be transferred directly to the conditional expression that controls the loop. In a for loop, control goes first to the iteration portion of the for statement and then to the conditional expression. For all three loops, any intermediate code is bypassed.

Here is an example program that uses continue to cause two numbers to be printed on each line: // Demonstrate continue. class Continue { public static void main (String args[]) { for (int i=0; i<10; i++) { System.out.print (i + " "); if (i%2 == 0) continue; System.out.println (""); } } } This code uses the % operator to check if i is even. If it is, the loop continues without printing a newline. Here is the output from this program: 01 23 45 67 89

As with the break statement, continue may specify a label to describe which enclosing loops to continue. Here is an example program that uses continue to print a triangular multiplication table for 0 through 9.

Q. 4. Describe the following with respect to Exception Handling: A) Exception Classes The class at the top of the exception classes hierarchy is called Throwable. Two classes are derived from the Throwable class- Error and Exception. The Exception class is used fro the exceptional conditions that have to be trapped in a program. The Error class defines a condition that does not occur under normal circumstances. In other words, the Error class is used for catastrophic failures such as VirtualMachineError. These classes are available in the java.lang package B) Common Exceptions Java has several predefined exceptions. The most common exceptions that you may encounter are described below. Arithmetic Exception This exception is thrown when an exceptional arithmetic condition has occurred. For example, a division by zero generates such an exception. NullPointer Exception This exception is thrown when an application attempts to use null where an object is required. An object that has not been allocated memory holds a null value. The situations in which an exception is thrown include:

- Using an object without allocating memory for it. - Calling the methods of a null object. - Accessing or modifying the attributes of a null object. ArrayIndexOutOfBounds Exception The exception Array Index Out Of Bounds Exception is thrown when an attempt is made to access an array element beyond the index of the array. For example, if you try to access the eleventh element of an array thats has only ten elements, the exception will be thrown.

Q.5 What is the difference between bound property and constraint property? Ans. Bound Property Bound properties support the PropertyChangeListener (in the API reference documentation) class. Sometimes when a Bean property changes, another object might need to be notified of the change, and react to the change. Whenever a bound property changes, notification of the change is sent to interested listeners. The accessor methods for a bound property are defined in the same way as those for simple properties. However, you also need to provide the event listener registration methods forPropertyChangeListener classes and fire a

PropertyChangeEvent (in the API reference documentation) event to the PropertyChangeListener objects by calling their propertyChange methods The convenience PropertyChangeSupport (in the API reference documentation) class enables your bean to implement these methods. Your bean can inherit changes from the PropertyChangeSupportclass, or use it as an inner class. In order to listen for property changes, an object must be able to add and remove itself from the listener list on the bean containing the bound property. It must also be able to respond to the event notification method that signals a property change. The PropertyChangeEvent class encapsulates property change information, and is sent from the property change event source to each object in the property change listener list with the propertyChange method. Implementing Bound Property Support Within a Bean To implement a bound property in your application, follow these steps: 1. Import the java.beans package. This gives you access to the

PropertyChangeSupport class. 2. Instantiate a PropertyChangeSupport object. This object maintains the property change listener list and fires property change events. You can also make your class a PropertyChangeSupport subclass. 3. Implement methods to maintain the property change listener list. Since a PropertyChangeSupport subclass implements these methods, you merely wrap calls to the property-change support objects methods.

4. Modify a propertys set method to fire a property change event when the property is changed. Creating a Bound Property To create the title property as a bound property for the MyBean component in the NetBeans GUI Builder, perform the following sequence of operations: 1. Right-click the Bean Patterns node in the MyBean class hierarchy. 2. Select Add|Property from the pop-up menu. 3. Fill the New Property Pattern form as shown on the following figure and click OK.

Fig. 6.6.1 Note that the title property and the multicast event source pattern

PropertyChangeListener were added to the Bean Patterns structure.

You can also modify existing code generated in the previous lesson to convert the title and lines properties to the bound type as follows (where newly added code is shown in bold): import java.awt.Graphics; import java.beans.PropertyChangeListener; import java.beans.PropertyChangeSupport; import java.io.Serializable; import javax.swing.JComponent; /** * Bean with bound properties. */ public class MyBean extends JComponent implements Serializable { private String title; private String[] lines = new String[10]; private final PropertyChangeSupport pcs = new PropertyChangeSupport( this ); public String getTitle() { return this.title; } public void setTitle( String title ) { String old = this.title; this.title = title; this.pcs.firePropertyChange( "title", old, title ); }

public String[] getLines() { return this.lines.clone(); } public String getLines( int index ) { return this.lines[index]; } public void setLines( String[] lines ) { String[] old = this.lines; this.lines = lines; this.pcs.firePropertyChange( "lines", old, lines ); } public void setLines( int index, String line ) { String old = this.lines[index]; this.lines[index] = line; this.pcs.fireIndexedPropertyChange( "lines", index, old, lines ); } public void addPropertyChangeListener( PropertyChangeListener listener ) { this.pcs.addPropertyChangeListener( listener ); } public void removePropertyChangeListener( PropertyChangeListener listener ) { this.pcs.removePropertyChangeListener( listener ); } protected void paintComponent( Graphics g )

{ g.setColor( getForeground() ); int height = g.getFontMetrics().getHeight(); paintString( g, this.title, height ); if ( this.lines != null ) { int step = height; for ( String line : this.lines ) paintString( g, line, height += step ); } } private void paintString( Graphics g, String str, int height ) { if ( str != null ) g.drawString( str, 0, height ); } }

Constrain Property A bean property is constrained the API if the bean supports the and Vetoable Property

ChangeListener(in

reference

documentation)

ChangeEvent(in the API reference documentation) classes, and if the set method for this property throws a PropertyVetoException(in the API reference documentation). Constrained properties are more complicated than bound properties because they also support property change listeners which happen to be vetoers.

The following operations in the setXXX method for the constrained property must be implemented in this order: 1. Save the old value in case the change is vetoed. 2. Notify listeners of the new proposed value, allowing them to veto the change. 3. If no listener vetoes the change (no exception is thrown), set the property to the new value. The accessor methods for a constrained property are defined in the same way as those for simple properties, with the addition that the setXXX method throws a

PropertyVetoException exception. The syntax is as follows: public void setPropertyName(PropertyType pt) throws PropertyVetoException {code} Handling Vetoes If a registered listener vetoes a proposed property change by throwing a PropertyVetoException exception, the source bean with the constrained property is responsible for the following actions: Catching exceptions. Reverting to the old value for the property. Issuing a new VetoableChangeListener.vetoableChange call to all listeners to report the reversion. The VetoableChangeListener class throws a PropertyVetoException and handles the PropertyChangeEvent event fired by the bean with the constrained property.

The VetoableChangeSupport provides the following operations: Keeping track of VetoableChangeListener objects. Issuing the vetoableChange method on all registered listeners. Catching any vetoes (exceptions) thrown by listeners. Informing all listeners of a veto by calling vetoableChange again, but with the old property value as the proposed "new" value. Creating a Constrained Property To create a constrained property, set the appropriate option in the New Property Pattern form as shown on the following figure.

Fig. 6.7.1 Note that the Multicast Source Event Pattern vetoableChangeListener was added to the Bean Patterns hierarchy.

You can also modify the existing code generated in the previous lesson to make the title and lines properties constrained as follows (where newly added code is shown in bold): import java.io.Serializable; import java.beans.PropertyChangeListener; import java.beans.PropertyChangeSupport; import java.beans.PropertyVetoException; import java.beans.VetoableChangeListener; import java.beans.VetoableChangeSupport; import java.awt.Graphics; import javax.swing.JComponent; /** * Bean with constrained properties. */ public class MyBean extends JComponent implements Serializable { private String title; private String[] lines = new String[10]; private final PropertyChangeSupport pcs = new PropertyChangeSupport( this ); private final VetoableChangeSupport vcs = new VetoableChangeSupport( this ); public String getTitle() { return this.title; } /**

* This method was modified to throw the PropertyVetoException * if some vetoable listeners reject the new title value */ public void setTitle( String title ) throws PropertyVetoException { String old = this.title; this.vcs.fireVetoableChange( "title", old, title ); this.title = title; this.pcs.firePropertyChange( "title", old, title ); } public String[] getLines() { return this.lines.clone(); } public String getLines( int index ) { return this.lines[index]; } /** * This method throws the PropertyVetoException * if some vetoable listeners reject the new lines value */ public void setLines( String[] lines ) throws PropertyVetoException { String[] old = this.lines; this.vcs.fireVetoableChange( "lines", old, lines ); this.lines = lines;

this.pcs.firePropertyChange( "lines", old, lines ); } public void setLines( int index, String line ) throws PropertyVetoException { String old = this.lines[index]; this.vcs.fireVetoableChange( "lines", old, line ); this.lines[index] = line; this.pcs.fireIndexedPropertyChange( "lines", index, old, line ); } public void addPropertyChangeListener( PropertyChangeListener listener ) { this.pcs.addPropertyChangeListener( listener ); } public void removePropertyChangeListener( PropertyChangeListener listener ) { this.pcs.removePropertyChangeListener( listener ); } /** * Registration of the VetoableChangeListener */ public void addVetoableChangeListener( VetoableChangeListener listener ) { this.vcs.addVetoableChangeListener( listener ); } public void removeVetoableChangeListener( VetoableChangeListener listener ) { this.vcs.removeVetoableChangeListener( listener ); }

protected void paintComponent( Graphics g ) { g.setColor( getForeground() ); int height = g.getFontMetrics().getHeight(); paintString( g, this.title, height ); if ( this.lines != null ) { int step = height; for ( String line : this.lines ) paintString( g, line, height += step ); } } private void paintString( Graphics g, String str, int height ) { if ( str != null ) g.drawString( str, 0, height ); } } Q-6. Define RMI. Define the architecture of RMI invocation. Ans. RMI applications often comprise two separate programs, a server and a client. A typical server program creates some remote objects, makes references to these objects accessible, and waits for clients to invoke methods on these objects. A typical client program obtains a remote reference to one or more remote objects on a server and then invokes methods on them. RMI provides the mechanism by which the server and the client communicate and pass information back and forth. Such an application is sometimes referred to as a distributed object application.

Designing a Remote Interface At the core of the compute engine is a protocol that enables tasks to be submitted to the compute engine, the compute engine to run those tasks, and the results of those tasks to be returned to the client. This protocol is expressed in the interfaces that are supported by the compute engine. The remote communication for this protocol is illustrated in the following figure.

Fig. 2.2.1.1 Each interface contains a single method. The compute engines remote interface, Compute, enables tasks to be submitted to the engine. The client interface, Task, defines how the compute engine executes a submitted task. The compute Compute interface defines the remotely accessible part, the compute engine itself. Here is the source code for the Compute interface: package compute; import java.rmi.Remote; import java.rmi.RemoteException; public interface Compute extends Remote { <T> T executeTask(Task<T> t) throws RemoteException; }

By extending the interface java.rmi.Remote, the Compute interface identifies itself as an interface whose methods can be invoked from another Java virtual machine. Any object that implements this interface can be a remote object. As a member of a remote interface, the executeTask method is a remote method. Therefore, this method must be defined as being capable of throwing a java.rmi.RemoteException. This exception is thrown by the RMI system from a remote method invocation to indicate that either a communication failure or a protocol error has occurred. A RemoteException is a checked exception, so any code invoking a remote method needs to handle this exception by either catching it or declaring it in its throws clause. The second interface needed for the compute engine is the Task interface, which is the type of the parameter to the executeTask method in the Compute interface. The compute.Task interface defines the interface between the compute engine and the work that it needs to do, providing the way to start the work. Here is the source code for the Task interface: package compute; public interface Task<T> { T execute(); } The Task interface defines a single method, execute, which has no parameters and throws no exceptions. Because the interface does not extend Remote, the method in this interface doesnt need to list java.rmi.RemoteException in its throws clause. The Task interface has a type parameter, T, which represents the result type of the tasks computation. This interfaces execute method returns the result of the computation and thus its return type is T.

The Compute interfaces executeTask method, in turn, returns the result of the execution of the Task instance passed to it. Thus, the executeTask method has its own type parameter, T, that associates its own return type with the result type of the passed Task instance. RMI uses the Java object serialization mechanism to transport objects by value between Java virtual machines. For an object to be considered serializable, its class must implement the java.io.Serializable marker interface. Therefore, classes that implement the Task interface must also implement Serializable, as must the classes of objects used for task results. Different kinds of tasks can be run by a Compute object as long as they are implementations of the Task type. The classes that implement this interface can contain any data needed for the computation of the task and any other methods needed for the computation. Here is how RMI makes this simple compute engine possible. Because RMI can assume that the Task objects are written in the Java programming language, implementations of the Task object that were previously unknown to the compute engine are downloaded by RMI into the compute engines Java virtual machine as needed. This capability enables clients of the compute engine to define new kinds of tasks to be run on the server machine without needing the code to be explicitly installed on that machine. The compute engine, implemented by the ComputeEngine class, implements the Compute interface, enabling different tasks to be submitted to it by calls to its executeTask method. These tasks are run using the tasks implementation of the execute method and the results, are returned to the remote client. Implementing a Remote Interface

This section discusses the task of implementing a class for the compute engine. In general, a class that implements a remote interface should at least do the following: Declare the remote interfaces being implemented Define the constructor for each remote object Provide an implementation for each remote method in the remote interfaces An RMI server program needs to create the initial remote objects and export them to the RMI runtime, which makes them available to receive incoming remote invocations. This setup procedure can be either encapsulated in a method of the remote object implementation class itself or included in another class entirely. The setup procedure should do the following: Create and install a security manager Create and export one or more remote objects Register at least one remote object with the RMI registry (or with another naming service, such as a service accessible through the Java Naming and Directory Interface) for bootstrapping purposes The complete implementation of the compute engine follows. The

enqine.ComputeEngine class implements the remote interface Compute and also includes the main method for setting up the compute engine. Here is the source code for the ComputeEngine class: package engine; import java.rmi.RemoteException; import java.rmi.registry.LocateRegistry; import java.rmi.registry.Registry;

import java.rmi.server.UnicastRemoteObject; import compute.Compute; import compute.Task; public class ComputeEngine implements Compute { public ComputeEngine() { super(); } public <T> T executeTask(Task<T> t) { return t.execute(); } public static void main(String[] args) { if (System.getSecurityManager() == null) { System.setSecurityManager(new SecurityManager()); } try { String name = "Compute"; Compute engine = new ComputeEngine(); Compute stub = (Compute) UnicastRemoteObject.exportObject(engine, 0); Registry registry = LocateRegistry.getRegistry(); registry.rebind(name, stub); System.out.println("ComputeEngine bound"); } catch (Exception e) { System.err.println("ComputeEngine exception:"); e.printStackTrace(); } }

} The following sections discuss each component of the compute engine implementation. Declaring the Remote Interfaces Being Implemented The implementation class for the compute engine is declared as follows: public class ComputeEngine implements Compute This declaration states that the class implements the Compute remote interface and therefore can be used for a remote object. The ComputeEngine class defines a remote object implementation class that implements a single remote interface and no other interfaces. The ComputeEngine class also contains two executable program elements that can only be invoked locally. The first of these elements is a constructor for ComputeEngine instances. The second of these elements is a main method that is used to create a ComputeEngine instance and make it available to clients. Defining the Constructor for the Remote Object The ComputeEngine class has a single constructor that takes no arguments. The code for the constructor is as follows: public ComputeEngine() { super(); } This constructor just invokes the superclass constructor, which is the no-argument constructor of the Object class. Although the superclass constructor gets invoked even if omitted from the ComputeEngine constructor, it is included for clarity.

Providing Implementations for Each Remote Method The class for a remote object provides implementations for each remote method specified in the remote interfaces. The Compute interface contains a single remote method, executeTask, which is implemented as follows: public <T> T executeTask(Task<T> t) { return t.execute(); } This method implements the protocol between the ComputeEngine remote object and its clients. Each client provides the ComputeEngine with a Task object that has a particular implementation of the Task interfaces execute method. The

ComputeEngine executes each clients task and returns the result of the tasks execute method directly to the client. Passing Objects in RMI Arguments to or return values from remote methods can be of almost any type, including local objects, remote objects, and primitive data types. More precisely, any entity of any type can be passed to or from a remote method as long as the entity is an instance of a type that is a primitive data type, a remote object, or a serializable object, which means that it implements the interface java.io.Serializable. Some object types do not meet any of these criteria and thus cannot be passed to or returned from a remote method. Most of these objects, such as threads or file descriptors, encapsulate information that makes sense only within a single address space. Many of the core classes, including the classes in the packages java.lang and java.util, implement the Serializable interface. The rules governing how arguments and return values are passed are as follows:

Remote objects are essentially passed by reference. A remote object reference is a stub, which is a client-side proxy that implements the complete set of remote interfaces that the remote object implements. Local objects are passed by copy, using object serialization. By default, all fields are copied except fields that are marked static or transient. Default serialization behavior can be overridden on a class-by-class basis. Passing a remote object by reference means that any changes made to the state of the object by remote method invocations are reflected in the original remote object. When a remote object is passed, only those interfaces that are remote interfaces are available to the receiver. Any methods defined in the implementation class or defined in non-remote interfaces implemented by the class are not available to that receiver. For example, if you were to pass a reference to an instance of the ComputeEngine class, the receiver would have access only to the compute engines executeTask method. That receiver would not see the ComputeEngine constructor, its main method, or its implementation of any methods of java.lang.Object. In the parameters and return values of remote method invocations, objects that are not remote objects are passed by value. Thus, a copy of the object is created in the receiving Java virtual machine. Any changes to the objects state by the receiver are reflected only in the receivers copy, not in the senders original instance. Any changes to the objects state by the sender are reflected only in the senders original instance, not in the receivers copy. Implementing the Servers main Method The most complex method of the ComputeEngine implementation is the main method. The main method is used to start the ComputeEngine and therefore needs to

do the necessary initialization and housekeeping to prepare the server to accept calls from clients. This method is not a remote method, which means that it cannot be invoked from a different Java virtual machine. Because the main method is declared static, the method is not associated with an object at all but rather with the class ComputeEngine. Creating and Installing a Security Manager The main methods first task is to create and install a security manager, which protects access to system resources from untrusted downloaded code running within the Java virtual machine. A security manager determines whether downloaded code has access to the local file system or can perform any other privileged operations. If an RMI program does not install a security manager, RMI will not download classes (other than from the local class path) for objects received as arguments or return values of remote method invocations. This restriction ensures that the operations performed by downloaded code are subject to a security policy. Heres the code that creates and installs a security manager: if (System.getSecurityManager() == null) { System.setSecurityManager(new SecurityManager()); } Making the Remote Object Available to Clients Next, the main method creates an instance of ComputeEngine and exports it to the RMI runtime with the following statements: Compute engine = new ComputeEngine();

Compute stub = (Compute) UnicastRemoteObject.exportObject(engine, 0); The static UnicastRemoteObject.exportObject method exports the supplied remote object so that it can receive invocations of its remote methods from remote clients. The second argument, an int, specifies which TCP port to use to listen for incoming remote invocation requests for the object. It is common to use the value zero, which specifies the use of an anonymous port. The actual port will then be chosen at runtime by RMI or the underlying operating system. However, a non-zero value can also be used to specify a specific port to use for listening. Once the exportObject invocation has returned successfully, the ComputeEngine remote object is ready to process incoming remote invocations. The exportObject method returns a stub for the exported remote object. Note that the type of the variable stub must be Compute, not ComputeEngine, because the stub for a remote object only implements the remote interfaces that the exported remote object implements. The exportObject method declares that it can throw a RemoteException, which is a checked exception type. The main method handles this exception with its try/catch block. If the exception were not handled in this way, RemoteException would have to be declared in the throws clause of the main method. An attempt to export a remote object can throw a RemoteException if the necessary communication resources are not available, such as if the requested port is bound for some other purpose. Before a client can invoke a method on a remote object, it must first obtain a reference to the remote object. Obtaining a reference can be done in the same way that any other object reference is obtained in a program, such as by getting the reference as part of the return value of a method or as part of a data structure that contains such a reference.

The system provides a particular type of remote object, the RMI registry, for finding references to other remote objects. The RMI registry is a simple remote object naming service that enables clients to obtain a reference to a remote object by name. The registry is typically only used to locate the first remote object that an RMI client needs to use. That first remote object might then provide support for finding other objects. The java.rmi.registry.Registry remote interface is the API for binding (or registering) and looking up remote objects in the registry. The java.rmi.registry.LocateRegistry class provides static methods for synthesizing a remote reference to a registry at a particular network address (host and port). These methods create the remote reference object containing the specified network address without performing any remote communication. LocateRegistry also provides static methods for creating a new registry in the current Java virtual machine, although this example does not use those methods. Once a remote object is registered with an RMI registry on the local host, clients on any host can look up the remote object by name, obtain its reference, and then invoke remote methods on the object. The registry can be shared by all servers running on a host, or an individual server process can create and use its own registry. The ComputeEngine class creates a name for the object with the following statement: String name = "Compute"; The code then adds the name to the RMI registry running on the server. This step is done later with the following statements: Registry registry = LocateRegistry.getRegistry(); registry.rebind(name, stub);

This rebind invocation makes a remote call to the RMI registry on the local host. Like any remote call, this call can result in a RemoteException being thrown, which is handled by the catch block at the end of the main method. Note the following about the Registry.rebind invocation: The no-argument overload of LocateRegistry.getRegistry synthesizes a reference to a registry on the local host and on the default registry port, 1099. You must use an overload that has an int parameter if the registry is created on a port other than 1099. When a remote invocation on the registry is made, a stub for the remote object is passed instead of a copy of the remote object itself. Remote implementation objects, such as instances of ComputeEngine, never leave the Java virtual machine in which they were created. Thus, when a client performs a lookup in a servers remote object registry, a copy of the stub is returned. Remote objects in such cases are thus effectively passed by (remote) reference rather than by value. For security reasons, an application can only bind, unbind, or rebind remote object references with a registry running on the same host. This restriction prevents a remote client from removing or overwriting any of the entries in a servers registry. A lookup, however, can be requested from any host, local or remote. Once the server has registered with the local RMI registry, it prints a message indicating that it is ready to start handling calls. Then, the main method completes. It is not necessary to have a thread wait to keep the server alive. As long as there is a reference to the ComputeEngine object in another Java virtual machine, local or remote, the ComputeEngine object will not be shut down or garbage collected. Because the program binds a reference to the ComputeEngine in the registry, it is reachable from a remote client, the registry itself. The RMI system keeps the ComputeEngines process

running. The ComputeEngine is available to accept calls and wont be reclaimed until its binding is removed from the registry and no remote clients hold a remote reference to the ComputeEngine object. The final piece of code in the ComputeEngine.main method handles any exception that might arise. The only checked exception type that could be thrown in the code is RemoteException, either by the UnicastRemoteObject.exportObject invocation or by the registry rebind invocation. In either case, the program cannot do much more than exit after printing an error message. In some distributed applications, recovering from the failure to make a remote invocation is possible. For example, the application could attempt to retry the operation or choose another server to continue the operation. Q. 7. Define the following terms: A) Socket A socket is one endpoint of a two-way communication link between two programs running on the network. A socket is bound to a port number so that the TCP layer can identify the application that data is destined to be sent. An endpoint is a combination of an IP address and a port number. Every TCP connection can be uniquely identified by its two endpoints. That way you can have multiple connections between your host and the server. The java.net package in the Java platform provides a class, Socket, that implements one side of a two-way connection between your Java program and another program on the network. The Socket class sits on top of a platform-dependent implementation, hiding the details of any particular system from your Java program. By using the

java.net.Socket class instead of relying on native code, your Java programs can communicate over the network in a platform-independent fashion. Additionally, java.net includes the ServerSocket class, which implements a socket that servers can use to listen for and accept connections to clients. This lesson shows you how to use the Socket and ServerSocket classes. If you are trying to connect to the Web, the URL class and related classes (URLConnection, URLEncoder) are probably more appropriate than the socket classes. In fact, URLs are a relatively high-level connection to the Web and use sockets as part of the underlying implementation. See Working with URLs for information about connecting to the Web via URLs. B) Port Generally speaking, a computer has a single physical connection to the network. All data destined for a particular computer arrives through that connection. However, the data may be intended for different applications running on the computer. So how does the computer know to which application to forward the data? Through the use of ports. Data transmitted over the Internet is accompanied by addressing information that identifies the computer and the port for which it is destined. The computer is identified by its 32-bit IP address, which IP uses to deliver data to the right computer on the network. Ports are identified by a 16-bit number, which TCP and UDP use to deliver the data to the right application. In connection-based communication such as TCP, a server application binds a socket to a specific port number. This has the effect of registering the server with the system to receive all data destined for that port. A client can then rendezvous with the server at the servers port, as illustrated here:

Fig. 3.1.1 Definition: The TCP and UDP protocols use ports to map incoming data to a particular process running on a computer. In datagram-based communication such as UDP, the datagram packet contains the port number of its destination and UDP routes the packet to the appropriate application, as illustrated in this figure:

Fig. 3.1.2 Port numbers range from 0 to 65,535 because ports are represented by 16-bit numbers. The port numbers ranging from 0 1023 are restricted; they are reserved for use by wellknown services such as HTTP and FTP and other system services. These ports are called well-known ports. Your applications should not attempt to bind to them. C) Datagram

Clients and servers that communicate via a reliable channel, such as a TCP socket, have a dedicated point-to-point channel between themselves, or at least the illusion of one. To communicate, they establish a connection, transmit the data, and then close the connection. All data sent over the channel is received in the same order in which it was sent. This is guaranteed by the channel. In contrast, applications that communicate via datagrams send and receive completely independent packets of information. These clients and servers do not have and do not need a dedicated point-to-point channel. The delivery of datagrams to their destinations is not guaranteed. Nor is the order of their arrival. Definition: A datagram is an independent, self-contained message sent over the network whose arrival, arrival time, and content are not guaranteed. The java.net package contains three classes to help you write Java programs that use datagrams to send and receive packets over the network: Datagram Socket. Datagram Packet, and Multicast Socket An application can send and receive Datagram Packets through a Datagram Socket. In addition, Datagram Packets can be broadcast to multiple recipients all listening to a Multicast Socket

1. Write a complete program for each of the following: Decision Making Statements Ans: If-else statement

Master of Computer Application (MCA) Java Programming Assignment Set 2

if statement is Javas conditional branch statement. It through two different paths. Here is the general form of the if statement: if (condition) statement1; else statement2; Here, each statement may be a single statement or a compound statement enclosed in curly braces (that is, a block). The condition is any expression that returns a boolean value. The else clause is optional. The if works like this: If the condition is true, then statement1 is executed. Otherwise, statement2(if it exists) is executed. In no case will bot Most often, the expression used to control the if will involve the relational operators. However, this is not technically necessary. It is possible to control the if using a single boolean va this code fragment: boolean dataAvailable; // can be used to route program execution hat both statements be executed. For example, consider the following: variable, as shown in if (dataAvailable) ProcessData(); else waitForMoreData();

if (dataAvailable) ProcessData(); else waitForMoreData(); Remember, only one statement can appear directly after the if or the else. If you want to include more statements, youll need to create a block, as in this fragment:

int bytesAvailable; // if (bytesAvailable > 0) { ProcessData(); bytesAvailable -= n; } else waitForMoreData(); Here, both statements within the if block will execute if bytesAvailable is greater than zero. Some programmers find it convenient to include the curly braces when using the if, even when there is only one statement in each clause. This makes it easy to add another statement at a later date, and you dont have to worry about forgetting the braces. In fact, forgetting to define a block when one is needed is a common cause of errors. For example, consider the following code fragment: int bytesAvailable; // if (bytesAvailable > 0) { ProcessData(); bytesAvailable -= n; } else waitForMoreData(); bytesAvailable = n; It seems clear that the statement bytesAvailable = n; was intended to be executed inside the else clause, because of the indentation level. However, as you recall, whitespace is insignificant to Java, and there is no way for the compiler to know what was intended. This code will compile without complaint, but it will behave incorrectly when run. The preceding example is fixed in the code that follows: int bytesAvailable; // if (bytesAvailable > 0) { ProcessData(); bytesAvailable -= n; } else { waitForMoreData(); bytesAvailable = n; } The if-else-if Ladder A common programming construct that is based upon a sequence of nested ifs is the ifelseif ladder. It looks like this:

if(condition) statement; else if(condition) statement; else if(condition) statement; . . . else statement; The if statements are executed from the top down. As soon as one of the conditions controlling the if is true, the statement associated with that if is executed, and the rest of the ladder is bypassed. If none of the conditions is true, then the final else statement will be executed. The final else acts as a default condition; that is, if all other conditional tests fail, then the last else statement is performed. If there is no final else and all other conditions are false, then no action will take place. Here is a program that uses an if-else-if ladder to determine which season a particular month is in. // Demonstrate if-else-if statements. class IfElse { public static void main(String args[ ]) { int month = 4; // April String season; if(month == 12 || month == 1 || month == 2) season = "Winter"; else if(month == 3 || month == 4 || month == 5) season = "Spring"; else if(month == 6 || month == 7 || month == season = "Summer"; else if(month == 9 || month == 10 || month == 11) season = "Autumn"; else season = "Bogus Month"; System.out.println("April is in the " + season + "."); } } Here is the output produced by the program: April is in the Spring. You might want to experiment with this program before moving on. As you will find, no matter what

value you give month, one and only one assignment statement within the ladder will be executed. Switch Statement The switch statement is Javas multiway branch statement. It provides an easy way to dispatch execution to different parts of your code based on the value of an expression. As such, it often provides a better alternative than a large series of if-else-if statements. Here is the general form of a switch statement: switch (expression) { case value1: // statement sequence break; case value2: // statement sequence break; . . . case valueN: // statement sequence break; default: // default statement sequence } The expression must be of type byte, short, int, or char; each of the values specified in the case statements must be of a type compatible with the expression. Each case value must be a unique literal (that is, it must be a constant, not a variable). Duplicate case values are not allowed. The switch statement works like this: The value of the expression is compar values in the case statements. If a match is found, the code sequence following that case statement is executed. If none of the constants matches the value of the expression, then the default statement is executed. However, the default statement is optional. If no case matches and no default is present, then no further action is taken. The break statement is used inside the switch to terminate a statement sequence. When a break statement is encountered, execution branches to the statement. This has the effect of "jumping out" of the switch. Example

The break statement is optional. If you omit the break, execution will continue on into the next case. It is sometimes desirable to h example, consider the following program: // In a switch, break statements are optional. class MissingBreak { public static void main(String args[ ]) { for(int i=0; i<12; i++) switch(i) { case 0: compared with each of the literal first line of code that follows the entire switch have multiple cases without break statements between them. For ed ave case 1: case 2: case 3: case 4: System.out.println("i is less than 5"); break; case 5: case 6: case 7: case 8: case 9: System.out.println("i is less than 10"); break; default: System.out.println("i is 10 or more"); } } } This program generates the following output: i is less than 5 i is less than 5 i is less than 5 i is less than 5 i is less than 5 i is less than 10 i is less than 10 i is less than 10 i is less than 10 i is less than 10 i is 10 or more i is 10 or more Nested switch Statements

You can use a switch as part of the statement sequence of an outer switch. This is called a nested switch. Since a switch statement defines its own block, no conflicts arise between the case constants in the inner switch and those in the outer switch. For example, the following fragment is perfectly valid: switch(count) { case 1: switch(target) { // nested switch case 0: System.out.println("target is zero"); break; case 1: // no conflicts with outer switch System.out.println("target is one"); break; } break; case 2: // Here, the case 1: statement in the inner switch does not conflict with the case 1: statement in the outer switch. The count variable is only compared with the list of cases at the outer level. If count is 1, then target is compared with the inner list cases. In summary, there are three important features of the switch statement to note: The switch differs from the if in that switch can only test for equality, whereas if can evaluate any type of Boolean expression. That is, the switch looks only for a match between the value of the expression and one of its case constants. No two case constants in the same switch can have identical values. Of course, a switch statement enclosed by an outer switch can have case constants in common. A switch statement is usually more efficient than a set of nested ifs. The last point is particularly interesting because it gives insight into how the Java compiler works. When it compiles a switch statement, the Java compiler will inspect each of the case constants and create a "jump table" that it will use for selecting the path of execution depending on the value of the expression. Therefore, if you need to select among a large group of values, a switch statement will run much faster than the equivalent logic coded using a sequence of if-elses. The compiler can do this

because it knows that the case constants are all the same type and simply must be compared for equality with the switch expression. The compiler has no such knowledge of a long list of if expressions. Looping Statements Ans: for Loop The usage of for loop is as follows for (initial statement; termination condition; increment instruction) statement; When multiple statements are to be included in the for loop, the statements are included inside flower braces. for (initial statement; termination condition; increment instruction) { statement1; statement2; } The example below prints numbers from 1 to 10 The results of the above program is shown below Like all other programming languages, Java allows loops to be nested. That is, one loop may be inside another. For example, here is a program that nests for loops: // Loops may be nested. class Nested { public static void main(String args[ ]) { int i, j; for(i=0; i<10; i++) { for(j=i; j<10; j++) System.out.print("."); System.out.println(); } } } The output produced by this program is shown here: . .. . .. While Statement The while loop is Javas most fundamental looping statement. It repeats a statement or block while its controlling expression is true. Here is its general form: while (condition) {

// body of loop } The condition can be any Boolean expression. The body of the loop will be executed as long as the conditional expression is true. When condition becomes false, control passes to the next line of code immediately following the loop. The curly braces are unnecessary if only a single statement is being repeated. Example do.while statement As you just saw, if the conditional expression controlling a while loop is initially false, then the body the loop will not be executed at all. However, sometimes it is desirable to execute the body of a while loop at least once, even if the conditional expression is false to begin with. In other words, there are times when you would like to test the termi beginning. Fortunately, Java supplies a loop that does just that: the do always executes its body at least once, because its conditional expression is at the bottom of the Its general form is do { // body of loop } while (condition); Each iteration of the do-while loop first executes the body of the loop and then evaluates the conditional expression. If this expression is true, the loop will repeat. Otherwise, the loo As with all of Javas loops, condition must be a boolean expression. Example termination expression at the end of the loop rather than at the do-while. The do of nation do-while loop loop. loop terminates. The do-while loop is especially useful when you process a menu selection, because you will usually want the body of a menu loop to execute at least once. Consider the following program which implements a very simple help system for Javas selection and iterationstatements: // Using a do-while to process a menu selection class Menu { public static void main(String args[])

throws java.io.IOException { char choice; do { System.out.println("Help on:"); System.out.println(" 1. if"); System.out.println(" 2. switch"); System.out.println(" 3. while"); System.out.println(" 4. do-while"); System.out.println(" 5. for\n"); System.out.println("Choose one:"); choice = (char) System.in.read(); } while( choice < 1 || choice > 5); System.out.println("\n"); switch(choice) { case 1: System.out.println("The if:\n"); System.out.println("if(condition) statement;"); System.out.println("else statement;"); break; case 2: System.out.println("The switch:\n"); System.out.println("switch(expression) {"); System.out.println(" case constant:"); System.out.println(" statement sequence"); System.out.println(" break;"); System.out.println(" // "); System.out.println("}"); break; case 3: System.out.println("The while:\n"); System.out.println("while(condition) statement;"); break; case 4: System.out.println("The do-while:\n"); System.out.println("do {"); System.out.println(" statement;"); System.out.println("} while (condition);"); break; case 5: System.out.println("The for:\n"); System.out.print("for(init; condition; iteration)"); System.out.println(" statement;"); break; } } }

Here is a sample run produced by this program: Help on: 1. if 2. switch 3. while 4. do-while 5. for Choose one: The do-while: do { statement; } while (condition); In the program, the do-while loop is used to verify that the user has entered a valid choice. If not, then the user is reprompted. Since the menu must be displayed at least once, the do-while is the perfect loop to accomplish this. A few other points about this example: Notice that characters are read from the keyboard by calling System.in.read( ). This is one of Javas console input functions. Although Javas console I/O methods wont be discussed in detail until System.in.read( ) is used here to obtain the users choice. It reads characters from standard input (returned as integers, which is why the return value was cast to char). By default, standard input is line buffered, so you must press ENTER before any characters that you type will be sent to your program. Javas console input is quite limited and awkward to work with. Further, most real-world Java programs and applets will be graphical and window-based. For these reasons, not much use of console input has been made in this book. However, it is useful in this context. One other point: Because System.in.read( ) is being used, the program must specify the throws java.io.IOException clause. This line is necessary to handle input errors. 2. How do you implements inheritance in java? Ans: Inheritance is one of the cornerstones of object-oriented programming because it allows the creation of hierarchical classifications. Using inheritance, you can create a general class that defines traits common to a set of related items. This class can then be inherited by other, more specific classes, each adding those things that are unique to it. In the terminology of Java, a class that is inherited is

called a superclass. The class that does the inheriting is called a subclass. Therefore, a subclass is a specialized version of a superclass. It inherits all of the instance variables and methods defined by the superclass and add its own, unique elements. Implementing Inheritance in Java: - The extends keyword is used to derive a class from a superclass, or in other words, extend the functionality of a superclass. Syntax public class <subclass_name>extends<superclass_name> Example public class confirmed extends ticket { } Rules for Overriding Methods The method name and the order of arguments should be identical to that of the superclass method. The return type of both the methods must be the same. The overriding method cannot be less accessible than the method it overrides. For example, if the method to override is declared as public in the superclass, you cannot override it with the private keyword in the subclass. An overriding method cannot raise more exceptions than those raised by the superclass. Example // Create a superclass. class A { int i, j; void showij() { System.out.println("i and j: " + i + " " + j); } } // Create a subclass by extending class A. class B extends A { int k; void showk() { System.out.println("k: " + k); } void sum() { System.out.println("i+j+k: " + (i+j+k)); } } class SimpleInheritance {

public static void main(String args[]) { A superOb = new A(); B subOb = new B(); // The superclass may be used by itself. superOb.i = 10; superOb.j = 20; System.out.println("Contents of superOb: "); superOb.showij(); System.out.println(); /* The subclass has access to all public members of its superclass. */ subOb.i = 7; subOb.j = 8; subOb.k = 9; System.out.println("Contents of subOb: "); subOb.showij(); subOb.showk(); System.out.println(); System.out.println("Sum of i, j and k in subOb:"); subOb.sum(); } } The output from this program is shown here: Contents of superOb: i and j: 10 20 Contents of subOb: i and j: 7 8 k: 9 Sum of i, j and k in subOb: i+j+k: 24 As you can see, the subclass B includes all of the members of its superclass, A. This is why subOb can access i and j and call showij ( ). Also, inside sum ( ), i and j can be referred to directly, as if they were part of B. Even though A is a superclass for B, it is also a completely independent, stand-alone class. Being a superclass for a subclass does not mean that the superclass cannot be used by itself. Further, a subclass can be a superclass for another subclass. The general form of a class declaration that inherits a superclass is shown here: class subclass-name extends superclass-name { // body of class }

You can only specify one superclass for any subclass that you create. Java does not support the inheritance of multiple superclasses into a single subclass. (This differs from C++, in which you can inherit multiple base classes.) You can, as stated, create a hierarchy of inheritance in which a subclass becomes a superclass of another subclass. However, no class can be a superclass of itself. 3. Draw and explain the JDBC Application Architecture? Ans: The JDBC API is a Java API that can access any kind of tabular data, especially data stored in a Relational Database. JDBC helps you to write java applications that manage these three programming activities: 1. Connect to a data source, like a database 2. Send queries and update statements to the database 3. Retrieve and process the results received from the database in answer to your query JDBC Architecture: - The JDBC API supports both two-tier and three-tier processing models for database access. Two-tier Architecture for Data Access In the two-tier model, a Java application talks directly to the data source. This requires a JDBC driver that can communicate with the particular data source being accessed. A users commands are delivered to the database or other data source, and the results of those statements are sent back to the user. The data source may be located on another machine to which the user is connected via a network. This is referred to as a client/server configuration, with the users machine as the client, and the machine housing the data source as the server. The network can be an intranet, which, for example, connects employees within a corporation, or it can be the Internet. In the three-tier model, commands are sent to a "middle tier" of services, which then sends the commands to the data source. The data source processes the commands and sends the results back to the middle tier, which then sends them to the user. MIS directors find the three-tier model very attractive because the middle tier makes it possible to maintain control over access and the kinds of updates that can be made to corporate data. Another advantage is that it simplifies the deployment

of applications. Finally, in many cases, the three-tier architecture can provide performance advantages. DBMS Client Machine DBMS Proprietary Protocol Database Server Java Application JBDC Until recently, the middle tier has often been written in languages such as C or C++, which offer fast performance. However, with the introduction of optimizing compilers into efficient machine-specific code and technologies such as Enterprise JavaBeans, the Java platform is fast becoming the standard platform for middle making it possible to take advantage With enterprises increasingly using the Java programming language for writing server code, the JDBC API is being used more and more in the middle tier of a three that make JDBC a server technology are its support for connection pooling, distributed transactions, and disconnected rowsets. The JDBC API is also what allows access to a data source from a Java middle tier. Three-tier Architecture for Data Access that translate Java bytecode middle-tier development. This is a big plus, of Javas robustness, multithreading, and security features. three-tier architecture. Some of the f features 4. What are the difference between an interface and an abstract class? Ans: An abstract class is a class that leaves one or more method implementations unspecified by declaring one or more methods abstract. An abstract method has no body (i.e.,no implementation). A subclass is required to override the abstract method and provide an implementation. Hence, an abstract class is incomplete and cannot be instantiated, but can be used as a base class. abstract public class abstract-base-class-name { // abstract class has at least one abstract method public abstract return-type abstract-method-name ( formal-params ); ... // other abstract methods, object methods, class methods } public class derived-class-name extends abstract-base-class-name { public return-type abstract-method-name (formal-params) { stmt-list; }

... // other method implementations } It would be an error to try to instantiate an object of an abstract type: abstract-class-name obj = new abstract-class-name(); // ERROR! That is, operator new is invalid when applied to an abstract class. An interface is a specification, or contract, for a set of methods that a class that implements the interface must conform to in terms of the type signature of the methods. The class that implements the interface provides an implementation for each method, just as with an abstract method in an abstract class. So, you can think of an interface as an abstract class with all abstract methods. The interface itself can have either public, package, private or protected access defined. All methods declared in an interface are implicitly abstract and implicitly public. It is not necessary, and in fact considered redundant to declare a method in an interface to be abstract. You can define data in an interface, but it is less common to do so. If there are data fields defined in an interface, then they are implicitly defined to be: Public, Static, and Final In other words, any data defined in an interface are treated as public constants. Note that a class and an interface in the same package cannot share the same name. Methods declared in an interface cannot be declared final. Why? Interface declaration Interface names and class names in the same package must be distinct. public interface interface-name { // if any data are defined, they must be constants public static final type-name var-name = constant-expr; // one or more implicitly abstract and public methods return-type method-name ( formal-params );} When to use an Interface vs when to use an abstract class Having reviewed their basic properties, there are two primary differences between interfaces and abstract classes: An abstract class can have a mix of abstract and non-abstract methods, so some default implementations can be defined in the abstract base class. An abstract class can also have static

methods, static data, private and protected methods, etc. In other words, a class is a class, so it can contain features inherent to a class. The downside to an abstract base class, is that since their is only single inheritance in Java, you can only inherit from one class. An interface has a very restricted use, namely, to declare a set of public abstract method signatures that a subclass is required to implement. An interface defines a set of type constraints, in the form of type signatures that impose a requirement on a subclass to implement the methods of the interface. Since you can inherit multiple interfaces, they are often a very useful mechanism to allow a class to have different behaviors in different situations of usage by implementing multiple interfaces. It is usually a good idea to implement an interface when you need to define methods that are to be explicitly overridden by some subclass. If you then want some of the methods implemented with default implementations that will be inherited by a subclass, then create an implementation class for the interface, and have other class inherit (extend) that class, or just use an abstract base class instead of an interface 5. Explain the working of struts with an example. Ans: Struts is modeled after the MVC design pattern, you can follow a standard development process for all of your Struts Web applications. Identificaty of the application Views, the Controller objects that will service those Views, and the Model components being operated on. 1. Define and create all of the Views, in relation to their purpose, that will represent the user interface of our application. Add all Action Forms used by the created Views to the strutsconfig. xml file. 2. Create the components of the applications Controller. 3. Define the relationships that exist between the Views and the Controllers (struts-config.xml). 4. Make the appropriate modifications to the web.xml file, describe the Struts components to the Web application. Lets Start with step one. We will create the view file named index.jsp index.jsp <%@ page language="java" %>

<%@ taglib uri="/WEB-INF/struts-html.tld" prefix="html" %> <html> <head> <title>Sample Struts Application</title> </head> <body> <html:form action="Name" name="nameForm" type="example.NameForm"> <table width="80%" border="0"> <tr> <td>Name:</td> <td> <html:text property="name" /></td> </tr> <tr> <td><html:submit /></td> </tr> </table> </html:form> </body> </html> Action: Represents the URL to which this form will be submitted. This attribute is also used to find the appropriate Action Mapping in the Struts configuration file, which we will describe later in this section. The value used in our example is Name, which will map to an Action Mapping with a path attribute equal to Name Name: Identifies the key that the ActionForm will be referenced by. We use the value NameForm. An ActionForm is an object that is used by Struts to represent the form data as a JavaBean. It main purpose is to pass form data between View and Controller components. We will discuss NameForm later in this section. Type: Names the fully qualified class name of the form bean to use in this request. For this example, we use the value example. NameForm, which is an ActionForm object containing data members matching the inputs of this form. NameForm.java package example; //import statements import javax.servlet.http.HttpServletRequest; import org.apache.struts.action.ActionForm; import org.apache.struts.action.ActionMapping; public class NameForm extends ActionForm {

private String name = null; public String getName() { return (name); } public void setName(String name) { this.name = name; } public void reset(ActionMapping mapping, HttpServletRequest request) { this.name = null; } } displayname.jsp <html> <head> <title>Sample Struts Display Name</title> </head> <body> <table width="80%" border="0"> <tr> <td>Hello <%= request.getAttribute("NAME") %> !!</td> </tr> </table> </body> </html> NameAction.java package example; import java.io.IOException; import javax.servlet.ServletException; import javax.servlet.http.HttpServletRequest; import javax.servlet.http.HttpServletResponse; import org.apache.struts.action.Action; import org.apache.struts.action.ActionForm; import org.apache.struts.action.ActionForward; import org.apache.struts.action.ActionMapping; public class NameAction extends Action { public ActionForward execute(ActionMapping mapping, ActionForm form, HttpServletRequest request, HttpServletResponse response) throws IOException, ServletException { String target = new String("success"); if ( form != null ) { // Use the NameForm to get the request parameters NameForm nameForm = (NameForm)form; String name = nameForm.getName(); }// if no mane supplied Set the target to failure if ( name == null ) { target = new String("failure");

} else { request.setAttribute("NAME", name); } return (mapping.findForward(target)); } } Moving to step three, to deploy the NameAction to our Struts application, weneed to compile the NameAction class and move the class file to /WEB-INF/classes/example directory, and add the following entry to the <action-mappings> section of the /WEB-INF/strutsconfig.xml file: <action path="/Name" type="example.NameAction" name="nameForm"input="/index.jsp"> <forward name="success" path="/displayname.jsp"/> <forward name="failure" path="/index.jsp"/> </action> For step four we modify the web.xml file. We have to to tell the Web applicationabout our ActionServlet. This is accomplished by adding the following servlet definition to the /WEBINF/ web.xml file: <servlet> <servlet-name>action</servlet-name> <servlet-class> org.apache.struts.action.ActionServlet </servlet-class> <init-param> <param-name>config</param-name> <param-value>/WEB-INF/struts-config.xml</param-value> </init-param> <load-on-startup>1</load-on-startup> </servlet> Once we have told the container about the ActionServlet, we need to tell it whenthe action should be executed. To do this, we have to add a <servlet-mapping>element to the /WEB-INF/ web.xml file: <servlet-mapping> <servlet-name>action</servlet-name> <url-pattern>*.do</url-pattern> </servlet-mapping> 6. Write a program in Java to demonstrate the complete life cycle of a Servlet. Ans: CODE: import database.BookDB; import javax.servlet.*; import util.Counter;

public final class ContextListener implements ServletContextListener { private ServletContext context = null; public void contextInitialized(ServletContextEvent event) { context = event.getServletContext(); try { BookDB bookDB = new BookDB(); context.setAttribute("bookDB", bookDB); } catch (Exception ex) { System.out.println( "Couldn't create database: " + ex.getMessage()); } Counter counter = new Counter(); context.setAttribute("hitCounter", counter); context.log("Created hitCounter" + counter.getCounter()); counter = new Counter(); context.setAttribute("orderCounter", counter); context.log("Created orderCounter" + counter.getCounter()); } public void contextDestroyed(ServletContextEvent event) { context = event.getServletContext(); BookDB bookDB = context.getAttribute( "bookDB"); bookDB.remove(); context.removeAttribute("bookDB"); context.removeAttribute("hitCounter"); context.removeAttribute("orderCounter"); } } 7. Explain the life cycle of a Servlet? Ans: Servlet Life Cycle: The life cycle of a servlet is controlled by the container in which the servlet has been deployed. When a request is mapped to a servlet, the container performs the following steps. 1. If an instance of the servlet does not exist, the Web container 2. Loads the servlet class. 3. Creates an instance of the servlet class. 4. Initializes the servlet instance by calling the init method. 5. Invokes the service method, passing a request and response object. If the container needs to remove the servlet, it finalizes the servlet by calling theservlet's destroy method. Handling Servlet Life-Cycle Events

You can monitor and react to events in a servlet's life cycle by defining listener objects whose methods get invoked when life cycle events occur. To use these listener objects, you must define the listener class and specify the listener class. The listeners.ContextListener class creates and removes the database helper and counter objects used in the Duke's Bookstore application. The methods retrieve the Web context object from ServletContextEvent and then store (and remove) the objects as servlet context attributes. CODE: import database.BookDB; import javax.servlet.*; import util.Counter; public final class ContextListener implements ServletContextListener { private ServletContext context = null; public void contextInitialized(ServletContextEvent event) { context = event.getServletContext(); try { BookDB bookDB = new BookDB(); context.setAttribute("bookDB", bookDB); } catch (Exception ex) { System.out.println( "Couldn't create database: " + ex.getMessage()); } Counter counter = new Counter(); context.setAttribute("hitCounter", counter); context.log("Created hitCounter" + counter.getCounter()); counter = new Counter(); context.setAttribute("orderCounter", counter); context.log("Created orderCounter" + counter.getCounter()); } public void contextDestroyed(ServletContextEvent event) { context = event.getServletContext(); BookDB bookDB = context.getAttribute( "bookDB"); bookDB.remove(); context.removeAttribute("bookDB"); context.removeAttribute("hitCounter"); context.removeAttribute("orderCounter"); } }

8. Explain the importance, applications and working of Java Struts. Ans: Struts is an application development framework that is designed for and used with the popular J2EE (Java 2, Enterprise Edition) platform. It cuts time out of the development process and makes developers more productive by providing them a series of tools and components to build applications with. Struts fall sunder the Jakarta subproject of the Apache Software Foundation and comes with an Open Source license. An example of a Struts Flow server-side script which logs the user on to an application: function login() {userManager = struts.applicationScope["userManager"]; error = ""; while (struts.sessionScope["curUser"] == null) { forwardAndWait("loginForm", {"error" : error}); user = struts.param["user"]; passwd = struts.param["passwd"]; if (userManager.login(user, passwd)) { struts.sessionScope["curUser"] = user; } else { error = "Invalid login, please try again"; } } } Features of Struts are as follows: Easily script complex workflows Full access to Struts features Can exist side-by-side regular Struts actions Ability to run in non-Struts environments (uses Jakarta's Commons-Chain) Enhanced Java API methods and Collections integration Remote RPC support (termed Ajax but with JSON instead of XML) for callingflow methods from the client Includes Wizard library to help easily create complex wizards Includes Javascript Templates library to replace JSP for a 100% Javascriptview layer. Includes number guessing, remote rpc, Javascript Templates, and wizardexamples Struts are a Web Application 'Framework'. Strutsa collection of Java code designed to help you build solid applications while saving time. Struts

are based on the time-proven Model-View-Controller (MVC) design pattern. The MVC pattern is widely recognized as being among the most well-developed and mature design patterns in use. By using the MVC design pattern, processing is broken into three distinct sections aptly named the Model, the View, and the Controller. Model Components Model components are generally standard Java classes. View Components View components are generally built using Java Server Page (JSP) files. Controller Components Controller components in Struts are Java classes and must be built using specificrules. They are usually referred to as "Action classes."

Master of Computer Application (MCA) Semester 4 Set 1 Computer Based Optimization Methods
1. Briefly describe the structure of a mathematical model in OR.
Ans:

The Structure of Mathematical Model

Many industrial and business situations are concerned with planning activities. In each case of planning, there are limited sources, such as men, machines, material and capital at the disposal of the planner. One has to make decision regarding these resources in order to either maximize

production, or minimize the cost of production or maximize the profit etc. These problems are referred to as the problems of constrained optimization. Linear programming is a technique for determining an optimal schedule of interdependent activities, for the given resources. Programming thus means planning and refers to the process of decision-making regarding particular plan of action amongst several available alternatives. Any business activity of production activity to be formulated as a mathematical model can best be discussed through its constituents; they are: - Decision Variables, - Objective function, - Constraints. 1.6.1 Decision variables and parameters The decision variables are the unknowns to be determined from the solution of the model. The parameters represent the controlled variables of the system. 1.6.2 Objective functions This defines the measure of effectiveness of the system as a mathematical function of its decision variables. The optimal solution to the model is obtained when the corresponding values of the decision variable yield the best value of the objective function while satisfying all constraints. Thus the objective function acts as an indicator for the achievement of the optimal solution. While formulating a problem the desire of the decision-maker is expressed as a function of n decision variables. This function is essentially a linear programming problem (i.e., each of its item will have only one variable raise to power one). Some of the Objective functions in practice are: - Maximization of contribution or profit - Minimization of cost - Maximization of production rate or minimization of production time - Minimization of labour turnover - Minimization of overtime - Maximization of resource utilization - Minimization of risk to environment or factory etc.

1.6.3 Constraints To account for the physical limitations of the system, the model must include constraints, which limit the decision variables to their feasible range or permissible values. These are expressed in the form of constraining mathematical functions. For example, in chemical industries, restrictions come from the government about throwing gases in the environment. Restrictions from sales department about the marketability of some products are also treated as constraints. A linear programming problem then has a set of constraints in practice. The mathematical models in OR may be viewed generally as determining the values of the decision variables x J, J = 1, 2, 3, n, which will optimize Z = f (x 1, x 2, - x n). Subject to the constraints: g i (x 1, x 2 x n) ~ b i, i = 1, 2, - m and xJ 0 j = 1, 2, 3 - n where ~ is , or =.

The function f is called the objective function, where g i ~ b i, represent the i th constraint for i = 1, 2, 3 - m where b i is a known constant. The constraints x j 0 are called the non-negativity condition, which restrict the variables to zero or positive values only.
2. Find all basic solutions for the system

x 1 + 2x 2 + x 3 = 4
2x 1 + x 2 + 5x 3 = 5
Solution: Here A = , X = and b = . i) If x1 = 0, then the basis matrix is B = . In this case 2x2 + x3 = 4, x2 + 5x3 = 5. If we solve this, then x2 = and x3 = . Therefore x2 = , x3 = is a basic feasible solution. ii) If x2 = 0, then the basis matrix is B = . In this case, x1 + x3 = 4, 2x1 + 5x3 = 5.If we solve this, then x1 = 5 and x3 = -1. Therefore x1 = 5, x3 = -1 is a basic solution. (Note that this solution is not feasible, because x3 = -1 < 0). iii) If x3 = 0, then the basis matrix is B =. In this case, x1 + 2x2 = 4.

2x1 + x2 = 5. If we solve this, then x1 = 2, and x2 = 1. Therefore x1 = 2, x2 = 1 is a basic feasible solution. Therefore i) (x2, x3) = (5/3, 2/3), ii) (x1, x3) = (5, -1), and iii) (x1, x2) = (2, 1) are only the collection of all basic solutions.

3. Use simplex method to solve the LPP Maximize Z = 2x 1 + 4x 2 + x 3 + x 4 subject to x 1 + 3x 2 + x 4 4

2x 1 + x 2 3

x 2 + 4x 3 + x 4 3

x 1, x 2 , x 3 , x 4 0

Answer:

Rewriting in the standard form Maximize Z = 2x1 + 4x2 + x3 + x4 + 0.S1 + 0.S2 + 0.S3 Subject to x1 + 3x2 + x4 + S1 = 4 2x1 + x2 + S2 = 3 x2 + 4x3 + x4 + S3 = 3 x1, x2, x3, x4, S1, S2, S3 0. The initial basic solution is S1 = 4, S2 = 3, S3 = 3 \X0 = = C0 =

The initial table is given by

S1 is the outgoing variable, x2 is the incoming variable to the basic set. The first iteration gives the following table :

x3 enters the new basic set replacing S3, the second iteration gives the following table :

x1 enters the new basic set replacing S2, the third iteration gives the following table:

Since all elements of the last row are non-negative, the optimal solution is Z = which is achieved for x2 = 1, x1 = 1, x3 = and x4 = 0.

4. Solve the following problem by MODI method A firm owns facilities at six places. It has manufacturing plants at places A, B and C with daily production of 50, 40 and 60 units respectively. At point D, E and F it has three warehouses with daily demands of 20, 95 and 35 units respectively. Per unit shopping costs are given in the following table. If the firm wants to minimize its total transportation cost, how should it route its products? Warehouses D E F Plants A B 6 3 4 8 4 1 7 2

C 4

Ans: A firm owns facilities at six places. It has manufacturing pants at places A, B, and C with daily production of 50, 40 and 60 units respectively. At point D, E and F

it has three warehouses with daily demands of 20, 95 and 35 units respectively. Per unit shipping costs are given in the following table. If the firm wants to minimize its total transportation cost, how should it route its products?

Solution: We use the North West corner rule for the initial basic feasible solution, given in the following table and the basic cell are represented in right top square box. Let u1 = 0. For the occupied cell (1, 1), u1 + v1 = c11 0 + v1 = 6 v1 = 6. Similarly, u1 + v2 = c12 0 + v2 = 4 v2 = 4. Now with v2 = 4, u2 + 4 = 8 u2 = 4 and u3 = 0. Now from u3 = 0, we get v3 = 0. Therefore we have determined all ui and vj values. Next we calculate = ui + vj cij values which are given in the left bottom squares (opportunity costs). Now

Therefore the cell (2, 1) has the largest positive opportunity cost and so select x21 for inclusion as the basic variable. The closed loop (indicated in arrows in the above table) starting with cell (2, 1). The revised solution is shown in the following table.

This solution is tested for optimality and is found to be non-optimal. Here the cell (1, 3) has positive opportunity cost and so a closed loop is traced starting with this. The resulting solution, when tested is found to be optimal (given in the following table). (observe that in this table no opportunity cost is positive).

Therefore the optimal transportation cost is Rs. (4 50) + (1 35) + (3 20) + (8 20) + (4 60) = 555.

5. Solve the assignment problem given in the following table for optimal solution using HAM:
Job Worker 1 2 3 4 A B C D

45 40 51 67 57 42 63 55 49 52 48 64 41 45 60 55

Solution: Step 1: The minimum value of each row is subtracted from all elements in the row. It is shown in the reduced cost table, which is also called the opportunity cost table. Reduced Cost Table 1:

Step 2: For each column of this table, the minimum value is subtracted from all the other values. Clearly, the columns that contain a zero would remain unaffected by this operation. Here only the fourth column values would change. Consider the following table. Reduced Cost Table 2:

Step 3: Draw the minimum number of lines covering all zeros. As a general rule, we should first cover those rows/columns which contain larger number of zeros. The following table shows reduced cost and the lines are drawn. Reduced Cost Table 3:

Step 4: Since the number of lines drawn is equal to 4 (= n), the optimal solution is obtained. The assignments are made after scanning rows and columns for unit zeros. Assignments made are shown with brackets, below. Reduced Cost Table 4:

Assignments are made in the following order. Rows 1, 3 and 4 contain only one zero each. So assign 1-B, 3.-C, and 4-A. Since worker 1 has been assigned job B, we cross the zero in the second column of the second row. After making these assignments, only worker 2 and job D are left for assignment. The final pattern of assignments is 1-B, 2-D, 3-C, and 4-A, involving a total time of 40 + 55 + 48 + 41 = 184 minutes.

6. Write briefly about the concept Probability and Cost Consideration in Project Scheduling with examples wherever necessary?

Answer:

Probability and Cost Consideration in Project Scheduling The analysis in CPM does not take into the case where time estimates for the different activities are probabilistic. Also it does not consider explicitly the cost of schedules. Here we will consider both probability and cost aspects in project scheduling. Probability considerations are incorporated in project scheduling by assuming that the time estimate for each activity is based on 3 different values. They are a = The optimistsic time, which will be required if the execution of the project goes extremely well. b = The pessimistic time, which will be required if everything goes bad. m = The most likely time, which will be required if execution is normal. The most likely estimate m need not coincide with mid-point of a and b. Then the expected duration of each activity can be obtained as the mean of and 2 m. i.e. . This estimate can be used to study the single estimate D in the critical path calculation. The variance of each activity denoted by V is defined by variance V=. The earliest expected times for the node i denoted by E(mi) for each node i is obtained by taking the sum of expected times of all activities leading to the node i, when more than one activity leads to a node i, then greatest of all E(mi) is chosen . Let mi be the earliest occurrence time of

the event i, we can consider mi as a random variable. Assuming that all activities of the network are statistical independent, we can calculate the mean and the variance of the as follows = . Where K defines the activities along the largest path leading to i. For the latest expected time, we consider the last node. Now for each path move backwords, substituting the for each activity (ij). Thus we have if only one path events from J to i or it is the minimum of {E[LJ) ] for all J for which the activities (i, j) is defined. Note: The probability distribution of times for completing an event can be approximated by the normal distribution due to central limit theorem. Since represents the earliest occurrence time, event will meet a certain schedule time STi (specified by an analyst) with probability = Pr = Pr (Z Ki) where Z ~N(01) and Ki = . It is common practice to compute the probability that event i will occur no later than its LCe such probability will then represent the chance that the succeeding events will occur within the (ESe, LCe) duration. Example: A project is represented by the network shown below and has the following data. Task A B 18 22 20 C 26 40 33 D 16 20 18 E 15 25 20 F 6 12 9 G 7 12 10 H 7 9 8 I 3 5 4

Optimistic Time 5 Pessimistic Time 10 Most Likely Time 8

Determine the following: a) Expected task time and their variance. b) The earliest and latest expected times to reach each event. c) The critical path.

d) The probability of an event occurring at the proposed completion data if the original contract time of completing the project is 41.5 weeks.

e) The duration of the project that will have 96% channel of being completed. Solution: a) Using the formula we can calculate expected activity times and variance in the following table ) Activity 1-2 1-3 1-4 2-5 2-6 3-6 4-7 5-7 6-7 A 5 18 26 16 15 6 7 7 3 B 10 22 40 20 25 12 12 9 5 M 8 20 33 18 20 9 10 8 4 7-8 20-00 33-0 18-0 20-0 9-0 9-8 8-0 4-0 V 0.696 0.444 5.429 0.443 2.780 1.000 0.694 0.111 0.111

Forward Pass: E1 = 0 E2 = 7.8 E3 = 20 E4 = 33 E5 = 25-8 E6 = 29 E7 = 42. Backward Pass: L7 = 42.8 L6 = 38.8 L5 = 34.8 L4 = 33.0 L3 = 29.8 L2 = 16.8 L1 = 0. b) The E-values and L-values are shown in Fig.

c) The critical path is shown by thick line in fig. The critical path is 1-4-7 and the earliest completion time for the project is 42.8 weeks.

d) The last event 7 will occur only after 42.8 weeks. For this we require only the duration of critical activities. This will help us in calculating the standard duration of the last event. Expected length of critical path = 33+9.8 = 42.8 Variance of article path length = 5.429+0.694 = 6.123 Probability of meeting the schedule time is given by Pi (Z Ki) = Pi (Z 0.52) = 0.30 (From normal distribution table) Thus the probability that the project can be completed in less than or equal to 41.5 weeks is 0.30. In other words probably that the project will get delayed beyond 41.5 weeks is 0.70. e) iven that P (Z Ki) = 0.95. But Z0.9S = 1.6 u, from normal distribution table. Then 1.6 u = Sji = 1.642.47+42.8 = 46.85 weeks

Master of Computer Application (MCA) Semester 4 Computer Based Optimization Methods


Assignment Set 2 (40 Marks)

1. Ships arrive at a port at a rate of one in every three hours, with a negative exponential distribution of inter arrival times. The time a ship occupies a berth for unloading and loading has a negative exponential distribution with an average of 12 hours. If the average delay of ships waiting for berths is to be kept below 6 hours, how many berths should there be at the port?
Answer:

Solution: = 1/3 = 1/12 r= For multichannel queues, < 1, to ensure that the queue does not explode. Therefore c > 4 Let us calculate the waiting time when c = 5

Substituting for n, c and p

Average waiting time of a ship

= 6.65, which is greater than 6 hours and inadequate. When c = 6, Po = . And average waiting time of a ship E(w) = 1.71 hours. Hence 6 berths should be provided at the port.

2. Find the optimum integer solution to the following all IPP


Maximize Z = x1 + 2x2 subject to the constraints x1 + x2 7 2x1 11 2x2 7 x1, x2 0 and are integers

Solution: Step 1: Introducing the slack variables, we get 2x2 + x3 = 7 x1 + x2 + x4 = 7 2x1 + x5 = 11 x1, x2, x3, x4, x5 > 0. Step 2: Ignoring the integer condition, we get the initial simplex table as follows:

Introducing x2 and leaving x3 from the basis, we get

Introducing X1 and leaving X4 we get the following optimum table. Optimum table

The optimum solution thus got is: . Step 3: Since the optimum solution obtained above is not an integer solution, we must go to next step. Step 4: Now we select the constraint corresponding to the criterion maxi (fBi) = max (fB1, fB2, fB3) = max Since in this problem, the x2 equation and x1-equation both have the same value of fBi ie , either one of the two equations can be used. Now consider the first row of the optimum table . The Gomorys constraint to be added is d Adding this new constraint to the optimum table we get

Step 5: To apply dual simplex method. Now, in order to remove the infeasibility of the optimum solution: , we use the dual simplex method. i) leaving vector is G1 (i.e. )

ii) Entering vector is given by

= Therefore k = 3. So we must enter a3 corresponding to which x3 is given in the above table. Thus dropping G1 and introducing x3. We get the following dual simplex table.

Thus clearly the optimum feasible solution is obtained in integers. Finally we get the integer optimum solution to the given IPP as x1 = 4, x2 = 3 and max z = 10.

3. Reduce the following game by dominance and find the game value

Player B

I I Player A II III IV 3 3 4 0

II 2 4 2 4

III 4 2 4 0

IV 0 4 0 8

Solution: This matrix has no saddle point. We reduce the size of the matrix by the principle of dominance. From players As point of view, row I is dominated by row III. Therefore row I is deleted and the reduced matrix is

From Bs point of view, column I is dominated by III. Therefore column I is deleted and so the matrix becomes II III IV II III 4 2 2 4 0 4 0 8

IV 4

In the above matrix no single row (or column) dominates another row (column). But column II is dominated by the average of column III and IV which is

Hence column II is deleted. Therefore the reduced matrix is III IV II III 2 4 4 0

IV 0

Again row II is dominated by the average of III and IV rows, which gives . Therefore row II is deleted and 2 2 matrix results. III IV III 4 0 8

IV 0

The above 2 2 matrix has no saddle point. It can be solved by the arithmetic method.

Therefore the complete solution to the given problem is Optimal strategy for player A: (0, 0, 2/3, 1/3). Optimal strategy for player B: (0, 0, 2/3, 1/3). The value of the game is (for A) = = 8/3.

4. Write the different applications of simulation?


Answer:

1. The application of simulation in business is extremely wide. Unlike the other mathematical models, through theoretical, simulation can be easily understood by the users and thereby facilitates their active involvement. This, in turn, makes the results more reliable and also ensures easy acceptance for execution. The degree to which a simulation model can be made close to reality is dependent upon the ingenuity of the O.R team who should identify the relevant variables as well as their behavior.

2. The use of computer simulation for studying the likely behavior of the nuclear reactors during accidents is case in point. Clearly testing actual reactors are even scaled down reactor models under emergency conditions would involve excessive risks. 3. Simulation can also be in use for a wide diversity of problems encountered in production systems the policy for optimal maintenance in terms of frequency of replacement of spares or preventive maintenance, number of maintenance crews, number of equipment for handling materials, job shop scheduling, routing problems, stock control and so forth. The other areas of application include dock facilities, facilities at airports to minimize congestion, hospital appointment systems and even management games. 4. Like in the case of other O.R. models, with the help of simulation, the manager tries to strike a balance between opposing costs of providing facilities (which usually mean long term commitment of funds) and the opportunity and other costs of not providing them.

5. A firm engaged in producing 2 models viz., model A and model B, performs only 3 operations painting, assembly and testing. The relevant data are as follows Unit Sale Price Hours required for each unit Assembly Model A Rs 50.00 Model B Rs 80.00 1.0 1.5 Painting 0.2 0.2 Testing 0.0 0.1

Answer: Solution: Let us first write the notations as under:

Z : Total revenue x1 : Number of Units of Model A x2 : Number of Units of Model B b1 : Weekly hours available for assembly b2 : Weekly hours available for painting b3 : Weekly hours available for testing.

Since the objective (goal) of the firm is to maximize its revenue, the model can be stated as follows: Maximize Z = 50x1+80x2 objective function Subject to 1.0 x1+1.5x2 600 Assembly constraints 0.2 x1+0.2x2 100 Painting constraints 0.0 x1+0.1x2 30 Testing constraints and x1 0, x2 0 Non-negativity conditions. Here x1 and x2 are the decision variables.

6. Write a C++ program for simplex method.


Answer:

Simplex Algorithm 1) Locate the most negative number in the last (bottom) row of the simplex table, excluding that of last column and call the column in which this number appears as the work (pivot) column. 2) Form ratios by dividing each positive number in the work column, excluding that of the last row into the element in the same row and last column. Designate that element in the work column that yields the smallest ratio as the pivot element. If more than one element yields the same smallest ratio choose arbitrarily one of them. If no element in the work column is non negative the program has no solution. 3) Use elementary row operations to convert the pivot element to unity (1) and then reduce all other elements in the work column to zero. 4) Replace the x -variable in the pivot row and first column by x-variable in the first row pivot column. The variable which is to be replaced is called the outgoing variable and the variable that replaces is called the incoming variable. This new first column is the current set of basic variables.

5) Repeat steps 1 through 4 until there are no negative numbers in the last row excluding the last column. 6) The optimal solution is obtained by assigning to each variable in the first column that value in the corresponding row and last column. All other variables are considered as non-basic and have assigned value zero. The associated optimal value of the objective function is the number in the last row and last column for a maximization program but the negative of this number for a minimization problem. 3.4.2 Simplex Method Flowchart

Note: 1. The pivot column is the column with the most negative value in the objective function. If there are no negatives, stop, youre done. 2. Find the ratios between the non-negative entries in the right hand side and the positive entries in the pivot column. If there are no positive entries, stop, there is no solution. 3. The pivot row is the row with the smallest non-negative ratio. Zero counts as a non-negative ratio. 4. Pivot where the pivot row and pivot column meet. 5. Go back to step 1 until there are no more negatives in the bottom row. Problems: 1) Maximize z = x1+ 9x2 + x3 Subject to x1 + 2x2 + 3x3 9 3x1 + 2x2 + 2x3 15 x1, x2, x3 0. Rewriting in the standard form Maximize z = x1 + 9x2 + x3 + 0.S1 + 0.S2 Subject to the conditions x1 + 2x2 + 3x3 + S1 = 9

3x1 + 2x2 + 2x3 + S2 = 15 x1, x2, x3, S1, S2 0. Where S1 and S2 are the slack variables. The initial basic solution is S1 = 9, S2 = 15 \ X0 = , C0 = The initial simplex table is given below :

S1 outgoing variable, x2 incoming variable. Since there are three Zj Cj which are negative, the solution is not optimal. We choose the most negative of these i.e. 9, the corresponding column vector x2 enters the basis replacing S1, since ratio is minimum. We use elementary row operations to reduce the pivot element to 1 and other elements of work column to zero. First Iteration The variable x1 becomes a basic variable replacing S1. The following table is obtained. Since all elements of the last row are non-negative the optimal solution is obtained. The maximum value of the objective function Z is which is achieved for x2 = , S2 = 6 which are the basic variables. All other variables are non-basic. 2) Use Simplex method to solve the LPP Maximize Z = 2x1 + 4x2 + x3 + x4 Subject to x1 + 3x2 + x4 4 2x1 + x2 3 x2 + 4x3 + x4 3 x1, x2, x3, x4 0 Rewriting in the standard form

Maximize Z = 2x1 + 4x2 + x3 + x4 + 0.S1 + 0.S2 + 0.S3 Subject to x1 + 3x2 + x4 + S1 = 4 2x1 + x2 + S2 = 3 x2 + 4x3 + x4 + S3 = 3 x1, x2, x3, x4, S1, S2, S3 0. The initial basic solution is S1 = 4, S2 = 3, S3 = 3 \X0 = = C0 = The initial table is given by

S1 is the outgoing variable, x2 is the incoming variable to the basic set. The first iteration gives the following table : x3 enters the new basic set replacing S3, the second iteration gives the following table : x1 enters the new basic set replacing S2, the third iteration gives the following table: Since all elements of the last row are non-negative, the optimal solution is Z = which is achievedfor x2 = 1, x1 = 1, x3 = and x4 = 0.

Master of Computer Application (MCA) Semester 4 MC0080 Analysis and Design of Algorithms 4 Credits
Assignment Set 1 (40 Marks)

1. Briefly explain the concept of Djikstras Algorithm. Ans: Directed Graph So far we have discussed applications of Greedy technique to solve problems involving undirected graphs in which each edge (a, b) from a to b is also equally an edge from b to a. In other words, the two representations (a, b) and (b, a) are for the same edge. Undirected graphs represent symmetrical relations. For example, the relation of brother between male members of, say a city, is symmetric. However, in the same set, the relation of father is not symmetric. Thus a general relation may be symmetric or asymmetric. A general relation is represented by a directed graph, in which the (directed) edge, also called an arc, (a, b) denotes an edge from a to b. However, the directed edge (a, b) is not the same as the directed edge (b, a). In the context of directed graphs, (b, a) denotes the edge from b to a. Next, we formally define a directed graph and then solve some problems, using Greedy technique, involving directed graphs. Actually, the notation (a, b) in mathematics is used for ordered pair of the two elements viz., a and b in which a comes first and then b follows. And the ordered pair (b, a) denotes a different ordered set in which b comes first and then a follows. However, we have misused the notation in the sense that we used the notation (a, b) to denote an unordered set of two elements, i.e., a set in which order of occurrence of a and b does not matter. In Mathematics the usual notation for an unordered set is {a, b}. In this section, we use parentheses (i.e., (and)) to denote ordered sets and braces (i.e., {and}) to denote a general (i.e., unordered set). Definition A directed graph or digraph G = (V(G), E(G)) where V(G) denotes the set of vertices of G and E(G) the set of directed edges, also called arcs, of G. An arc from a to b is denoted as (a, b). Graphically it is denoted as follows: in which the arrow indicates the direction. In the above case, the vertex a is sometimes called the tail and the vertex b is called the head of the arc or directed edge. Definition A Weighted Directed Graph is a directed graph in which each arc has an assigned weight. A weighted directed graph may be denoted as G = (V(G), E(G)), where any element of E(G) may be of the form (a, b, w) where w denotes the weight of the arc (a, b). The directed Graph G = ((a, b, c, d, e), ((b, a, 3), (b, d, 2), (a, d, 7), (c, b, 4), (c, d, 5), (d, e, 4), (e, c, 6))) is diagrammatically represented as follows:

Figure 7.7.1 Single-Source Shortest Path

Next, we consider the problem of finding the shortest distances of each of the vertices of a given weighted connected graph from some fixed vertex of the given graph. All the weights between pairs of vertices are taken as only positive number. The fixed vertex is called the source. The problem is known as Single-Source Shortest Path Problem (SSSPP). One of the well-known algorithms for SSSPP is due to Dijkstra. The algorithm proceeds iteratively, first consider the vertex nearest to the source. Then the algorithm considers the next nearest vertex to the source and so on. Except for the first vertex and the source, the distances of all vertices are iteratively adjusted, taking into consideration the new minimum distances of the vertices considered earlier. If a vertex is not connected to the source by an edge, then it is considered to have distance from the source. Algorithm Single-source-Dijkstra (V, E, s) // The inputs to the algorithm consist of the set of vertices V, the set of edges E, and s // the selected vertex, which is to serve as the source. Further, weights w(i, j) between // every pair of vertices i and j are given. The algorithm finds and returns dv, the // minimum distance of each of the vertex v in V from s. An array D of the size of // number of vertices in the graph is used to store distances of the various vertices // from the source. Initially Distance of the source from itself is taken as 0 // and Distance D(v) of any other vertex v is taken as . // Iteratively distances of other vertices are modified taking into consideration the // minimum distances of the various nodes from the node with most recently modified // distance. D(s) 0 For each vertex v s do D(v) // Let Set-Remaining-Nodes be the set of all those nodes for which the final minimum // distance is yet to be determined. Initially Set-Remaining-Nodes V while (Set-Remaining-Nodes ) do begin choose v Set-Remaining-Nodes such that D(v) is minimum

Set-Remaining-Nodes Set-Remaining-Nodes ~ {v} For each node x Set-Remaining-Nodes such that w(v, x) do D(x) min {D(x), D(v) + w(v, x)} end 2. Describe the following with suitable examples for each: o Binary Search Trees Ans: We know that for binary search trees and red-black trees, any satellite information associated with a key are stored in the same node as the key. In practice, one might actually store with each key just a pointer to another disk page containing the satellite information for that key. The pseudo code in this chapter implicitly assumes that the satellite information associated with a key, or the pointer to such satellite information, travels with the key whenever the key is moved from node to node. A common variant on a B tree, known as a B+ tree, stores all the satellite information in the leaves and stores only keys and child pointers in the internal nodes, thus maximizing the branching factor of the internal nodes. Objectives At the end of this unit the student should be able to: Find the height of a B-tree. Recognize a Fibonacci Heap Properties of B Trees A B tree T is a rooted tree (whose root is root [T]) having the following properties: 1. Every node x has the following fields: a. n [x], the number of keys currently stored in node x, b. the n [x keys themselves, stored in nondecreasing order, so that , c. leaf [x], a Boolean value that is TRUE if x is a leaf and FALSE if x is an internal node. 2. Each internal node x also contains n [x]+1 pointers c1[x], c2 [x]., cn[x]+1[x] to its children. Leaf nodes have no children, so their fields are undefined. 3. The keys keyI [x] separate the ranges of keys stored in each subtree: if ki is any key stored in the subtree with root ci [x], then . 4. All leaves have the same depth, which is the trees height h. 5. There are lower and upper bounds on the number of keys a node can contain. These bounds can be expressed in terms of a fixed integer called the minimum degree of the B Tree: a. Every node other than the root must have at least t 1 keys. Every internal node other than the root thus has at least t children. If the tree is nonempty, the root must have at least one key. b. Every node can contain at most 2t 1 keys. Therefore, an internal node can have at most 2t children. We say that a node is full if it contains exactly 2t 1 keys. The simplest B tree occurs when t = 2. Every internal node then has either 2, 3, or 4 children, and we have a 2-3-4 tree. In practice, however, much larger values of t are typically used.

o Red Black Trees Properties: A binary search tree in which The root is colored black All the paths from the root to the leaves agree on the number of black nodes No path from the root to a leaf may contain two consecutive nodes colored red Empty subtrees of a node are treated as subtrees with roots of black color. The relation n > 2h/2 - 1 implies the bound h < 2 log 2(n + 1). 3. Define and explain a context free grammar. Ans: Earlier in the discussion of grammars we saw context-free grammars. They are grammars whose productions have the form X -> , where X is a nonterminal and is a nonempty string of terminals and nonterminals. The set of strings generated by a context-free grammar is called a context-free language and context-free languages can describe many practically important systems. Most programming languages can be approximated by context-free grammar and compilers for them have been developed based on properties of context-free languages. Let us define context-free grammars and context-free languages here. Definition (Context-Free Grammar) : A 4-tuple G = < V , , S , P > is a context-free grammar (CFG) if V and are finite sets sharing no elements between them, S V is the start symbol, and P is a finite set of productions of the form X -> , where X V , and ( V ) *. A language is a context-free language (CFL) if all of its strings are generated by a context-free grammar. Example 1: L1 = { anbn | n is a positive integer } is a context-free language. For the following context-free grammar G1 = < V1 , , S , P1 > generates L1 : V1 = { S } , = { a , b } and P1 = { S -> aSb , S -> ab }. Example 2: L2 = { wwr| w {a, b }+ } is a context-free language , where w is a non-empty string and wr denotes the reversal of string w, that is, w is spelled backward to obtain wr . For the following context-free grammar G2 = < V2 , , S , P2 > generates L2 : V2 = { S } , = { a , b } and P2 = { S -> aSa , S -> bSb , S -> aa , S -> bb }. Example 3: Let L3 be the set of algebraic expressions involving identifiers x and y, operations + and * and left and right parentheses. Then L3 is a context-free language. For the following context-free grammar G3 = < V3 , 3, S , P3 > generates L3 : V3 = { S } , 3 = { x , y , ( , ) , + , * } and P3 = { S -> ( S + S ) , S -> S*S , S -> x , S -> y }. Example 4: Portions of the syntaxes of programming languages can be described by context-free grammars. For example { < statement > -> < if-statement > , < statement > -> < for-statement > , < statement > -> < assignment > , . . . , < if-statement > -> if ( < expression > ) < statement > , < forstatement > -> for ( < expression > ; < expression > ; < expression > ) < statement > , . . . , < expression > -> < algebraic-expression > , < expression > -> < logical-expression >,...}.

4. Explain in your own words the concept of Turing machines. Ans: There are a number of versions of a TM. We consider below Halt State version of formal definition a TM. Definition: Turing Machine (Halt State Version) A Turing Machine is a sixtuple of the form , where (i) Q is the finite set of states, (ii) is the finite set of non-blank information symbols, (iii) is the set of tape symbols, including the blank symbol (iv) is the next-move partial function from , where L denoted the tape Head moves to the left adjacent cell, R denotes tape Head moves to the Right adjacent cell and N denotes Head does not move, i.e., continues scanning the same cell. In other words, for qi Q and ak , there exists (not necessarily always, because d is a partial function) some q j Q and some a1 such that (qi ak) = (q j, a1, x), where x may assume any one of the values L, R and N. The meaning of (qi ak) = (q j, al, x) is that if qi is the current state of the TM, and ak is cell currently under the Head, then TM writes a1 in the cell currently under the Head, enters the state q j and the Head moves to the right adjacent cell, if the value of x is R, Head moves to the left adjacent cell, if the value of x is L and continues scanning the same cell, if the value of x is N. (v) q0Q, is the initial / start state. (vi) hQ is the Halt State, in which the machine stops any further activity. 5. Describe Matrix Chain Multiplication Algorithm using Dynamic Programming. Ans: It can be seen that if one arrangement is optimal for A1A2 A n then it will be optimal for any pairings of (A1A k) and (Ak+1 An). Because, if there were a better pairing for say A1A2 Ak, then we can replace the better pair A1A2 Ak in A1A2 Ak Ak+1..A n to get a pairing better than the initially assumed optimal pairing, leading to a contradiction. Hence the principle of optimality is satisfied. Thus, the Dynamic Programming technique can be applied to the problem, and is discussed below: Let us first define the problem. Let A i, 1 i n, be a d i 1 x d i matrix. Let the vector d [0n] stores the dimensions of the matrices, where the dimension of A i is d i 1 x di for i = 1, 2, .., n. By definition, any subsequence A j.A k of A1A2 A n for 1 j k n is a well-defined product of matrices. Let us consider a table m [1n, 1n] in which the entries mij for 1 i j n, represent optimal (i.e., minimum) number of operations required to compute the product matrix (A iA j). We fill up the table diagonal-wise, i.e., in one iteration we fill-up the table one diagonal m i, i + s, at a time, for some constant s 0. Initially we consider the biggest diagonal m ii for which s = 0. Then next the diagonal m i, i + s for s = 1 and so on. First, filling up the entries mii, i = 1, 2, .., n. Now mii stands for the minimum scalar multiplication required to compute the product of single matrix A i. But number of scalar multiplications required are zero. Hence, mii = 0 for i =1, 2, ..; n. Filling up entries for m i (i + 1) for i = 1, 2, .. (n 1).

m i (i + 1) denotes the minimum number of scalar multiplication required to find the product A i A i + 1. As A i is d i 1 x d i matrix and A i + 1 is d i x d i + 1 matrix. Hence, there is a unique number for scalar multiplication for computing A i A i + 1 giving m i , (i + 1) = d i 1d i d i + 1 for i = 1, 2, ., (n 1) The above case is also subsumed by the general case m i (i + s) for s 1 For the expression A i A i + 1A i + s Let us consider top-level pairing (A i A i + 1..A j) (A j + 1 ..A i + s) for some k with i j i + s. Assuming optimal number of scalar multiplication viz., mij and mi + 1, j are already known, we can say that m i (i + s) = min i j i + s (m i , j + m j + 1, s + d i 1 d j d i + s) for i = 1, 2, .., n s. When the term d i 1 d j d i + s represents the number of scalar multiplications required to multiply the resultant matrices (A iA j) and (A j + 1 ..A i + s) Summing up the discussion, we come the definition m i, i + s for i = 1, 2, .., (n 1) as m i, i + s = min i j i + s (m ij + m j + 1, i + s + d i 1 d i d i + 1) for i = 1, 2, .., (n s) Then m 1, n is the final answer Let us illustrate the algorithm to compute m j + 1, i + s discussed above through an example Let the given matrices be Thus the dimension of vector d [0 . . 4] is given by [14, 6, 90, 4, 35] For s = 0, we know m i i = 0. Thus we have the matrix Next, consider for s = 1, the entries m i, i + 1 = d i 1 d i d i + 1

6. Show that the clique problem is a N.P. complete problem. Ans: Proof: The verification of whether every pairs of vertices is connected by an edge in E, is done for different pairs of vertices by a Non-deterministic TM, i.e., in parallel. Hence, it takes only polynomial time because for each of n vertices we need to verify atmost n (n + 1) /2 edges, the maximum number of edges in a graph with n vertices. We next show that 3-CNF-SAT problem can be transformed to clique problem in polynomial time. Take an instance of 3-CNF-SAT. An instance of 3CNF-SAT consists of a set of n clauses, each consisting of exactly 3 literal, each being either a variable or negated variable. It is satisfiable if we can choose literals in such a way that: Atleast one literal from each clause is chosen

If literal of form x is chosen, no literal of form x is considered. For each of the literals, create a graph node, and connect each node to every node in other clauses, except those with the same variable but different sign. This graph can be easily computed from a Boolean formula in 3-CNF-SAT in polynomial time. Consider an example, if we have then G is the graph shown in above. In the given example, a satisfying assignment of is (x1 = 0, x2 = 0, x3 = 1). A corresponding clique of size k = 3 consists of the vertices corresponding to x2 from the first clause, x3 from the second clause, and x3 from the third clause. The problem of finding n-element clique is equivalent to finding a set of literals satisfying SAT. Because there are no edges between literals of the same clause, such a clique must contain exactly one literal from each clause. And because there are no edges between literals of the same variable but different sign, if node of literal x is in the clique, no node of literal of form x is. This proves that finding n-element clique in 3n-element graph is NPComplete.

Master of Computer Application (MCA) Semester 4 MC0080 Analysis and Design of Algorithms 4 Credits

Assignment Set 2

1. Explain the concept of Recursion in algorithms Ans 1 :


A Recursive Algorithm is an algorithm which calls itself with "smaller (or simpler)" input values, and which obtains the result for the current input by applying simple operations to the returned value for the smaller (or simpler) input. More generally if a problem can be solved utilizing solutions to smaller versions of the same problem, and the smaller versions reduce to easily solvable cases, then one can use a recursive algorithm to solve that problem. For example, the elements of a recursively defined set, or the value of a recursively defined function can be obtained by a recursive algorithm. If a set or a function is defined recursively, then a recursive algorithm to compute its members or values mirrors the definition. Initial steps of the recursive algorithm correspond to the basis clause of the recursive definition and they identify the basic elements. They are then followed by steps corresponding to the inductive clause, which reduce the computation for an element of one generation to that of elements of the immediately preceding generation. In general, recursive computer programs require more memory and computation compared with iterative algorithms, but they are simpler and for many cases a natural way of thinking about the problem. Example 1: Algorithm for finding the k-th even natural number Note here that this can be solved very easily by simply outputting 2*(k - 1) for a given k . The purpose here, however, is to illustrate the basic idea of recursion rather than solving the problem. Algorithm 1: Even(positive integer k) Input: k , a positive integer Output: k-th even natural number (the first even being 0) Algorithm: if k = 1, then return 0; else return Even(k-1) + 2 . Here the computation of Even(k) is reduced to that of Even for a smaller input value, that is Even(k-1). Even(k) eventually becomes Even(1) which

is 0 by the first line. For example, to compute Even(3), Algorithm Even(k) is called with k = 2. In the computation of Even(2), Algorithm Even(k) is called with k = 1. Since Even(1) = 0, 0 is returned for the computation of Even(2), and Even(2) = Even(1) + 2 = 2 is obtained. This value 2 for Even(2) is now returned to the computation of Even(3), and Even(3) = Even(2) + 2 = 4 is obtained. As can be seen by comparing this algorithm with the recursive definition of the set of nonnegative even numbers, the first line of the algorithm corresponds to the basis clause of the definition, and the second line corresponds to the inductive clause. By way of comparison, let us see how the same problem can be solved by an iterative algorithm. Algorithm 1-a: Even(positive integer k) Input: k, a positive integer Output: k-th even natural number (the first even being 0) Algorithm: int i, even; i := 1; even := 0; while( i < k ) { even := even + 2; i := i + 1; } return even . 2. What do you mean by Pushdown Automata? Ans 2: A PDA is formally defined as a 7-tuple: where

is a finite set of states is a finite set which is called the input alphabet is a finite set which is called the stack alphabet is a mapping of to finite subsets of , the transition * relation, where means "a finite (maybe empty) list of element of " and denotes the empty string. is the start state is the initial stack symbol is the set of accepting states

An element is a transition of M. It has the intended meaning that M, in state , with on the input and with as topmost stack symbol, may read a, change the state to q, pop A, replacing it by pushing . The letter (epsilon) denotes the empty string and the component of the transition relation is used to formalize that the PDA can either read a letter from the input, or proceed leaving the input untouched. In many texts the transition relation is replaced by an (equivalent) formalization, where

is the transition function, mapping .

into finite subsets of

Here (p,a,A) contains all possible actions in state p with A on the stack, while reading a on the input. One writes for the function precisely when the relation. Note that finite in this definition is essential. Example The following is the formal description of the PDA which recognizes the language by final state: for

PDA for

(by final state) , where

Q = {p,q,r} = {0,1} = {A,Z} F = {r} consists of the following six instructions: (p,0,Z,p,AZ), (p,0,A,p,AA), (p,,Z,q,Z), (p,,A,q,A), (q,1,A,q,), and (q,,Z,r,Z).

In words, in state p for each symbol 0 read, one A is pushed onto the stack. Pushing symbol A on top of another A is formalized as replacing top A by AA. In state q for each symbol 1 read one A is popped. At any moment the automaton may move from state p to state q, while it may move from state q to accepting state r only when the stack consists of a single Z. There seems to be no generally used representation for PDA. Here we have depicted the instruction (p,a,A,q,) by an edge from state p to state q labelled by a;A / (read a; replace A by ).

3. Explain the concept of undecidable problems for context free languages. Ans 3: Does a given Turing machine M halt on all inputs? Does Turing machine M halt for any input? Do two Turing machines M1 and M2 accept the same language? Is the language L (M) finite? Does L (M) contain any two strings of the same length? Does L (M) contain a string of length k, for some given k? If G is a unrestricted grammar. Does L (G) = ?

Does L (G) infinite? If G is a context sensitive grammar. Does L (G) = ?

Does L (G) infinite ? If L1 and L2 are any context free languages over .

Does Does L1 = L2? Does L1 L2?

If L is recursively enumerable language over Does L empty? Does L finite?

4. If L1 and L2 are context- free languages, then L1 U L2 is a context free language. Ans 4:

5. Explain prims Algorithm. Prims Algorithm The algorithm due to Prim builds up a minimum spanning tree by adding edges to form a sequence of expanding subtrees. The sequence of subtrees is represented by the pair (VT, ET), where VT and ET respectively represent

the set of vertices and the set of edges of a subtree in the sequence. Initially, the subtree, in the sequence, consists of just a single vertex which is selected arbitrarily from the set V of vertices of the given graph. The subtree is built-up iteratively by adding an edge that has minimum weight among the remaining edges (i.e., edge selected greedily) and, which at the same time, does not form a cycle with the earlier selected edges. We illustrate the Prims algorithm through an example before giving a semiformal definition of the algorithm. Example (of Prims Algorithm): Let us explain through the following example how Primes algorithm finds a minimal spanning tree of a given graph. Let us consider the following graph: Initially VT = (a) ET =

In the first iteration, the edge having weight which is the minimum of the weights of the edges having a as one of its vertices, is chosen. In this case, the edge ab with weight 1 is chosen out of the edges ab, ac and ad of weights respectively 1, 5 and 2. Thus, after First iteration, we have the given graph with chosen edges in bold and VT and ET as follows: VT = (a, b) ET = ( (a, b))

In the next iteration, out of the edges, not chosen earlier and not making a cycle with earlier chosen edge and having either a or b as one of its vertices, the edge with minimum weight is chosen. In this case the vertex b does not have any edge originating out of it. In such cases, if required, weight of a non-existent edge may be taken as . Thus choice is restricted to two edges viz., ad and ac respectively of weights 2 and 5. Hence, in the next iteration the edge ad is chosen. Hence, after second iteration, we have the given graph with chosen edges and VT and ET as follows: VT = (a, b, d) ET = ((a, b), (a, d))

In the next iteration, out of the edges, not chosen earlier and not making a cycle with earlier chosen edges and having either a, b or d as one of its vertices, the edge with minimum weight is chosen. Thus choice is restricted to edges ac, dc and de with weights respectively 5, 3, 1.5. The edge de with weight 1.5 is selected. Hence, after third iteration we have the given graph with chosen edges and VT and ET as follows: VT = (a, b, d, e) ET = ((a, b), (a, d); (d, e))

In the next iteration, out of the edges, not chosen earlier and not making a cycle with earlier chosen edge and having either a, b, d or e as one of its vertices, the edge with minimum weight is chosen. Thus, choice is restricted to edges dc and ac with weights respectively 3 and 5. Hence the edge dc with weight 3 is chosen. Thus, after fourth iteration, we have the given graph with chosen edges and VT and ET as follows: VT = (a, b, d, e, c) ET = ((a, b), (a, d) (d, e) (d, c))

At this stage, it can be easily seen that each of the vertices, is on some chosen edge and the chosen edges form a tree. Given below is the semiformal definition of Prims Algorithm Algorithm Spanning-Prim (G) // the algorithm constructs a minimum spanning tree // for which the input is a weighted connected graph G = (V, E) // the output is the set of edges, to be denoted by ET, which together constitute a minimum spanning tree of the given graph G // for the pair of vertices that are not adjacent in the graph to each other, can be given // the label indicating infinite distance between the pair of vertices.

// the set of vertices of the required tree is initialized with the vertex v0 VT {v0} ET // initially ET is empty // let n = number of vertices in V For i = 1 to n 1 do find a minimum-weight edge e = (v1, u1) among all the edges such that v1 is in VT and u1 is in V VT. VT VT { u1} ET = ET { e } Return ET 6. Give an algorithm for Greedy Knapsack problem. Analyze your algorithm? Ans 6: There are n items in a store. For i =1,2, . . . , n, item i has weight wi > 0 and worth vi > 0. Thief can carry a maximum weight of W pounds in a knapsack. In this version of a problem the items can be broken into smaller piece, so the thief may decide to carry only a fraction xi of object i, where 0 xi 1. Item i contributes xiwi to the total weight in the knapsack, and xivi to the value of the load. In Symbol, the fraction knapsack problem can be stated as follows. maximize nSi=1 xivi subject to constraint nSi=1 xiwi W It is clear that an optimal solution must fill the knapsack exactly, for otherwise we could add a fraction of one of the remaining objects and increase the value of the load. Thus in an optimal solution nSi=1 xiwi = W. Greedy-fractional-knapsack (w, v, W) FOR i =1 to n do x[i] =0 weight = 0 while weight < W do i = best remaining item IF weight + w[i] W then x[i] = 1 weight = weight + w[i] else x[i] = (w - weight) / w[i] weight = W return x Analysis

If the items are already sorted into decreasing order of vi / wi, then the whileloop takes a time in O(n); Therefore, the total time including the sort is in O(n log n). If we keep the items in heap with largest vi/wi at the root. Then creating the heap takes O(n) time while-loop now takes O(log n) time (since heap property must be restored after the removal of root) Although this data structure does not alter the worst-case, it may be faster if only a small number of items are need to fill the knapsack. One variant of the 0-1 knapsack problem is when order of items are sorted by increasing weight is the same as their order when sorted by decreasing value. The optimal solution to this problem is to sort by the value of the item in decreasing order. Then pick up the most valuable item which also has a least weight. First, if its weight is less than the total weight that can be carried. Then deduct the total weight that can be carried by the weight of the item just pick. The second item to pick is the most valuable item among those remaining. Keep follow the same strategy until thief cannot carry more item (due to weight). Proof One way to proof the correctness of the above algorithm is to prove the greedy choice property and optimal substructure property. It consist of two steps. First, prove that there exists an optimal solution begins with the greedy choice given above. The second part prove that if A is an optimal solution to the original problem S, then A - a is also an optimal solution to the problem S - s where a is the item thief picked as in the greedy choice and S s is the subproblem after the first greedy choice has been made. The second part is easy to prove since the more valuable items have less weight. Note that if v` / w` , is not it can replace any other because w` < w, but it increases the value because v` > v. Theorem The fractional knapsack problem has the greedy-choice property. Proof Let the ratio v`/w` is maximal. This supposition implies that v`/w` v/w for any pair (v, w), so v`v / w > v for any (v, w). Now Suppose a solution does not contain the full w` weight of the best ratio. Then by replacing an amount of any other w with more w` will improve the value.