You are on page 1of 74

Michael James Which University FYP

BSc. Computer Science 21st March 2008

CHAPTER 1. INTRODUCTION I certify that the material contained in this dissertation is my own work and does not contain unreferenced or unacknowledged material. I also warrant that the above statement applies to the implementation of the project and all associated documentation. Regarding the electronically submitted version of this submitted work, I consent to this being stored electronically and copied for assessment purposes, including the Departments use of plagiarism detection systems in order to check the integrity of assessed work. I agree to my dissertation being placed in the public domain, with my name explicitly included as the author of the work.

Date: 20th March 2008 Signed

CHAPTER 1. INTRODUCTION

Abstract
The aim of this project is to develop an application which aids a potential University applicant in choosing a suitable University. The complete system comprises of a background MySQL database and a front end PHP web application. The web application uses a fuzzy logic querying approach to return non-discrete results to the user based on a closeness-of-match score calculated at runtime.

This project can be broken down into three main areas: research into current applications which use a querying approach based on fuzzy logic, eliciting the requirements & the designing of a system to meet the required specification and a prototype system to demonstrate to completed application in use.

CHAPTER 1. INTRODUCTION

Contents ____________________________________
List of Figures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6 List of Tables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1 Introduction 1.1 Overall Aim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Project Aims . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4 Report Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Background Research & Related Work 2.1 Background Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 Fuzzy Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.2 Fuzzy Set Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.3 Fuzzy Logic Approaches With MySQL. . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 Choosing A House . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 9 9 10 11 12 12 12 13 14 16 16

Design 20 3.1 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.1.1 High Level Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.1.2 Detailed Requirements List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.2 Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.3 Design Decisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.2.1 MySQL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.2.2 PHP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 3.4 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 3.4.1 User Interface - PHP Front End . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.4.2 Administrator Interface - PHP Front End . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.4.3 Logic Processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.4.4 MySQL Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.5 Which University Web Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Implementation 4.1 Method of Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 4.2 MySQL Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 4.2.1 Database Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 4.3 Fuzzy Logic Calculations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 4.4 Which University Web Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 4.4.1 Welcome Screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 4.4.2 Administrator Control Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .42 4.4.3 User Data Entry Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.4.4 Displaying of the Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4

4.

CHAPTER 1. INTRODUCTION 5. System Operation 5.1 Usage Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.1 Welcome Screen. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.2 User Data Entry Form. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.3 Displaying of the Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.4 Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 44 45 46 47 47

6.

Testing and Evaluation 49 6.1 Testing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 6.1.2 Black Box Testing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 6.2 User Interface Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 6.3 Overall Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 References

Appendix Appendix 1 - FYP Proposal Working Documents The Working Documents for this project are available at www.lancs.ac.uk/ug/jamesm2

CHAPTER 1. INTRODUCTION

List of Figures ____________________________________


Fig. 1: Bivalent Sets to characterise the Temp. of a room. 13

Fig. 2: Fuzzy Sets to characterise the Temp. of a room.

14

Fig. 3: Graph showing partial goodness of match scores applied to house prices. 18 Fig. 4: Use Case diagram depicting the Users and Administrators interaction with the system. 24

Fig. 5: Overview of the Proposed System Architecture.

29

Fig. 6: Entity relationship diagram for the Which University system.

30

Fig. 7: Proposed User Data Entry Web Interface.

34

Fig. 8: Proposed Search Results Interface.

35

Fig. 9: Which University Database UML Diagram.

37

Fig. 10: Calculating Overall Feature Weighting.

39

Fig. 11: Numerical Fuzzy Logic Calculation.

40

Fig. 12: Textual Fuzzy Logic Calculation.

41

Fig. 13: Which University Application Welcome Page.

45

CHAPTER 1. INTRODUCTION Fig. 14: Which University User Data Entry Form. 46

Fig. 15: Which University Search Results.

47

Fig. 16: Sequence Diagram Depicting a User Usage Scenario

48

Fig. 17: A Model Of The Software Testing Process.

49

Fig. 18: Black Box Testing Model

51

CHAPTER 1. INTRODUCTION

List of Tables ____________________________________


Table 3.1: Derived Requirements List for High Level Requirement A. 21

Table 3.2: Derived Requirements List for High Level Requirement B.

22

Table 3.3: Derived Requirements List for High Level Requirement C.

22

Table 3.4: Derived Requirements List for High Level Requirement D.

22

Table 3.5: Derived Requirements List for High Level Requirement E.

22

Table 3.6: Derived Requirements List for High Level Requirement F.

23

Table 3.7: Derived Requirements List for High Level Requirement G.

23

Table 3.8: Derived Requirements List for High Level Requirement H.

23

CHAPTER 1. INTRODUCTION

Introduction 1 ____________________________________
1.1 Overall Aim
The main aim of this project is to develop a system which would be useful to University applicants when initially considering which Universities to apply to. Such an application should present a user with a variety of University-based questions and allow them to input their choices into the web based form. The application should make use of various fuzzy logic querying techniques when dealing with the data collected from the user. Using this approach should allow a University profile to be generated from the users preferences. Non-discrete (fuzzy) results should then be calculated by comparing the University profile to each stored University. The results from this can then be ranked in descending order of closeness-ofmatch score and will therefore allow the user to see details of their most suitable University first.

1.2 Motivation
Traditional search techniques which feature in the majority of all existing search applications tend to follow the rigid feature matching approach. These types of systems produce binary (Yes/No) responses to questions and therefore, as a consequence, may yield no results at all, or at least very few. Systems following this approach can also generate far too many results and the output tends to be ordered in a way which is of no real assistance to the user. These types of search applications can be limiting and unsophisticated to use, especially when concerning Universities. This is due to the fact that some features of a University are inherently imprecise, or fuzzy by nature such as near to shops or large campus. It is possible to convert these into numbers (distances, sizes) but in reality, users do not view these characteristics in such a discrete, scientific manner. This is what provides the real motivation behind the development of a fuzzy logic based University search application as the system must be able to handle vague and fuzzy feature specifications in a natural way. If this is achieved then it should alleviate the boundaries and problems associated with many conventional rigid feature matching systems. The lack of a current application that allows potential applicants to state precisely their desired characteristics in a University at an early stage in the selection process provides another strong case why a University search application implementing fuzzy logic queries could be deemed a worthwhile project, filling a possible niche. A lot of users may not have thought through exactly what they want, or may have difficulty expressing exactly what they are looking for. In addition, their wishes may change over time. Rigid feature matching (discrete) search applications require the user to know exactly what they are looking for prior to the search being carried out. If the University doesnt meet a feature requirement, then it is immediately discarded from the search results. This is yet another example that highlights the need for a fuzzy logic system, as this would allow for scope within a search and would not severely penalise Universities should they not meet exactly the profile that the user was trying to create.
9

CHAPTER 1. INTRODUCTION

1.3 Project Aims


A system for choosing a suitable University has to take into account quite a variety of factors, including (but not necessarily limited to) the following, The system should accommodate a wide variety of different user purposes and needs. It should incorporate the vast majority of relevant University features. Varying importance of different features should be taken into account. Inherent fuzziness of some features should be dealt with naturally. The system must lead a user into generating a University profile suitable to them in a way which they find easy to understand and subsequently, navigate.

It is also necessary to take into account that the importance of University features will vary amongst users. For the majority of applicants, location, University reputation, degree course and available accommodation type may be considered the most important factors. Similarly, some features such as proximity to sports facilities or shops may only be considered important by some of the users. In addition, it is likely that there will also be quite a number of nice-tohave features which will be highly individual to each user. Regarding the features deemed nice to have, another aim of the project can be seen as follows, The system must handle the importance of features in a natural and fair way so that features deemed nice-to-have will not adversely affect a Universitys closeness of match score.

In addition, there is an almost unlimited range of features which may affect a decision when an applicant is choosing potential Universities. Therefore, the following aim may also be added, The system should incorporate a certain amount of uncommon and indirect University features without potentially discouraging the user with a long series of drawn out questions.

Each of the stored University features may be quite diverse in nature. University league placing is numerical, en-suite accommodation or student parking available is symbolic (Yes/No), names of cities/areas of the country are specific and textual, and features such as graduate job opportunities can be classed as indirect. Therefore, the following is necessary, The system should ensure that each different feature category uses a different and appropriate type of logic for representation and profile matching.

In addition, there are several general features which the system must aim to include in order for it to be successful. The GUI which the system will use needs to be aesthetically pleasing and intuitive to all categories of potential user (age, computer literacy level etc). The database should only ever store relevant details which are required for the systems functionality. Passwords should only ever be stored server side. The system should be web integrated to allow for the greatest possible coverage. This would also be of benefit when carrying out system testing and evaluation.
10

CHAPTER 1. INTRODUCTION

1.4 Report Overview


The remaining sections of this report are outlined as follows: Chapter two provides some background information into relevant areas of research and examines existing work and applications which could aid the design and implementation of the system. Chapter three offers a breakdown and explanation of the major design decisions made and describes the system architecture, interface designs and query structure. Diagrams of the overall architecture are included to illustrate how the system will integrate. A detailed list of specific system requirements to be met by the completed product is also included in this section. Chapter four gives an overview of the system implementation & main data structures. Important sections of code are also listed in this section. Chapter five shows the application in operation and gives a preview of how the system would be used in reality. This is achieved via a walkthrough of the main features of a typical session with the finished system. Chapter six discusses how the system was evaluated and gives an overview of the testing undertaken. This section also considers how successfully the system meets the specified requirements and contains details of any interesting and important findings indicated by the data. Chapter seven revisits the objectives specified in this chapter and draws conclusions based on how successfully the initial aims were met. It analyses the overall system and discusses possible revisions to the design/implementation, indicating where further work/development could be carried out.

11

CHAPTER 2. BACKGROUND RESEARCH & RELATED WORK

Background Research & Related Work 2 ____________________________________


2.1 Background Research
Prior to the design phase of the proposed system, background research was carried out into areas and fields which were considered relevant and potentially beneficial to the functionality of the overall application. At this stage no concrete platforms had been decided on and it was deemed necessary to fully understand the concepts behind how fuzzy logic could be used to process and manipulate data in order for the system to function as intended. Various existing implementations of such logic were also researched which provided not only a theoretical understanding of the arithmetic behind fuzzy logic, but also real-life contexts where similar existing systems have been implemented and developed to make best use of the benefits which fuzzy logic queries can offer.

2.1.1 Fuzzy Logic


Kaehler (2008) explains that the concept of Fuzzy Logic (FL) was conceived by Lotfi Zadeh (1965) a professor at the University of California at Berkley, and presented not as a control methodology, but as a way of processing data by allowing partial set membership rather than crisp set membership or non-membership. Fuzzy logic derives from fuzzy set theory (see section 2.1.2) and deals with a logical reasoning that is approximate rather than precise. The concept behind this logic implies that it would be ideal for integration into the proposed University search as the suggestion of partial set membership over crisp set membership would prevent Universities from being excluded from the final results set, should they not match perfectly against the University profile generated from the applicants criteria. Partial set membership, made possible by fuzzy logic, allows set membership values to range (inclusively) between 0 and 1, and makes linguistically imprecise concepts like slightly, quite and very feasible over discrete values such as yes or no. This has a positive implication on the proposed application as it would allow the system to gauge the importance of a University feature to an applicant using values such as very important, slightly important and averagely important. This is ideal when considering applying to a University as the majority of applicants will not be 100% certain of exactly the University characteristics they require, therefore crisp values such as important and not important may not be entirely suited to the question or fully justify an applicants true feelings. It could be argued that a common misconception is that fuzzy logic is less precise than other forms of logic. However, this may not be the case as it is simply an organised and mathematical method of naturally handling imprecise concepts. For example, the concept of tallness cannot be expressed in an equation, because although height is a quantity, tallness is not. However, there is a general agreement concerning what it means to be tall, and also, an assumption that there are no specific boundaries and the so therefore the answer is deemed to be fuzzy.
12

CHAPTER 2. BACKGROUND RESEARCH & RELATED WORK KGROUND

2.1.2 Fuzzy Set Theory


Following on from the idea of partial set membership is the concept that provides the premise to this theory. Fuzzy sets are those whose elements have varying degrees of membership. As with Fuzzy Logic, the Fuzzy Set Theory was formalised by Zadeh (1965) at the University Fuzzy Theory of California. It was seen as very much a paradigm shift from conventional bivalent set theory theory, where every proposition takes exactly one of two truth values (true or false). The traditional bivalent set theory ca be particularly limiting should we wish to describe can certain real world (humanistic) problems mathematically, such as characterising the mathematically, temperature of a room.

As can be seen from the diagram above the most limiting feature of all bivalent sets is that above, they are mutually exclusive. It is clearly not accurate enough to define the transition from a hey quantity such as cool to warm by the increase in one degree Fahrenheit. In reality, it is unlikely that such a sharp transition would be noticed; instead a smooth drift from cool to ooth warm would be recognised as occurring. Fig.2 below shows how Fuzzy Set Theory can be used to describe this natural effect more accurately,

13

CHAPTER 2. BACKGROUND RESEARCH & RELATED WORK KGROUND

2.1.3 Fuzzy Logic Approaches With MySQL


Having already gained some experience in using the MySQL database management system it system, seemed (personally) perhaps the most appropriate method of data storage for the proposed system. Through the research undertaken, a thorough understanding of the basic concepts behind the fuzzy logic approach had been gained, but it was necessary to explore methods in ach which fuzzy logic principles could be integrated into queries to retrieve data for the preferred into data storage system. An article by a BA Student in Rotterdam (Schaap, 2006) provides a good ins des insight into how fuzzy MySQL queries can be created. The problem that the web developer encountered (for a retail company) was that when trying to perform complicated MySQL queries the results were rform queries, simply too strict and frequently generat no result found errors when good although, not generated perfect, results existed within the database. This occurred when a customer was asking for a Sony television AND that is widescreen AND less than 1000.
WHERE tv_manufacturer=sony AND tv_description =%widescreen% AND tv_price < 1000;

The results generated from this initial search may be very accurate, but this approach severely limits opportunities when (an otherwise matching) television is 1025. In reality, the customer would probably be satisfied paying the 25 over their initial budget, yet the query used will not allow for this real-life eventua life eventuality. It is possible to rewrite this initial query by replacing the AND with OR, but by using the OR statement we get a different series of inaccurate results as now all televisions below 1000 will be shown, or all Sony televisions will be shown, or all widescreen television will appear in the results. This approach is the opposite of the previous query and an excess number of often unrelated results are generated. A solution to this problem was achieved using the built in MATCH AGAINST function MATCH AGAINST provided by MySQL. This system uses text matching which allows the addition of an d individual preference. The query will then allocate points to indicate the score in matching.
14

CHAPTER 2. BACKGROUND RESEARCH & RELATED WORK Although this is only a text matching system, the web developer from the research article (Schaap, 2006) successfully managed to use the MATCH AGAINST function to integrate a real world demand such as less than 1000. The implications of this on this project and the proposed system are significant as it should allow the successful development of fuzzy queries to deal with user preferences such as the University must be ranked within the top 50 in the UK or the number of University applicants each year must be less than 30,000. The fuzzy match against function for less than, more than queries was achieved by encoding the actual numbers into a word. In the case of the television search, the televisions manufacturer would be encoded to a unique word such as manufacturesony and the televisions price encoded to pricemaxonethousand. All of the desired televisions characteristics are then stored, within the same database row, in a new text only column,
databasetextrow = widescreen manurfacturesony pricemaxonethousand

The match against function in MySQL also allows searches to be conducted IN BOOLEAN MODE. This adds a preference to each search demand using the following symbols,
+ = Obligated > = Important ~ = More or less important - = Without

These preference symbols then allow queries to be created using the following format,
if($customerpricemax) < 1000) $search = >sony +widescreen ~pricemaxonethousand;

The overall match for each television can finally be returned in descending order of score using this MySQL query line. SELECT tv_manufacturer, MATCH (databasetextrow) AGAINST ( $search IN BOOLEAN MODE) AS score WHERE MATCH (databasetextrow) AGAINST ($search IN BOOLEAN MODE) ORDER BY score DESC

Although this approach appears to offer exactly what this proposed system would require, there are several other methods available which deliver similarly effective results, albeit, by different means. The technique researched above performs the fuzzy logic calculations on the actual database query itself before any results are returned. There is a proposed system in the following section of research which uses one of these alternative approaches.

15

CHAPTER 2. BACKGROUND RESEARCH & RELATED WORK

2.2 Related Work


The system below demonstrates an alternative approach to achieving a fuzzy result set. Instead of integrating fuzzy logic into the MySQL query statements, the database is first filtered, using conventional MySQL statements, and all results deemed useless to the user are removed. Fuzzy logic techniques are then applied to the remaining results using PHP and the results displayed to the user in descending order using a calculated closeness of match score.

2.2.1 Choosing a House


C. D. Paice (2007) discussed a system aimed at assisting people in choosing a house, specifically suited to them. There are many aspects of this system which could prove useful in the development of the prototype University search but this research will focus on the methods and techniques which were used to generate a fuzzy result set from a list of user preferences. The main reason behind this fuzzy logic based house search system was that some features of a house are inherently imprecise; for example, near to shops, large garden. The system was designed in a way which allows vague/fuzzy user preferences to be handled in a natural, efficient way. The rationale behind this systems development is very similar to that of the proposed Which University? search application. The system operates by computing a closeness of match score between a users created house profile and each of the currently stored houses in the database. Houses can then be ranked in descending order of score which will allow the user of the system to initially view the details of the most suitable houses. The closeness of match scores range from 0.0 which denotes a completely unsuited house, and 1.0 which indicates a perfect match for that particular house.

Feature Importance
When choosing a house there will inevitably be certain features which are permanently more important than others. For example, price may be a consistently more important factor than say distance to shops. Because of this, the importance of each feature needed to be taken into consideration to prevent nice to have features having the same weighting as other, more significant requirements. There are several different ways in which this can be achieved. The Choosing a House application focused on three of these. The first was to actually invite users to indicate their opinion on the importance of the features. This would require calculating an average of all opinions and giving each feature an importance rating from 0 to 1. This approach could be best served for the nice to have features where opinions on their importance tend to vary quite dramatically from user to user. Another method was to inherently assume that certain features are inherently very important, the system placing certain features above others by default. Although this is perhaps the most uncomplicated technique, there may be problems should the users views differ greatly from that of the vast majority of the systems users. This approach could however be developed to allow the importance of features to be adjusted dynamically with system use in light of feedback from users. To be able to calculate scores for each feature the importance must be represented by a number. For the Choosing a House application the developer chose to use the following values,
16

CHAPTER 2. BACKGROUND RESEARCH & RELATED WORK Essential features: importance = 1.0 Desirable features: importance = 0.6 Nice-to-have features: importance = 0.3

This is very relevant to the proposed University search system as values similar to these would need to be allocated to features such as University location, degree course availability and accessibility of sports facilities to represent their importance within a search.

Search Feature Matching Goodness of Match


Initially the system generated results based on the assumption that each feature will either match (underlined) or else will not match, Users Request Morecambe (1.0); 200 - 250K (1.0); 3+ bedrooms (1.0); Garden (0.6); Garage (0.6); Quiet road (0.3); Near to school (0.3); Near to shops (0.3)

Potential House Match Carnforth, 350K, 3 beds, Garden, Garage, Noisy road, Near shops and School: 55% (2.8 out of 5.1)

The prototype was then revised as in practise, there can be partial matches for a search. Partial matches were made possible by the introduction of the Goodness of Match concept. For a house search where a user specifies a desirable price in the range of 200 250K a house costing anywhere between 200K and 250K is considered perfect and is awarded a goodness of 1.0, with houses costing greater than 350K being classed as completely unsuitable and gaining a goodness score of 0.0. Where the system beings to accommodate partial matches is with houses that are outside of the stated range, but not to a point where they could be labelled as a complete non-starter and are disregarded. A house costing 270K may well be outside of the users specific range but if that same house matches well with the users other requirements then it is still possible that the house would be given some consideration. In cases such as this, the system allocates the price of the house an in-between goodness of match score. The range for where a feature such as price could be given a partial score is established using upper and lower limits. The system calculates this by adding 12% onto the upper limit and subtracting 20% from the lower limit. For the price range of 200 250K this would produce an extended range of 160 - 280K. With this in mind, for any price in the range 200 250K a goodness score of 1.0 (perfect) is awarded. Any house price outside of the extended range of 160 - 280K is given a goodness score of 0.0 (completely unsuitable). This leaves
17

CHAPTER 2. BACKGROUND RESEARCH & RELATED WORK KGROUND a range between 160 - 200K and 250 - 280K where a partial goodness of match score goodness match can be awarded. The system calculates this using arithmetic in the following form . arithmetic form, For house prices > the upper limit Goodness score = 1.0 x (exten (extended upper - house price) / (extended upper - upper limit) For a house price of 265K Goodness score = 1.0 x (280 - 265) / (280 - 250) = 0.50 For house prices < the lower limit Goodness score = 1.0 x (extended lower - house price) / (extended lower - lower limit) For a house price of 165K Goodness score = 1.0 x (160 - 165) / (160 - 200) = 0.125 160

Fig.3: Graph showing p partial goodness of match scores applied to house prices. pplied Although this approach of accommodating partial goodness of match was done with house prices in mind, the general principles and theory behind the arithmetic could be used successfully to allow partial matches to features such as University league position within the proposed system. This may work well in the proposed application, as if a University matched with other search criteria requested by a user but was marginally outside of the specified league uested position limit, the University would almost certainly still warrant consideration. This approach may also work well for features which have numerical values but the system has features to make use of a different method for allowing partial goodness of match scores when it deals with text based features such as location. The method employed in this case is simpler than for numerical values. The system itself recognises neighbouring areas, th through some form of storage mechanism. Suppose the specified location is Lancaster. The system will recognise neighbouring areas of Lancaster (e Heysham, Galgate) and for a house in Lancaster itself will (e.g
18

CHAPTER 2. BACKGROUND RESEARCH & RELATED WORK award a goodness of match score of 1.0 (perfect). For houses in neighbouring areas of Lancaster, a lower goodness of match score will be applied. The system uses a similar approach to this for all text based features. The final match score used to rank each of the houses in the search is calculated using the aggregate of importance x goodness of match for all the features. These can then be displayed to the user in order of rank. As with the numerical approach to partial goodness of match, the method used when dealing with text based features could provide a useful foundation for dealing with certain text based features in the proposed system. While this may be true, the text based solution used in the Choosing a House appears to offer far less scope for values when compared to the numerical solution. The text based method would only permit 3 possible closeness of match values, 1.0 for a house in the specified location; another specified value (0.6 for example) for a house in a neighbouring area and 0.0 for any house outside of the neighbouring area. The lack of scope available here may prove to be too limiting.

19

CHAPTER 3. DESIGN

Design 3 ____________________________________
In order to successfully create an in depth set of requirements for the system, a complete understanding of the situation and manner in which the system will be used must first be obtained. In addition, the design will rely significantly on my own knowledge and insight into which University features will be beneficial to include and which would prove unnecessary and merely serve to distract the users attention away from relevant data. As I have experience of being in the position of short listing Universities myself (less than 3 years ago), it may be argued that this provided additional valuable research into which features a potential applicant would find meaningful to include in an application such as this. Another key factor which needs to be considered is the user themselves. Although at first, it might seem apparent that the system should be catered towards college and Sixth Form leavers, from personal experience it appears that parents may also play an important role when their son/daughter considers the prospect of University. In light of this, it must also be presumed that parents may also find the proposed system of interest and make use of it themselves on behalf of their son/daughter. This highlights interesting design considerations, as users will evidently have varying levels of computer literacy; some users will be comfortable using computers and develop an instinctive sense of where icons and navigation buttons should be, while other users require key areas of the interface to be large and clearly labelled before they begin to feel comfortable using an application. Stereotypically, it is young generations who tend to be more computer literate than older generations, perhaps due to differing levels of computer interaction during early years of learning, highlighted by Bunz et al. (2007). Due to this, it must be ensured that an appropriate balance between intuitiveness and elegance is obtained whilst designing the user interface for the Which University search application, allowing all potential users to operate the system with ease.

3.1 Requirements
Based upon an understanding of the situation in which the proposed system will be used, a set of requirements have been devised. Each of these requirements should be met for the solution to be considered a success. A system design requires an overall outcome which can be used as a starting point for deriving more precise requirements.

Develop a system which will aid potential applicants in selecting a University which suits their needs. As many features of a University are fuzzy in nature, the system should balance features off against one another and produce a list showing the best choices in rank order.

20

CHAPTER 3. DESIGN

3.1.1 High Level Requirements


Using the systems overall aim above, a series of higher level requirements have been developed which focus more on specific aspects and functionality which the system must provide. A. Certify that the systems interface is suitable and intuitive for all major categories of proposed user. B. Construct a University profile for each user from the specific requests which they provide. C. Ensure that the system stores an appropriate amount of University data. D. Generate result sets using a fuzzy logic approach to balance requested features off against one another. E. Display all results to the user in a clear and consistent manner. F. Provide additional features to the user allowing them to further investigate and evaluate Universities once the results have been displayed. G. Supply an interface for a system administrator to perform maintenance on Universities without having to access low level database code. H. The system must be made public via a website to increase its overall coverage.

3.1.2 Detailed Requirements List


Based upon the high level requirements specified above, a fully derived, detailed requirements list was created and categorised into tables relevant to the respective high level constraints which they satisfy. These requirements will be revisited and reviewed as part of the Testing & Evaluation section of this report. Table 3.1: Derived Requirements List for High Level Requirement A. A. A1. A2. A3. A4. Certify that the systems interface is suitable and intuitive for all major categories of proposed user. Ensure that an aesthetically pleasing colour scheme and layout is utilised to provide the system with a professional appearance and also make the user feel welcome. Certify that the general layout of each navigational page is kept consistent. Provide, where possible, both step by step walkthroughs and shortcut alternatives to allow users with various levels of computer literacy to operate the system efficiently. Ensure that each navigational page provides links to every other navigational page in a clear and consistent manner.

21

CHAPTER 3. DESIGN Table 3.2: Derived Requirements List for High Level Requirement B. B. B1. B2. B3. B4. Construct a University profile for each user from the specific requests which they provide. Present the user with a data input screen which displays a variety of relevant University features. Allow the user to state the significance of each feature to them using fuzzy importance levels such as slightly important, very important etc. Provide secure data input methods such drop down boxes and radio buttons to prevent the introduction of lexical errors into the system. Choose appropriate default importance values for University characteristics to avoid results being adversely affected should the user choose to ignore certain features.

Table 3.3: Derived Requirements List for High Level Requirement C. C. Ensure that the system stores an appropriate amount of University data C1. Certify that the system stores enough data about each University to make the system worthwhile. C2. Ensure that all sensitive data and that which may infringe the Data Protection Act (1998) is not stored within the system. Table 3.4: Derived Requirements List for High Level Requirement D. D. The system must generate result sets using a fuzzy logic approach to balance requested features off against one another. D1. Ensure that Universities are not excluded from the results set for not satisfying certain search criteria. D2. Certify that each University feature is given an appropriate result weighting depending on its pre-decided level of importance. D3. The system should calculate a goodness of match score for each University determined by how closely it matches the user created profile.

Table 3.5: Derived Requirements List for High Level Requirement E. E. Display all results to the user in a clear and consistent manner. E1. All University results should be displayed in descending order of goodness of match score so that the users most suited Universities are displayed at the top. E2. To prevent cluttering of data, only a subset of important University features should be initially displayed. A full record of University features should be easily accessible should the user wish to enquire further.

22

CHAPTER 3. DESIGN Table 3.6: Derived Requirements List for High Level Requirement F. F. F1. F2. Provide additional features to the user allowing them to further investigate and evaluate Universities once the results have been displayed. For every University result generated provide a hyperlink to the Universitys official website. Once all results have been displayed, a free-text search facility should be provided to allow users to filter the results by words/values of their choice. It should be possible for the user to search all fields or let them specify a certain field to filter by. Provide some style of rating system to allow system users to rate Universities and view, at a glance, the overall user opinion of each University. The system should provide a section where users can comment and exchange their views on Universities.

F3. F4.

Table 3.7: Derived Requirements List for High Level Requirement G. G. Supply an interface for a system administrator to perform maintenance on Universities without having to access low level database code. G1. Provide an interface method, as either part of the website, or via a desktop application, to allow the system administrator to add/remove/and edit Universities. G2. The administrator interface must be securely protected, via password or other means, to prevent unauthorised alteration of database records.

Table 3.8: Derived Requirements List for High Level Requirement H. H. The system must be made public via a website to increase its overall coverage. H1. The system should utilize appropriate system architecture to facilitate smooth web integration. H2. Security must be taken into consideration with the system being distributed globally. All passwords must be stored securely server side and preventative measures taken to thwart potential MySQL injection attacks with PHP.

3.2 Use Cases


The Use Case diagram below presents a graphical overview of the functionality provided by the system in terms of actors, their goals ( represented as use cases) and any dependencies between those use cases. Following the diagram is a flow-of-events listing for each of the systems use cases which demonstrates the actors input and systems response, along with the order in which events are carried out.

23

CHAPTER 3. DESIGN

Fig.4: Use Case diagram depicting the Users and Administrators interaction with the system.

1. Login

24

CHAPTER 3. DESIGN 2. Add University

3. Edit University

4.

Delete University

25

CHAPTER 3. DESIGN 5. Filter University Results

6.

Perform University Search

26

CHAPTER 3. DESIGN 7. Rate Universities

8.

Comment On Universities

3.3 Design Decisions


This section summarises and justifies the platforms and tools being used within the Which University? system design.

3.3.1 MySQL
The database used as part of the Which University? application will be stored using MySQL 5.0. MySQL is an open source database relational database management system (RDBMS) based on SQL (Structured Query Language). It is widely used in web based applications due to its flexibility and seamless integration with PHP. It could be argued that most people view databases as those similar to Microsoft Access, but this is not actually a management system. MySQL allows users to connect to a specific database on the server and issue requests. Within MySQL, there are many features which would make this database management system suitable for this system and reasons why it out performs others. The stability of MySQL is perhaps its strongest feature and this has proven itself over the last ten years. MySQL is also multithreaded which allows multiple connections at the same time, without slowing down the system. It meets all of the ANSI SQL92 regulations and the unlicensed version of the MySQL database management system costs nothing. Therefore, when selecting which method of storage the
27

CHAPTER 3. DESIGN system would use, it seemed appropriate to use MySQL as it satisfies all of the requirements as well as being free to implement.

3.3.2 PHP
The web based user interface for the system will be created using PHP. PHP is a widely-used general-purpose scripting language that is especially suited to web development and can be embedded into HTML. The overall concept of PHP appears similar to that of JavaScript in respect that it can be imbedded into traditional HTML code. However, the procedure of removing much of the processing away from individual computers has some important benefits. WORKLOAD The user's computer is not required to do much of the processor-intensive work. This can speed page load times and generally ease the browsing experience. As well as this, PHP does not put a strain on servers. The code is optimized to make the server's job easier. This is particularly suited to the Which University? system as it means users with lower performance systems will not notice a significant amount of reduced performance issues. DYNAMIC CONTENT One of PHPs strongest features is its contribution to the creation of dynamic websites that can react to user input. As a program language, one of the main functions of PHP is storing variables. The scripts can encode and store user inputs into variables that can be passed on to other code to execute. The code can then query databases to draw out data and perform comparisons. Through these means, PHP can take user input and change a website in response to input unlike hard coded HTML which is generally static. The ability of PHP to offer this is invaluable for the creation of the proposed system. The PHP code is executed exclusively by the server and therefore requires no action from the end-user. The server used to store the web based system must have PHP installed but once uploaded the system can make use of dynamic features without effort by the user. A server will parse the code at its source, execute the code and then return properly formatted HTML to the users browser in code that it can decipher.

3.4 System Architecture


Figure 5 below illustrates how the various sections interface with each other. It can be seen that the MySQL database storing the Universities provides the centre point for both user and administrator functionality. The remainder of this section will provide a more detailed description of the function of each individual system component.

28

CHAPTER 3. DESIGN

Logic Processor

Fig.5: Overview of the Proposed System Architecture.

3.4.1 User Interface - PHP Front End


Once the user has accessed the application they are presented with a PHP page which displays a list of search criteria and University features. The user can then select the importance of each of these features from the list and submit the completed form to the server. Using these user preferences, the server then creates a University profile and calculates closeness-of-match scores between this profile and existing Universities by querying the MySQL database. These results are then ranked by the server and a summary of each University displayed in a separate PHP page. From here, the user has the option to view the entire University record, give their own user rating to a University or leave/view comments about each of the Universities. In addition, the interface allows the user to perform various formatting changes such as ordering results by any of the summary fields or filtering the results set using free-text.

3.4.2 Administrator Interface - PHP Front End


Once the administrator has logged into the system they are presented with a PHP page displaying each of the Universities currently stored within the database. The administrator has the option to add a new University to the database or edit/delete an existing University from the database. Similarly to the interface displayed to normal users of the system, the administrator can order the Universities by any of the summary fields or filter the University list using a freetext filter.

3.4.3 Logic Processor


The logic processor is the area of the system where all of the fuzzy logic calculations are carried out. Once the user has submitted their preferences, a University profile is created and
29

CHAPTER 3. DESIGN all results from the MySQL database are passed into the logic processor phase. This phase is not a separate PHP file but a section of code located within the user interface which focuses on performing calculations and logic on the returned results.

3.4.4 MySQL Database


The MySQL database is used as the main storage mechanism for the system. Both the user and administrator interfaces rely on the data stored within this database to fulfil their intended functionality. The MySQL database is used to store a variety of tables, the most significant of these storing a complete record for every University. As well as this, there is a general table of courses stored and an intermediate table which links courses to Universities at which they are taught. A table of areas must also be stored to allow the logic processor to distinguish between areas which are local to that of the location specified by the user, and those which are further away and should not hold any weighting and affect the overall results. These comprise the tables associated with University information but there are also several tables within the database which focus on user control and enforcing certain limitations. One of these tables keeps a calculated average of each of the Universitys current user rating. It does this by maintaining a cumulative total of the score along with the total number of votes for that particular University. This is then used to calculate an average score out of 10. The final table is used primarily to enforce the ruling that each user may only have a maximum of one vote for each University. The reasoning behind this requirement is to prevent user spamming of votes and in effect, prevent adversely affecting the overall user score for a particular University. Below is an entity relationship diagram for the database schema:
M 1 1

Has

University
M 1

IPs

Has

Teach

Has

Comments

Results

Courses

Courses

Fig. 6: Entity relationship diagram for the Which University system. As can be seen from the diagram above, a University consists of many courses, and many courses can be taught at a University. This many-to-many relationship requires an intermediate table to be created taking the primary keys of both University and Courses tables.
30

CHAPTER 3. DESIGN University


Uni ID Name Town / City Area Campus Layout Uni League Pos Ensuite Sports Accommodation Facilities Student Parking Research League Pos Cheapest To Live

Research % Mark

Best Degree League Pos Most From Overseas

Best Degree % Mark

Grad Grad Teaching Employ Employ Excellence League % Mark League Position Pos Starting Salary League Pos Highest Starting Salary Best For Sport

Teaching Excellence % Mark

Most Staff League Pos Student Sat % Mark

Most Staff UG/Staff Mem Website URL

Most Applications

Most From Overseas

Student Sat League Position

Match

Courses Course ID Name

University Courses ID Results Uni ID IPs ID Area Area Surrounding Regions Uni ID IP Count Votes Points Uni ID Course ID

Comments ID Uni ID Name Email Message

These diagrams provide a precise description of the contents of the tables and the relationship between them, but it is also important to note how the data within the tables will be used by the other sections of the system. Firstly and perhaps most importantly is the University table which lists all the relevant information about a particular University. The table contains a unique Uni ID as its primary key. Second to this are a group of fields which describe general features about the University,
31

CHAPTER 3. DESIGN such as its name, location, campus layout and its current league position. These were deemed features which may be of greatest interest to potential users so are also included in the feature shortlist which will be first viewed once results are returned. Next in the table are features which could be classed as available facilities at the University; these results only contain yes (available) or no (not available) values and consist of student parking availability, accessible sports facilities and the availability of en-suite accommodation. These features are slightly more personal as their importance will vary depending on factors such as whether or not the user is wishing to take a car etc. The remaining fields in this table are statistical and are grouped into league position + % mark pairs. An example of this would be the Universitys overall research league position and the percentage of excellence mark it attained. The courses table is somewhat simpler than the University table above and consists of only the courses unique ID which is the primary key and the course title. As Universities and Courses share a many-to-many relationship an intermediate table was needed to link these together; this intermediate table consists solely of the primary keys of both University and Course tables (Uni ID and Course ID). The results table is used to store the current score for each University, as voted for by the users. The first field in this table is the University ID which the score is relating to. As well as this there is a votes field which contains the total number of users who have cast their vote, and a points field which holds the total number of points this specific University has scored. The IPs table has the purpose of restricting users to only one vote on each University, which, as previously stated, was deemed necessary to prevent user spamming of votes from adversely affecting the overall user rating of a University. The first field in this table is simply an ID number which is also the primary key. The University ID number is the next field along and is used in combination with the IP field, ensuring that each user may only vote once on each University. Finally there is the comments table. The purpose of this is to contain all of the data regarding comments which users have provided about each University. Again the primary key of this table is just a unique ID number. The next field along is the University ID of the University for which the comment was left, this is used to filter the entire table to ensure that only relevant University comments are displayed for the University in which they relate to. The name, email and message fields in this table contain the main body of the actual comment made and consist of the data which the user has entered.

3.5 Which University Web Interface


The web interface used by the Which University system is perhaps the most important of all the design requirements as it is what will be continuously used by users in order to interact with the application. As discussed in project aims (1.3), it must be taken into consideration that the system will be used by both college/sixth form leavers, and potentially their parents. This requires consideration of the actual design of the interface as, due to generational differences seemingly having a correlation with computer literacy levels (Bunz et al., 2007), both categories of user will require a layout which is tailored towards their needs. From the
32

CHAPTER 3. DESIGN perspective of development, the system will focus on the aesthetics of the interface as the system itself is intended for as a general social tool use rather than heavy business purposes where aesthetics would not hold as much importance. With user interfaces, it is important to adopt a consistent layout between pages but this perhaps holds greater significance when designing an interface which may be used by users with lower computer literacy levels. More competent users may almost develop an innate understanding of where certain buttons should naturally be, even if the layout isnt always consistently similar. This instinct may not be present users with lower computing competency and they may tend to feel more comfortable with larger, text labelled buttons, and layouts which are regular between pages, even if this is not always practical. The Which University web interface is broken down into 3 major sections. The first of these are the general site navigation pages which allow users to view a description of the project and/or send an email concerning questions or noting information. There is also a feedback form available to gather electronic user feedback about HCI issues and the application in general. This may prove to be a valuable asset in quickly and effectively assessing any HCI shortfalls which may not have been considered. The second of the interface sections will concern the presentation of an input form to the user to allow them to enter their desired importance ratings of several of the University features mentioned previously. The most appropriate input methods for specific data types must be taken into consideration, for example, combo boxes have the advantage of allowing large amounts of values to be displayed discretely without overloading the users senses. However, the disadvantage of this is that only one of these values can be displayed at any one time, the remainder remaining hidden until selected. The final section of the interface will relate to the actual layout of the results returned by the database queries and processing of fuzzy logic calculations. A balance between displaying relevant information and not overloading the initial subset needs to be achieved when considering which fields should appear in the preview. Considering this, it was decided to avoid displaying a significant amount of the Universities statistical information, in favour of fields which were deemed, generally, to be of importance to the user.

33

CHAPTER 3. DESIGN

Fig. 7: Proposed User Data Entry Web Interface

As can be seen from the proposed web interface above, the fields which contain a large number of potential values were placed in drop down boxes. Although only one of these values can be displayed at once, this approach does at least prevent the input form from being over powered by a large number of values for one field and keeps the layout of the input form aesthetically pleasing and consistent. Radio buttons were preferred where fields only had a small number of potential values. This approach meant that all of the values could be displayed to the user at one time, whilst the relatively few possible values ensured that the input form remained aesthetically pleasing and practical to use.
34

CHAPTER 3. DESIGN

Fig. 8: Proposed Search Results Interface

The main aspect of the above interface which perhaps is the most prominent was the decision to alternate the colours of each of the result lines. This could be achieved using a fairly simple algorithm but its effect should ensure that each of the University results remains clear and easily viewed, especially when result sets become large and (potentially) overwhelming. The main focus of the entire project is the application of fuzzy logic result calculations and closeness of match scores. The % match column in the above interface demonstrates how the calculated match will look once a score has been computed for each University. As this is the most significant of all the columns displayed, the bright yellow colouration makes it easily distinguishable and should immediately draw a users attention towards it. The specific summary set of University features was selected due to their general higher relevance to a greater audience of users. Some of the other University features are quite specialised and almost certainly will not be of great interest to users unless they are especially interested in the specific University. The statistical information about each University may appear somewhat daunting to some users and may overload their memory should these statistics be visible in one table for every University on record. The user star rating system for the Universities was integrated into this area as it would allow the user to easily assess the current user rating at the same time as viewing the preview of the Universitys features. Their opinion on both of these factors should be enough for them to decide whether or not to view the University in more detail. From the initial results is a link which allows the user to display all of a Universitys features in a full screen table as well as the option for them to visit the Universitys official website.

35

CHAPTER 4. IMPLEMENTATION

Implementation 4 ____________________________________
4.1 Method of Implementation
The system was implemented over a number of weeks as individual modules. The first stage was to create the database as this was always likely the to be the centre point of the system with each of the other modules relying on its existence and data population in order to function as intended. Once the database and its tables had been implemented successfully in a local environment they were uploaded onto the web server and with the uploaded database now online and fully employed the focus moved onto the implementation of the PHP pages which would be used primarily to communicate with the database. To begin with much of the fuzzy logic processing module was removed from the PHP pages as I wanted to first ensure that basic communication and functionality was being carried out as intended. Having implemented both the database and the basic shell of the PHP communication pages to ensure that basic connection and data retrieval was a success it was now time to begin including the more complex fuzzy logic calculation modules. The fuzzy logic modules are an integrated part of the PHP communication pages so adding these back into the code did not provide me with any significant problems. Once the system was now functioning as intended with the PHP interfaces successfully querying the database and then applying fuzzy logic calculations to returned results set the majority of the implementation was complete. As the system is aimed towards being a web based tool, where presentation plays a vital role, the implementation of a surrounding website for the actual application to run within was decided upon. The implementation of each of these sections will be discussed in more detail under the headings which follow.

4.2 MySQL Database


The installation of MySQL and the implementation of the database locally progressed from start to finish with relatively few problems. However, there were several minor problems encountered, and these will now be described. The first problem recognised was during the course of uploading the database implementing it online. The problem itself was not concerning the databases functionality, but more the constraints that were placed on its maintainability. Initially a GUI application called MySQL Front was being used to allow databases and tables to be created and data to be inputted using a front end interface. This worked well whilst developing the system on a home PC but on my return to Lancaster University, it was necessary to enquire with the web host about possible use of a web based PHPMyAdmin alternative to maintain the online version of the database due to the port restrictions of 3306 (MySQL) on Lancaster Universitys ResNet.

36

CHAPTER 4. IMPLEMENTATION The creation of the database itself was exactly as specified in design (section 3). Below is a more detailed UML diagram of the data types and their interactions between tables.

Fig. 9: Which University Database UML Diagram.

4.2.1 Database Insertion


Database insertion is performed in various sections of the Which University system via a web based user interface which is part of the integrated web site. Upon successfully logging in from the main site, administrators can add new Universities or edit/delete existing Universities from the database. The database design is such that the primary key of each University is an automatically incrementing unique ID number. This ensures that no University added to the database can contain the same Uni ID and also provides a unique value which can be used for various search and validation techniques. As the primary key is automatically incremented, there is no requirement for the administrator to enter this value when a new University is created. Upon creating the MySQL query to insert a new University into the database a validation check is performed on both the Name and Website URL fields. Both of these fields are also unique so no other University in the database should contain an identical name or website URL to the one being submitted. This proves to be a successful validation technique to prevent duplicate Universities from being added to the database. The second area of the system where database insertion is permitted is where users are allowed to leave comments about specific Universities. The structure of an actual comment is exactly as described in the design section with the date, users name, email address and actual message being inserted into the database along with the specific Uni ID of the University which the comment relates to. As there is no real world stipulation that any of the fields that comprise a
37

CHAPTER 4. IMPLEMENTATION comment are required to be unique, and identical comments by users are perfectly acceptable, no validation is required for database insertion in this section.

4.3 Fuzzy Logic Calculations


The implementation of the fuzzy logic calculations was the motive behind the initial design and development of this system. As previously noted, the sections of code where fuzzy logic calculations are carried out are located within the PHP file which is used to display the results set to the user. As the concept of fuzzy logic relies on the fact that results are not heavily filtered, and records not removed unnecessarily, the initial phase of the calculation is to simply query the database for a complete list of all the Universities, where the course selected by the user is available and being taught at that University. This was the only filtering of the database which actually takes place as it was presumed that prospective University students would have a firm idea of what course they wanted to take before beginning to decide on a University suitable for them, an assumption taken from my own experience of being in a similar position 3 years ago. This filtering of the results does have other benefits associated with it such as speeding up the whole procedure by removing results seemed surplus to requirements prior to the calculating of closeness of match scores. This will become particularly useful as the database of Universities grows. Another benefit of this initial filtering is that the results that are finally returned will not be cluttered with Universities which initially look promising, but eventually turn out not to offer the course which is essential for the user. A system which performs crucial filtering like this automatically will always be preferred by the user to one which required them to perform unnecessary manual filtering. Once the result set has been returned, and contains only the Universities offering the required course, then the fuzzy logic calculations are performed to generate a closeness of match score between each University and the profile which has been produced by the user. There are different approaches to how fuzziness is applied to certain fields depending on whether or not they are numerical or text based. The different methods and overall procedure of calculating the closeness of match score for a University is examined below with a listing of each of the importance weightings which were decided on for each of the University features:

38

CHAPTER 4. IMPLEMENTATION Selected Importance Weighting For Each University Feature


Location (Area Of Country): 1.0 University League Position: 1.0 Campus Layout: 0.6 High Teaching Excellence Rating: Good Student: Staff Ratio: Cheap University Living Costs: High Volume Of Applications: Popular With Overseas Students: High Graduate Starting Salary: Best For Sport: High Student Satisfaction Rating: 0.3 0.3 0.6 0.3 0.3 0.3 0.3 0.6

Accessible Sports Facilities: 0.6 Student Parking Available: En-suite Accommodation: High Research Rating: Standard Of Final Degree: Likelyhood Of Graduate Employment: 0.6 0.6 0.3 0.6

0.3

Calculating The Overall Weighting Of A Feature Overall Feature Weighting = Closeness of Match Score X Importance
This is carried out for each of the University features. Once this has been completed the overall match of each University can be calculated by taking the cumulative Overall Feature Weighting for all features, dividing this by the maximum possible cumulative Overall Feature Weighting and then multiplying this by 100. This will generate a percentage of how well each University matches the criteria which the user has input.

Overall University = Cumulative Feature Weighting Score X 100 Match Maximum Possible Cumulative Feature Score

Fig. 10 Calculating Overall Feature Weighting.

Numerical Features (Statistical & league position based)


For the majority of the statistical data which is stored about a University, the fuzzy closeness of match score is calculated by taking advantage of the large range of values which are present for each University and creating hypothetical boundaries in which the closeness of match can scale. This is achieved by putting an upper and lower limit on the value which the user specifies. If the value of a specific University feature (i.e. its league position) is within an acceptable range of that selected by the user then a goodness of match score of 1.0 will be awarded. For values outside of this range the associated goodness of match score will begin to decrease the further outside of the range it gets, until it reaches a point where the goodness of match will be 0 (i.e. not suitable at all). The upper boundary is usually specified at 20% greater than the selected value (Upper boundary of 60 for a selected boundary of 50). The
39

CHAPTER 4. IMPLEMENTATION lower boundary is usually 10% of the selected value. However, for fields similar to the University league position anything better than the specified position is awarded a value of 1.0 (completely suitable) as, for a league placing request of Top 50, a University in position 45 is just as valid as one in position 12. The goodness of match score calculated for each of the relevant University features will be multiplied by its perceived importance level (0.0 1.0) which will then give the overall weighting which the feature will have in computing the overall suitability of the University for the user. Below is an example of a fuzzy calculation for one of the numerical (statistical) features. For Specified University League Position Of Top 50 Performed On A University In Position 56 (worse than the requested level) Goodness score = 1.0 x (extended upper actual position) / (extended upper - 50) Goodness score = 1.0 x (60 56) / (60 - 50) = 0.4 For Specified University League Position Of Top 50 Performed On A University In Position 41 (better than the requested level) Goodness score = 1.0 Fig. 11: Numerical Fuzzy Logic Calculation.

Text Based Features (Areas & Locations)


For certain University fields, it is not appropriate to use similar calculations to those listed above as the field values contain text as opposed to numerical data. As this is the case a different approach must be taken and a lookup operation must be performed on the Area table in the database. This table contains each of the areas of the UK along with areas which are in their vicinity. This approach to applying some degree of fuzzy logic to text fields offers less scope than those used for numerical fields as the goodness of match here can only exhibit one of three values; 1.0 if the area specified is the exact area where a University resides, 0.6 if the University is in a neighbouring area of the users specified location and 0.0 if the University is outside of all neighbouring areas to the location specified. See below for an example of how fuzzy logic is applied to certain text based features.

40

CHAPTER 4. IMPLEMENTATION

For Specified University Area Of North East And A University Situated In The North West Goodness score = 0.6 (As North West is a neighbouring vicinity to North East) For Specified University Area Of North East And A University Situated In The North East Goodness score = 1.0 For Specified University Area Of North East And A University Situated In London Goodness score = 0.0 (As London is NOT a neighbouring vicinity to North East)

Fig. 12: Textual Fuzzy Logic Calculation. There are also occasional instances where the data stored about a particular field is very discrete and therefore impossible to calculate fuzzy results for. An example of this sort of data would be evident in the En-suite accommodation available field. Here there is only really the possibility of Yes or No answers and as a result only crisp goodness of match scores of 1.0 and 0.0 are made possible.

41

CHAPTER 4. IMPLEMENTATION

4.4 Which University Web Interface


The web interface was implemented as illustrated in the Design report also, written in a combination of PHP and HTML and divided into 3 separate sections. Here is a brief overview of each page and what how it was implemented.

4.4.1 Welcome Screen


The welcome screen is the first display which the users will encounter upon visiting the site (other than the flash animation screen should they have appropriate software installed). As explained in Design (section 3), the aim was to implement a vibrant and aesthetically pleasing design for the site which displayed clear, visible navigation buttons and followed consistent layout patterns between screens. The welcome page is written completely in HTML as its main purpose is to provide a central navigation point from which other areas of the site can expand from. Linked from the welcome page is a section containing general information about myself and about the motivation behind the development of the application. The feedback form is also linked from here and provides a detailed series of questions which will be used to gather vital user feedback regarding website HCI issues, but more importantly the functionality and opinions on the implemented system. There were considerable attempts to ensure that equilibrium was attained between implementing an attractive, subtle layout, whilst also ensuring that older users are comfortable by making navigational buttons clear and unambiguous. This can be seen by the CLICK HERE TO START button which is used to actually run the Which University search application. The *Admin Options* button accesses a password protected area of the site where system administrators can log in and perform necessary maintenance to the University database.

4.4.2 Administrator Control Interface


The administrator control interface is linked to by the *Admin Options* navigation button and is the only area of the system which requires a username and password authentication to gain access. The obvious need for secure authentication is so that any users gaining access to this area have the ability to create/edit/delete any of the University records. As the administrator control panel is based around PHP the username and password used to access this area are stored server side so is never seen by any user viewing source code etc. The other alternative which I was presented with was to actually store the username and password within the database and perform a database query each time a login attempt was made. Although this would work the extra work of connecting to the database each time would be unnecessary when embedding the username and password into PHP achieves the same end result.
42

CHAPTER 4. IMPLEMENTATION The navigation of this area closely resembles that of the user interface once results have been returned. The administrator can order Universities by any of the field headers and is able to filter the list of Universities using the free text filter which can be configured to filer by partial word match or by whole word match only.

4.4.3 User Data Entry Form


The implementation of the user data entry form followed very closely to its original design. A balance between drop down boxes and radio buttons was met depending on the amount of possible values the field contained. The implementation of this page also ensured a consistent layout was maintained by displaying links to the other sections of the site in the left hand navigation column. Once a user has selected their chosen importance levels for each of the features, the web form is posted to University.php which proceeds to take each of the values and store them within a session to ensure that the users choices remain as long as the session is active.

4.4.4 Displaying of the Results


Having stored all of the Users importance values from the data entry form as sessions, the PHP page initially queries the MySQL database for all of the Universities which teach the required course which the user has specified. From here the fuzzy logic calculations are carried out on those Universities that werent filtered. A simple algorithm was used when selecting each row from the results set to ensure a pattern of alternating line colours was established to make it easier to read rows of University data. The implementation of the star user rating system within this interface required the addition of a javascript file to allow mouse over effects which perhaps offers a greater amount of style and functionality.

43

CHAPTER 5. SYSTEM OPERATION

System Operation 5 ____________________________________


To illustrate the systems operation this section will contain a walkthrough of the workflow as experienced by a user and will conclude with a comparison of this and the original design.

5.1 Usage Scenario


A typical usage scenario for the Which University system would consist of a user filling in the online web form and submitting their requested University features to the server. The server would then query the database and proceed to perform fuzzy logic calculations on the result set which is returned. Once these calculations have been performed the results will be displayed to the user ranked in descending order of closeness of match score. 1. Enter the websites Welcome Page and click the CLICK TO START button in order to access the application. 2. Input their personal choice of feature importance levels and submit the form to the server. 3. Be presented with the returned results set ranked in descending order or closeness of match score. 4. One the user has reached this point there are options to branch off in different directions. Click the View All Details link to be taken to a full screen table of all details stored about the University. This is where majority of the statistical date is contained. Visit the Universitys official website using the Website link provided. Rate the University using the start rating system. Leave or view comments about a chosen University.

44

CHAPTER 5. SYSTEM OPERATION

5.1.1 Welcome Screen

Fig: 13: Which University Application Welcome Page.

This page is what greets the user once that access the site. From here they have the option to view a brief summary of the motivations behind the development of the project, leave feedback should they be testing the application, log in as an administrator to perform required maintenance or to actually start the University search application.

45

CHAPTER 5. SYSTEM OPERATION

5.1.2 User Data Entry Form

Fig. 14: Which University User Data Entry Form. Once the user starts the application they are presented with a web form where they can enter their preferences to how important each of the University features are to them. Once the user has completed the form they are required to submit it to the server where the values they have selected will be stored in sessions where they will exist until the session expires.

46

CHAPTER 5. SYSTEM OPERATION

5.1.3 Displaying of the Results

Fig. 15: Which University Search Results. Having submitting their choices via the web form, the user will be presented with their results in the same format as Fig. 15. As described in the usage scenario (5.1), the user has a variety of different options at this point. For example, the user may choose to narrow the results down further by using the custom filter above the results set; this matches free-text and can be configured to search for partial word matches or whole words only.

5.1.4 Summary
A comparison between the user usage scenario and the systems original functionality expectations from the original example in Design (section 3) show that no significant changes have actually been made to the flow of the system. The preview set of data for each University has also stayed very consistent with that specified in the original design. The only real element of change was the inclusion of a small javascript to enable mouse over functionality for the star rating facility. Figure 16 below shows a final sequence diagram depicting the system in operation during the preceding usage scenario.

47

CHAPTER 5. SYSTEM OPERATION

Fig. 16: Sequence Diagram Depicting a User Usage Scenario

48

CHAPTER 6. TESTING AND EVALUATION

Testing And Evaluation 6 ____________________________________


6.1 Testing
To ensure that the entire system is thoroughly tested I decided to use a number of testing strategies; each of these will be covered in detail during this section of the report. According to Ian Sommerville the distinct goal for the software testing process is to discover faults or defects in the software where the behaviour of the software is incorrect, undesirable or does not conform to its specification [2004: 538]. Defect testing is concerned with rooting out all kind of undesirable system behaviour, such as system crashes, unwanted interactions with other systems, incorrect data computations and data corruption. Sommerville provides a graphical example of a general model of the software testing process [2004:539], as shown in fig. 17,

Fig. 17: A Model Of The Software Testing Process. His suggestion for testing system usage and operational features is to meet the following criteria [2004: 539]. 1. All system functions that are accessed through menus should be tested. 2. Combinations of functions that are accessed through the same menu should be tested. 3. Where user input is provided, all functions must be tested with both correct and incorrect input. Having read the opinions of Ian Sommerville I decided to identify test cases in order to meet his criteria mentioned above. I firstly isolated the database section of the system and created some tests to manipulate data to ensure that the database was reading and writing as intended.

49

CHAPTER 6. TESTING AND EVALUATION MySQL Database Data Manipulation No. 1 2 3 Test Description Add a new University to the existing database. Retrieve all of the Universities stored in the database. Check that each University in the database has the correct data type for each field. Retrieve Universities based on what values they have for each field. This should be tried for each of the fields within the database. i.e. Return all Universities who HAVE student parking available. Test that when a new University is added, its Uni ID field is automatically incremented. Delete a University from the database. Expected Outcome New University is added to the next new available row. Displays every University currently stored. Every University record has the correct data type for each of their fields. Database should return all Universities whose fields match that of the search criteria. Result PASS PASS PASS

PASS

The Uni ID field should increment by 1 each time a new University is added. The relevant University should be deleted from the database leaving nothing left behind. Edit each of the fields for any The relevant fields should have been University to ensure that the changes are updated and these changes maintained once they are submitted. permanently maintained. Attempt a query which will return a No Universities should be returned known result set of 0 Universities. by the query.

PASS

PASS

PASS

PASS

As can be seen from the data manipulation tests performed on the database, this area of the system is functioning as intended. Querying of the database for results already held was successful and based upon the fact that there is also no problems inserting, deleting or updating currently stored University records.

6.1.2 Black Box Testing


Now that I was confident the main backbone of the system was performing correctly I proceeded to introduce the user interface and carry out black box testing on the system. Black Box testing is, according to Ian Sommerville an approach to testing where the tests are derived from the program or component specification, the system is a Black Box and its behaviour can only be determined by studying its inputs and the related outputs [2004:544]

50

CHAPTER 6. TESTING AND EVALUATION

Fig. 18: Black Box Testing Model

In relation to the Which University system, black box testing may also be considered a form of integration testing as all of the individual components are required to work together in order to meet the goals set by the specification. As shown in Figure 18, Sommerville provides a graphical example of a black box testing model which illustrates effectively how to view the system when actually conducting black box testing. For my system, the majority of the black box testing is centred on the user interfaces and their interaction with the database. To carry out my black box testing on the system, inputs from the user interfaces will be chosen as test cases and the outputs recorded. Data Entry Form Black Box Testing No. 1 Test Description Clicking any one of the navigational buttons in the left hand column will direct you to a different section of the site. Upon initially viewing the data entry form, all values are set to the default option Clicking RESET will set the values to their default Clicking the email link mike@whichuniversityfyp.co.uk opens a new message screen Location Course and League Position can be selected by clicking on the drop down menu Clicking on a radio button selects the chosen value Clicking SUBMIT will forward the user to the results screen and display appropriate results corresponding to the users selections Expected Outcome Clicking the navigational buttons in the left column will direct you to the correct area of the site for the button that you clicked. Upon initially viewing the data entry form, all values are set to the default option Clicking RESET sets the values to their default Clicking the email link mike@whichuniversityfyp.co.uk opens a new message screen Clicking the drop down menus on Location Course and League Position displays a variety of options relating to each feature Clicking on a radio button selects the chosen value Clicking SUBMIT will forward the user to the results screen and display appropriate results corresponding to the users selections Result PASS

PASS

3 4

PASS PASS

PASS

6 7

PASS PASS

51

CHAPTER 6. TESTING AND EVALUATION Results Screen Black Box Testing No. 1 Test Description The results page lists potentially suitable universities in order of closeness of match, the percentages displayed in descending order Clicking main menu provides a link to the main menu The filter can be selected by clicking on the drop down menu Clicking on reset filter resets to no filter Clicking view all details provides a link to further details about the specific university Clicking website provides a link to the official website of the specific university Highlighting a particular star on a specific university allows the mouse over to take effect and displays a temporary box depicting click star to vote and the relevant number of stars Clicking a specific star related to a specific university allows the user to vote their overall rating Expected Outcome The results page lists potentially suitable universities in order of closeness of match, the percentages displayed in descending order Upon clicking main menu, the user is returned to the main menu Clicking the drop down menu shows the potential filter options Upon clicking on reset filter, resets to no filter Upon clicking view all details, the user is directed to a new screen displaying further details about the specific university Upon clicking website, the user is directed to a new screen displaying the official website of the specific university Highlighting a particular star on a specific university temporarily displays click star to vote and the relevant number of stars Result PASS

2 3 4 5

PASS PASS PASS PASS

PASS

PASS

10

Clicking a specific star related to a specific university brings up a dialogue box showing Thank you for voting Clicking view/submit provides a link Upon clicking view/submit, the user to the comments screen is directed to a new screen displaying comments and the option to leave comments Clicking back to the top returns the Clicking back to the top returns the user to the top of the screen user to the top of the screen

PASS

PASS

PASS

Comments Black Box Testing No. 1 2 3 Test Description Clicking back to search results returns the user to the results screen Clicking back to the top returns the user to the top of the screen Clicking on the name, email and comment dialogue boxes selects them and allows user to type within them Clicking leave comment submits the comment Expected Outcome Clicking back to search results returns the user to the results screen Clicking back to the top returns the user to the top of the screen Clicking on the name, email and comment dialogue boxes selects them and allows user to type within them Clicking leave comment submits the comment Result PASS PASS PASS

PASS

52

CHAPTER 6. TESTING AND EVALUATION

6.2 User Interface Evaluation


The user interface with used within the Which University system was one of the more important aspects of the system when considering its intended purpose and the audience it was aimed at. As the application was being used socially as a tool, rather than for business purposes, so an emphasis on a vibrant and attractive user interface was essential. With this in mind I was aware that, for the user interface design, it could almost become an advertising exercise where lots of user feedback would be essential in tweaking various parts of the interface to try and match it perfectly to all areas of the target audience base. I was fully aware that my own opinions and judgements on what contributed to a successful interface would probably be drastically different to a lot of the systems actual users. As this was the case I could not simply rely on my own beliefs and it was necessary to gather feedback from as many potential users as possible. The feedback form located on the sites welcome page was used to gather feedback from as many different types of potential users as possible, the broader the range of users who left feedback, the more the systems interface could be developed into a better all round product.

Results From User Feedback


The following demonstrate feedback gained from users, and the percentage of users who selected each value, in relation to specific attributes, How computer literate would you rate yourself? Computer Literacy Rate Novice Occasional User Frequent Social User Competent Daily User Advanced User 0% 16.67% 50.00% 33.33% 0%

The table above shows that the users who provided feedback tended to class themselves as frequent social users or competence daily users, with some categorising themselves as occasional users. However, due to accessibility issues, all the feedback came from similar users, and further research would need to be carried out to assess the numbers of novice and advanced users who may potentially use the application.

53

CHAPTER 6. TESTING AND EVALUATION

How do you rate the site's aesthetics? (Does it look good and appeal to the eye) Aesthetics Poor Average Good Excellent 0% 0% 83.33% 16.67%

The table above shows that the users who provided feedback considered the aesthetics to be good or excellent.

Can you tell where you are immediately within the site? (Clear title, description, captions, etc) Signposting on site Poor Average Good Excellent 0.% 66.67% 33.33% 0.%

The table above shows that the majority of users who provided feedback were able to locate where they were are the site, although this is perhaps an area that could be improved upon in further work.

How well would you rate navigation around the site? (Is it easy to find your way around) Easy to Navigate Yes No 100% 0%

The table above shows all the users could navigate around the site fairly easily. However, again, it should be noted that the sample of users who left feedback was relatively small, and not representative of all potential users.

54

CHAPTER 6. TESTING AND EVALUATION

Are the links to other pages within the site helpful and appropriate? Links Poor Average Good Excellent 0% 16.67% 83.33% 0%

The table above shows that the majority of users who provided feedback considered the links to be good, although the table indicates scope for possible improvement through further work.

Does the site operate & look acceptable on your specific internet browser? Successful Operation on Internet Browser Yes No 100% 0%

The table above shows that the website worked effectively on all the internet browsers used by those who left feedback . However, in hindsight, it perhaps would have been helpful to ask users to select which browser they were using, in order to cover all eventualities

How would you rate the quantity of data stored about each University? (Sufficient data to make the application worth while?)

Sufficient data to make application worthwhile Yes No 83.33% 16.67%

55

CHAPTER 6. TESTING AND EVALUATION The table above shows that the majority of users who provided feedback considered quantity of data stored about each university as sufficient to make the application worthwhile. How well are the search results organised and displayed?: Search Results Organisation Poor Average Good Excellent 0% 33.33% 50.00% 16.67%

The table above shows that the majority of users who provided feedback considered the search results to be organised and well displayed, although again, scope for improvement may be possible through further work.

Do the search results appear to be error-free? (Spelling errors etc) "Error-free" results Yes No 83.33% 16.67%

The table above shows that the majority of users who provided feedback found relatively few errors.

Are there any extra University details which you feel would be of benefit to include?:

Extra University Details Needed Yes No 16.67% 83.33%

The table above shows that the majority of users who provided feedback did not feel that any extra university details were needed.

56

CHAPTER 6. TESTING AND EVALUATION

Do you feel the University preview data returned by each search is sufficient? Preview Data Sufficient Yes No 100% 0%

The table above shows that all the users who provided feedback found the preview data sufficient.

Were there any current features of the application which you found confusing/difficult to use? Examples of current features found confusing/difficult to use Yes No 100% 0%

The table above shows that none of the users who provided feedback found examples of features that were confusing or difficult to use.

Are there ANY new features which you feel would be beneficial for the application to include?

New Features Necessary Yes No 0% 100%

The table above shows that none of the users who provided feedback considered that there were any new features necessary to be added.

57

CHAPTER 6. TESTING AND EVALUATION

On the whole how would you rate the site & application? (1 Poor to 10 Excellent) Overall 1 2 3 4 5 6 0% 0% 0% 0% 0% 0%

7 16.67% 8 66.67% 9 16.67% 10 0.0%

Mean Mode

8 8

Overall, the table above shows that the users who left feedback considered the application to be worthwhile.

COMMENTS & IMPROVEMENTS Comments/Improvements: Would recommend you make the bit that says 'feedback' and what not bigger cause my eye keeps going to which university final year project Bit of a random click on application rather than knowing i was being sent to the survey bit Just need more courses Maybe order them by % so the reader can just look down the list and not - just a bit easier to navigate? Did not understand the row of stars at end of each line of search results. When I advertently clicked on one it thanked me for voting! Why?
58

CHAPTER 6. TESTING AND EVALUATION Accommodation spelt wrong on view all details More info on grades to get in, and what they expect of you More info on POS would be helpful When I clicked on a link and then tried to go back I was told the page had expired so I had to put all my data in again! Did not understand the overall university user rating. Who are these users? Could you make the custom filter caption more user-friendly?

6.3 Overall Evaluation


In order to evaluate the success of the system I am going to go back over each of the derived requirements from the Design chapter, and assess how well this requirement has been met. A.1 Ensure that an aesthetically pleasing colour scheme and layout is utilised to provide the system with a professional appearance and also make the user feel welcome. During the development of the system I was always conscious of the importance towards its aesthetics and ensured that the colour scheme was vibrant without being overpowering. With 100% of user feedback stating that the aesthetics of the site are either good or excellent I believe this requirement has been met successfully. A.2 Certify that the general layout of each navigational page is kept consistent. Consistency was maintained throughout the navigational screens with links being located in the same position and order each time. A.3 Provide, where possible, both step by step walkthroughs and shortcut alternatives to allow users with various levels of computer literacy to operate the system efficiently. Although I didnt make a special effort to offer both alternative forms of navigation around the site I do believe that I achieved a satisfactory balance between efficiency and intuitiveness. A.4 Ensure that each navigational page provides links to every other navigational page in a clear and consistent manner. This requirement was definitely met as there is a link on every page to every other part of the site. This is also consistently positioned to ensure users become familiar with the layout.

B.1 Present the user with a data input screen which displays a variety of relevant University features.
59

CHAPTER 6. TESTING AND EVALUATION There is a great variety of input data required by the user on the data input screen. This was confirmed by 83.33% of the feedback users who stated that no more University features were needed. B.2 Allow the user to state the significance of each feature to them using fuzzy importance levels such as slightly important, very important etc. All of the important levels requested for each of the features demonstrate some degree of fuzziness, some exhibit greater amounts than others but this is down to the nature of the feature in question. B.3 Provide secure data input methods such drop down boxes and radio buttons to prevent the introduction of lexical errors into the system. There is no free text data entry option available for any of the University features in the data entry form so this completely prevents a user from entering data which will cause lexical errors within the systems calculations. B.4 Choose appropriate default importance values for University characteristics to avoid results being adversely affected should the user choose to ignore certain features. Appropriate values were chosen by myself during the implementation phase, these were frequently the Average Importance option and as no negative comments have been made by any of the users during feedback I can only presume this is satisfactory. C.1 Certify that the system stores enough data about each University to make the system worthwhile. The data that is stored covers all aspects of a University from features such as parking, to academic statistics and general social factors. I strongly believe that this requirement was met well. C.2 Ensure that all sensitive data and that which may infringe the Data Protection Act (1998) is not stored within the system. I can guarantee that there is no data which could be considered sensitive in anyway, all data concerns solely the University and never any individual connected to it. D.1 Ensure that Universities are not excluded from the results set for not satisfying certain search criteria. The only filtering of the results set concerns whether or not the chosen course is actually taught at the University, other than this no other Universities are filtered out.

60

CHAPTER 6. TESTING AND EVALUATION D.2 Certify that each University feature is given an appropriate result weighting depending on its pre-decided level of importance. I believe that the importance values assigned to each feature were fair in the context, if this changes in the future it would prove very easy to amend these values. D.3 The system should calculate a goodness of match score for each University determined by how closely it matches the user created profile. The calculated goodness of match score is probably the motivating feature behind the system so I am certain that this has been implemented successfully. E.1 All University results should be displayed in descending order of goodness of match score so that the users most suited Universities are displayed at the top. Once results are returned they are always ordered in descending order of goodness of match. The user also has the option to reverse this and display them in ascending order. E.2 To prevent cluttering of data, only a subset of important University features should be initially displayed. A full record of University features should be easily accessible should the user wish to enquire further. Judging by the user feedback I received people were very happy with the subset of University features as 100% of them claimed it was satisfactory. Only 1 click is required to view each Universitys full record. F.1 For every University result generated provide a hyperlink to the Universitys official website. I can confirm that for every University record in the results there is a direct link to that Universitys official website. F.2 Once all results have been displayed, a free-text search facility should be provided to allow users to filter the results by words/values of their choice. It should be possible for the user to search all fields or let them specify a certain field to filter by. A custom filter was included on the results page to allow users to filter their search results by any free text value they wish. They are also able to filter by specific search fields should they wish. F.3 Provide some style of rating system to allow system users to rate Universities and view, at a glance, the overall user opinion of each University. A fairly advance star based user rating system was integrated into the results section of the system which allows a user to both view at a glance a Universitys current rating, but also cast their own vote.
61

CHAPTER 6. TESTING AND EVALUATION

F.4 The system should provide a section where users can comment and exchange their views on Universities. A comments section was included but I believe this is one section that could be improved in the future. H.1 The system should utilize appropriate system architecture to facilitate smooth web integration. After talking to many people about different possibilities it was decided to use PHP and MySQL as the systems architecture, this seemed to be a choice many people felt happy recommending. H.2 Security must be taken into consideration with the system being distributed globally. All passwords must be stored securely server side and preventative measures taken to thwart potential MySQL injection attacks with PHP. All passwords are embedded server side within the PHP code as to ensure that no user can gain access to them through the HTML source code. Basic measures were implemented to prevent injection attacks but this is again an area where improvement could be made.

62

CHAPTER 7. CONCLUSION

Conclusion 7 ____________________________________
Looking back over the entire project period I am extremely happy with what I have achieved and the outcome of the system as a whole. Firstly, I have gained what I would consider to be an in depth understanding of fuzzy logic and its benefits and drawbacks. Ive had the experience of working on a project from start to finish using my own initiative and decision making to take the design, development and implementation any direction which I felt was right. The sense of satisfaction from knowing that it was your own influences and effort which have produced the system is something to take great reward from and has been a very worthwhile learning experience on many levels. Secondly, the system that has been developed works exactly as it was intended and has certainly changed my previous opinions where I was perhaps guilty of neglecting background research and underestimating the value of a good design. Im now certain that the system would not have reached anywhere close to the level of functionality it has if these two sections hadnt received anything less than 100% of my efforts. The feedback which I received from many different users proved in the end to be vital as it highlighted some key areas which I will come back to in the Further Work section. There are areas where I still believe work needs to be done, especially concerning security which maybe majority of users may not be aware of. There are however areas which I believe I got right first time, had I the opportunity to start the project over from fresh I would still have picked the architecture which I did and the database would not have differed much at all from its current state.

FURTHER WORK
Whilst using the system and gauging users opinions through feedback there was a reoccurring area where people felt improvement might be necessary. The commenting system which was integrated into the results interface was effective at what it did but there was no implementation of any security features to prevent completely random individuals from leaving potentially misleading comments when they had no real knowledge about the University at all. This was also a common theme with the User rating system as some users struggled to understand exactly who the rating system was aimed towards. This is completely understandable as I had originally intended it to be aimed at current students of the University who would be in a position to cast a reliable vote. However, due to time constraints and other issues taking preference I was unable to implement a constraint which only allowed current students at the University to vote or leave their comments. Had such a feature been successfully implemented I think a lot more users would understand and accept its validity as a useful review tool. I got the impression from users that they were a little sceptical of it due to the fact that anyone could cast their vote, if only one vote per person. This is completely understandable and would definitely be a key area I would look into developing for further work.

63

References ____________________________________
Bunz, U., Curry, C., Voon, W (2007) Perceived versus actual computer-email-web fluency. Computers in Human Behaviour. Vol. 23 (5): p. 2321-2344

Kaehler, S. D. (2008) Fuzzy Logic An Introduction. Available at: http://www.seattlerobotics.org/encoder/mar98/fuz/fl_part1.html

Paice, C.D. (2007) Choosing a House. (PowerPoint presentation)

Schaap, Y. (2006) Easy Fuzzy Logic with MYSQL - The end of no results found Available at: http://www.yvoschaap.com/index.php/weblog/easy_fuzzy_logic_with_mysql_the_end_of_ no_results_found/ Sommerville, I. (2004) Software Engineering. 7th Edition. England: Addison Wesley

Zadeh, L. (1965) Fuzzy Sets. Information and Control. Vol. 8 (3): p.338-353

64

Appendix ____________________________________
Appendix 1 - FYP Proposal

65

Appendix 1 - FYP Proposal Which University?

Michael James FYP Proposal Which University?


Abstract
The aim of this project is to develop and test an application which is aids a potential University candidate in choosing a suitable University. The complete application will comprise of background database and a front end GUI. The project involves three main areas, research, development\programming, testing and the writing of the report. The whole system will be web integrated to allow nationwide use and aid testing.

1. Introduction
This project is designed to aid the user in the task of choosing their desired University. The project design will allow the user to select their preferred choice of University features and characteristics from a web based GUI. The GUI will then query a pre-constructed database with their selections. The database used will consist of a detailed set of characteristics for each University in the register. Once the query results are obtained they will be returned to the GUI and displayed to the user, with the option of saving the results to file. The database queries will each use a fuzzy logic approach to ensure that the results are returned on a percentage success match basis, rather than discrete yes or no values. Currently there are similar applications available for different purposes on the web. However, very few are based upon the fuzzy logic query approach, opting instead for a more discrete method which will rule out completely search results which do not directly match the search criteria. The proposed solution will need to be easy to use, relatively simple and intuitive, robust and accessible worldwide via the internet using most common internet browsers. To achieve this, testing will need to be carried out to create an efficient system that is able to withstand possible high flows of traffic and deal appropriately with possible misuse. The report will consist of background research into the area of similar existing web-based database querying applications and the purposes which they are used for. The proposed project section of the report will contain the details regarding the front end user interface along with that of the background database and the fuzzy logic queries which will interact between the two. Testing strategies and any security requirements will also be discussed in the proposed project section. The programme of work section includes all of the main stages of the projects development. This is illustrated by a Gantt chart (page 8) to show the time plan for the completion of each development stage. The resources section of the report
66

contains a list of all the required resources needed and any reasons given for their use. At the end of the report is the reference section where all material used for reference purposes will be acknowledged.

2. Background
The web based graphical user interface is designed to allow users of the application to quickly and effectively select any combination of search criteria for their preferred University. It is these selections which database queries will be constructed around.

User selects their desired University features from the GUI.

Database queries are formed from the selections made via the GUI. Query results are obtained and the relevant fields are then projected back to the GUI.

Queries are then applied to the database and relevant results are selected.

Figure 1 Process of querying the database. Currently there are web applications which exhibit some of the characteristics related to this proposed project. An example of such a current application would be Auto Traders online car search facility. This application presents its users with an intuitive interface which allows people to enter a selection of car attributes from colour, millage and price range, down to its distance from a current postcode. The process of using these attributes to form database queries is very similar to the method proposed in this project. However, one significant difference which stands the proposed project apart from existing applications is the way in which the queries are actually applied to the database and the format in which the results are returned. Having researched and used the similar Auto Trader system it became apparent that the results returned were very much discrete in terms of the results being completely dropped from the set if they did not match completely to the query. Whilst in some applications this may be beneficial, in this case it would be possible for a car to match every other attribute selected by the user but be dropped completely by the result set due to it being a few miles outside of the postcode range, or a few miles over/under the desired millage. Although the method of database querying that the Auto Trader application uses is particularly discrete in its nature, the majority of the web integration and transition of data
67

between graphical user interface and database are very similar in structure to the proposed system.

Figure 2 User interface for Auto Traders online car search application. Another current application which I researched quite closely was one which concerned the comparison of cruising boats. This application was developed by a collector/designer of the boats whose goal was to construct an accurate template of the critical variables that go into a cruising boat and then search the database for boats fitting this template. The database here proved to be an excellent tool for storing information and calculating the various ratios and performance parameters required but unfortunately problems were met when attempting to construct the Ideal Cruising Boat template. The problem originated from using traditional logic statements to sort the database. The database program here was effective at sorting out and being queried for data within a discrete range, but these crisp logical terms totally exclude all boats outside of the range selected. In reality a value moderately less or greater than the crisp limits might be good enough for at least some consideration. Even the boats that are passed through the filters are not easily comparable since they are all ranked the same. Again, in reality, values closer to the midpoint of a range were often preferred by designers, at least as a starting point, and should be scored higher than those at the edges. This is where the idea of fuzzy logic querying came into prominence.

68

Fuzzy logic replaced the familiar crisp logical statements such as "greater than and less than" with linguistic statements such as close or very close. Without rigid crisp logic boundaries, these "fuzzy logic variables" were used to blur the edges of a logical set and allow each member in the set to be ranked individually. This was how the developer of the Ideal Cruising Boat application solved the problem of totally excluding all boats outside of the range and it is around this querying concept that the proposed Which University? application will be based.

3. The Proposed Project


The aim of the project is to create an application which can aid a user in selecting their desired University. In doing this, the application must also alleviate the boundaries and problems associated with conventional discrete query results. This will involve the implementation of a fuzzy logic querying approach.

3.1 The Graphical User Interface


The major point to consider with all web based applications is that the interface will need to be easy to use, clear and simple. The user should be able to view the entire interface on the screen at any one time meaning that scrolling around should not be necessary. The front end graphical user interface will be developed using a web based language such as PHP and will allow the user to select their desired combination of University features and attributes. These selections will then be used to develop appropriate database queries. To aid the simplicity of the user interface and to prevent ambiguity and inconsistency occurring whilst formulating queries, certain measures need to be taken. It is preferable that all University features and attributes should be selected on the interface from drop down menus and/or selected via bullet points rather than being typed into the interface by the user. The GUI will also need to include a separate area for displaying the returning results once the database has been queried. Ideally there should also be the options to sort the returned results not only by the default setting of percentage match but also by numerous other factors which the user may view as being of some importance. For each University on the list the GUI will display a brief description of the University along with a hyperlink to the Universitys own website should the user wish to locate further information. Consideration must also be taken with the design and layout of the interface, and also the target age group which the application is being aimed at. Although it is initially apparent that the majority of the applications users will be Sixth Form / College students, it is also probable that their parents may also be interested in making use of the application (to see which Universities they believe to be most suitable for their son or daughter). Considering this, research must also be undertaken into an appropriate design as the application will evidently be catering for two, quite different groups of target user.

69

The graphical user interface may also provide the option to save a set of results for a particular query to file. This would be of use in the event that a user wishes to view the result set at a later time, perhaps to show someone else or to compare the results against another set of selection criteria made by another member of the family.

3.2 Database
The database constructed will consist of each UK Universitys characteristics, features and facilities. There will also be a separate GUI constructed to allow an administrator to add, delete and amend existing University entries to the database. It should be insured that the database is kept consistent and that no duplicate records are stored. For data protection reasons it is essential that the database only stores information about each university that is relevant and required by the user.

3.3 Fuzzy Logic Based Queries


The database queries will each use a fuzzy logic approach to ensure that the results are returned on a percentage success match basis, rather than discrete yes or no values. This approach will be used to ensure that a particular University isnt completely ruled out due to one of the preferred features not matching exactly. Instead, each result will be assigned a percentage dependant on how many of its features were matched by that of the query constructed from the users preferences. Once the queries are completed, and percentage matches assigned, an ordered list of Universities from 100% match down to 1% match will be formed and projected to the user via the original graphical user interface. It must be ensured that only the necessary fields are returned to the GUI by the queries. If it is not necessary for a certain field to be viewed by the user, then it is better that they dont see it. An area for possible further development of the fuzzy logic based queries could be to possibly allow users to give feedback about the set of results returned to them. This feedback could then be monitored for any reoccurring trends, such as certain Universities being discounted or ignored despite meeting the search criteria. These trends could then be re-applied to each search and future results could amend the Universitys percentage match to show this. Another possibility, once web integration is established, to allow users of the application to rate Universities and write reviews about them. This would allow users to not only use the software to see which Universities closely match their needs, but also read first hand reviews about their selection.

70

3.4 Security
Security for this type of application is perhaps not as important as it is for other alternative types of software. The main security aspects to consider will, as mentioned, concern the data protection act. It is essential to insure that only completely necessary data about each University and its staff is stored within the database. If for any reason data needs to be stored in the database for administration purposes (such as staff contact numbers) it must be ensured that users of the application can in no way gain access to this information.

3.5 Testing
Thorough testing will need to be carried out to ensure the system is reliable and functions correctly. This will include trying to break the software to test its error handling capabilities. Unexpected query results need to be eliminated as errors could lead to problems within the system and sensitive information could be unwillingly projected to the user which is obviously a very serious issue. Testing of the accuracy of the query results will be the main area of testing within the application. This is because the fuzzy logic based approach will be the most complicated aspect and the area where it is most likely for problems to arise. Problems encountered could vary from a query throwing an exception and refusing to run or a query running successfully but for one reason or another returning a null data set. These types of errors will be easy to spot and hopefully the cause of the problem easy to identify. However, it may also be possible for less obvious errors to manifest within the application. It could be possible that the returned data set from a query actually looks like it is valid and has returned successfully but, due to problems within the fuzzy logic mathematics and implementation, percentage matches may not have been applied accurately. This would result in a data set of Universities being returned which did not relate to the search criteria from which the database query was formed. These types of errors may be more difficult to pinpoint within the application. Consideration and possible tests will also need to be carried out concerning the applications expected internet traffic flow once it has become fully web integrated. Not allowing for enough of a demand will lead to major slow down of the program and will probably in some situations cause the application to hang. After the testing has been carried out, I will test the system using various different users. Starting with advanced users who are more familiar with this type of system and then using novice users who have little or no experience using these types of system. The aim of the tests will be to assess the level of simplicity and intuitiveness of the application along with its complexity in delivering desirable results. These sorts of tests using neutral users will allow me to assess the balance between simplicity and complexity which will, in turn, allow me to judge whether or not the application has been delivered to allow the widest range of users possible to efficiently use the program to their benefit. Similar to the above it will be important to test the application using individuals from the two target age groups (Sixth Form/College students & parents). As an average, the gap between the competence levels regarding computer usage between these two generations is usually quite high. As this is the case it must be ensured that tests are carried out to assess whether the interface used by the system is clear and understandable to both groups of target user.
71

Failure to do so may mean that the application is unknowingly used in an incorrect fashion. This again has the potential to introduce errors and bugs into the software.

4. Programme of Work
The main parts of the programme of work can be broken into:

1) Familiarise self with tools and concepts This will involve familiarizing myself
with database concepts and querying using both MySql and also the planned fuzzy logic querying approach planned for some of the less discrete search factors. A good knowledge of certain concepts from the Artificial Intelligence field would also be a benefit along with the mathematical background behind fuzzy logic. Developing a good understanding of how database queries can be generated from information selected from a web based GUI will also be fundamental in the applications design. It will also be important that I familiarise myself with the process of uploading an SQL database into web space along with any important requirements and drawbacks connected to this. This will be essential if the application is to be implemented globally on the internet.

2) Conduct studies into potential user interface designs and estimated web
traffic for a web based version of the application A study into a suitable user interface must be conducted to ensure that the final version is suitable for use by both target age groups. It is vitally important to develop a suitable front end interface at this stage as this is what the users will be interacting with throughout their usage of the application. The best method of evaluating what would constitute a suitable interface would be to show members of each target age group a wide variety of similar existing interfaces and allow them to pick out areas of each layout which they find easy to use. Tables can then be constructed with the results gained which should allow me to generate a good balance between the differences in layout trends which may, or may not be apparent between the two generations of target user. A study must also be conducted in order to estimate the amount of web traffic will be accessing the application and how it would behave during varying levels of demand. A feasibility study will also need to be conducted to assess whether or not it will be possible to gain the level of performance that is required.

3) Design Software Any code which needs to be developed and written for the front
end graphical user interface, the background database and also the fuzzy logic queries which connect the two.

4) Implement Software Locally Implement the database and graphical user


interface locally on a PC and ensure that flow of data between the two, via the queries, is successful.

5) Testing with Further Development Incremental tests which are carried out
during the development phase to remove noticeable bugs and potential problems.

72

Software is beginning to become finalized and tests with users should be carried out to ensure that the system is appropriate.

6) Detailed Testing Thoroughly test the software to evaluate the application and
give the chance to remove any final bugs.

7) Finish Software & Write User Manual Create the final version of the software
and write all necessary user manuals needed for operation of the application.

8) Implement Software Globally Via The Internet Upload SQL database and user
interface into appropriate web space and ensure that the database is still querying correctly and that the results returned match the search criteria. Also ensure that the actual operating of the application is still smooth an efficient now that it is no longer located locally.

9) Finish Report

Figure 3 Gantt chart to show the programme of work time plan.

5. Resources Required
Access to a PC in order to develop the application. Access to an appropriate web server in order to upload and run the web based version of the completed application.

6. References
CSC355 Artificial Intelligence Intranet page o http://www.comp.lancs.ac.uk/~dixa/teaching/AI355/ Auto Trader Online Website
73

o http://www.autotrader.co.uk/ Fuzzy Logic Research Page & Example Implementation o http://www.johnsboatstuff.com/Articles/fuzzy.htm

74

You might also like