Professional Documents
Culture Documents
CHAPTER 1. INTRODUCTION I certify that the material contained in this dissertation is my own work and does not contain unreferenced or unacknowledged material. I also warrant that the above statement applies to the implementation of the project and all associated documentation. Regarding the electronically submitted version of this submitted work, I consent to this being stored electronically and copied for assessment purposes, including the Departments use of plagiarism detection systems in order to check the integrity of assessed work. I agree to my dissertation being placed in the public domain, with my name explicitly included as the author of the work.
CHAPTER 1. INTRODUCTION
Abstract
The aim of this project is to develop an application which aids a potential University applicant in choosing a suitable University. The complete system comprises of a background MySQL database and a front end PHP web application. The web application uses a fuzzy logic querying approach to return non-discrete results to the user based on a closeness-of-match score calculated at runtime.
This project can be broken down into three main areas: research into current applications which use a querying approach based on fuzzy logic, eliciting the requirements & the designing of a system to meet the required specification and a prototype system to demonstrate to completed application in use.
CHAPTER 1. INTRODUCTION
Contents ____________________________________
List of Figures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6 List of Tables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1 Introduction 1.1 Overall Aim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Project Aims . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4 Report Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Background Research & Related Work 2.1 Background Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 Fuzzy Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.2 Fuzzy Set Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.3 Fuzzy Logic Approaches With MySQL. . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 Choosing A House . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 9 9 10 11 12 12 12 13 14 16 16
Design 20 3.1 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.1.1 High Level Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.1.2 Detailed Requirements List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.2 Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.3 Design Decisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.2.1 MySQL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.2.2 PHP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 3.4 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 3.4.1 User Interface - PHP Front End . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.4.2 Administrator Interface - PHP Front End . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.4.3 Logic Processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.4.4 MySQL Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.5 Which University Web Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Implementation 4.1 Method of Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 4.2 MySQL Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 4.2.1 Database Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 4.3 Fuzzy Logic Calculations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 4.4 Which University Web Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 4.4.1 Welcome Screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 4.4.2 Administrator Control Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .42 4.4.3 User Data Entry Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.4.4 Displaying of the Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4
4.
CHAPTER 1. INTRODUCTION 5. System Operation 5.1 Usage Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.1 Welcome Screen. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.2 User Data Entry Form. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.3 Displaying of the Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.4 Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 44 45 46 47 47
6.
Testing and Evaluation 49 6.1 Testing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 6.1.2 Black Box Testing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 6.2 User Interface Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 6.3 Overall Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 References
Appendix Appendix 1 - FYP Proposal Working Documents The Working Documents for this project are available at www.lancs.ac.uk/ug/jamesm2
CHAPTER 1. INTRODUCTION
14
Fig. 3: Graph showing partial goodness of match scores applied to house prices. 18 Fig. 4: Use Case diagram depicting the Users and Administrators interaction with the system. 24
29
30
34
35
37
39
40
41
45
CHAPTER 1. INTRODUCTION Fig. 14: Which University User Data Entry Form. 46
47
48
49
51
CHAPTER 1. INTRODUCTION
22
22
22
22
23
23
23
CHAPTER 1. INTRODUCTION
Introduction 1 ____________________________________
1.1 Overall Aim
The main aim of this project is to develop a system which would be useful to University applicants when initially considering which Universities to apply to. Such an application should present a user with a variety of University-based questions and allow them to input their choices into the web based form. The application should make use of various fuzzy logic querying techniques when dealing with the data collected from the user. Using this approach should allow a University profile to be generated from the users preferences. Non-discrete (fuzzy) results should then be calculated by comparing the University profile to each stored University. The results from this can then be ranked in descending order of closeness-ofmatch score and will therefore allow the user to see details of their most suitable University first.
1.2 Motivation
Traditional search techniques which feature in the majority of all existing search applications tend to follow the rigid feature matching approach. These types of systems produce binary (Yes/No) responses to questions and therefore, as a consequence, may yield no results at all, or at least very few. Systems following this approach can also generate far too many results and the output tends to be ordered in a way which is of no real assistance to the user. These types of search applications can be limiting and unsophisticated to use, especially when concerning Universities. This is due to the fact that some features of a University are inherently imprecise, or fuzzy by nature such as near to shops or large campus. It is possible to convert these into numbers (distances, sizes) but in reality, users do not view these characteristics in such a discrete, scientific manner. This is what provides the real motivation behind the development of a fuzzy logic based University search application as the system must be able to handle vague and fuzzy feature specifications in a natural way. If this is achieved then it should alleviate the boundaries and problems associated with many conventional rigid feature matching systems. The lack of a current application that allows potential applicants to state precisely their desired characteristics in a University at an early stage in the selection process provides another strong case why a University search application implementing fuzzy logic queries could be deemed a worthwhile project, filling a possible niche. A lot of users may not have thought through exactly what they want, or may have difficulty expressing exactly what they are looking for. In addition, their wishes may change over time. Rigid feature matching (discrete) search applications require the user to know exactly what they are looking for prior to the search being carried out. If the University doesnt meet a feature requirement, then it is immediately discarded from the search results. This is yet another example that highlights the need for a fuzzy logic system, as this would allow for scope within a search and would not severely penalise Universities should they not meet exactly the profile that the user was trying to create.
9
CHAPTER 1. INTRODUCTION
It is also necessary to take into account that the importance of University features will vary amongst users. For the majority of applicants, location, University reputation, degree course and available accommodation type may be considered the most important factors. Similarly, some features such as proximity to sports facilities or shops may only be considered important by some of the users. In addition, it is likely that there will also be quite a number of nice-tohave features which will be highly individual to each user. Regarding the features deemed nice to have, another aim of the project can be seen as follows, The system must handle the importance of features in a natural and fair way so that features deemed nice-to-have will not adversely affect a Universitys closeness of match score.
In addition, there is an almost unlimited range of features which may affect a decision when an applicant is choosing potential Universities. Therefore, the following aim may also be added, The system should incorporate a certain amount of uncommon and indirect University features without potentially discouraging the user with a long series of drawn out questions.
Each of the stored University features may be quite diverse in nature. University league placing is numerical, en-suite accommodation or student parking available is symbolic (Yes/No), names of cities/areas of the country are specific and textual, and features such as graduate job opportunities can be classed as indirect. Therefore, the following is necessary, The system should ensure that each different feature category uses a different and appropriate type of logic for representation and profile matching.
In addition, there are several general features which the system must aim to include in order for it to be successful. The GUI which the system will use needs to be aesthetically pleasing and intuitive to all categories of potential user (age, computer literacy level etc). The database should only ever store relevant details which are required for the systems functionality. Passwords should only ever be stored server side. The system should be web integrated to allow for the greatest possible coverage. This would also be of benefit when carrying out system testing and evaluation.
10
CHAPTER 1. INTRODUCTION
11
As can be seen from the diagram above the most limiting feature of all bivalent sets is that above, they are mutually exclusive. It is clearly not accurate enough to define the transition from a hey quantity such as cool to warm by the increase in one degree Fahrenheit. In reality, it is unlikely that such a sharp transition would be noticed; instead a smooth drift from cool to ooth warm would be recognised as occurring. Fig.2 below shows how Fuzzy Set Theory can be used to describe this natural effect more accurately,
13
The results generated from this initial search may be very accurate, but this approach severely limits opportunities when (an otherwise matching) television is 1025. In reality, the customer would probably be satisfied paying the 25 over their initial budget, yet the query used will not allow for this real-life eventua life eventuality. It is possible to rewrite this initial query by replacing the AND with OR, but by using the OR statement we get a different series of inaccurate results as now all televisions below 1000 will be shown, or all Sony televisions will be shown, or all widescreen television will appear in the results. This approach is the opposite of the previous query and an excess number of often unrelated results are generated. A solution to this problem was achieved using the built in MATCH AGAINST function MATCH AGAINST provided by MySQL. This system uses text matching which allows the addition of an d individual preference. The query will then allocate points to indicate the score in matching.
14
CHAPTER 2. BACKGROUND RESEARCH & RELATED WORK Although this is only a text matching system, the web developer from the research article (Schaap, 2006) successfully managed to use the MATCH AGAINST function to integrate a real world demand such as less than 1000. The implications of this on this project and the proposed system are significant as it should allow the successful development of fuzzy queries to deal with user preferences such as the University must be ranked within the top 50 in the UK or the number of University applicants each year must be less than 30,000. The fuzzy match against function for less than, more than queries was achieved by encoding the actual numbers into a word. In the case of the television search, the televisions manufacturer would be encoded to a unique word such as manufacturesony and the televisions price encoded to pricemaxonethousand. All of the desired televisions characteristics are then stored, within the same database row, in a new text only column,
databasetextrow = widescreen manurfacturesony pricemaxonethousand
The match against function in MySQL also allows searches to be conducted IN BOOLEAN MODE. This adds a preference to each search demand using the following symbols,
+ = Obligated > = Important ~ = More or less important - = Without
These preference symbols then allow queries to be created using the following format,
if($customerpricemax) < 1000) $search = >sony +widescreen ~pricemaxonethousand;
The overall match for each television can finally be returned in descending order of score using this MySQL query line. SELECT tv_manufacturer, MATCH (databasetextrow) AGAINST ( $search IN BOOLEAN MODE) AS score WHERE MATCH (databasetextrow) AGAINST ($search IN BOOLEAN MODE) ORDER BY score DESC
Although this approach appears to offer exactly what this proposed system would require, there are several other methods available which deliver similarly effective results, albeit, by different means. The technique researched above performs the fuzzy logic calculations on the actual database query itself before any results are returned. There is a proposed system in the following section of research which uses one of these alternative approaches.
15
Feature Importance
When choosing a house there will inevitably be certain features which are permanently more important than others. For example, price may be a consistently more important factor than say distance to shops. Because of this, the importance of each feature needed to be taken into consideration to prevent nice to have features having the same weighting as other, more significant requirements. There are several different ways in which this can be achieved. The Choosing a House application focused on three of these. The first was to actually invite users to indicate their opinion on the importance of the features. This would require calculating an average of all opinions and giving each feature an importance rating from 0 to 1. This approach could be best served for the nice to have features where opinions on their importance tend to vary quite dramatically from user to user. Another method was to inherently assume that certain features are inherently very important, the system placing certain features above others by default. Although this is perhaps the most uncomplicated technique, there may be problems should the users views differ greatly from that of the vast majority of the systems users. This approach could however be developed to allow the importance of features to be adjusted dynamically with system use in light of feedback from users. To be able to calculate scores for each feature the importance must be represented by a number. For the Choosing a House application the developer chose to use the following values,
16
CHAPTER 2. BACKGROUND RESEARCH & RELATED WORK Essential features: importance = 1.0 Desirable features: importance = 0.6 Nice-to-have features: importance = 0.3
This is very relevant to the proposed University search system as values similar to these would need to be allocated to features such as University location, degree course availability and accessibility of sports facilities to represent their importance within a search.
Potential House Match Carnforth, 350K, 3 beds, Garden, Garage, Noisy road, Near shops and School: 55% (2.8 out of 5.1)
The prototype was then revised as in practise, there can be partial matches for a search. Partial matches were made possible by the introduction of the Goodness of Match concept. For a house search where a user specifies a desirable price in the range of 200 250K a house costing anywhere between 200K and 250K is considered perfect and is awarded a goodness of 1.0, with houses costing greater than 350K being classed as completely unsuitable and gaining a goodness score of 0.0. Where the system beings to accommodate partial matches is with houses that are outside of the stated range, but not to a point where they could be labelled as a complete non-starter and are disregarded. A house costing 270K may well be outside of the users specific range but if that same house matches well with the users other requirements then it is still possible that the house would be given some consideration. In cases such as this, the system allocates the price of the house an in-between goodness of match score. The range for where a feature such as price could be given a partial score is established using upper and lower limits. The system calculates this by adding 12% onto the upper limit and subtracting 20% from the lower limit. For the price range of 200 250K this would produce an extended range of 160 - 280K. With this in mind, for any price in the range 200 250K a goodness score of 1.0 (perfect) is awarded. Any house price outside of the extended range of 160 - 280K is given a goodness score of 0.0 (completely unsuitable). This leaves
17
CHAPTER 2. BACKGROUND RESEARCH & RELATED WORK KGROUND a range between 160 - 200K and 250 - 280K where a partial goodness of match score goodness match can be awarded. The system calculates this using arithmetic in the following form . arithmetic form, For house prices > the upper limit Goodness score = 1.0 x (exten (extended upper - house price) / (extended upper - upper limit) For a house price of 265K Goodness score = 1.0 x (280 - 265) / (280 - 250) = 0.50 For house prices < the lower limit Goodness score = 1.0 x (extended lower - house price) / (extended lower - lower limit) For a house price of 165K Goodness score = 1.0 x (160 - 165) / (160 - 200) = 0.125 160
Fig.3: Graph showing p partial goodness of match scores applied to house prices. pplied Although this approach of accommodating partial goodness of match was done with house prices in mind, the general principles and theory behind the arithmetic could be used successfully to allow partial matches to features such as University league position within the proposed system. This may work well in the proposed application, as if a University matched with other search criteria requested by a user but was marginally outside of the specified league uested position limit, the University would almost certainly still warrant consideration. This approach may also work well for features which have numerical values but the system has features to make use of a different method for allowing partial goodness of match scores when it deals with text based features such as location. The method employed in this case is simpler than for numerical values. The system itself recognises neighbouring areas, th through some form of storage mechanism. Suppose the specified location is Lancaster. The system will recognise neighbouring areas of Lancaster (e Heysham, Galgate) and for a house in Lancaster itself will (e.g
18
CHAPTER 2. BACKGROUND RESEARCH & RELATED WORK award a goodness of match score of 1.0 (perfect). For houses in neighbouring areas of Lancaster, a lower goodness of match score will be applied. The system uses a similar approach to this for all text based features. The final match score used to rank each of the houses in the search is calculated using the aggregate of importance x goodness of match for all the features. These can then be displayed to the user in order of rank. As with the numerical approach to partial goodness of match, the method used when dealing with text based features could provide a useful foundation for dealing with certain text based features in the proposed system. While this may be true, the text based solution used in the Choosing a House appears to offer far less scope for values when compared to the numerical solution. The text based method would only permit 3 possible closeness of match values, 1.0 for a house in the specified location; another specified value (0.6 for example) for a house in a neighbouring area and 0.0 for any house outside of the neighbouring area. The lack of scope available here may prove to be too limiting.
19
CHAPTER 3. DESIGN
Design 3 ____________________________________
In order to successfully create an in depth set of requirements for the system, a complete understanding of the situation and manner in which the system will be used must first be obtained. In addition, the design will rely significantly on my own knowledge and insight into which University features will be beneficial to include and which would prove unnecessary and merely serve to distract the users attention away from relevant data. As I have experience of being in the position of short listing Universities myself (less than 3 years ago), it may be argued that this provided additional valuable research into which features a potential applicant would find meaningful to include in an application such as this. Another key factor which needs to be considered is the user themselves. Although at first, it might seem apparent that the system should be catered towards college and Sixth Form leavers, from personal experience it appears that parents may also play an important role when their son/daughter considers the prospect of University. In light of this, it must also be presumed that parents may also find the proposed system of interest and make use of it themselves on behalf of their son/daughter. This highlights interesting design considerations, as users will evidently have varying levels of computer literacy; some users will be comfortable using computers and develop an instinctive sense of where icons and navigation buttons should be, while other users require key areas of the interface to be large and clearly labelled before they begin to feel comfortable using an application. Stereotypically, it is young generations who tend to be more computer literate than older generations, perhaps due to differing levels of computer interaction during early years of learning, highlighted by Bunz et al. (2007). Due to this, it must be ensured that an appropriate balance between intuitiveness and elegance is obtained whilst designing the user interface for the Which University search application, allowing all potential users to operate the system with ease.
3.1 Requirements
Based upon an understanding of the situation in which the proposed system will be used, a set of requirements have been devised. Each of these requirements should be met for the solution to be considered a success. A system design requires an overall outcome which can be used as a starting point for deriving more precise requirements.
Develop a system which will aid potential applicants in selecting a University which suits their needs. As many features of a University are fuzzy in nature, the system should balance features off against one another and produce a list showing the best choices in rank order.
20
CHAPTER 3. DESIGN
21
CHAPTER 3. DESIGN Table 3.2: Derived Requirements List for High Level Requirement B. B. B1. B2. B3. B4. Construct a University profile for each user from the specific requests which they provide. Present the user with a data input screen which displays a variety of relevant University features. Allow the user to state the significance of each feature to them using fuzzy importance levels such as slightly important, very important etc. Provide secure data input methods such drop down boxes and radio buttons to prevent the introduction of lexical errors into the system. Choose appropriate default importance values for University characteristics to avoid results being adversely affected should the user choose to ignore certain features.
Table 3.3: Derived Requirements List for High Level Requirement C. C. Ensure that the system stores an appropriate amount of University data C1. Certify that the system stores enough data about each University to make the system worthwhile. C2. Ensure that all sensitive data and that which may infringe the Data Protection Act (1998) is not stored within the system. Table 3.4: Derived Requirements List for High Level Requirement D. D. The system must generate result sets using a fuzzy logic approach to balance requested features off against one another. D1. Ensure that Universities are not excluded from the results set for not satisfying certain search criteria. D2. Certify that each University feature is given an appropriate result weighting depending on its pre-decided level of importance. D3. The system should calculate a goodness of match score for each University determined by how closely it matches the user created profile.
Table 3.5: Derived Requirements List for High Level Requirement E. E. Display all results to the user in a clear and consistent manner. E1. All University results should be displayed in descending order of goodness of match score so that the users most suited Universities are displayed at the top. E2. To prevent cluttering of data, only a subset of important University features should be initially displayed. A full record of University features should be easily accessible should the user wish to enquire further.
22
CHAPTER 3. DESIGN Table 3.6: Derived Requirements List for High Level Requirement F. F. F1. F2. Provide additional features to the user allowing them to further investigate and evaluate Universities once the results have been displayed. For every University result generated provide a hyperlink to the Universitys official website. Once all results have been displayed, a free-text search facility should be provided to allow users to filter the results by words/values of their choice. It should be possible for the user to search all fields or let them specify a certain field to filter by. Provide some style of rating system to allow system users to rate Universities and view, at a glance, the overall user opinion of each University. The system should provide a section where users can comment and exchange their views on Universities.
F3. F4.
Table 3.7: Derived Requirements List for High Level Requirement G. G. Supply an interface for a system administrator to perform maintenance on Universities without having to access low level database code. G1. Provide an interface method, as either part of the website, or via a desktop application, to allow the system administrator to add/remove/and edit Universities. G2. The administrator interface must be securely protected, via password or other means, to prevent unauthorised alteration of database records.
Table 3.8: Derived Requirements List for High Level Requirement H. H. The system must be made public via a website to increase its overall coverage. H1. The system should utilize appropriate system architecture to facilitate smooth web integration. H2. Security must be taken into consideration with the system being distributed globally. All passwords must be stored securely server side and preventative measures taken to thwart potential MySQL injection attacks with PHP.
23
CHAPTER 3. DESIGN
Fig.4: Use Case diagram depicting the Users and Administrators interaction with the system.
1. Login
24
3. Edit University
4.
Delete University
25
6.
26
8.
Comment On Universities
3.3.1 MySQL
The database used as part of the Which University? application will be stored using MySQL 5.0. MySQL is an open source database relational database management system (RDBMS) based on SQL (Structured Query Language). It is widely used in web based applications due to its flexibility and seamless integration with PHP. It could be argued that most people view databases as those similar to Microsoft Access, but this is not actually a management system. MySQL allows users to connect to a specific database on the server and issue requests. Within MySQL, there are many features which would make this database management system suitable for this system and reasons why it out performs others. The stability of MySQL is perhaps its strongest feature and this has proven itself over the last ten years. MySQL is also multithreaded which allows multiple connections at the same time, without slowing down the system. It meets all of the ANSI SQL92 regulations and the unlicensed version of the MySQL database management system costs nothing. Therefore, when selecting which method of storage the
27
CHAPTER 3. DESIGN system would use, it seemed appropriate to use MySQL as it satisfies all of the requirements as well as being free to implement.
3.3.2 PHP
The web based user interface for the system will be created using PHP. PHP is a widely-used general-purpose scripting language that is especially suited to web development and can be embedded into HTML. The overall concept of PHP appears similar to that of JavaScript in respect that it can be imbedded into traditional HTML code. However, the procedure of removing much of the processing away from individual computers has some important benefits. WORKLOAD The user's computer is not required to do much of the processor-intensive work. This can speed page load times and generally ease the browsing experience. As well as this, PHP does not put a strain on servers. The code is optimized to make the server's job easier. This is particularly suited to the Which University? system as it means users with lower performance systems will not notice a significant amount of reduced performance issues. DYNAMIC CONTENT One of PHPs strongest features is its contribution to the creation of dynamic websites that can react to user input. As a program language, one of the main functions of PHP is storing variables. The scripts can encode and store user inputs into variables that can be passed on to other code to execute. The code can then query databases to draw out data and perform comparisons. Through these means, PHP can take user input and change a website in response to input unlike hard coded HTML which is generally static. The ability of PHP to offer this is invaluable for the creation of the proposed system. The PHP code is executed exclusively by the server and therefore requires no action from the end-user. The server used to store the web based system must have PHP installed but once uploaded the system can make use of dynamic features without effort by the user. A server will parse the code at its source, execute the code and then return properly formatted HTML to the users browser in code that it can decipher.
28
CHAPTER 3. DESIGN
Logic Processor
CHAPTER 3. DESIGN all results from the MySQL database are passed into the logic processor phase. This phase is not a separate PHP file but a section of code located within the user interface which focuses on performing calculations and logic on the returned results.
Has
University
M 1
IPs
Has
Teach
Has
Comments
Results
Courses
Courses
Fig. 6: Entity relationship diagram for the Which University system. As can be seen from the diagram above, a University consists of many courses, and many courses can be taught at a University. This many-to-many relationship requires an intermediate table to be created taking the primary keys of both University and Courses tables.
30
Research % Mark
Grad Grad Teaching Employ Employ Excellence League % Mark League Position Pos Starting Salary League Pos Highest Starting Salary Best For Sport
Most Applications
Match
University Courses ID Results Uni ID IPs ID Area Area Surrounding Regions Uni ID IP Count Votes Points Uni ID Course ID
These diagrams provide a precise description of the contents of the tables and the relationship between them, but it is also important to note how the data within the tables will be used by the other sections of the system. Firstly and perhaps most importantly is the University table which lists all the relevant information about a particular University. The table contains a unique Uni ID as its primary key. Second to this are a group of fields which describe general features about the University,
31
CHAPTER 3. DESIGN such as its name, location, campus layout and its current league position. These were deemed features which may be of greatest interest to potential users so are also included in the feature shortlist which will be first viewed once results are returned. Next in the table are features which could be classed as available facilities at the University; these results only contain yes (available) or no (not available) values and consist of student parking availability, accessible sports facilities and the availability of en-suite accommodation. These features are slightly more personal as their importance will vary depending on factors such as whether or not the user is wishing to take a car etc. The remaining fields in this table are statistical and are grouped into league position + % mark pairs. An example of this would be the Universitys overall research league position and the percentage of excellence mark it attained. The courses table is somewhat simpler than the University table above and consists of only the courses unique ID which is the primary key and the course title. As Universities and Courses share a many-to-many relationship an intermediate table was needed to link these together; this intermediate table consists solely of the primary keys of both University and Course tables (Uni ID and Course ID). The results table is used to store the current score for each University, as voted for by the users. The first field in this table is the University ID which the score is relating to. As well as this there is a votes field which contains the total number of users who have cast their vote, and a points field which holds the total number of points this specific University has scored. The IPs table has the purpose of restricting users to only one vote on each University, which, as previously stated, was deemed necessary to prevent user spamming of votes from adversely affecting the overall user rating of a University. The first field in this table is simply an ID number which is also the primary key. The University ID number is the next field along and is used in combination with the IP field, ensuring that each user may only vote once on each University. Finally there is the comments table. The purpose of this is to contain all of the data regarding comments which users have provided about each University. Again the primary key of this table is just a unique ID number. The next field along is the University ID of the University for which the comment was left, this is used to filter the entire table to ensure that only relevant University comments are displayed for the University in which they relate to. The name, email and message fields in this table contain the main body of the actual comment made and consist of the data which the user has entered.
CHAPTER 3. DESIGN perspective of development, the system will focus on the aesthetics of the interface as the system itself is intended for as a general social tool use rather than heavy business purposes where aesthetics would not hold as much importance. With user interfaces, it is important to adopt a consistent layout between pages but this perhaps holds greater significance when designing an interface which may be used by users with lower computer literacy levels. More competent users may almost develop an innate understanding of where certain buttons should naturally be, even if the layout isnt always consistently similar. This instinct may not be present users with lower computing competency and they may tend to feel more comfortable with larger, text labelled buttons, and layouts which are regular between pages, even if this is not always practical. The Which University web interface is broken down into 3 major sections. The first of these are the general site navigation pages which allow users to view a description of the project and/or send an email concerning questions or noting information. There is also a feedback form available to gather electronic user feedback about HCI issues and the application in general. This may prove to be a valuable asset in quickly and effectively assessing any HCI shortfalls which may not have been considered. The second of the interface sections will concern the presentation of an input form to the user to allow them to enter their desired importance ratings of several of the University features mentioned previously. The most appropriate input methods for specific data types must be taken into consideration, for example, combo boxes have the advantage of allowing large amounts of values to be displayed discretely without overloading the users senses. However, the disadvantage of this is that only one of these values can be displayed at any one time, the remainder remaining hidden until selected. The final section of the interface will relate to the actual layout of the results returned by the database queries and processing of fuzzy logic calculations. A balance between displaying relevant information and not overloading the initial subset needs to be achieved when considering which fields should appear in the preview. Considering this, it was decided to avoid displaying a significant amount of the Universities statistical information, in favour of fields which were deemed, generally, to be of importance to the user.
33
CHAPTER 3. DESIGN
As can be seen from the proposed web interface above, the fields which contain a large number of potential values were placed in drop down boxes. Although only one of these values can be displayed at once, this approach does at least prevent the input form from being over powered by a large number of values for one field and keeps the layout of the input form aesthetically pleasing and consistent. Radio buttons were preferred where fields only had a small number of potential values. This approach meant that all of the values could be displayed to the user at one time, whilst the relatively few possible values ensured that the input form remained aesthetically pleasing and practical to use.
34
CHAPTER 3. DESIGN
The main aspect of the above interface which perhaps is the most prominent was the decision to alternate the colours of each of the result lines. This could be achieved using a fairly simple algorithm but its effect should ensure that each of the University results remains clear and easily viewed, especially when result sets become large and (potentially) overwhelming. The main focus of the entire project is the application of fuzzy logic result calculations and closeness of match scores. The % match column in the above interface demonstrates how the calculated match will look once a score has been computed for each University. As this is the most significant of all the columns displayed, the bright yellow colouration makes it easily distinguishable and should immediately draw a users attention towards it. The specific summary set of University features was selected due to their general higher relevance to a greater audience of users. Some of the other University features are quite specialised and almost certainly will not be of great interest to users unless they are especially interested in the specific University. The statistical information about each University may appear somewhat daunting to some users and may overload their memory should these statistics be visible in one table for every University on record. The user star rating system for the Universities was integrated into this area as it would allow the user to easily assess the current user rating at the same time as viewing the preview of the Universitys features. Their opinion on both of these factors should be enough for them to decide whether or not to view the University in more detail. From the initial results is a link which allows the user to display all of a Universitys features in a full screen table as well as the option for them to visit the Universitys official website.
35
CHAPTER 4. IMPLEMENTATION
Implementation 4 ____________________________________
4.1 Method of Implementation
The system was implemented over a number of weeks as individual modules. The first stage was to create the database as this was always likely the to be the centre point of the system with each of the other modules relying on its existence and data population in order to function as intended. Once the database and its tables had been implemented successfully in a local environment they were uploaded onto the web server and with the uploaded database now online and fully employed the focus moved onto the implementation of the PHP pages which would be used primarily to communicate with the database. To begin with much of the fuzzy logic processing module was removed from the PHP pages as I wanted to first ensure that basic communication and functionality was being carried out as intended. Having implemented both the database and the basic shell of the PHP communication pages to ensure that basic connection and data retrieval was a success it was now time to begin including the more complex fuzzy logic calculation modules. The fuzzy logic modules are an integrated part of the PHP communication pages so adding these back into the code did not provide me with any significant problems. Once the system was now functioning as intended with the PHP interfaces successfully querying the database and then applying fuzzy logic calculations to returned results set the majority of the implementation was complete. As the system is aimed towards being a web based tool, where presentation plays a vital role, the implementation of a surrounding website for the actual application to run within was decided upon. The implementation of each of these sections will be discussed in more detail under the headings which follow.
36
CHAPTER 4. IMPLEMENTATION The creation of the database itself was exactly as specified in design (section 3). Below is a more detailed UML diagram of the data types and their interactions between tables.
CHAPTER 4. IMPLEMENTATION comment are required to be unique, and identical comments by users are perfectly acceptable, no validation is required for database insertion in this section.
38
Accessible Sports Facilities: 0.6 Student Parking Available: En-suite Accommodation: High Research Rating: Standard Of Final Degree: Likelyhood Of Graduate Employment: 0.6 0.6 0.3 0.6
0.3
Calculating The Overall Weighting Of A Feature Overall Feature Weighting = Closeness of Match Score X Importance
This is carried out for each of the University features. Once this has been completed the overall match of each University can be calculated by taking the cumulative Overall Feature Weighting for all features, dividing this by the maximum possible cumulative Overall Feature Weighting and then multiplying this by 100. This will generate a percentage of how well each University matches the criteria which the user has input.
Overall University = Cumulative Feature Weighting Score X 100 Match Maximum Possible Cumulative Feature Score
CHAPTER 4. IMPLEMENTATION lower boundary is usually 10% of the selected value. However, for fields similar to the University league position anything better than the specified position is awarded a value of 1.0 (completely suitable) as, for a league placing request of Top 50, a University in position 45 is just as valid as one in position 12. The goodness of match score calculated for each of the relevant University features will be multiplied by its perceived importance level (0.0 1.0) which will then give the overall weighting which the feature will have in computing the overall suitability of the University for the user. Below is an example of a fuzzy calculation for one of the numerical (statistical) features. For Specified University League Position Of Top 50 Performed On A University In Position 56 (worse than the requested level) Goodness score = 1.0 x (extended upper actual position) / (extended upper - 50) Goodness score = 1.0 x (60 56) / (60 - 50) = 0.4 For Specified University League Position Of Top 50 Performed On A University In Position 41 (better than the requested level) Goodness score = 1.0 Fig. 11: Numerical Fuzzy Logic Calculation.
40
CHAPTER 4. IMPLEMENTATION
For Specified University Area Of North East And A University Situated In The North West Goodness score = 0.6 (As North West is a neighbouring vicinity to North East) For Specified University Area Of North East And A University Situated In The North East Goodness score = 1.0 For Specified University Area Of North East And A University Situated In London Goodness score = 0.0 (As London is NOT a neighbouring vicinity to North East)
Fig. 12: Textual Fuzzy Logic Calculation. There are also occasional instances where the data stored about a particular field is very discrete and therefore impossible to calculate fuzzy results for. An example of this sort of data would be evident in the En-suite accommodation available field. Here there is only really the possibility of Yes or No answers and as a result only crisp goodness of match scores of 1.0 and 0.0 are made possible.
41
CHAPTER 4. IMPLEMENTATION
CHAPTER 4. IMPLEMENTATION The navigation of this area closely resembles that of the user interface once results have been returned. The administrator can order Universities by any of the field headers and is able to filter the list of Universities using the free text filter which can be configured to filer by partial word match or by whole word match only.
43
44
This page is what greets the user once that access the site. From here they have the option to view a brief summary of the motivations behind the development of the project, leave feedback should they be testing the application, log in as an administrator to perform required maintenance or to actually start the University search application.
45
Fig. 14: Which University User Data Entry Form. Once the user starts the application they are presented with a web form where they can enter their preferences to how important each of the University features are to them. Once the user has completed the form they are required to submit it to the server where the values they have selected will be stored in sessions where they will exist until the session expires.
46
Fig. 15: Which University Search Results. Having submitting their choices via the web form, the user will be presented with their results in the same format as Fig. 15. As described in the usage scenario (5.1), the user has a variety of different options at this point. For example, the user may choose to narrow the results down further by using the custom filter above the results set; this matches free-text and can be configured to search for partial word matches or whole words only.
5.1.4 Summary
A comparison between the user usage scenario and the systems original functionality expectations from the original example in Design (section 3) show that no significant changes have actually been made to the flow of the system. The preview set of data for each University has also stayed very consistent with that specified in the original design. The only real element of change was the inclusion of a small javascript to enable mouse over functionality for the star rating facility. Figure 16 below shows a final sequence diagram depicting the system in operation during the preceding usage scenario.
47
48
Fig. 17: A Model Of The Software Testing Process. His suggestion for testing system usage and operational features is to meet the following criteria [2004: 539]. 1. All system functions that are accessed through menus should be tested. 2. Combinations of functions that are accessed through the same menu should be tested. 3. Where user input is provided, all functions must be tested with both correct and incorrect input. Having read the opinions of Ian Sommerville I decided to identify test cases in order to meet his criteria mentioned above. I firstly isolated the database section of the system and created some tests to manipulate data to ensure that the database was reading and writing as intended.
49
CHAPTER 6. TESTING AND EVALUATION MySQL Database Data Manipulation No. 1 2 3 Test Description Add a new University to the existing database. Retrieve all of the Universities stored in the database. Check that each University in the database has the correct data type for each field. Retrieve Universities based on what values they have for each field. This should be tried for each of the fields within the database. i.e. Return all Universities who HAVE student parking available. Test that when a new University is added, its Uni ID field is automatically incremented. Delete a University from the database. Expected Outcome New University is added to the next new available row. Displays every University currently stored. Every University record has the correct data type for each of their fields. Database should return all Universities whose fields match that of the search criteria. Result PASS PASS PASS
PASS
The Uni ID field should increment by 1 each time a new University is added. The relevant University should be deleted from the database leaving nothing left behind. Edit each of the fields for any The relevant fields should have been University to ensure that the changes are updated and these changes maintained once they are submitted. permanently maintained. Attempt a query which will return a No Universities should be returned known result set of 0 Universities. by the query.
PASS
PASS
PASS
PASS
As can be seen from the data manipulation tests performed on the database, this area of the system is functioning as intended. Querying of the database for results already held was successful and based upon the fact that there is also no problems inserting, deleting or updating currently stored University records.
50
In relation to the Which University system, black box testing may also be considered a form of integration testing as all of the individual components are required to work together in order to meet the goals set by the specification. As shown in Figure 18, Sommerville provides a graphical example of a black box testing model which illustrates effectively how to view the system when actually conducting black box testing. For my system, the majority of the black box testing is centred on the user interfaces and their interaction with the database. To carry out my black box testing on the system, inputs from the user interfaces will be chosen as test cases and the outputs recorded. Data Entry Form Black Box Testing No. 1 Test Description Clicking any one of the navigational buttons in the left hand column will direct you to a different section of the site. Upon initially viewing the data entry form, all values are set to the default option Clicking RESET will set the values to their default Clicking the email link mike@whichuniversityfyp.co.uk opens a new message screen Location Course and League Position can be selected by clicking on the drop down menu Clicking on a radio button selects the chosen value Clicking SUBMIT will forward the user to the results screen and display appropriate results corresponding to the users selections Expected Outcome Clicking the navigational buttons in the left column will direct you to the correct area of the site for the button that you clicked. Upon initially viewing the data entry form, all values are set to the default option Clicking RESET sets the values to their default Clicking the email link mike@whichuniversityfyp.co.uk opens a new message screen Clicking the drop down menus on Location Course and League Position displays a variety of options relating to each feature Clicking on a radio button selects the chosen value Clicking SUBMIT will forward the user to the results screen and display appropriate results corresponding to the users selections Result PASS
PASS
3 4
PASS PASS
PASS
6 7
PASS PASS
51
CHAPTER 6. TESTING AND EVALUATION Results Screen Black Box Testing No. 1 Test Description The results page lists potentially suitable universities in order of closeness of match, the percentages displayed in descending order Clicking main menu provides a link to the main menu The filter can be selected by clicking on the drop down menu Clicking on reset filter resets to no filter Clicking view all details provides a link to further details about the specific university Clicking website provides a link to the official website of the specific university Highlighting a particular star on a specific university allows the mouse over to take effect and displays a temporary box depicting click star to vote and the relevant number of stars Clicking a specific star related to a specific university allows the user to vote their overall rating Expected Outcome The results page lists potentially suitable universities in order of closeness of match, the percentages displayed in descending order Upon clicking main menu, the user is returned to the main menu Clicking the drop down menu shows the potential filter options Upon clicking on reset filter, resets to no filter Upon clicking view all details, the user is directed to a new screen displaying further details about the specific university Upon clicking website, the user is directed to a new screen displaying the official website of the specific university Highlighting a particular star on a specific university temporarily displays click star to vote and the relevant number of stars Result PASS
2 3 4 5
PASS
PASS
10
Clicking a specific star related to a specific university brings up a dialogue box showing Thank you for voting Clicking view/submit provides a link Upon clicking view/submit, the user to the comments screen is directed to a new screen displaying comments and the option to leave comments Clicking back to the top returns the Clicking back to the top returns the user to the top of the screen user to the top of the screen
PASS
PASS
PASS
Comments Black Box Testing No. 1 2 3 Test Description Clicking back to search results returns the user to the results screen Clicking back to the top returns the user to the top of the screen Clicking on the name, email and comment dialogue boxes selects them and allows user to type within them Clicking leave comment submits the comment Expected Outcome Clicking back to search results returns the user to the results screen Clicking back to the top returns the user to the top of the screen Clicking on the name, email and comment dialogue boxes selects them and allows user to type within them Clicking leave comment submits the comment Result PASS PASS PASS
PASS
52
The table above shows that the users who provided feedback tended to class themselves as frequent social users or competence daily users, with some categorising themselves as occasional users. However, due to accessibility issues, all the feedback came from similar users, and further research would need to be carried out to assess the numbers of novice and advanced users who may potentially use the application.
53
How do you rate the site's aesthetics? (Does it look good and appeal to the eye) Aesthetics Poor Average Good Excellent 0% 0% 83.33% 16.67%
The table above shows that the users who provided feedback considered the aesthetics to be good or excellent.
Can you tell where you are immediately within the site? (Clear title, description, captions, etc) Signposting on site Poor Average Good Excellent 0.% 66.67% 33.33% 0.%
The table above shows that the majority of users who provided feedback were able to locate where they were are the site, although this is perhaps an area that could be improved upon in further work.
How well would you rate navigation around the site? (Is it easy to find your way around) Easy to Navigate Yes No 100% 0%
The table above shows all the users could navigate around the site fairly easily. However, again, it should be noted that the sample of users who left feedback was relatively small, and not representative of all potential users.
54
Are the links to other pages within the site helpful and appropriate? Links Poor Average Good Excellent 0% 16.67% 83.33% 0%
The table above shows that the majority of users who provided feedback considered the links to be good, although the table indicates scope for possible improvement through further work.
Does the site operate & look acceptable on your specific internet browser? Successful Operation on Internet Browser Yes No 100% 0%
The table above shows that the website worked effectively on all the internet browsers used by those who left feedback . However, in hindsight, it perhaps would have been helpful to ask users to select which browser they were using, in order to cover all eventualities
How would you rate the quantity of data stored about each University? (Sufficient data to make the application worth while?)
55
CHAPTER 6. TESTING AND EVALUATION The table above shows that the majority of users who provided feedback considered quantity of data stored about each university as sufficient to make the application worthwhile. How well are the search results organised and displayed?: Search Results Organisation Poor Average Good Excellent 0% 33.33% 50.00% 16.67%
The table above shows that the majority of users who provided feedback considered the search results to be organised and well displayed, although again, scope for improvement may be possible through further work.
Do the search results appear to be error-free? (Spelling errors etc) "Error-free" results Yes No 83.33% 16.67%
The table above shows that the majority of users who provided feedback found relatively few errors.
Are there any extra University details which you feel would be of benefit to include?:
The table above shows that the majority of users who provided feedback did not feel that any extra university details were needed.
56
Do you feel the University preview data returned by each search is sufficient? Preview Data Sufficient Yes No 100% 0%
The table above shows that all the users who provided feedback found the preview data sufficient.
Were there any current features of the application which you found confusing/difficult to use? Examples of current features found confusing/difficult to use Yes No 100% 0%
The table above shows that none of the users who provided feedback found examples of features that were confusing or difficult to use.
Are there ANY new features which you feel would be beneficial for the application to include?
The table above shows that none of the users who provided feedback considered that there were any new features necessary to be added.
57
On the whole how would you rate the site & application? (1 Poor to 10 Excellent) Overall 1 2 3 4 5 6 0% 0% 0% 0% 0% 0%
Mean Mode
8 8
Overall, the table above shows that the users who left feedback considered the application to be worthwhile.
COMMENTS & IMPROVEMENTS Comments/Improvements: Would recommend you make the bit that says 'feedback' and what not bigger cause my eye keeps going to which university final year project Bit of a random click on application rather than knowing i was being sent to the survey bit Just need more courses Maybe order them by % so the reader can just look down the list and not - just a bit easier to navigate? Did not understand the row of stars at end of each line of search results. When I advertently clicked on one it thanked me for voting! Why?
58
CHAPTER 6. TESTING AND EVALUATION Accommodation spelt wrong on view all details More info on grades to get in, and what they expect of you More info on POS would be helpful When I clicked on a link and then tried to go back I was told the page had expired so I had to put all my data in again! Did not understand the overall university user rating. Who are these users? Could you make the custom filter caption more user-friendly?
B.1 Present the user with a data input screen which displays a variety of relevant University features.
59
CHAPTER 6. TESTING AND EVALUATION There is a great variety of input data required by the user on the data input screen. This was confirmed by 83.33% of the feedback users who stated that no more University features were needed. B.2 Allow the user to state the significance of each feature to them using fuzzy importance levels such as slightly important, very important etc. All of the important levels requested for each of the features demonstrate some degree of fuzziness, some exhibit greater amounts than others but this is down to the nature of the feature in question. B.3 Provide secure data input methods such drop down boxes and radio buttons to prevent the introduction of lexical errors into the system. There is no free text data entry option available for any of the University features in the data entry form so this completely prevents a user from entering data which will cause lexical errors within the systems calculations. B.4 Choose appropriate default importance values for University characteristics to avoid results being adversely affected should the user choose to ignore certain features. Appropriate values were chosen by myself during the implementation phase, these were frequently the Average Importance option and as no negative comments have been made by any of the users during feedback I can only presume this is satisfactory. C.1 Certify that the system stores enough data about each University to make the system worthwhile. The data that is stored covers all aspects of a University from features such as parking, to academic statistics and general social factors. I strongly believe that this requirement was met well. C.2 Ensure that all sensitive data and that which may infringe the Data Protection Act (1998) is not stored within the system. I can guarantee that there is no data which could be considered sensitive in anyway, all data concerns solely the University and never any individual connected to it. D.1 Ensure that Universities are not excluded from the results set for not satisfying certain search criteria. The only filtering of the results set concerns whether or not the chosen course is actually taught at the University, other than this no other Universities are filtered out.
60
CHAPTER 6. TESTING AND EVALUATION D.2 Certify that each University feature is given an appropriate result weighting depending on its pre-decided level of importance. I believe that the importance values assigned to each feature were fair in the context, if this changes in the future it would prove very easy to amend these values. D.3 The system should calculate a goodness of match score for each University determined by how closely it matches the user created profile. The calculated goodness of match score is probably the motivating feature behind the system so I am certain that this has been implemented successfully. E.1 All University results should be displayed in descending order of goodness of match score so that the users most suited Universities are displayed at the top. Once results are returned they are always ordered in descending order of goodness of match. The user also has the option to reverse this and display them in ascending order. E.2 To prevent cluttering of data, only a subset of important University features should be initially displayed. A full record of University features should be easily accessible should the user wish to enquire further. Judging by the user feedback I received people were very happy with the subset of University features as 100% of them claimed it was satisfactory. Only 1 click is required to view each Universitys full record. F.1 For every University result generated provide a hyperlink to the Universitys official website. I can confirm that for every University record in the results there is a direct link to that Universitys official website. F.2 Once all results have been displayed, a free-text search facility should be provided to allow users to filter the results by words/values of their choice. It should be possible for the user to search all fields or let them specify a certain field to filter by. A custom filter was included on the results page to allow users to filter their search results by any free text value they wish. They are also able to filter by specific search fields should they wish. F.3 Provide some style of rating system to allow system users to rate Universities and view, at a glance, the overall user opinion of each University. A fairly advance star based user rating system was integrated into the results section of the system which allows a user to both view at a glance a Universitys current rating, but also cast their own vote.
61
F.4 The system should provide a section where users can comment and exchange their views on Universities. A comments section was included but I believe this is one section that could be improved in the future. H.1 The system should utilize appropriate system architecture to facilitate smooth web integration. After talking to many people about different possibilities it was decided to use PHP and MySQL as the systems architecture, this seemed to be a choice many people felt happy recommending. H.2 Security must be taken into consideration with the system being distributed globally. All passwords must be stored securely server side and preventative measures taken to thwart potential MySQL injection attacks with PHP. All passwords are embedded server side within the PHP code as to ensure that no user can gain access to them through the HTML source code. Basic measures were implemented to prevent injection attacks but this is again an area where improvement could be made.
62
CHAPTER 7. CONCLUSION
Conclusion 7 ____________________________________
Looking back over the entire project period I am extremely happy with what I have achieved and the outcome of the system as a whole. Firstly, I have gained what I would consider to be an in depth understanding of fuzzy logic and its benefits and drawbacks. Ive had the experience of working on a project from start to finish using my own initiative and decision making to take the design, development and implementation any direction which I felt was right. The sense of satisfaction from knowing that it was your own influences and effort which have produced the system is something to take great reward from and has been a very worthwhile learning experience on many levels. Secondly, the system that has been developed works exactly as it was intended and has certainly changed my previous opinions where I was perhaps guilty of neglecting background research and underestimating the value of a good design. Im now certain that the system would not have reached anywhere close to the level of functionality it has if these two sections hadnt received anything less than 100% of my efforts. The feedback which I received from many different users proved in the end to be vital as it highlighted some key areas which I will come back to in the Further Work section. There are areas where I still believe work needs to be done, especially concerning security which maybe majority of users may not be aware of. There are however areas which I believe I got right first time, had I the opportunity to start the project over from fresh I would still have picked the architecture which I did and the database would not have differed much at all from its current state.
FURTHER WORK
Whilst using the system and gauging users opinions through feedback there was a reoccurring area where people felt improvement might be necessary. The commenting system which was integrated into the results interface was effective at what it did but there was no implementation of any security features to prevent completely random individuals from leaving potentially misleading comments when they had no real knowledge about the University at all. This was also a common theme with the User rating system as some users struggled to understand exactly who the rating system was aimed towards. This is completely understandable as I had originally intended it to be aimed at current students of the University who would be in a position to cast a reliable vote. However, due to time constraints and other issues taking preference I was unable to implement a constraint which only allowed current students at the University to vote or leave their comments. Had such a feature been successfully implemented I think a lot more users would understand and accept its validity as a useful review tool. I got the impression from users that they were a little sceptical of it due to the fact that anyone could cast their vote, if only one vote per person. This is completely understandable and would definitely be a key area I would look into developing for further work.
63
References ____________________________________
Bunz, U., Curry, C., Voon, W (2007) Perceived versus actual computer-email-web fluency. Computers in Human Behaviour. Vol. 23 (5): p. 2321-2344
Schaap, Y. (2006) Easy Fuzzy Logic with MYSQL - The end of no results found Available at: http://www.yvoschaap.com/index.php/weblog/easy_fuzzy_logic_with_mysql_the_end_of_ no_results_found/ Sommerville, I. (2004) Software Engineering. 7th Edition. England: Addison Wesley
Zadeh, L. (1965) Fuzzy Sets. Information and Control. Vol. 8 (3): p.338-353
64
Appendix ____________________________________
Appendix 1 - FYP Proposal
65
1. Introduction
This project is designed to aid the user in the task of choosing their desired University. The project design will allow the user to select their preferred choice of University features and characteristics from a web based GUI. The GUI will then query a pre-constructed database with their selections. The database used will consist of a detailed set of characteristics for each University in the register. Once the query results are obtained they will be returned to the GUI and displayed to the user, with the option of saving the results to file. The database queries will each use a fuzzy logic approach to ensure that the results are returned on a percentage success match basis, rather than discrete yes or no values. Currently there are similar applications available for different purposes on the web. However, very few are based upon the fuzzy logic query approach, opting instead for a more discrete method which will rule out completely search results which do not directly match the search criteria. The proposed solution will need to be easy to use, relatively simple and intuitive, robust and accessible worldwide via the internet using most common internet browsers. To achieve this, testing will need to be carried out to create an efficient system that is able to withstand possible high flows of traffic and deal appropriately with possible misuse. The report will consist of background research into the area of similar existing web-based database querying applications and the purposes which they are used for. The proposed project section of the report will contain the details regarding the front end user interface along with that of the background database and the fuzzy logic queries which will interact between the two. Testing strategies and any security requirements will also be discussed in the proposed project section. The programme of work section includes all of the main stages of the projects development. This is illustrated by a Gantt chart (page 8) to show the time plan for the completion of each development stage. The resources section of the report
66
contains a list of all the required resources needed and any reasons given for their use. At the end of the report is the reference section where all material used for reference purposes will be acknowledged.
2. Background
The web based graphical user interface is designed to allow users of the application to quickly and effectively select any combination of search criteria for their preferred University. It is these selections which database queries will be constructed around.
Database queries are formed from the selections made via the GUI. Query results are obtained and the relevant fields are then projected back to the GUI.
Queries are then applied to the database and relevant results are selected.
Figure 1 Process of querying the database. Currently there are web applications which exhibit some of the characteristics related to this proposed project. An example of such a current application would be Auto Traders online car search facility. This application presents its users with an intuitive interface which allows people to enter a selection of car attributes from colour, millage and price range, down to its distance from a current postcode. The process of using these attributes to form database queries is very similar to the method proposed in this project. However, one significant difference which stands the proposed project apart from existing applications is the way in which the queries are actually applied to the database and the format in which the results are returned. Having researched and used the similar Auto Trader system it became apparent that the results returned were very much discrete in terms of the results being completely dropped from the set if they did not match completely to the query. Whilst in some applications this may be beneficial, in this case it would be possible for a car to match every other attribute selected by the user but be dropped completely by the result set due to it being a few miles outside of the postcode range, or a few miles over/under the desired millage. Although the method of database querying that the Auto Trader application uses is particularly discrete in its nature, the majority of the web integration and transition of data
67
between graphical user interface and database are very similar in structure to the proposed system.
Figure 2 User interface for Auto Traders online car search application. Another current application which I researched quite closely was one which concerned the comparison of cruising boats. This application was developed by a collector/designer of the boats whose goal was to construct an accurate template of the critical variables that go into a cruising boat and then search the database for boats fitting this template. The database here proved to be an excellent tool for storing information and calculating the various ratios and performance parameters required but unfortunately problems were met when attempting to construct the Ideal Cruising Boat template. The problem originated from using traditional logic statements to sort the database. The database program here was effective at sorting out and being queried for data within a discrete range, but these crisp logical terms totally exclude all boats outside of the range selected. In reality a value moderately less or greater than the crisp limits might be good enough for at least some consideration. Even the boats that are passed through the filters are not easily comparable since they are all ranked the same. Again, in reality, values closer to the midpoint of a range were often preferred by designers, at least as a starting point, and should be scored higher than those at the edges. This is where the idea of fuzzy logic querying came into prominence.
68
Fuzzy logic replaced the familiar crisp logical statements such as "greater than and less than" with linguistic statements such as close or very close. Without rigid crisp logic boundaries, these "fuzzy logic variables" were used to blur the edges of a logical set and allow each member in the set to be ranked individually. This was how the developer of the Ideal Cruising Boat application solved the problem of totally excluding all boats outside of the range and it is around this querying concept that the proposed Which University? application will be based.
69
The graphical user interface may also provide the option to save a set of results for a particular query to file. This would be of use in the event that a user wishes to view the result set at a later time, perhaps to show someone else or to compare the results against another set of selection criteria made by another member of the family.
3.2 Database
The database constructed will consist of each UK Universitys characteristics, features and facilities. There will also be a separate GUI constructed to allow an administrator to add, delete and amend existing University entries to the database. It should be insured that the database is kept consistent and that no duplicate records are stored. For data protection reasons it is essential that the database only stores information about each university that is relevant and required by the user.
70
3.4 Security
Security for this type of application is perhaps not as important as it is for other alternative types of software. The main security aspects to consider will, as mentioned, concern the data protection act. It is essential to insure that only completely necessary data about each University and its staff is stored within the database. If for any reason data needs to be stored in the database for administration purposes (such as staff contact numbers) it must be ensured that users of the application can in no way gain access to this information.
3.5 Testing
Thorough testing will need to be carried out to ensure the system is reliable and functions correctly. This will include trying to break the software to test its error handling capabilities. Unexpected query results need to be eliminated as errors could lead to problems within the system and sensitive information could be unwillingly projected to the user which is obviously a very serious issue. Testing of the accuracy of the query results will be the main area of testing within the application. This is because the fuzzy logic based approach will be the most complicated aspect and the area where it is most likely for problems to arise. Problems encountered could vary from a query throwing an exception and refusing to run or a query running successfully but for one reason or another returning a null data set. These types of errors will be easy to spot and hopefully the cause of the problem easy to identify. However, it may also be possible for less obvious errors to manifest within the application. It could be possible that the returned data set from a query actually looks like it is valid and has returned successfully but, due to problems within the fuzzy logic mathematics and implementation, percentage matches may not have been applied accurately. This would result in a data set of Universities being returned which did not relate to the search criteria from which the database query was formed. These types of errors may be more difficult to pinpoint within the application. Consideration and possible tests will also need to be carried out concerning the applications expected internet traffic flow once it has become fully web integrated. Not allowing for enough of a demand will lead to major slow down of the program and will probably in some situations cause the application to hang. After the testing has been carried out, I will test the system using various different users. Starting with advanced users who are more familiar with this type of system and then using novice users who have little or no experience using these types of system. The aim of the tests will be to assess the level of simplicity and intuitiveness of the application along with its complexity in delivering desirable results. These sorts of tests using neutral users will allow me to assess the balance between simplicity and complexity which will, in turn, allow me to judge whether or not the application has been delivered to allow the widest range of users possible to efficiently use the program to their benefit. Similar to the above it will be important to test the application using individuals from the two target age groups (Sixth Form/College students & parents). As an average, the gap between the competence levels regarding computer usage between these two generations is usually quite high. As this is the case it must be ensured that tests are carried out to assess whether the interface used by the system is clear and understandable to both groups of target user.
71
Failure to do so may mean that the application is unknowingly used in an incorrect fashion. This again has the potential to introduce errors and bugs into the software.
4. Programme of Work
The main parts of the programme of work can be broken into:
1) Familiarise self with tools and concepts This will involve familiarizing myself
with database concepts and querying using both MySql and also the planned fuzzy logic querying approach planned for some of the less discrete search factors. A good knowledge of certain concepts from the Artificial Intelligence field would also be a benefit along with the mathematical background behind fuzzy logic. Developing a good understanding of how database queries can be generated from information selected from a web based GUI will also be fundamental in the applications design. It will also be important that I familiarise myself with the process of uploading an SQL database into web space along with any important requirements and drawbacks connected to this. This will be essential if the application is to be implemented globally on the internet.
2) Conduct studies into potential user interface designs and estimated web
traffic for a web based version of the application A study into a suitable user interface must be conducted to ensure that the final version is suitable for use by both target age groups. It is vitally important to develop a suitable front end interface at this stage as this is what the users will be interacting with throughout their usage of the application. The best method of evaluating what would constitute a suitable interface would be to show members of each target age group a wide variety of similar existing interfaces and allow them to pick out areas of each layout which they find easy to use. Tables can then be constructed with the results gained which should allow me to generate a good balance between the differences in layout trends which may, or may not be apparent between the two generations of target user. A study must also be conducted in order to estimate the amount of web traffic will be accessing the application and how it would behave during varying levels of demand. A feasibility study will also need to be conducted to assess whether or not it will be possible to gain the level of performance that is required.
3) Design Software Any code which needs to be developed and written for the front
end graphical user interface, the background database and also the fuzzy logic queries which connect the two.
5) Testing with Further Development Incremental tests which are carried out
during the development phase to remove noticeable bugs and potential problems.
72
Software is beginning to become finalized and tests with users should be carried out to ensure that the system is appropriate.
6) Detailed Testing Thoroughly test the software to evaluate the application and
give the chance to remove any final bugs.
7) Finish Software & Write User Manual Create the final version of the software
and write all necessary user manuals needed for operation of the application.
8) Implement Software Globally Via The Internet Upload SQL database and user
interface into appropriate web space and ensure that the database is still querying correctly and that the results returned match the search criteria. Also ensure that the actual operating of the application is still smooth an efficient now that it is no longer located locally.
9) Finish Report
5. Resources Required
Access to a PC in order to develop the application. Access to an appropriate web server in order to upload and run the web based version of the completed application.
6. References
CSC355 Artificial Intelligence Intranet page o http://www.comp.lancs.ac.uk/~dixa/teaching/AI355/ Auto Trader Online Website
73
74