You are on page 1of 5

COURSE OUTLINE GIFT SCHOOL OF COMPUTING SCIENCE

CS-402

Data Mining
BSc. Computer Science Fall Semester, 2012
Faculty: Credit hours: Course level: Campus/Location/Instruction Mode: Course Convenor: GIFT School of Computing Science 3 (3 Cr. Hrs for lecture sessions ) Undergraduate GIFT University/On Campus/In Person Mr. Nadeem Qaisar Mehmood
UAN +92 (55)-111-900-100 (Ext. 179) TEL +92 (55)-389298 (Ext. 179)

nadeemqaisar@gift.edu.pk Pre-requisite: Timing CS-210 Database Systems MATH-102 Probability & Statistics Monday Friday Laboratory hours: Consultation hours: Core/Elective This document was last updated: 13:30-to-14:45 12:00-to-13:15 ROOM : F4 ROOM : F3

No lab sessions Monday Elective 15 October, 2012


th

09:20-to-10:35

ROOM : My Office

BRIEF COURSE DESCRIPTION The course introduces the process how to extract or uncover hidden information and identify patterns within the data. Data Mining discovers hidden knowledge from large datasets and it has recently created lots of interest in the database and data engineering communities. The course is an essential field of the 21th century. This course will provide a comprehensive introduction to the data mining process; build theoretical and conceptual foundations of key data mining tasks such as classification, association rule mining and clustering; discuss analysis and implementation of algorithms. The students will get hands on experience with the data mining process using different data mining tools. Selected research papers will also be discussed in the class to supplement the text book material.

SECTION A - TEACHING, LEARNING AND ASSESSMENT


COURSE AIMS It is planned to achieve the following goals To build a comprehensive foundation for to data mining To develop conceptual and theoretical understanding of the data mining process To provide hands-on experience in the implementation and evaluation of data mining algorithms To provide hands on experience with an available data mining tool (such as Weka e.t.c) to analyze different data sets To develop interest in data mining research LEARNING OUTCOME By the end of this course, we aim to impart the following skill-set to the students: A thorough understanding of data mining processes and techniques and their role in current information development Sound foundation to analyse a data set holding different data types and characteristics and then to suggest and apply best suitable mining techniques The ability to understand available data mining tools and their usage for analysing and extracting hidden information and data patterns. Student must have knowledge about the main fields/Methodologies of DM Acquaintance with the new research trends in the data mining stream CONTENT ORGANIZATION AND TEACHING STRATEGIES The course will be presented in minimum 30 x 1.25-hour (i.e. 75 minutes per lecture) sessions during the semester. The lecture session will involve the presentation of material by the lecturer, throughout which student participation and involvement is encouraged. There will be minimum four quizzes and four assignments throughout the course. CONTENT SUMMARY

WEEK 1

TOPICS Introduction Need,Origin, Motivation, Processes tasks and functionalities, Fields & Applications Data (What is data? Attributes(types and properties), Datasets) Data Preprocessing Aggregation, Data Cleaning, Sampling, Dimension Reduction, Feature Selection, Discretization, Attribute Transformation Similarity and Dissimilarity Proximity of simple attributes, Distance measures, Binary Vector Similarity, Cosine Similarity, Correlation Data Exploration Summary Statistics, Measures of Location and Spread, Visualization Techniques, OLAP Weka introduction and assignments in parallel Classification Basic Methods, Decision Tree: ID3, C4.5, CART, Learning as Search, Bias, Instance Based, Nearest Neighbor, Bayes Classifier, Issues of Classification. MIDTERM

3,4

4,5

5,6,7

9,10

11,12

13,14 15 (Depends on time)

Association Rules Market Basket Analysis, Frequent Item set, Association Rule, Confidence and Support Apriori Algorithm, FP-Growth Algorithm Clustering Cluster Analysis, K-Means, Hierarchical Clustering, Density Based Clustering, Grid Based Clustering, Graph Based Methods Recommender Systems Collaborative Filtering (Memory Based, Model based), Cross Selling, Profiling Mining Text and Web Text And Web Data Mining and Selected Research Papers& Presentations

Please note: This is a proposed schedule only and may be varied at the discretion of the Course Convenor to give a greater or lesser degree of emphasis to particular topics. TEXTS AND SUPPORTING MATERIALS Prescribed Texts and Readings: Introduction to Data Mining, Pang-Ning Tan, Michael Steinbach, Vipin Kumar AddisonWesley, 2005. ISBN: 0321321367 Students shall also be provided with the slides prepared by the instructor.

Reference Texts: Data Mining: Concepts and Techniques, Jiawei Han and Micheline Kamber, Second Edition, Morgan Kaufmann Publishers, 2006 Machine Learning, Tom Mitchell, McGraw Hill, 1997 Selected papers or handouts

ASSESSMENT Item 1. 2. 3. 4. 5. Assessment Task Assignments(4-6) Quizzes (3-5) Mid term exam Project Final Examination 3 hours Length Weighting 13% 14% 20.00% 18.00% 35.00% During Formal Examination Period Due Day and Time Various Weeks in Class Various Weeks in Class 8 or 9
th th

week

Students must complete each component of the assessment to the satisfaction of the course instructor. All components of the above assessment are compulsory, and must be completed in order to obtain a pass grade. Students are expected to perform satisfactorily in each item. Assessment Item No. 1 (Assignments) Four to Six home Assignments will be given to test how smart you can implement concepts learned during class. All students are encouraged to fully participate in class discussions. Assessment Item No. 2 (Project/Lab) Students will get the lab manuals hand outs in the labs and they all will have to execute the manual practically during the lab sessions. There is possibility of having a small quiz at the end of each lab session. In a total there could be minimum 5 and maximum 8 quizzes.

Assessment Item No. 3 (Quizzes) Four to Seven quizzes will be given to students in the class during various weeks which may or may not be announced. The objective is to test whether the students are keeping up with the work and grasping the concepts. All students are encouraged not to be miss class. Assessment Item No. 4 (Midterm) The mid term exam will be taken in 8 or 9tgh week. Exact date will be announced later on. Assessment Item No. 5 (Final Examination) It will be a 3-hour. Further details will be given towards the end of the semester. This will be held in the formal examination period at the end of the semester as scheduled by GIFT University. Late Submissions / Missed Quizzes: There will be no-retake of any quiz. A total of two late days will be provided and could be used in case any graded homework assignment is given. These can be used as a whole in one graded homework assignment or as one day each in two graded homework assignments. Coming Late to Lectures and Labs: The instructors reserve the right to not allow late comers to attend the lecture.

All students should check their email accounts daily for the assignments.
PLAGIARISM Unfortunately, some students have resorted to cheating and plagiarism in the past. We want to make it clear to you that we have a zero-tolerance policy for such cases. You have made it to GIFT after passing through significant competition. Do not squander a promising opportunity that may have significant impact on your future. After all, it will be too bad to be expelled from GIFT or to end up with an F. Some typical cases that we have encountered in the past include submitting identical homework, copying a paragraph from the internet for your assignment without referencing the source, and taking someone elses code, changing variable names in it, and then submitting it under your own name. The instructors also reserve the right to use automated tools to check for plagiarism. CLASS PARTICIPATION Participation in, and contribution to, class discussions will affect your final grade positively. Raise your hand if you have any question. Making any kind of disruption (e.g. side talks, continually come to class late, continually leaving class early, use of cell phones, etc) in the class will affect your participation grade negatively. Part of your participation grade will be based on: Your active participation in various assigned class content support roles, and You check your email at least twice in a day for any course announcements

ATTENDANCE Excellent attendance is expected. University policy automatically assigns a WF grade if a student misses 15% of the classes that are not excused. CHEATING "The act of plagiarism (unlawful copying of any form of intellectual ideas, thoughts, written scripts, etc.) will be prosecuted. Cheating or copying from your peers on exam, quiz, or homework is illegal and unethical and will consequently result in the student receiving ZERO grade on that particular exam/assignment; resubmission WILL NOT BE ALLOWED. MOBILE PHONE USAGE NOT ALLOWED

Usage of mobile phone is strictly prohibited in the class room. In case of mobile phone usage, it will be dealt according to university policies.

You might also like