A Data Mart on

“Public Educational Sector”

Prepared & Presented by
Prashanth Prabhakar, Yogesh Navaneethan & Santhosh Sivaraman

CONTENTS
• • • • • • • • OLAP [On-line Analytical Processing ] The Current Scenario of Public Education Sector… Why Data Mart? The Business Process… Tables Identified… The Schema… Data Mining? Conclusion.

Current Scenario-Organization Chart
Organisational Chart of Schools in a State
State

District 1

District 2

District 3

Block 1 School 1 School 2

Block 2

Block 1

Block 2 School 1

Block 1 School 1

Block 2

School 1

School 2

School 1

School 2

Teachers Students Infrastructure

Teachers Students Infrastructure

Teachers Students Infrastructure

Teachers Students Infrastructure

Teachers Students Infrastructure

Teachers Students Infrastructure

Teachers Students Infrastructure

Teachers Students Infrastructure

School 2

School 2

School 2

School 2

Teachers Students Infrastructure

Teachers Students Infrastructure

Teachers Students Infrastructure

Teachers Students Infrastructure

Data Mart
• A datawarehouse combines databases across an entire enterprise, “data marts are usually smaller and focus on a particular subject or department.” Some data marts, called dependent data marts, are subsets of larger datawarehouse.

Typical Data Warehousing Environment

Dependent Data Marts • Physical database that receives all its information from the data warehouse.

Logical Data Marts • • Filtered view of the main data warehouse but no physical existence No disk space required

The Business process
To analyze
The caste wise literate child population. The infrastructure facilities in each school. The Pupil Teacher Ratio.[PTR] Qualification wise teacher’s percentage. Year wise retirement of teachers. The percentage of Physically Challenged literate pupils. Schools that have largest pupil strength. Details of Out of School Children.

Tables Identified
• • • • • • • • • • • Dist_master Block_master Sch_mgmt_master Sch_type_master Caste_master Religion_Master Infrastructure_Master Class_Master Subject_Master Special_Need_Master Qualification_Master • • • • • • School_infrastructure School_Master Teacher_Table Student_Table Infrastructure_sub_table Teacher_Quali_sub_table

Logical Data Modeling
• The Logical Data Modeling involves tasks such as defining entities with attributes , keys,constraints and properties. • ER Diagrams.

Click here for Logical DataModel

Schema
• Database organization
– – – – must look like business must be recognizable by business user approachable by business user Must be simple

• Schema Types
– Star Schema – Snowflake schema

Why Snow Flakes Schema?
• Represent dimensional hierarchy directly by normalizing tables. • Easy to maintain and saves storage

Facts Tables
• Central table
– – – – mostly raw numeric items narrow rows, a few columns at most large number of rows (millions to a billion) Access via dimensions

Facts Tables in this Data mart are:1.Teacher_Table 2.Student_Table

Dimension Tables
– – – – – – Define business in terms already familiar to users Wide rows with lots of descriptive text Small tables (about a million rows) Joined to fact table by a foreign key heavily indexed typical dimensions

Total No. of Dimension Table in this Data Mart:15

Snowflake Schema for the Data Mart

Fact Tables

Data Mining
• Data mining (sometimes called data or knowledge discovery) is the process of analyzing data from different perspectives and summarizing it into useful information - information that can be used to increase revenue, cuts costs, or both. • It allows users to analyze data from many different dimensions or angles, categorize it, and summarize the relationships identified. • Technically, data mining is the process of finding correlations or patterns among dozens of fields in large relational databases.