Building Predictive Models for NYC Public High Schools
Alec Hubel | Introduction to Data Science - all !"#$Abstract
The New York City public school system (responsible for the education of over 1 million students) is the largest in the country. nfortunately! it"s si#e only makes it more susceptible to impeding issues. The fact that school budgets are consistently tightening is only worsened by the fact that $merican students are falling behind their international competition. $s a way to monitor the success of a school!the %epartment of &ducation monitors two key statistics ' high school graduation rates and aspirational performance measures. This study looks to uncover the key drivers of those measures in an at tempt to isolate the factors that are most responsible for a successful education in New York City public schools.
New York City public schools employ !*** teachers across over 1!** schools. These teachers are responsible for the education of 1.1 million students and represent an overwhelming portion of the +,- million annual budget. $ system of this magnitude reuires consistent monitoring in order to determineit"s efficacy. nfortunately! a evaluation of each and every school! teacher! and students would be a huge draw on already limited resources. /ecause of this! the %epartment of &ducation must rely on certain performance metrics to decide if a school"s performance is up to snuff. 0or high schools! the primary metrics that are used for this purpose are a schools graduation rate (what percentage of a senior class will successfully graduate in a given year) and it"s aspirational performance measure ($2). The New York 3tate %epartment of &ducation uses the below definition for aspirational performance measures4
The percent of students in the cohort who earned a 5egents diploma with $dvanced %esignation (i.e.! earned ,, units of course credit6 passed 78 5egents e9ams at a score of : or above6 and took advanced course seuences in Career and Technical &ducation! the arts! or a language other than &nglish)6 and
The percent of students in the cohort who graduated with a local! 5egents! or 5egents with $dvanced %esignation diploma and earned a score of or greater on their &nglish 5egents e9amination and an ;* or better on a mathematics 5egents e9am (note4 this aspirational measure is referred to as the <&=$>2ath $2?)This data point is meant to measure what percent of a graduating class is prepared for college or a post7high school career. 0or this analysis! @ attempted to build predictive , models. Ane for a school"s graduation rate! and one for a school"s aspiration performance measure.
The New York City %epartment of &ducation makes an enormous amount of data available for public use and review. Thanks to this fact! collecting all the data reuired for my analysis was substantially easier than anticipated. To start! @ decided to focus on the ,*117,*1, school year. @ wanted to keep the data as recent as possible! in order to have my results be as reflective of the current status of the school system as possible. The data that @ reuired was held across primarily B separate data7sets. The first data7set contained demographic data! presenting values for the racial composition of schools! what percentage of the student body ualified for free or subsidi#ed school lunches (a common pro9y for the income levels of a student population)! student7teacher ratios! and the graduation rates and $2s of individual schools. The second data7set contained budgetary information for each of the schools. 0rom this! @ was able to e9trapolate the dollar allocated per student. This would be a more useful measure for the funding of a school than the absolute budget! because a larger school would naturally have a larger budget! but may