Predicting the Chances of Coronary Heart Disease

Logistic Regression

Multivariate Solutions

– Logistic regression is used extensively in the medical and social sciences as well as marketing applications such as prediction of a customer's propensity to purchase a product or cease a subscription. – Logistic regression makes use of several predictor variables that may be either numerical or categorical. • For example. sex and body mass index.What is ‘Logistic Regression’? • Logistic regression in a nutshell: – Logistic regression is used for prediction of the probability of occurrence of an event by fitting data to a logistic curve. the probability that a person has a heart attack within a specified time period might be predicted from knowledge of the person's age. .

Example: Calculating the Risk of Coronary Heart Disease • In this example. what are the risk factors associated with Coronary Heart Disease? How do they contribute to the chances of contracting the disease. Let us define a variable ‘Outcome – Death from Coronary Heart Disease’ – Outcome = 1 If 'The Individual will contract a form of Coronary Heart Disease' = 0 If 'The Individual will not contract a form of Coronary Heart Disease' • • The outcome takes only two possible values. .

Hypothesis: To Develop a Model to Determine the Risk of Contracting Coronary Heart Disease Logistic Regression • Risk Factors Contained in the Model – Smoking – Total Cholesterol Level (TCL -200) – Body Mass Index (BMI – 25) – Gender (1=male. 0=female) – Age (in years. less 50) – Hours of physical activity (weekly) .

123 Sig.028 0. and shows which of the variables are most influential in determining which risk factor is most relevant when considering Coronary Heart Disease. For example.006 Odds Ratio (Exponential (Beta) 2. That men are slightly more likely to get Coronary Heart Disease than women.058 0. 0=female) Age (in years less 50) Hourse of Physical Activity (weekly) Constant Regression Beta 0.038 0.015 0. the Odds-Ratio is often used to interpret the results.Ten Years Regression Output Smoking Total Cholesterol Level (TCL -200) Body Mass Index (BMI-25) Gender (1=male.363 This slide is descriptive.028 1. and that physical activity sharply reduces the chances of Coronary Heart Disease (negative coefficient). High cholesterol is also a risk factor.029 0.024 0.013 -4. smoking and a total cholesterol level above 200 are the highest risk factors.166 0.4 times that of nonsmokers. .898 0.455 1.024 -1. as is age.024 0.181 1.120 0. Smokers' risk of developing coronary heart disease is 2. When examining the results.060 1.Logistic Regression Output Risk of Coronary Heart Disease . 0.

Two examples follow: . • • A simulator can be used to classify individuals based on demographic data or a survey screen.Odds-Ratio • When a respondent’s choices are set within the regression model. – ‘Z’ is the outcome of the regression equation once all the questions are input. an ‘odds-ratio’ for each respondent is created using the formula of 1/(1+e-z).

066 0.058 0.000 0.520 A slightly obese. .013 Product (b*d) 0.Ten Years 0.028 0.119 0. smoker.123 Sum Odds Ratio (1/(1+e-z) Risk of Coronary Heart Disease .098 0. 0=female) Age (in years less 50) Hourse of Physical Activity (weekly) Equation Constant Answer 1 230 32 0 55 0 Regression Beta 0.Example One Inactive. with somewhat high total cholesterol and is physically inactive has an 18% chance of contracting Coronary Heart Disease within the next ten years.024 -1.406 0.098 1.000 -4. 55-year-old woman.980 0. Smoking.18 18% -1. 55-year-old Woman Risk of Coronary Heart Disease Regression Output Smoking Total Cholesterol Level (TCL -200) Body Mass Index (BMI-25) Gender (1=male.

052 -4.098 0. 0=female) Age (in years less 50) Hourse of Physical Activity (weekly) Equation Constant Answer 0 180 25 1 65 4 Regression Beta 0.109 Using the logistic output.066 0. physically active 65-year-old man with a good cholesterol level has practically no chance of contracting Coronary Heart Disease in the next ten years.000 0.00 0% -9.Ten Years 0.000 -1.320 0.024 -1.358 -4.123 Sum Odds Ratio (1/(1+e-z) Risk of Coronary Heart Disease .013 Product (b*d) 0.028 0. the chances of a non-smoking.028 0. .058 0.Example Two Health-Conscience 65-Year-Old Man Risk of Coronary Heart Disease Regression Output Smoking Total Cholesterol Level (TCL -200) Body Mass Index (BMI-25) Gender (1=male.