You are on page 1of 3

STAT 378

Group Project Guidelines

Introduction
The project will consist of a multiple linear regression analysis on a dataset of your
choosing. In a group of three(or four), you will be expected to find a dataset,
develop questions of interest, run a complete multiple linear regression analysis,
and summarize your results in a written report as well as presentation in the class.

Finding a Data Set


Once given your group, you must find a dataset related to your topic of interest. You
may select a data set from any textbook, the Internet, another class, or use data that
you have collected on your own. I will provide you with some ideas of different
places to look for data for your specific topic. Your data set should have at least
three potential explanatory variables and at least twenty observations.

Once you have found a data set of interest, you will need to save it. The easiest way
to do this is to save the data file as a .csv file (which can be opened in both Excel and
Minitab). If you have questions on saving a data set, please contact me and I would
be happy to help.

Analyzing the Data


You have been/will be exposed to many procedures and analyses in multiple linear
regression. You are not expected to apply every technique we discuss in class to
your data set. However, your analysis should use the methods that are appropriate
for your chosen data set. I would suggest the following steps as a general approach:
 Make some scatterplots of each explanatory variable verses the response
variable. This will suggest whether there are any necessary transformations
when dealing with your data.
 Use model selection procedures to narrow your search down to two or three
candidate models.
 Think about Radj , MSE, hypothesis tests(partial F-test, t-test),
2

multicollinearity, and assumption checking when choosing your final model.


 Once you have settled on a final model, comment on the usefulness of that
model and possibly use it to make predictions and/or interpret the
regression coefficients within the context of the problem.

Written Report
General guidelines for the final report:
 It should be a typed formal write-up (full sentences, no bullet points)
 There is no specific length requirement, but I expect most of you will have 5-
15 pages.
 Include relevant graphs, tables, and Minitab output.
Oral presentation
 You will give a 13-15 minute presentation. All of you will present your
project. (If not, it should be arranged in advance.)
 Include relevant graphs, tables, and Minitab output.

General outline for both report and presentation:


 Introduction: Your write-up should include an introduction that describes a
description of your data and variables, where it came from, the question(s) of
interest, etc.
 Model Selection: You should describe the steps involved in your analysis and
how you reach to your final model. Make certain to include the purpose
behind each step in the analysis (e.g. “we transformed the response variable
because there was evidence on non-constant variance”). Include relevant
MINITAB output and its interpretation. In written report, add more details of
the challenges you met in the data analysis and how you solved it.
 Final Model Results: You should comment on the usefulness of your final
model, interpreting coefficient estimates and making predictions if that
applies to your question(s) of interest. You may wish to address whether
model assumptions are met and whether there are unusual observations
influencing the fit of your final model.
 Conclusion: This should be a discussion of what you learned about your data
and the relationship with the response. You may comment on difficulties or
issues encountered with the analysis, things you might have done differently
or that could be improved.
Grade Break Down and Due Dates

Portion of Project Due Date Points


Final report Tuesday, April 27th at 9:00am 60

Oral presentation (on Zoom) Group 1-5: Zoom during the 40


scheduled class time on Tuesday,
April 27th
Group 6-9: Zoom during the
scheduled class time on Thursday,
April 29th
Total 100
Things to look for
Introduction  Description of dataset, variables, where data came from
(15 points possible )  Outline questions of interest/main purpose of paper

Model Selection  Steps taken to determine final model fully described


(50 points possible )  Description of any new variables created
 Description of why (if any) variables were transformed

Final Model Results  Final model stated


(25 points possible) Potentially include:
 Assumptions checked
 Coefficients interpreted
 Predictions made

Conclusions  Summary/concluding remarks related to purpose of paper


(10 points possible)  Discussion of problems encountered, future ideas, things to do
differently (optional)

You might also like