You are on page 1of 2

# Project 3 – Functions (Simulation, Random Numbers & Linear Regression)

## Due: Start of Lecture, 10/18/10

Simulation, statistics and data visualization are all commonly used in biomedical and biomolecular engineering.
In this project, you will create: 1) a function that creates a simulated data set (a linear relationship with noise);
2) another function to perform a linear regression on this data to obtain the slope and intercept of the best fit
line representing it; and 3) a third function, that plots the data and its regression line. You will also create a
driver program to test and demonstrate the operation of each of these functions.

## Data Generation (Simulation Simulation) – linGen.m

Write a function that, given independent data vector x, slope m, intercept b, noise epsilon and random stream rs
parameters, returns a dependent data vector y. The pseudocode relationship for the dependent values is y 
mx + b + epsilon*z where z is the standard normal distribution (which the Matlab function randn draws from
stream rs).

## Data Analysis (Statistical Analysis) – linReg.m

Write a function that, given independent and dependent data vectors x and y (respectively) returns the slope
and intercept of the line that best fits (in terms of minimizing the sum square error) this data. See Example 8.5
in Chapman for a full explanation of the method. Important quantities and equations follow:

##  x = Sum of x (independent) values.

 y = Sum of y (dependent) values.
 x = Sum of the squares of the x values (note: square before summing).
2

##  xy = Sum of the products of corresponding x, y pairs.

x = mean of x values =  
 x 
 n 

y = mean of y values =  
 x 
 n 

 xy  y  x
 x  x x
2

## And the intercept: b  y  mx

Plot Results ( Data Visualization) – linPlot.m

Write a function that, given data vectors x and y and regression parameters m and b, plots the data and the best
fit line through it (as shown in the example below). The horizontal (x) limits on the regression line should be the
minimum and maximum values in the independent data vector x.

## Enter desired slope (arbitrary units): 0.5

Enter desired intercept (arbitrary units): 8.0
Enter desired noise parameter (arbitrary units): 1.5
Regression results: slope = 0.495589, intercept = 7.982158

## Driver Program & Deliverables (Function Testing & Documentation) – linProject.m

The driver program should create the independent data vector, x. This data should range from -10 to 10 and
include 101 values. It may be a simple sequence of values or a collection of uniformly distributed random values.
Be sure to state which in your documentation. It should than prompt the user for slope, intercept and noise
parameters. Finally, it should call the functions linGen, linReg and linPlot in sequence and correctly manage the
returned values, including displaying the calculated slope and intercept in the command window.

Submit your internally and externally (i.e. with report(s)) documented source code as a single zip file.