You are on page 1of 3

BIOE 340: Modeling and Simulation in Bioengineering

Review for Midterm

1. A fitted model with more predictors will generally have a lower Training Set Error than a
model with fewer predictors. True or False.
2. In a linear regression with several variables, if the F statistic is significant, all of the predictors
have statistically significant effects. True or False.
3. Test set error computed from 5-fold cross validation can be used to select best model. True
or False.
4. R2 statistic measures the proportion of variability in response that can be explained using
predictors. True or False.
5. Training set RSS and training set R2 cannot be used to select from a set of models with
different numbers of predictors. True or False.
6. Name one algorithm used to perform simple linear regression. What is the quantity optimized?
7. What are the two main goals of data-driven modeling?
8. Name one approach for estimating test set error. Explain the approach in 2-3 sentences.
9. Answer the questions based on what is provided in the given box. Consider a statistical
significance level of 0.05 for this problem.

>> myData = readtable('peruvian.txt','Delimiter',' ');


>> mdl = fitlm(myData,'Systol~Age+Weight+Height+Pulse+Diastol')

mdl =

Linear regression model:


Systol ~ 1 + Age + Weight + Height + Pulse + Diastol

Estimated Coefficients:
Estimate SE tStat pValue
________ ________ _______ ________

(Intercept) 33.394 64.276 0.51954 0.60814


Age -0.31555 0.31289 -1.0085 0.32328
Weight 0.94567 0.40098 2.3584 0.026838
Height 0.012416 0.042303 0.29351 0.77165
Pulse 0.11594 0.23816 0.48682 0.6308
Diastol 0.2436 0.25733 0.94666 0.35325

Number of observations: 30, Error degrees of freedom: 24


Root Mean Squared Error: 11
R-squared: 0.423, Adjusted R-Squared: 0.303
F-statistic vs. constant model: 3.52, p-value = 0.0158

a) Describe the null hypothesis for the F-test?


b) Is there a relationship between the predictors and the response? Justify your answer.

1
BIOE 340: Modeling and Simulation in Bioengineering
Review for Midterm
c) Describe the null hypotheses, which the p-value for Weight corresponds to.
d) Which predictors appear to have a statistically significant relationship to the

response? Justify your answer.
10. Hasan works for a pharmaceutical company and conducted a study to assess the efficacy and
potency of a group of drugs before allowing the agents to proceed to clinical trials. Taking the
Hill coefficient as 1, he fitted a Hill model. Figure given below shows the normalized dose-
response curves he fitted for four different drugs from the same class of medications (Drugs
A, B, C, & D).

Efficacy: maximum effect that a drug can produce regardless of dose.


Potency: amount of a drug that is needed to produce a given effect (e.g, for instance EC50).
Which of the following is true regarding these medications?
A. Drug A has greater efficacy than Drug D.
B. Drug D has greater efficacy than Drug A.
C. Drug A has greater potency than Drug D.
D. Drug D has greater potency than Drug A.
E. All four drugs (A,B,C,D) have equal potencies.
11. Barış works for a pharmaceutical company and he is supposed to assess the efficacy and
potency of 2 drugs before allowing the agents (Agent A and Agent B) to proceed to clinical
trials. He performed a preliminary experiment to get an overall idea and obtained some data.
He decided to model the dose-response relationship using a Hill Model in MATLAB (see
below).

% The data
>> dose=[1;10;60;100;200;540;600;850;1000]; % (mg/kg)
>> response_A=[8;43;70;74;77;78;79;79;85]; % (% response)
>> response_B=[9;49;84;89;93;95;96;96;98]; % (% response)
% Define the Hill function
>> fitHillModel=@(b,x) b(1).*x./(b(2)+x);

% Define initial values

2
BIOE 340: Modeling and Simulation in Bioengineering
Review for Midterm
>> init_A=[max(response_A);median(dose)];
>> init_B=[max(response_B);median(dose)];

% Fit the model using defined function


% and the MATLAB lsqcurvefit() function.
>> mdl_1=lsqcurvefit(fitHillModel,init_A,dose,response_A);
>> mdl_1

mdl_1 =

81.0451
9.0467
>> mdl_2=lsqcurvefit(fitHillModel,init_B,dose,response_B);
>> mdl_2

mdl_2 =

97.6532
9.8931

a) Write the form of the Hill Equation he used. State the variables and the parameters the equation
based on.
b) What are the values for the Hill slope (𝛼), 𝐸!"# and 𝐸𝐶$% for each drug?
c) Compare the drugs based on their efficacy and potency. Which one would you prefer? Explain.

You might also like