P. 1
Research on Hollywood movies of 2012

Research on Hollywood movies of 2012

|Views: 7|Likes:
Published by Sayed Anwar

It's an excel research done on movies that were released during the year 2012. category( Action, Comedy, Drama and Sci-Fi). The research involves the probability of hits and flop due viewer rating n etc.

It's an excel research done on movies that were released during the year 2012. category( Action, Comedy, Drama and Sci-Fi). The research involves the probability of hits and flop due viewer rating n etc.

More info:

Categories:Types, Research
Published by: Sayed Anwar on Feb 25, 2013
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as XLSX, PDF, TXT or read online from Scribd
See more
See less

09/17/2013

pdf

text

original

Genre Title Budget ($)in Millions

Action Ghost Rider: Spirit of Vengeance 57
Action The Cold Light of Day 20
Action Stolen 35
Action Resident Evil: Retribution 65
Action Red Dawn 65
Action The Man with the Iron Fists 20
Action Wrath of the Titans 150
Action Hit and Run 2
Action Haywire 23
Action Battleship 209
Action Lockout 20
Action This Means War 65
Action Snow White and the Huntsman 170
Action Act of Valor 12
Action Contraband 25
Action Taken 2 45
Action Safe 33
Action Premium Rush 35
Action The Bourne Legacy 125
Action John Carter 250
Action Safe House 85
Action The Expendables 2 100
Action Get the Gringo 20
Action The Amazing Spider-Man 230
Action Jack Reacher* 60
Action The Hunger Games 78
Action Dredd 45
Action The Raid: Redemption 1.1
Action End of Watch 7
Action Looper 30
Action Skyfall 200
Action The Avengers 220
Action The Dark Knight Rises 250
Action Django Unchained* 83
Comedy Madea's Witness Protection 20
Comedy Fun Size 14
Comedy The Three Stooges 30
Comedy One For The Money 40
Comedy That's My Boy 70
Comedy Mirror Mirror 85
Comedy Parental Guidance 6.5
Comedy Wanderlust 35
Comedy A Thousand Words 40
Comedy For a Good Time, Call... 5.7
Comedy Damsels in Distress 3
Comedy Think Like a Man 12
Comedy Diary of a Wimpy Kid: Dog Days 22
Comedy Iron Sky 7
Comedy Friends with Kids 10
Comedy Magic Mike 7
Comedy The Campaign 56
Comedy The Five-Year Engagement 30
Comedy Dark Shadows 150
Comedy To Rome with Love 24.8
Comedy This Is 40* 35
Comedy The Dictator 65
Comedy Celeste and Jesse Forever 8
Comedy Jeff, Who Lives at Home 10
Comedy Project X 12
Comedy Your Sister's Sister 0.125
Comedy Seeking a Friend for the End of the World 10
Comedy American Reunion 50
Comedy Men in Black 3 215
Comedy Safety Not Guaranteed 0.75
Comedy The Best Exotic Marigold Hotel 10
Comedy 21 Jump Street 42
Comedy Ted 65
Comedy Seven Psychopaths 15
Comedy Moonrise Kingdom 16
Drama Good Deeds 14
Drama Darling Companion 12
Drama Won't Back Down 19
Drama Cosmopolis 20
Drama W.E. 29
Drama Big Miracle 30
Drama Deadfall 12
Drama The Odd Life of Timothy Green 25
Drama Compliance 10
Drama Arbitrage 13
Drama The Words 6
Drama Salmon Fishing in the Yemen 14.5
Drama Smashed 5
Drama People Like Us 16
Drama Anna Karenina 50
Drama Hitchcock* 15
Drama Beasts of the Southern Wild 1.8
Drama We Need to Talk About Kevin 7
Drama Flight 31
Drama The Impossible 45
Drama The Master 35
Drama Silver Linings Playbook 21
Drama Argo 44.5
Drama The Perks of Being a Wallflower 13
Drama Lincoln 60
Drama Life of Pi 120
Sci-Fi Total Recall 125
Sci-Fi Chronicle 15
Sci-Fi Prometheus 130
Sci-Fi Cloud Atlas 102
Q1(i) Is it a good idea to make a bigger budget movie for profit?
Ans Not necessarily, the correlation, hypothesis test and regression model indicate that the budget of the movie is depended on the weekend, gross and viewer rating.
(ii) Why does the length of the movie affect the budget?
Ans The length of the movie affect the budget,because eg Sci-Fi and Action movies require's special effect which creates an excitement for viewer.
(iii) What is the correlation between viewer rating and gross collection?
Ans
Q2)(a) Ans
(b)
(c) Means 51.49267677
Variance 3653.788039
Mode 20
Standard Deviation 60.44657177
Q3) Correlation and Equality of Means
Budget ($)in Millions
Budget ($)in Millions 1
First week collection($) in Mi 0.669285261
Gross ($) in Millions 0.790045357
Length in minutes 0.566813015
Viewer Rating 0.233063936
*Movies released in recent week.
According to the data of movies released during the year of 2012 shows that there is no correlation between viewer rating and gross collection.
Sources-www.Imdb.com,www.wikipedia.com,www.boxofficemojo.com
Unit of measurement -Gross collection,Weekend collection and Gross collection are measured in $ Millions.
length is measured in Minutes and Viewer rating are counted out of 10.
From the above table we can see that the movies released during the year 2012,
gross collection of the movies.we can see from the correlation matrix above that the co
weekend collection while considering it's impact on gross collection i.e .(0.937311029) the higher the weekend collection th
H0: the first week collection has no impact on gross collection for all the movies released during the year 2012.
Mathematically H0 : µ first weekend collection - µ gross collection = 0

First weekend collection($) in Mi
Mean 19.86907785
Variance 1048.60121
Observations 99
Hypothesized Mean Difference 0
df 101
t Stat -5.000790229
P(T<=t) one-tail 1.20597E-06
t Critical one-tail 1.66008063
P(T<=t) two-tail 2.41194E-06
t Critical two-tail 1.983731003
Regression Model
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.937311029
R Square 0.878551965
Adjusted R Square 0.877299923
Standard Error 85.13958791
Observations 99
ANOVA
df
Regression 1
Residual 97
Total 98
Coefficients
Intercept 3.322237343
First weekend collection($) in Mi 7.035381893
RESIDUAL OUTPUT
Observation Predicted Gross ($) in Millions
t-Test: Two-Sample Assuming Unequal Variances
Mathematically H0 : µ first weekend collection - µ gross collection = 0

H1: Gross collection of all the movies released during the yr 2012 is directly proportional to the weekend collection.
Mathematically H1 : µ first weekend collection - µ gross collection ≠ 0

Let us consider α =.05 to establish this hypothesis test , to establish same we will use two tail test by using t
of two sample assuming unequal variance with hypothesized mean difference as zero.
As we can see that t stat is less than t critical two tail hence we can reject H0 and conclude that H1 is true which states that
collection of all the movies released during the yr 2012 is directly proportional to the weekend collection.

1 158.9120577
2 8.951893652
3 4.610591653
4 151.432694
5 103.7640489
6 58.9790028
7 238.706332
8 44.83099051
9 62.59793289
10 182.9694828
11 47.1655835
12 125.7796021
13 398.835226
14 175.5246909
15 174.6324849
16 351.6775466
17 58.84926332
18 47.64514327
19 271.6715777
20 215.6513855
21 285.9526642
22 204.4734442
23 5.643040981
24 439.5488966
25 113.0741949
26 1076.46947
27 47.49381924
28 4.826296462
29 95.85638517
30 149.6690996
31 625.0017462
32 1462.732768
33 1135.225799
34 219.2240369
35 181.954629
36 32.17445809
37 122.9949628
38 84.3402178
39 97.97425322
40 130.8883798
41 107.4458894
42 49.23971258
43 46.77472582
44 4.334875036
45 3.734433333
46 239.9664744
47 106.204841
48 3.5332988
49 17.51588111
50 278.5968207
51 190.3822074
52 77.96806136
53 212.1694765
54 5.864535909
55 84.78615548
56 125.9847679
57 4.080545981
58 9.342476948
59 151.4266154
60 84.78615548
61 30.21711635
62 154.6820062
63 387.4032862
64 4.010030348
65 8.507672603
66 258.7249765
67 386.1539853
68 32.69435874
69 7.001713932
70 112.9610941
71 3.603385275
72 21.6379395
73 3.81709907
74 3.653420911
75 57.91824309
76 3.458660434
77 79.46549314
78 3.437807562
79 17.40823273
80 36.74659097
81 4.911487901
82 3.511791638
83 33.26076327
84 5.578413963
85 5.346422245
86 4.516155722
87 3.495216278
88 178.5072285
89 6.136390101
90 8.502466421
91 6.438932628
92 140.2174651
93 4.928830117
94 9.965804749
95 161.2772124
96 183.2715329
97 158.12947
98 362.4791936
99 70.94806584
CONCLUSION
-500
0
500
0.00 50.00 100.00 150.00 200.00 250.00
R
e
s
i
d
u
a
l
s

First weekend collection($) in Mi
First weekend collection($) in
Mi Residual Plot
As per the analysis of data abovefor movies released during the year 2012 indicates that gross collections were dependent on fi
collections. In action movies viewer rating and length were depended on the Gross collection but in
accurate of viewer rating and length are correlated . Comedy movies data displayed they were directly coeffecient with gross collection which is
explained by regression test, it slated that variance of 70.1% of population of gross collection can be explained by variance of first weekend
collection thus regression model is very accurate and hence supporting our test
weekend collection and the variance of 73.4% of population of gross collection can be explained by variance of Budget, thus regression model i
very accurate and hence supporting our test. in our data the Sci-Fi movies released during the year 2012 have small percentage c
genre. In Sci-Fi movies budget and length were co related as budget were higher for the sci
Budget can be explained by variance of length of the movie thus regression model is not very accurate and hence

MIB 2012 (SEPT)
PROJECT BY: SAYED ANWAR & HETAL KHATRI
First weekend collection($) in Mi Gross ($) in Millions Length in minutes Viewer Rating
22.12 132.5 95 4.4
0.80 16.8 93 4.8
0.18 2.5 96 5.3
21.05 221.6 95 5.3
14.28 39 114 5.5
7.91 18.4 96 5.8
33.46 301.9 99 5.8
5.90 14.4 100 5.9
8.43 33.3 93 5.9
25.53 302 131 6
6.23 28 95 6.1
17.41 156.3 97 6.3
56.22 396.3 127 6.3
24.48 80.4 110 6.4
24.35 96.2 110 6.4
49.51 365 91 6.4
7.89 40.3 95 6.5
6.30 30.6 91 6.6
38.14 276 135 6.7
30.18 282.7 132 6.7
40.17 207.8 115 6.8
28.59 312.5 103 7
0.33 7.5 96 7.1
62.00 752 136 7.2
15.60 110 130 7.3
152.54 686.6 142 7.3
6.28 36.2 95 7.4
0.21 4.1 101 7.6
13.15 40 109 7.7
20.80 166 118 7.8
88.36 978 143 8
207.44 1550.1 143 8.4
160.89 1081 165 8.7
30.69 150 165 8.8
25.39 65.6 114 3.9
4.10 9.2 90 5
17.01 53 92 5.1
11.52 36.8 91 5.1
13.45 57.7 114 5.5
18.13 162.8 106 5.5
14.80 29.3 105 5.6
6.53 21.4 98 5.6
6.18 20.5 91 5.6
0.14 1.2 85 5.7
0.06 1.3 99 6
33.64 99.19 123 6
14.62 76.5 94 6
0.03 8 93 6.1
2.02 12 100 6.1
39.13 165 110 6.2
26.59 103 85 6.2
10.61 53.7 124 6.3
29.69 238.7 150 6.3
0.36 73 112 6.4
11.58 20.7 133 6.5
17.44 177.5 83 6.5
0.11 26 92 6.6
0.86 4.5 83 6.6
21.05 101 88 6.6
11.58 1.1 90 6.7
3.82 9.6 101 6.7
21.51 234.7 113 6.9
54.59 624 106 6.9
0.10 4 86 7.1
0.74 134 124 7.2
36.30 202 109 7.2
54.42 501.7 106 7.3
4.17 15.1 110 7.8
0.52 65 94 7.9
15.58 35 111 4.3
0.04 7.9 103 4.6
2.60 5.2 121 4.9
0.07 6.5 109 5.3
0.05 0.89 119 5.4
7.76 24 107 6.3
0.02 0.45 95 6.4
10.82 51.6 104 6.5
0.02 31 90 6.7
2.00 23 100 6.7
4.75 11.4 96 6.8
0.23 34.5 107 6.8
0.03 2.9 81 7
4.26 12.4 114 7.1
0.32 27 130 7.1
0.29 4.5 98 7.3
0.17 11 93 7.5
0.02 6 112 7.5
24.90 95.5 139 7.5
0.40 60.3 113 7.7
0.74 18.8 143 7.8
0.44 32 122 8.2
19.46 159.6 120 8.2
0.23 28 102 8.3
0.94 122.2 150 8.3
22.45 240 127 8.3
25.58 198 118 6.3
22.00 126 83 7.1
51.05 402.52 124 7.2
9.61 65.6 171 8.1
Not necessarily, the correlation, hypothesis test and regression model indicate that the budget of the movie is depended on the weekend, gross and viewer rating.
The length of the movie affect the budget,because eg Sci-Fi and Action movies require's special effect which creates an excitement for viewer.
19.86907785 143.1087879 109.6161616 6.586868687
1048.60121 59076.97557 383.1981035 1.045029891
11.579175 28 95 6.3
32.3821125 243.0575561 19.57544644 1.022267035
First weekend collection($) in Mi Gross ($) in Millions Length in minutes Viewer Rating
1
0.937311029 1
0.468210392 0.473707897 1
0.277329405 0.344214701 0.440210241 1
According to the data of movies released during the year of 2012 shows that there is no correlation between viewer rating and gross collection.
Sources-www.Imdb.com,www.wikipedia.com,www.boxofficemojo.com
Unit of measurement -Gross collection,Weekend collection and Gross collection are measured in $ Millions.
length is measured in Minutes and Viewer rating are counted out of 10.
From the above table we can see that the movies released during the year 2012, the weekend collection has a big impact on the
gross collection of the movies.we can see from the correlation matrix above that the co-efficeient of correlation is highest for
weekend collection while considering it's impact on gross collection i.e .(0.937311029) the higher the weekend collection the
the first week collection has no impact on gross collection for all the movies released during the year 2012.
µ gross collection = 0
Gross ($) in Millions
143.1087879
59076.97557
99
SS MS F Significance F
5086414.911 5086414.911 701.6955077 3.37199E-46
703128.6946 7248.749429
5789543.605
Standard Error t Stat P-value Lower 95% Upper 95%
10.05320477 0.3304655 0.741760972 -16.63059126 23.27507
0.265590984 26.48953582 3.37199E-46 6.508257309 7.562506
Residuals
µ gross collection = 0
H1: Gross collection of all the movies released during the yr 2012 is directly proportional to the weekend collection.
µ gross collection ≠ 0
=.05 to establish this hypothesis test , to establish same we will use two tail test by using t-statistic using t test
of two sample assuming unequal variance with hypothesized mean difference as zero.
critical two tail hence we can reject H0 and conclude that H1 is true which states that Gross
collection of all the movies released during the yr 2012 is directly proportional to the weekend collection.
-26.41205774
7.848106348
-2.110591653
70.167306
-64.76404889
-40.5790028
63.19366799
-30.43099051
-29.29793289
119.0305172
-19.1655835
30.5203979
-2.535226016
-95.12469093
-78.4324849
13.32245337
-18.54926332
-17.04514327
4.328422285
67.04861446
-78.15266424
108.0265558
1.856959019
312.4511034
-3.074194882
-389.8694699
-11.29381924
-0.726296462
-55.85638517
16.33090036
352.9982538
87.36723239
-54.22579948
-69.22403689
-116.354629
-22.97445809
-69.99496277
-47.5402178
-40.27425322
31.91162016
-78.14588937
-27.83971258
-26.27472582
-3.134875036
-2.434433333
-140.7764744
-29.70484097
4.4667012
-5.515881111
-113.5968207
-87.3822074
-24.26806136
26.53052345
67.13546409
-64.08615548
51.51523209
21.91945402
-4.842476948
-50.42661543
-83.68615548
-20.61711635
80.01799377
236.5967138
-0.010030348
125.4923274
-57.14497649
115.5460147
-17.59435874
57.99828607
-77.96109408
4.296614725
-16.4379395
2.68290093
-2.763420911
-33.91824309
-3.008660434
-27.86549314
27.56219244
5.591767268
-25.34659097
29.5885121
-0.611791638
-20.86076327
21.06158604
-0.846422245
6.483844278
2.504783722
-83.00722852
54.1636099
10.29753358
25.56106737
19.38253492
23.07116988
112.2341953
78.72278758
14.72846715
-32.12946999
40.04080642
-5.348065843
To establish the relation mentioned above we use regression analysis by assuming Gross
collection as dependent variable and first weekend collection as causal variable hence we
plot Gross collectiona t Y axis and First weekend collection at X axis. As per the table
resulted by regression we can see that R square is 87.8 % which indicates that variance of
87.8% of population of gross collection can be explained by variance of first weekend
collection thus regression model is very accurate and hence supporting our hypothesis test.
The significance quotient is only 3.3% which clearly indicates that probablity of regression
obtained above by chance is only 3.3% and hence this model can be considered accurate
and significant again supporting our hypothesis.
Another observation from regression model can be inferred from p values of Y interceot as
the Pvalue is very low only at 0.74 hence again the probablity of such regression obtained
by chance is very low .
The fourth and most importnat inference can be interpreted from the residual effects
graph not following a specific pattern and hence regression can be considered robust and
hence the hypothesis that Gross collection is a dependent variable of first weekend
collection holds true.
per the analysis of data abovefor movies released during the year 2012 indicates that gross collections were dependent on first weekend
collections. In action movies viewer rating and length were depended on the Gross collection but in Regression test we found out that only 45% are
. Comedy movies data displayed they were directly coeffecient with gross collection which is
variance of 70.1% of population of gross collection can be explained by variance of first weekend
collection thus regression model is very accurate and hence supporting our test. Drama movies were dependent on the budget and has relation with
and the variance of 73.4% of population of gross collection can be explained by variance of Budget, thus regression model is
Fi movies released during the year 2012 have small percentage compare to other
Fi movies budget and length were co related as budget were higher for the sci-fi movies and that variance of only 34%of population of
Budget can be explained by variance of length of the movie thus regression model is not very accurate and hence partially supporting our test.
PROJECT BY: SAYED ANWAR & HETAL KHATRI
Lower 95.0%Upper 95.0%
-16.6306 23.27507
6.508257 7.562506
Title Budget ($)in Millions First weekend collection($) in Mi
Ghost Rider: Spirit of Vengeance 57.00 22.12
The Cold Light of Day 20.00 0.80
Stolen 35.00 0.18
Resident Evil: Retribution 65.00 21.05
Red Dawn 65.00 14.28
The Man with the Iron Fists 20.00 7.91
Wrath of the Titans 150.00 33.46
Hit and Run 2.00 5.90
Haywire 23.00 8.43
Battleship 209.00 25.53
Lockout 20.00 6.23
This Means War 65.00 17.41
Snow White and the Huntsman 170.00 56.22
Act of Valor 12.00 24.48
Contraband 25.00 24.35
Taken 2 45.00 49.51
Safe 33.00 7.89
Premium Rush 35.00 6.30
The Bourne Legacy 125.00 38.14
John Carter 250.00 30.18
Safe House 85.00 40.17
The Expendables 2 100.00 28.59
Get the Gringo 20.00 0.33
The Amazing Spider-Man 230.00 62.00
Jack Reacher* 60.00 15.60
The Hunger Games 78.00 152.54
Dredd 45.00 6.28
The Raid: Redemption 1.10 0.21
End of Watch 7.00 13.15
Looper 30.00 20.80
Skyfall 200.00 88.36
The Avengers 220.00 207.44
The Dark Knight Rises 250.00 160.89
Django Unchained* 83.00 30.69
Correlation and Equality of Means
Budget ($)in Millions
Budget ($)in Millions 1
First week collection($) in Mi 0.63709825
Gross ($) in Millions 0.767998296
Length in minutes 0.702548668
Viewer Rating 0.336751297
As the co-efficient of correlation is highest Length of action movies are related to the viewer rating
.The more action scence in the movie the more the viewer rating.
Viewer Rating
Mean 6.652941176
Variance 1.100142602
Observations 34
Hypothesized Mean Difference 0
df 33
t Stat -28.77296391
P(T<=t) one-tail 2.98714E-25
t Critical one-tail 1.692360309
P(T<=t) two-tail 5.97428E-25
t Critical two-tail 2.034515297
Regression Model
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.676707073
R Square 0.457932462
t-Test: Two-Sample Assuming Unequal Variances
As the co-efficient of correlation is highest Length of action movies are related to the viewer rating
.The more action scence in the movie the more the viewer rating.
H0: length of the movie has no impact on viewer rating for all Action movies released during the
year 2012.
Mathematically H0 : µ length of movies - µ viewer rating = 0

H1: Viewer rating of all the action movies released during the yr 2012 is directly proportional to
the length of the movie.
Mathematically H1 : µ first weekend collection - µ gross collection ≠ 0

Let us consider α =.05 to establish this hypothesis test , to establish same we will use two tail test
by using t-statistic using t test of two sample assuming unequal variance with hypothesized mean
As we can see that t stat is less than t critical two tail hence we can reject H0 and conclude that H1 is
true Viewer rating of all the action movies released during the yr 2012 is directly proportional to the
length of the movie , as evident the action movies have special effects which will be more as per
length of movoe and hence the good viewer rating .
Adjusted R Square 0.440992852
Standard Error 16.15683708
Observations 34
ANOVA
df
Regression 1
Residual 32
Total 33
Coefficients
Intercept 20.65671279
Viewer Rating 13.94196183
RESIDUAL OUTPUT
Observation Predicted Length in minutes
1 82.00134483
2 87.57812956
3 94.54911047
4 94.54911047
5 97.33750284
6 101.5200914
7 101.5200914
8 102.9142876
9 102.9142876
10 104.3084837
11 105.7026799
12 108.4910723
13 108.4910723
14 109.8852685
15 109.8852685
16 109.8852685
17 111.2794647
18 112.6736608
19 114.067857
20 114.067857
21 115.4620532
22 118.2504456
23 119.6446418
24 121.0388379
25 122.4330341
26 122.4330341
27 123.8272303
28 126.6156227
29 128.0098189
30 129.404015
31 132.1924074
32 137.7691921
33 141.9517807
34 143.3459769
-40
-30
-20
-10
0
10
20
30
0.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 9.00 10.00
Gross ($) in Millions Length in minutes Viewer Rating
132.50 95.00 4.40
16.80 93.00 4.80
2.50 96.00 5.30
221.60 95.00 5.30
39.00 114.00 5.50
18.40 96.00 5.80
301.90 99.00 5.80
14.40 100.00 5.90
33.30 93.00 5.90
302.00 131.00 6.00
28.00 95.00 6.10
156.30 97.00 6.30
396.30 127.00 6.30
80.40 110.00 6.40
96.20 110.00 6.40
365.00 91.00 6.40
40.30 95.00 6.50
30.60 91.00 6.60
276.00 135.00 6.70
282.70 132.00 6.70
207.80 115.00 6.80
312.50 103.00 7.00
7.50 96.00 7.10
752.00 136.00 7.20
110.00 130.00 7.30
686.60 142.00 7.30
36.20 95.00 7.40
4.10 101.00 7.60
40.00 109.00 7.70
166.00 118.00 7.80
978.00 143.00 8.00
1550.10 143.00 8.40
1081.00 165.00 8.70
150.00 165.00 8.80
First weekend collection($) in Mi Gross ($) in Millions Length in minutes Viewer Rating
1
0.942919066 1
0.678489345 0.669238903 1
0.519736102 0.51819843 0.676707073 1
efficient of correlation is highest Length of action movies are related to the viewer rating
.The more action scence in the movie the more the viewer rating.
Length in minutes
113.4117647
466.9768271
34
efficient of correlation is highest Length of action movies are related to the viewer rating
.The more action scence in the movie the more the viewer rating.
length of the movie has no impact on viewer rating for all Action movies released during the
µ viewer rating = 0
H1: Viewer rating of all the action movies released during the yr 2012 is directly proportional to
µ gross collection ≠ 0
=.05 to establish this hypothesis test , to establish same we will use two tail test
statistic using t test of two sample assuming unequal variance with hypothesized mean
critical two tail hence we can reject H0 and conclude that H1 is
Viewer rating of all the action movies released during the yr 2012 is directly proportional to the
length of the movie , as evident the action movies have special effects which will be more as per
SS MS F Significance F
7056.846996 7056.846996 27.0332344 1.1127E-05
8353.388298 261.0433843
15410.23529
Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0%
18.05364614 1.144185093 0.261033735 -16.11736101 57.43078659 -16.11736101
2.681481989 5.19934942 1.1127E-05 8.479961753 19.4039619 8.479961753
Residuals
12.99865517
5.421870443
1.45088953
0.45088953
16.66249716
-5.520091383
-2.520091383
-2.914287566
-9.914287566
26.69151625
-10.70267993
-11.4910723
18.5089277
0.114731521
0.114731521
-18.88526848
-16.27946466
-21.67366084
20.93214297
17.93214297
-0.46205321
-15.25044558
-23.64464176
14.96116206
7.566965877
19.56696588
-28.82723031
-25.61562267
-19.00981885
-11.40401504
10.8075926
5.230807868
23.04821932
21.65402314
10.00
To establish the relation mentioned above we use regression analysis by assuming viewer
rating as dependent variable and length of the movie as causal variable hence we plot
viewer rating at Y axis and length of the movie at X axis. As per the table resulted by
regression we can see that R square is only45 % which indicates that variance of only
45%of population of viewer rating can be explained by variance of length of the movie thus
regression model is not very accurate and hence partially supporting our hypothesis test.
The significance quotient is only 1.1% which clearly indicates that probablity of regression
obtained above by chance is only 1.1% and hence this model can be considered partially
significant again supporting our hypothesis.
Another observation from regression model can be inferred from p values of Y interceot as
the Pvalue is very low only at 0.26 hence again the probablity of such regression obtained
by chance is very low .
The fourth and most importnat inference can be interpreted from the residual effects
graph not following a specific pattern and hence regression can be considered robust and
hence the hypothesis that viewer collection is a dependent variable of length of the movie
partially holds true.
Upper 95.0%
57.43078659
19.4039619
Action
Title Budget ($)in Millions First weekend collection($) in Mi
Madea's Witness Protection 20 25.390575
Fun Size 14 4.101017
The Three Stooges 30 17.010125
One For The Money 40 11.51579
That's My Boy 70 13.453714
Mirror Mirror 85 18.132085
Parental Guidance 6.5 14.8
Wanderlust 35 6.52665
A Thousand Words 40 6.17628
For a Good Time, Call... 5.7 0.143935
Damsels in Distress 3 0.058589
Think Like a Man 12 33.636303
Diary of a Wimpy Kid: Dog Days 22 14.623599
Iron Sky 7 0.03
Friends with Kids 10 2.017466
Magic Mike 7 39.12717
The Campaign 56 26.58846
The Five-Year Engagement 30 10.61006
Dark Shadows 150 29.685274
To Rome with Love 24.8 0.361359
This Is 40* 35 11.579175
The Dictator 65 17.435092
Celeste and Jesse Forever 8 0.107785
Jeff, Who Lives at Home 10 0.855709
Project X 12 21.051363
Your Sister's Sister 0.125 11.579175
Seeking a Friend for the End of the World 10 3.822803
American Reunion 50 21.51408
Men in Black 3 215 54.592779
Safety Not Guaranteed 0.75 0.097762
The Best Exotic Marigold Hotel 10 0.737051
21 Jump Street 42 36.302612
Ted 65 54.415205
Seven Psychopaths 15 4.174915
Correlation and Equality of Means
Budget ($)in Millions
Budget ($)in Millions 1
First week collection($) in Million 0.613835983
Gross ($) in Millions 0.782226864
Length in minutes 0.334702147
Viewer Rating 0.082634184
In comedy movies the weekend collection matters. The weekend collection will have a strong
First weekend collection($) in Mi
Mean 15.06629285
Variance 228.0955534
Observations 34
Hypothesized Mean Difference 0
df 34
t Stat -3.498155532
P(T<=t) one-tail 0.000663816
t Critical one-tail 1.690924255
P(T<=t) two-tail 0.001327631
t Critical two-tail 2.032244509
Regression Model
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.841320894
R Square 0.707820847
t-Test: Two-Sample Assuming Unequal Variances
In comedy movies the weekend collection matters. The weekend collection will have a strong
impact on the gross collection as the coefficient of correlation is highest . The budget does not
have relation with length of the movie.The viewer rating does not matter because every person has
it's own taste in understanding comedy.
H0: the first week collection has no impact on gross collection for all the comedy movies released
during the year 2012.
Mathematically H0 : µ weekend collection - µ gross collection = 0

H1: Gross collection of all the comedy movies released during the yr 2012 is directly proportional
to the weekend collection.
Mathematically H1 : µ weekend collection - µ gross collection ≠ 0

Let us consider α =.05 to establish this hypothesis test , to establish same we will use two tail test
by using t-statistic using t test of two sample assuming unequal variance with hypothesized mean
As we can see that t stat is less than t critical two tail hence we can reject H0 and conclude that H1 is
true which states that Gross collection of all the comedymovies released during the yr 2012 is
directly proportional to the weekend collection.

Adjusted R Square 0.698690248
Standard Error 75.73546297
Observations 34
ANOVA
df
Regression 1
Residual 32
Total 33
Coefficients
Intercept -17.46393553
First weekend collection($) in Mi 7.685921708
RESIDUAL OUTPUT
Observation Predicted Gross ($) in Millions
1 177.686036
2 14.05616006
3 113.2745535
4 71.04552481
5 85.94025695
6 121.8978502
7 96.28770575
8 32.69938539
9 30.006469
10 -16.35766239
11 -17.01362506
12 241.0620559
13 94.93190147
14 -17.23335788
15 -1.957849803
16 283.2644297
17 186.8928864
18 64.08415495
19 210.6947563
20 -14.68655854
21 71.53269696
22 116.5408166
23 -16.63550846
24 -10.88702315
25 144.3351923
26 71.53269696
27 11.91782903
28 147.891599
29 402.1318897
30 -16.71254445
31 -11.79901925
32 261.5550981
33 400.7670698
34 14.6241343
-200
0
200
400
0 10 20 30 40 50 60
R
e
s
i
d
u
a
l
s

First weekend collection($) in Mi
First weekend collection($) in Mi Residual Plot
Gross ($) in Millions Length in minutes Viewer Rating
65.6 114 3.9
9.2 90 5
53 92 5.1
36.8 91 5.1
57.7 114 5.5
162.8 106 5.5
29.3 105 5.6
21.4 98 5.6
20.5 91 5.6
1.2 85 5.7
1.3 99 6
99.19 123 6
76.5 94 6
8 93 6.1
12 100 6.1
165 110 6.2
103 85 6.2
53.7 124 6.3
238.7 150 6.3
73 112 6.4
20.7 133 6.5
177.5 83 6.5
26 92 6.6
4.5 83 6.6
101 88 6.6
1.1 90 6.7
9.6 101 6.7
234.7 113 6.9
624 106 6.9
4 86 7.1
134 124 7.2
201.58 109 7.2
501.7 106 7.3
15.1 110 7.8
First weekend collection($) in Mi Gross ($) in Millions Length in minutes Viewer Rating
1
0.841320894 1
0.29738332 0.267165364 1
0.109791556 0.330282543 0.110400891 1
In comedy movies the weekend collection matters. The weekend collection will have a strong
Gross ($) in Millions
98.33441176
19036.42455
34
In comedy movies the weekend collection matters. The weekend collection will have a strong
on the gross collection as the coefficient of correlation is highest . The budget does not
have relation with length of the movie.The viewer rating does not matter because every person has
the first week collection has no impact on gross collection for all the comedy movies released
µ gross collection = 0
H1: Gross collection of all the comedy movies released during the yr 2012 is directly proportional
µ gross collection ≠ 0
=.05 to establish this hypothesis test , to establish same we will use two tail test
statistic using t test of two sample assuming unequal variance with hypothesized mean
critical two tail hence we can reject H0 and conclude that H1 is
Gross collection of all the comedymovies released during the yr 2012 is
SS MS F Significance F
444654.479 444654.479 77.52184532 4.63733E-10
183547.5313 5735.860352
628202.0102
Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0%
18.48447165 -0.944789543 0.351845885 -55.11557216 20.1877011 -55.11557216
0.872939017 8.804649074 4.63733E-10 5.907803117 9.464040298 5.907803117
Residuals
-112.086036
-4.856160057
-60.27455346
-34.24552481
-28.24025695
40.90214982
-66.98770575
-11.29938539
-9.506468997
17.55766239
18.31362506
-141.8720559
-18.43190147
25.23335788
13.9578498
-118.2644297
-83.89288636
-10.38415495
28.00524369
87.68655854
-50.83269696
60.95918345
42.63550846
15.38702315
-43.33519233
-70.43269696
-2.317829035
86.80840104
221.8681103
20.71254445
145.7990192
-59.97509809
100.9329302
0.475865701
To establish the relation mentioned above we use regression analysis by assuming Gross
collection as dependent variable and first weekend collection as causal variable hence we
plot Gross collectiona t Y axis and First weekend collection at X axis. As per the table
resulted by regression we can see that R square is 70.1 % which indicates that variance of
70.1% of population of gross collection can be explained by variance of first weekend
collection thus regression model is very accurate and hence supporting our hypothesis test.
The significance quotient is only 4.6% which clearly indicates that probablity of regression
obtained above by chance is only 4.6% and hence this model can be considered accurate
and significant again supporting our hypothesis.
Another observation from regression model can be inferred from p values of Y interceot as
the Pvalue is very low only at 0.35 hence again the probablity of such regression obtained
by chance is very low .
The fourth and most importnat inference can be interpreted from the residual effects
graph not following a specific pattern and hence regression can be considered robust and
hence the hypothesis that Gross collection is a dependent variable of first weekend
collection holds true.
Upper 95.0%
20.1877011
9.464040298
Comedy
Title Budget ($)in Millions First weekend collection($) in Mi
Good Deeds 14 15.583924
Darling Companion 12 0.039962
Won't Back Down 19 2.60337
Cosmopolis 20 0.070339
W.E. 29 0.047074
Big Miracle 30 7.760205
Deadfall 12 0.019391
The Odd Life of Timothy Green 25 10.822903
Compliance 10 0.016427
Arbitrage 13 2.002165
The Words 6 4.750894
Salmon Fishing in the Yemen 14.5 0.225894
Smashed 5 0.026943
People Like Us 16 4.255423
Anna Karenina 50 0.32069
Hitchcock* 15 0.287715
Beasts of the Southern Wild 1.8 0.169702
We Need to Talk About Kevin 7 0.024587
Flight 31 24.900566
The Impossible 45 0.4
The Master 35 0.736311
Silver Linings Playbook 21 0.443003
Argo 44.5 19.458109
The Perks of Being a Wallflower 13 0.228359
Lincoln 60 0.944308
Life of Pi 120 22.451514
Correlation and Equality of Means
Budget ($)in Millions
Budget ($)in Millions 1
First week collection($) in Mi 0.502686058
Gross ($) in Millions 0.85712907
Length in minutes 0.616787474
Viewer Rating 0.384348525
In Drama movies the Gross collection is dependent on the budget of the movie as the coefficient of
correlation is highest.
H0: The gross collection for all the drama movies released during the year 2012 is independent of
budget of the movie.
Budget ($)in Millions
Mean 25.72307692
Variance 591.2858462
Observations 26
Hypothesized Mean Difference 0
df 34
t Stat -1.225549156
P(T<=t) one-tail 0.114395125
t Critical one-tail 1.690924255
P(T<=t) two-tail 0.22879025
t Critical two-tail 2.032244509
Regression Model
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.85712907
R Square 0.734670243
Adjusted R Square 0.723614836
Standard Error 29.52882724
Observations 26
ANOVA
df
Regression 1
Residual 24
Total 25
t-Test: Two-Sample Assuming Unequal Variances
H0: The gross collection for all the drama movies released during the year 2012 is independent of
budget of the movie.
Mathematically H1 : µ budget - µ gross collection = 0

H1: Gross collection of all the drama movies released during the yr 2012 is directly proportional
to the budget of the movie.
Mathematically H1 : µ budget - µ gross collection ≠ 0

Let us consider α =.05 to establish this hypothesis test , to establish same we will use two tail test
by using t-statistic using t test of two sample assuming unequal variance with hypothesized mean
difference as zero.
As we can see that t stat is less than t critical two tail hence we can reject H0 and conclude that H1
is true which states that Gross collection of all the drama movies released during the yr 2012 is
directly proportional to the budget of the movie.
Coefficients
Intercept -10.49446037
Budget ($)in Millions 1.979868376
RESIDUAL OUTPUT
Observation Predicted Gross ($) in Millions
1 17.22369689
2 13.26396014
3 27.12303877
4 29.10290714
5 46.92172252
6 48.9015909
7 13.26396014
8 39.00224902
9 9.304223388
10 15.24382851
11 1.384749886
12 18.21363108
13 -0.59511849
14 21.18343364
15 88.49895841
16 19.20356527
17 -6.930697291
18 3.364618261
19 50.88145927
20 78.59961653
21 58.80093278
22 31.08277552
23 77.60968234
24 15.24382851
25 108.2976422
26 227.0897447
-100
-50
0
50
100
0 20 40 60 80 100 120 140
R
e
s
i
d
u
a
l
s

Budget ($)in Millions
Budget ($)in Millions Residual Plot
Gross ($) in Millions Length in minutes Viewer Rating
35 111 4.3
7.9 103 4.6
5.2 121 4.9
6.5 109 5.3
0.89 119 5.4
24 107 6.3
0.45 95 6.4
51.6 104 6.5
31 90 6.7
23 100 6.7
11.4 96 6.8
34.5 107 6.8
2.9 81 7
12.4 114 7.1
26.64 130 7.1
4.5 98 7.3
11 93 7.5
6 112 7.5
95.5 139 7.5
60.3 113 7.7
18.8 143 7.8
32 122 8.2
159.6 120 8.2
28 102 8.3
122.2 150 8.3
240 127 8.3
First weekend collection($) in Mi Gross ($) in Millions Length in minutes Viewer Rating
1
0.726299275 1
0.328596164 0.479212016 1
0.110079505 0.485727023 0.279065717 1
is dependent on the budget of the movie as the coefficient of
The gross collection for all the drama movies released during the year 2012 is independent of
Gross ($) in Millions
40.43384615
3154.842417
26
SS MS F Significance F
57944.2211 57944.2211 66.45348039 2.25863E-08
20926.83932 871.9516383
78871.06042
The gross collection for all the drama movies released during the year 2012 is independent of
µ gross collection = 0
H1: Gross collection of all the drama movies released during the yr 2012 is directly proportional
µ gross collection ≠ 0
=.05 to establish this hypothesis test , to establish same we will use two tail test
statistic using t test of two sample assuming unequal variance with hypothesized mean
critical two tail hence we can reject H0 and conclude that H1
Gross collection of all the drama movies released during the yr 2012 is
Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0%
8.518614798 -1.231944467 0.229903996 -28.0760172 7.087096462 -28.0760172
0.242872002 8.151900416 2.25863E-08 1.4786052 2.481131551 1.4786052
Residuals
17.77630311
-5.363960139
-21.92303877
-22.60290714
-46.03172252
-24.9015909
-12.81396014
12.59775098
21.69577661
7.756171485
10.01525011
16.28636892
3.49511849
-8.783433641
-61.85895841
-14.70356527
17.93069729
2.635381739
44.61854073
-18.29961653
-40.00093278
0.917224481
81.99031766
12.75617149
13.90235784
12.9102553
To establish the relation mentioned above we use regression analysis by assuming Gross
collection as dependent variable and Budget as causal variable hence we plot Gross
collectiona t Y axis and Budget at X axis. As per the table resulted by regression we can see
that R square is 73.4 % which indicates that variance of 73.4% of population of gross
collection can be explained by variance of Budget, thus regression model is very accurate
and hence supporting our hypothesis test.
The significance quotient is only 2.26% which clearly indicates that probablity of regression
obtained above by chance is only 2.26% and hence this model can be considered accurate
and significant again supporting our hypothesis.
Another observation from regression model can be inferred from p values of Y interceot as
the Pvalue is very low only at 0.23 hence again the probablity of such regression obtained
by chance is very low .
The fourth and most importnat inference can be interpreted from the residual effects
graph not following a specific pattern and hence regression can be considered robust and
The fourth and most importnat inference can be interpreted from the residual effects
graph not following a specific pattern and hence regression can be considered robust and
hence the hypothesis that Gross collection is a dependent variable of Budget holds true.
Upper 95.0%
7.087096462
2.481131551
Drama
Title Budget ($)in MillionsFirst weekend collection($) in Mi
Total Recall 125 25.577758
Chronicle 15 22.004098
Prometheus 130 51.050101
Cloud Atlas 102 9.612247
Correlation and Equality of Means
Budget ($)in Millions
Budget ($)in Millions 1
First week collection($) in Mi 0.386608264
Gross ($) in Millions 0.510235419
Length in minutes 0.591570074
Viewer Rating -0.109305766
Budget ($)in Millions
Mean 93
Variance 2852.666667
Observations 4
Hypothesized Mean Difference 0
df 5
t Stat -0.961115181
P(T<=t) one-tail 0.190317704
t Critical one-tail 2.015048373
P(T<=t) two-tail 0.380635408
t Critical two-tail 2.570581836
t-Test: Two-Sample Assuming Unequal Variances
In Sci-Fi movies, the length and budget of movie is correlated. Public always want Sci
genre movies, this is so due to the special effect in the sci
of high budget and more money invested on sci fi effects leads to high budget.
H0: Budget of all Sci-Fi movies released during the year 2012 is independent of the length of the movie.
Mathematically H1 : µ length - µ budget = 0

H1:Budget of all Sci-Fi movies released during the year 2012 directly proportional of the length of the movie
Mathematically H1 : µ length - µ budget ≠ 0

Let us consider α =.05 to establish this hypothesis test , to establish same we will use two tail test by using t
two sample assuming unequal variance with hypothesized mean difference as zero.
As we can see that t stat is less than t critical two tail hence we can reject H0 and conclude that H1
is true which states that Budget of all Sci-Fi movies released during the year 2012 directly
Regression Model
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.591570074
R Square 0.349955153
Adjusted R Square 0.02493273
Standard Error 52.74032518
Observations 4
ANOVA
df
Regression 1
Residual 2
Total 3
Coefficients
Intercept -15.30259806
Length in minutes 0.873408049
RESIDUAL OUTPUT
Observation Predicted Budget ($)in Millions
1 87.75955171
2 57.19026999
3 93
4 134.0501783
is true which states that Budget of all Sci-Fi movies released during the year 2012 directly
proportional of the length of the movie

-50
0
50
0 20 40 60 80 100 120 140 160
R
e
s
i
d
u
a
l
s

Length in minutes
Length in minutes Residual Plot
Gross ($) in Millions Length in minutes Viewer Rating
198 118 6.3
126 83 7.1
402.52 124 7.2
65.6 171 8.1
First weekend collection($) in Mi Gross ($) in Millions Length in minutes Viewer Rating
1
0.990390527 1
-0.319880107 -0.205474061 1
-0.360688156 -0.34543705 0.648028511 1
Length in minutes
124
1308.666667
4
movies, the length and budget of movie is correlated. Public always want Sci-Fi movies length to be longer compare to other
genre movies, this is so due to the special effect in the sci-fi movies which creates an excitement for the public. Sci-fi movies are made
of high budget and more money invested on sci fi effects leads to high budget.
Fi movies released during the year 2012 is independent of the length of the movie.

Fi movies released during the year 2012 directly proportional of the length of the movie.
µ budget ≠ 0
=.05 to establish this hypothesis test , to establish same we will use two tail test by using t-statistic using t test of
two sample assuming unequal variance with hypothesized mean difference as zero.
critical two tail hence we can reject H0 and conclude that H1
Fi movies released during the year 2012 directly
SS MS F Significance F
2994.9162 2994.9162 1.076710798 0.408429926
5563.0838 2781.5419
8558
Standard Error t Stat P-value Lower 95% Upper 95%
107.6529958 -0.142147443 0.899990505 -478.4960544 447.8908582
0.841720018 1.03764676 0.408429926 -2.748220882 4.49503698
Residuals
37.24044829
-42.19026999
37
-32.0501783
Fi movies released during the year 2012 directly
160 180
To establish the relation mentioned above we use regression analysis by assuming Budget
as dependent variable and length of the movie as causal variable hence we plot Budget
Y axis and length of the movie at X axis. As per the table resulted by regression we can see
that R square is only 34% which indicates that variance of only 34%of population of
Budget can be explained by variance of length of the movie thus regression model is not
very accurate and hence partially supporting our hypothesis test.
The significance quotient is only 0.4% which clearly indicates that probablity of regression
obtained above by chance is only 0.4% and hence this model can be considered partially
significant again supporting our hypothesis.
Another observation from regression model can be inferred from p values of Y interceot
as the P value is very low only at 0.89hence again the probablity of such regression
obtained by chance is very low .
The fourth and most importnat inference can be interpreted from the residual effects
graph not following a specific pattern and hence regression can be considered robust and
hence the hypothesis that viewer collection is a dependent variable of length of the movie
graph not following a specific pattern and hence regression can be considered robust and
hence the hypothesis that viewer collection is a dependent variable of length of the movie
partially holds true.

***less number of dataset population has resulted in differen results and hence any
interpretation drawn form such a small pool of data is absurd****
Lower 95.0% Upper 95.0%
-478.4960544 447.8908582
-2.748220882 4.49503698
the relation mentioned above we use regression analysis by assuming Budget
Budget at
Y axis and length of the movie at X axis. As per the table resulted by regression we can see
that R square is only 34% which indicates that variance of only 34%of population of
Budget can be explained by variance of length of the movie thus regression model is not
The significance quotient is only 0.4% which clearly indicates that probablity of regression
obtained above by chance is only 0.4% and hence this model can be considered partially
Another observation from regression model can be inferred from p values of Y interceot
as the P value is very low only at 0.89hence again the probablity of such regression
The fourth and most importnat inference can be interpreted from the residual effects
graph not following a specific pattern and hence regression can be considered robust and
hence the hypothesis that viewer collection is a dependent variable of length of the movie
graph not following a specific pattern and hence regression can be considered robust and
hence the hypothesis that viewer collection is a dependent variable of length of the movie
less number of dataset population has resulted in differen results and hence any
Sci-Fi

You're Reading a Free Preview

Download
scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->