Statistics Project - A Matter of Change

A Matter of Change
A Matter of Change
AP Statistics
Period 3
Nadim Imam Brian Shi

1
Table of Contents
Preface .......................................................................................................................................................... 2
Experiment: Part 1........................................................................................................................................ 3
Randomization.......................................................................................................................................... 4
Visual Representation .............................................................................................................................. 5
More Questions ........................................................................................................................................ 6
Experiment: Part 2........................................................................................................................................ 7
Contingency Tables................................................................................................................................... 8
Bar Graphs ................................................................................................................................................ 9
Experiment: Part 3...................................................................................................................................... 11
2
(Chi-Squared) Test of Independence ................................................................................................ 11
Two Sample Z Hypothesis Test for Proportions .................................................................................... 14
Power and Error ..................................................................................................................................... 17
Two Sample Z Confidence Interval for Proportions ............................................................................... 18
More Inferences.......................................................................................................................................... 20
2
(Chi-Squared) Test of Homogeneity .................................................................................................. 20
Response Bias ............................................................................................................................................. 23
2
(Chi-Squared) Test of Homogeneity .................................................................................................. 23
Experiment: Part 4...................................................................................................................................... 26
Conclusion and Error Report ................................................................................................................. 26
Appendix ..................................................................................................................................................... 27
Afterword .................................................................................................................................................... 30
2
Preface
We live in a world of bias.
Since the beginning of mankind to the present day the concept of gender equality is just that, a concept.
Although we must credit the progress we have achieved, a question arises; have we really progressed as
far as we may believe?
From zero to sixty in 3.5 seconds, we drive our cars while others zoom by, but eventually we face the red
light. We look to the side and we see a Mercedes Benz but we look to the other side and we see a male
solicitor. Then we wonder, why do they pander? How do they do survive? How much do they make?
Green light; time to coast down the street; red light already? We look to the side and we see a Bentley
but we look to the other side and we see a homeless woman. Tough life; Hmm…Is she more likely to
receive money?
In our society, based on democratic values of equality it would be easy to assume that they receive
change at a similar frequency. But do they? Generalizing to a broader population, deviating away from
the homeless population, a similar question arises; do males or females receive money at a similar
frequency if they should chose to ask or are the gender of the solicitor a factor that determines how
often he/she receives money from a random person.
This question is a very interesting one and we have decided to conduct an experimental to explore any
potential relations among the gender of a solicitor and the number people that agree to give some
money. The experimental study1 will be conducted in the cafeteria2 from the time period 1:53 pm to
2:28 pm for two days so any results that we have collected can only be generalized to people in the
cafeteria at this time. However we hope that even in this small setting we may be able to test if there is
any slight hint of gender inequality.
In short, we will design an experiment to answer the following question:

Does the gender of a solicitor affect the frequency that a random person would give him/her a
requested amount of money in the school cafeteria3? While our experiment will be designed to answer
the above question, there may be other potential questions that will arise. Such question will be
discussed later on in the report.
The following will be a detailed procedure of our experimental study and the individual stages that were
executed. Before beginning this experiment we hypothesize that the female may receive more money….
1
The basics will be detailed later.
2
We will discuss why later.
3
Our question remains restricted to the school cafeteria as our population that we draw the data from reside only in the cafeteria. Our
decision for choosing the cafeteria will be explained as we go on.
3
Experiment: PART 1
Planning
Outline and Basics
The first step of our experiment was to decide on the basic mechanics of it. We planned to have two
solicitors ask a random person for some money. One of the solicitors would be a male while the other
would be a female. We would rotate the solicitors around record the responses.
Based on these results we would try to investigate if there is any relationship between the gender of the
solicitor and the amount of people that would give money to him/her. We would conduct these
investigations using multiple statistical inference tests.
Another important decision that was made was that we choose to ask for a quarter each time. The
reason for this is that it seemed more common for a person to ask for a quarter as opposed to a dime or
nickel. We did not plan to ask for any money value higher than a quarter because a larger amount of
money would influence an individual’s willingness to give any money, as opposed to a quarter which is
not as significant in value compared to the dollar.
Location and Time

The next step of the experiment was to designate a time and location. We made several considerations:
The Library
The Gym
The Main Mall4
The Mall
The Cafeteria
Given that we required a large enough sample to conduct our experiment we ruled out the library. We
considered the gym because there were usually a lot of people that would hang around there. However
we realized that it was AP testing week and many students were cleared out. Furthermore we realized
that in the gym most students were dressed out in their uniforms and were very unlikely to carry any
bags/purses or wallets. Next we considered the main mall. At first we felt that the main mall offered a
large sample of students. Therefore we would have been able to generalize any results to the school.
However the amount of time that the large amount of students hanging out in the main mall is very
brief about five to seven minutes. After considering this we felt that we would not have an adequate
amount of time to conduct an experiment for a decent sized sample; so we ruled the main mall out as a
suitable location to conduct our experiment. We also considered the shopping mall as a possible
location to conduct our experiment. The mall provided a very large sample of people and we were able
4
The large hallway that is interconnected with the A wing and the B wing starting from the cafeteria and ending at the C wing hall way
4
to generalize any results to the population of people that went to the mall. Not only is this a larger
population than the population of students at school it also included adults of all ages; which would
have allowed us to generalize our results to a broad range of ages rather than students between the
ages of 14-18. Unfortunately like the preceding considerations the mall also had its cons. We realized
that the large and broad range of people could be dangerous5; something bad could happen
unexpectedly.
After all these considerations we were left with the school cafeteria. The cafeteria offered a large
sample of students and thanks to the hall monitors they were forces to remain there. By conducting our
experiment in our cafeteria we were able to sample the students for about 40 minutes; an adequate
amount of time we believed to collect sufficient data for our experiment.
Since we settled on collecting data in the school cafeteria the time that we should conduct the
experiment became relatively easy to decide; it would be time that we had lunch, from 1:53 to 2:28.
However this would mean that we would have to sacrifice our lunch to conduct the experiment.
Randomization
After deciding on a location we needed to
decide on a way to randomize all process of
the experiment as a way to reduce bias. We
decided that our main source of “randomness”
would be from a random number generator.
We assigned the male the number zero and

the female the number one; the male and
female being the two different treatments. We
used the random number generator from the
Texas Instrument 89 Titanium 6 to randomly
choose a random number from zero to one.
When the randint( ) function outputs a zero we would ask the male solicitor to ask a person for a
quarter. When the randint( ) function outputs a one we would ask the female solicitor to ask a person
for a quarter. Both solicitors would ask the exact same question “Do you have a quarter I can have?” The
question was standardized in order to reduce the effect of any response bias.
5
We were always told “never talk to strangers”
6
TI-89 Titanium Operating System v3.10, with the Statistics with List Editor App for the TI-89 Titanium is needed.
5
Some Considerations
When designing this experiment we took into account several considerations. First we decided that any
volunteer for the solicitor would have to be an average person. They can neither be well liked nor
despised. We also considered conducting a blinded experiment by not informing the solicitor why they
are asking for a quarter. However realistically speaking if we had done that, no one would have
volunteered to be a solicitor for us. Therefore we decided to tell both solicitors that they were helping
us in a statistical experiment.
Another extremely important consideration we took into account was; what should we do with the
money? We decided that if the random person agreed to give a quarter we would tell them it was a
statistical experiment and return the quarter. Furthermore when the solicitors went to ask for the
quarter we would stay at a distance in order not to influence a respondent’s decision.
Each time a person is chosen at random we would ask the solicitors if they knew the person at all. If they
did then we would skip the person and ask the next 12th person.
After considering these factors we proceeded to one of the most crucial parts of the experiment; finding
volunteers to be the “average” solicitor.
Volunteers
Finding the volunteers was a bit hard because not many of them were willing to be a solicitor. Some
responses included: “Yeah I’m no hobo”, “Go ask yourself”, “I have work to do” (don’t we all…), and
“Maybe later”. After a while we found a sophomore male student in the library that would volunteer as
a solicitor. We were also to find a female junior that agreed to help us also. The male was an African
American while the female was Caucasian. With our volunteers found, we were able to execute the
experiment.
Visual Representation of our experiment

Treatment one: Male Solicitor
R
Females
A
Treatment two: Female Solicitor
“Do you have a N
quarter I can Make Inferences
have?” D
Treatment one: Male Solicitor
O
Males
M
*
Treatment two: Female Solicitor
*To understand the randomization process please

refer to the above section “Randomization”
6
More Questions…
After planning out our experiment we were met with two other questions that we decided to explore as
well:
Is there a relation between solicitors asking across genders? Such was male to male, male to
female. Or female to female and female to male.
Is there a relation between how the solicitor asks the question and the response that the get?
In exploring the first question we decided just to further categorize the data that we had planned to
take in our initial experiment into male-yes, male-no, female-yes and female-no. Doing it this way, we
can explore the first question and combine the male-yes and female-yes and the male-no and female no
for a total yes and no to explore our initial question regarding gender equality.
To explore our second question, we have decided to have one of the solicitors ask two different
questions: one that is biased and one that is not. In this case we will not consider the gender of the
random person and focus our attention on the response of the person based on the question that
he/she is asked. We will use the data that we will collect for the male solicitor and use that as a non-
biased data7. Then we will ask the male solicitor to ask in a biased form afterwards and add that to our
data table for a non bias vs. bias relation data table.
The biased question will be: “Do you have a quarter I can have? I’m not paying you back.”
This question carries a negative bias therefore we hypothesize that they may be a difference in
responses.
Summing up our proposed procedure, we have several relations
Initial Question-Variables: See relationship between gender overall to see who is more likely to get the
quarter. The independent variable is the treatment of a male or female solicitor. The dependent variable
is the response that is received.
Further Questions 1-Variables: See relationships across genders to see how each gender responses to
the question. The independent variable is the treatment of a male or female solicitor. The dependent
variable is the response that is received.
Further Questions 2-Variables: To see if biased question is less effective in soliciting a quarter. The
independent variable is the treatment of a biased or non-biased question. The dependent variable is the
response that is received.
7
We do not have time to conduct a separate experiment; therefore we decided to reuse data that are independent of one another.
7
Experiment: PART 2
Execution
Day One
On our first day we gathered data on the question: Do you have a quarter I can have? The two subjects
we had chosen would go up to a respondent ask that question many, many, times. As the experiment
progressed we observed distinct patterns or characterizations in the responses of the respondents.
As our solicitors cordially asked their questions, it appeared that women were more likely to give a
definite answer. Unlike their male counterparts, women distinctively knew if they had the change or
not, therefore they were quick to respond either no or yes. Men on the contrary often fumbled through
their pockets, checked their wallets, or patted their backpack before uttering a definite statement.
Rarely we had also come upon those that had offered lesser amounts of money, ie. dimes and nickels.
Other behavioral characteristics included response methods. Of those men who had agreed to give the
change, some quickly found it, handed it, and walked away. Others stopped to find it, handed it, and
waited for a “thanks”. Interestingly enough most women who responded “yes” almost always stopped
to look through their purse, unlike the “drive by” give of their male counterparts.
There were also distinct patterns of those who said no. Most commonly men would check their pockets
and provide a solemn look of “sorry” before they walked away. Others blatantly walked away as if they
had not heard the question.
Some gender to gender interactions were also noticeable. When our female solicitor had approached a
male respondent, rarely did he ever walk away without any response. On the contrary, our male solicitor
had fairly equal response rates across genders. When our female solicitor had approached a female
respondent we incurred similar results; the female rarely irresponsive.
During day one, one of use worked the number generator while the other recorded the data. The
solicitors remained the same.
Day Two
On day two, we gathered data on the question: “Do you have a quarter I can have? I can’t pay you
back.” Contrary to the original question, we wondered how people would respond if they were blatantly
told this wasn’t a loan. Initially our subject asked the question awkwardly, as if it was scripted, but
eventually his voice had reached a level of comfort with the question at hand. Although both questions
would solicit for the same item: change, we predicted that the “biased” question would gather far less
change than the original question although since both questions are essentially asking for the same
result they shouldn’t.
8
Many respondents, initially, did not understand the question, or had taken time to think about the
question asked to them. Most seemed to look at our solicitor with an awkward attitude and slowly said
in almost a questioning manner; “No?” Some stalled for time with a casual “uh” and as we had predicted
there were less “Yes” responses. Behavioral characteristics noted above, applied as it did before on our
biased question. At the conclusion of our experiment, our soliciting subject seemed exploited, and
demanded to be relieved of the job.
During day two, we switched roles; one of worked the number generator while the other recorded the
data. The solicitors remained the same.
Day Two Continued – Organizing the Data

In organizing our data we have considered several options. However since our data is categorical rather
than quantitative we cannot use a histogram nor can we use a five number summary to accurately
describe our data.
Table 1
Male Yes Male No Female Yes Female No Total
Male 8 28 7 20 63
Female 9 23 12 26 70
Total 17 51 19 46 133
Table 2
Yes No Total
Male 15 48 63
Female 21 49 70
Total 36 97 133
Table 3
Yes No Total
Non-Biased 15 48 63
Biased 3 34 37
Total 18 82 100
Table 1
Question: “Do you have a quarter I can have?”
Table 2
Question: “Do you have a quarter I can have?”
Table 3
Question 1 Unbiased: “Do you have a quarter I can have?”
Question 2 Biased: “Do you have a quarter I can have? I’m not paying you back.”
9
30
25
20
15 Male
Female
10
0
Male Yes Male No Female Yes Female No
Table 1 : Males vs. Females across gender
60
50
40
30 Males
Females
20
10
0
Yes Response No Response
Table 2 : Males vs. Females

10
60
50
40
30 Non-Biased
Biased
20
10
0
Yes No
Table 3: Biased vs. Non-Biased
Initially we expressed our data in bar graphs but we decided on focusing on the contingency tables
because they could further be used for the χ2 (chi-square) test. Furthermore the contingency tables
allow us to better look at the categorical data in numeric terms while the bar graphs give a visual
representation without specified numbers.
Since our data our categorical we found that we had no way to find the mean or our data. The mean of
our data would not provide any benefit in any case. As stated above the five number summaries would
not provide any insight on the data as our data was categorical. Lastly as with the mean and five number
summary, we decided that there was also no need for a standard deviation8.
8
However we did use the Standard Error for our Z-tests.
11
Experiment: PART 3
Making Inferences
2
(Chi-Squared) Test of Independence
Population of interest: Students in the Mc Neil high school cafeteria.
Ho: The results of male and female solicitors asking the question: “Do you have a quarter I can have?”
is independent of gender.
Ha: The results of male and female solicitors asking the question: “Do you have a quarter I can have?”
is NOT independent of gender.
 = 0.05
Conditions:
Counted data condition:
The data must be in counts for the categories of a categorical variable.
Independence Assumption:
Randomization Condition: The individuals who have been counted and whose counts are
available for analysis should have been randomly selected.
Sample Size Assumption:
Expected Cell Frequency: We should expect to see at least 5 individuals in each cell.
Since:
The Gathered data are in counts as we have counted the number of yes and no in our data table. The
individuals were treated with a treatment that was randomized by the Texas Instrument 89 Titanium
randint( ) function. It is reasonable to assume that the randomization assumption is met. All expected
cell frequency counts are at least 5 individuals in each cell (we have calculated this in the calculations),
the sample size is big enough and the assumptions are met.
Then:
It is reasonable to proceed with the hypothesis test: 2 (Chi Squared) Test of Independence, with
degrees of freedom (Row – 1) x (Column – 1) = 1
12
Calculations:
0.30
0.25
0.20
0.15
0.10
0.05
2 4 6 8 10
General Inference for Independence

Since the PValue 0.4224 is greater than any reasonable alpha value ( = 0.05), we fail to reject Ho. There
is not sufficient evidence to claim that the results of male and female solicitors asking the question: “Do
you have a quarter I can have?” is not independent of gender.
13
Two Sample Z Hypothesis Test for Proportions
Population of Interest: Students in the Mc Neil high school cafeteria.
P1: The true proportion of Mc Neil High School students who answer “Yes” to the question; “Do you
have a quarter I can have?” when asked by a female.
have a quarter I can have?” when asked by a male.
Ho: P1 – P2 = 0
Ha: P1 – P2 0
 = 0.05
Conditions:
Randomization Condition: Participants must be randomly assigned to experimental treatment
groups.
Sample Size condition: Each sample must be reasonably less than 10% of their respective
populations.
Success Failure Assumption:
There must be at least 10 successes and 10 failures in order for the sample size to be large
enough.
Since:
The individuals asked, were randomly assigned to either the male or the female solicitor by the Texas
Instrument 89 Titanium randint( ) function, it is reasonable to assume that the randomization
assumption is met. Of the people asked, it is reasonable to assume that the samples are less than 10% of
all students in the cafeteria. There are indeed 10 successes and 10 failures for both proportions. The
number of successes for the male is 15 and the number of failures for the male is 48. The number of
successes for the female is 21 and the number of failures for the female is 49.
Then:
It is reasonable to proceed and use the normal model to conduct a Two Sample Z Hypothesis Test for
Proportions.
14
Calculations:
0.4
0.3
0.2
0.1
4 2 2 4
15
OR
General Inference for Difference in Gender
is not sufficient evidence to claim that the true proportion of Mc Neil High School students who answer
“Yes” to the question; “Do you have a quarter I can have?” when asked by a female is different from the
true proportion of Mc Neil High School students who answer “Yes” when asked by a male.
16
Type I Error:
We reject the null hypothesis when it is in fact true.
In context: We conclude that there is a difference between the proportion of men and women who said
yes to different genders begging the question that there is inequality due to gender when there is not.
Type II Error:
We fail to reject the null hypothesis when it is in fact false.
In context: We conclude that there is no difference between the proportion of men and women who
said yes to different genders begging the question that there is equality due to gender when there in
fact may not.
Power: (1-)
The power of a test is the probability that it correctly rejects a false null hypothesis. The distance
between the null hypothesis value Po, and the truth, P, is the effect size. By reducing type I error we
increase type II error, this applies vice-versa. By increasing the a value we decrease the b value which
increases power, the ability to decrease type II error, but as a result it increases type I error. To increase
power the best course of action is to increase the sample size. When increasing the sample size we
decrease the standard deviations and in turn decrease both type I and type II error.
β
17
Two Sample Z Confidence Interval for Proportions

Population of Interest: Students in the Mc Neil high school cafeteria.
have a quarter I can have?” when asked by a female.
have a quarter I can have?” when asked by a male.
We will use a 95% confidence interval
Conditions:
Randomization Condition: Participants must be randomly assigned to experimental treatment
groups.
Sample Size condition: Each sample must be reasonably less than 10% of their respective
populations.
Success Failure Assumption:
There must be at least 10 successes and 10 failures in order for the sample size to be large enough.
Since:
The individuals asked, were randomly assigned to either the male or the female solicitor by the Texas
Instrument 89 Titanium randint( ) function, it is reasonable to assume that the randomization
assumption is met. Of the people asked, it is reasonable to assume that the samples are less than 10% of
all students in the cafeteria. There are indeed 10 successes and 10 failures for both proportions. The
number of successes for the male is 15 and the number of failures for the male is 48. The number of
successes for the female is 21 and the number of failures for the female is 49.
Then:
It is reasonable then to proceed with the Two Proportion Z Confidence Interval using the normal model
with 95% confidence.
Calculations:
18
0.4
0.3
0.2
0.1
4 2 2 4
19
General Inference for Male vs. Female Soliciting

Based on these samples, we are 95% confident that the true difference in proportions of Mc Neil High
School students who answer “Yes” to the question; “Do you have a quarter I can have?” when asked by
a female and the true proportion of Mc Neil High School students who answer “Yes” to the question;
“Do you have a quarter I can have?” when asked by a male is from -0.2121 to 0.0883.
If we randomly and independently sample form two populations many, many number of times, the true
difference in the proportions of Mc Neil High School students who answer “Yes” to the question; “Do
you have a quarter I can have?” when asked by a female and the true proportion of Mc Neil High School
students who answer “Yes” to the question; “Do you have a quarter I can have?” when asked by a male
would be captured in about 95 out of every 100 intervals.
What Does it ALL Mean?

From the above inferences we have observed several trends. First of all, we have successfully concluded
that the female and male solicitors asking the question: “Do you have a quarter I can have?” is
independent of gender. This means that gender does not affect a respondent’s reaction nor does it
affect their response, agreeing to give change. From the hypothesis test for proportions we have
concluded that there is no true difference in the proportion of respondents who said “Yes” to either our
male or female solicitor. In context this means that being a certain gender does not factor into receiving
the change. More importantly, according to our data, males and females have an equal opportunity to
receive the favorable answer: “Yes.” The confidence interval, further, provides evidence that the
difference between the two populations includes zero. This means that it is plausible to assume that
there may be no difference between the responses to our male and female solicitor. Based on these
data, it is possible to assume that there may in fact be gender equality, or at least in the cafeteria.
20
More Inferences
2 (Chi-Squared) Homogeneity
Ho: The results of male and female solicitors asking the question: “Do you have a quarter I can have?”
is independent of gender.
Ha: The results of male and female solicitors asking the question: “Do you have a quarter I can have?”
is NOT independent of gender.
 = 0.05
Conditions:
available for analysis should have been randomly selected. The samples must be independent.
Since:
The Gathered data are in counts (we counted the number of yes and no responses from the random
person). The individuals asked, were randomized by the Texas Instrument 89 Titanium randint( )
function therefore it is reasonable to assume that the randomization assumption is met. All expected
cell frequency counts are at least in each cell therefore the sample size is large enough and the
assumption is met.
Then:
It is reasonable to proceed with the hypothesis test: 2 (Chi Squared) Test of Homogeneity, with degrees
of freedom (Row – 1) x (Column – 1) = 3
21
Calculations:
0.20
0.15
0.10
0.05
2 4 6 8 10
22
General Inference of Gender To Gender Soliciting

is not sufficient evidence to suggest that the results of male and female solicitors asking the question:
“Do you have a quarter I can have?” across gender have different distributions.
What Does it ALL Mean?

From the above test we have concluded that across genders, meaning a male asking men and women; a
female asking men and women; there is not enough evidence to suggest that the response distributions
among the different categories are different. In context this means that between our male and female
solicitor, they received similar proportions of responses in each of the categories: men and women
respectively. So based on these result, it does not matter who a solicitor asked; he/she would probably
get similar responses.
23
Response Bias
2 (Chi-Squared) Test of Homogeneity
Ho: The results of a solicitor asking the question: “Do you have a quarter I can have?” or asking: “Do
you have a quarter I can have? I can’t pay you back.” have the same distribution.
Ha: The results of a solicitor asking the question: “Do you have a quarter I can have?” or asking: “Do
you have a quarter I can have? I can’t pay you back.” do not have the same distribution.
 =.05
Conditions:
Since:
The Gathered data are in counts (we counted the number of yes and no responses from the random
person). The individuals asked, were randomized by the Texas Instrument 89 Titanium randint( )
function therefore it is reasonable to assume that the randomization assumption is met. All expected
cell frequency counts are at least in each cell therefore the sample size is large enough and the
assumption is met.
Then:
It is reasonable to proceed with the hypothesis test: 2 (Chi Squared) Test of Homogeneity, with degrees
of freedom (Row – 1) x (Column – 1) = 1
Calculations:
24
0.30
0.25
0.20
0.15
0.10
0.05
2 4 6 8 10
General Inference for Response Bias

Since the PValue 0.0484 is less than alpha of .05, we reject Ho. There is sufficient evidence to claim that
The results of a solicitor asking the question: “Do you have a quarter I can have?” or asking: “Do you
have a quarter I can have? I can’t pay you back.” do not have the same distribution.
25
Experiment: PART 4
Conclusion and Error Report
Based on our experiment we found that there is considerable gender equality within the cafeteria. For
the most part the male and female would have gotten the same amount of quarter if they had decided
to keep it during the experiment. Contrary to our initial belief that the female would receive more
money we discovered that the gender difference had no profound effect on the money received. In
general we believe that it is not worth it to ask random people for money as it does not yield a
substantial profit9. The Pvalue for our experiment were generally higher than any reasonable alpha,
excluding the response bias test. Therefore our results were not statistically significant.
Going back to our initial question (Incase it has been forgotten… it is: Does the gender of a solicitor
affect the frequency that a random person would give him/her a requested amount of money in the
school cafeteria?) we feel that there is no difference in the frequency money that a person gets
depending on the gender. In short, we believe that our inference test done based on the data that we
have collected, suggest that It does not matter what gender an individual is, he/she would probably be
faced with the same amount of yes and no responses from any random person they should so happen to
choose.
When conducting this experiment several factors contribute to a list of potential errors that probably
occurred. First of all we felt that we were too restricted in the cafeteria. We initially wanted to find out
how equal males and females were, however, we had to settle for the males and females in the
cafeteria and how equal they were.
There were also a number of discrepancies that occurred:

Towards the end of the lunch period our solicitor began to ask for quarters half-heartedly which
pretty much translated as “I don’t need a quarter” to the random person.
Some of the people we asked were sitting in groups and were influenced by their peers. At first,
when the solicitor asked for a quarter the random person responded no. However sometimes
their peers commented on how they should give (rarely though!) and they ended up giving.
At times some of the random people were too preoccupied eating or talking and totally ignored
the solicitor.
Some of the people that we told our solicitors to ask actually spotted us from a distance (they
actually saw us writing stuff down). Though we cannot give direct proof as to the validity of the
bias that it caused. We can assume it had an effect because almost immediately upon seeing us
they shook their heads.
9
Do we see another experiment coming up!?
26
Our solicitors tried to cheat us at times and asked their friends for a quarter. We realized
something was wrong and quickly asked the person if he/she know the solicitor. If they said yes,
we would cross out the data collected from that trial.
Our solicitors also tried to cheat us by pretending to ask someone. At times, they felt
embarrassed to ask some people, whether it was due to awkwardness or intimidation factors.
Also they would pass their own quarters as having received a quarter.
We cannot guarantee that the random number generator on the Ti 89 Titanium is perfectly
random as it itself follows an algorithm to implement the random number generation. Such
algorithms contradict the definition of randomness since algorithms follow a certain procedure.
For our purposes10 we believe that the Ti 89 Titanium is sufficient enough.
Due to time constraints we were forced to reuse data. Although we believe that the data we
used are independent of one another regarding the different experiments. We would need to
formally test that in order to make sure. So there is a potential that the data we reuse are
related in the different experiments regarding gender bias and response bias.
The week in which we conducted the experiment was during the AP testing week so there were
some people that were gone during the lunch period. Therefore we had some of the population
missing.
However the BIGGEST problem for us was not anticipating the experiment to go as we had thought it
would. We expected more random people to respond yes to our solicitor; however we discovered that a
data was so one-sided. There just was not enough people saying yes, therefore we had to compensate
by taking a larger sample until we felt that there was an adequate amount of success and failure.
Overall, we felt that this experiment went relatively smooth. We were able to collect our data, and the
solicitors cooperated somewhat. However like most experiments the reality of the process is far
different from the theory of our planning. All in all we were able to conduct an investigation and as a
result were able to find a direction for our question. Though we have not fully supported that the
genders are equal, our original hypothesis that the female would have a more favor response was
discredited.
10
Pseudo random, this is probably good enough for us.
27
APPENDIX
Normal Distribution (Gaussian Distribution)
The Normal Distribution, also known as the Gaussian Distribution is a probability distribution that
describes data, numerical and categorical, around an average value with deviations. The function is bell
shaped, with its peak at the mean
known as the bell curve. The
distribution as named after Carl
Friedrich Gauss, who used it for
the analysis of astronomical data.
Its formula is defined by the
probability density function. In
order to use the normal model a
distribution must be symmetrical.
The empirical rule is the area
under the curve of the function
with intervals of 1 standard deviation from the mean. The first interval is approximated to include 67%
of the distribution, the second interval includes 95% of the distribution, and the final interval includes
99.7% of the distribution. The PDF function for the Normal Distribution is:
Chi-Square Distribution
Given that the assumptions are met, the Chi-Square Distribution is used in statistical significance tests.
Though this method the quantities can be shown to have distributions that approximate with a “heavy
tailed” Chi-Square distribution, given the null hypothesis is true. Common Chi-Square tests include the
Chi-Square goodness of fit, homogeneity, and independence. The Chi-Square Distribution is a test that is
conducted to make inferences on counts across several different categories. The Chi-Square test will
always be a one-sided test. There will not be a two sided test as with the Normal Distributions and the T
Distribution. The PDF function for the Chi-Square Distribution is:
28
General Conditions:
Gamma Function
Though not explicitly used in our calculations, the Gamma Function plays an important role in many
distributions such as the 2 Distribution and the Student’s t Distribution. The gamma function is denoted
by the capital Greek alphabet . The formal definition of the gamma function is:
The Gamma function extends the notion of the factorial to all real numbers excluding the negative
integers. Basically
for all values of x excluding the negative integers. Below is a plot of the gamma function.
30
20
10
4 2 2 4 6
10
20
29
Error Function
The error function is defined as:
The error function has many uses in Statistics and is also instrumental in calculations involving the
normal model. The derivatives of the family of curves, for the error function, are the probability density
functions for the normal model. Below is a plot of the error function.
1.0
0.5
4 2 2 4 6
0.5
1.0
Degrees of Freedom
By definition the number of degrees of freedom is the number of values in the final calculation of a
statistic that are “free to vary.” The degrees of freedom for an estimate are congruent to the number of
Independent data that insert into the estimation subtracted from the number of parameters estimated.
Several statistical distributions such as the Student’s t and the Chi-Squared Distributions use the
parameters degrees of freedom. The degrees of freedom (df) emerges from the residual sum of squares.
Although the term is commonly used among the different distributions, often times they are calculated
in several methods and may not have correlation to one another.
PDFs
The probability density function is a function that gives the probability corresponding to a given x-value.
To find the probability that a random variable would fall in a given interval, one would simply take the
integral of the probability density function. By definition the probability of a random variable falling
within a given interval (take the interval [a, b] for instance) is equal to:
The sum of all probabilities within the interval [a, b] = P(a) or P(a+x) or P(a+2x) or … or P(b-x) or P(b)
= P(a) + P(a+x) + P(a+2x) + ….+ P(b-x) + P(b) =
30
Afterword
We live in a world of bias.
Since the beginning of mankind to the present day the concept of gender equality is just that, a concept.
Although we must credit the progress we have achieved, a question arises; have we really progressed as
far as we may believe?
From zero to sixty in 3.5 seconds, we drive our cars while others zoom by, but eventually we face the red
light. We look to the side and we see a Mercedes Benz but we look to the other side and we see a male
solicitor. Then we wonder, why do they pander? How do they do survive? How much do they make?
Green light; time to coast down the street; red light already? We look to the side and we see a Bentley
but we look to the other side and we see a homeless woman. Tough life; Hmm… it seems reasonable to
assume that she makes the same as her male counterpart. Doesn’t it?
Although our experiment answered that high school students may evenly give money without
consideration on the basis of gender, can we make a judgment on the nature of gender influence on
other mediums? In short, no, because we have to consider the confounding factors that may exist in the
decision making process. First our experiment was done on high school students who are between a
small range of age. The age of a person could be a significant factor in how they respond to gender
influenced questions partly because the brain develops during the teenage age and matures during
adulthood causing us to think in different ways. Furthermore our data is based on students cordially
asking for money, not asking as a necessity to survive as the pan handlers on the road may.
Were we to do this experiment again, we could better investigate gender influence and biased
questioning. During this experiment we were in short of time allotted for gathering data. Furthermore
we were constricted to the cafeteria of McNeil High School. Since our data was collected within a
narrow margin, we cannot generalize past what we sampled. Our generalizations are constrained to the
population of McNeil High School students rather than students in general or around the country. Other
sources to better conduct our experiment may have been public parks or the mall because they include
a better representation of the population at large. Due to time constraints and potential liability issues
we strained away from those places.
Now that we have summarized our results from this experiment we look to the broader question. We
wonder how people would react if they were faced with a minor injured person. How would they react?
How would men react compared to women? Does gender have a place in this question? It’s too bad we
have run out of time in this edition but maybe next time we will investigate this new issue in future
editions.

Statistics Project - A Matter of Change

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Statistics Project - A Matter of Change

Uploaded by

Copyright:

Available Formats

A Matter of Change

Nadim Imam Brian Shi

In short, we will design an experiment to answer the following question:

Location and Time

We assigned the male the number zero and

Visual Representation of our experiment

*To understand the randomization process please

Summing up our proposed procedure, we have several relations

Day Two Continued – Organizing the Data

Table 1 : Males vs. Females across gender

Table 2 : Males vs. Females

Table 3: Biased vs. Non-Biased

Population of interest: Students in the Mc Neil high school cafeteria.

General Inference for Independence

Two Sample Z Hypothesis Test for Proportions

Population of Interest: Students in the Mc Neil high school cafeteria.

General Inference for Difference in Gender

Two Sample Z Confidence Interval for Proportions

General Inference for Male vs. Female Soliciting

What Does it ALL Mean?

General Inference of Gender To Gender Soliciting

What Does it ALL Mean?

General Inference for Response Bias

There were also a number of discrepancies that occurred:

You might also like