Math workings

- Answers for First Assignment

ASSIGNMENT 1 - ANSWERS

BRUNA KHARYN DE ASSUNCAO BARBOSA

STUDENT NUMBER: 43282812

1.

a) Raw mean:

∑

=

) ))

Raw standard deviation:

√

= √

∑

)

= √

)

)

)

=√

)

)

)

=√

)

=√

=0.49

b) Given that 3 points are for cluster 1, and 2 are for cluster 2, we have:

and

So for the declustered mean:

∑ )

∑

∑

)

∑

)

For the declustered standard deviation:

√

√

∑

)

∑

√

)

)

)

√

)

)

√

c)

For the raw mean:

**And for the standard deviation:
**

√

√

)

)

We can see that the results of the raw mean and the raw standard deviation of the samples in

letter c are the same as those of the declustered mean and declustered standard deviation in

letter b. That happens because the process of declustering the five samples in letter b

corresponds to applying weights to each of those samples so that the clustering effect in their

spatial distribution does not interfere in the statistics values anymore. In other words, the five

initial samples were mathematically redistributed in space and that new spatial distribution is

equivalent to the regular, non-clustered distribution of two samples of values X=1 and X=2.

2.

a)

Checking that )

with

:

Since )

)

)

We have that

)

)

)

)

)

)

)

)

)

Knowing that:

)

And

)

) )

Which gives

)

So, we have:

)

)

Checking that )

, given that

√

)

)

√

))

)

√

)))

*

) (

√

)

)

)+

√ )

Given that the variables are uncorrelated that means that: ) , then:

b)

)

)

)

)

) (

√

)))

)

√

))

( (

√

)) *(

√

)+

*

√

+

)

√

)

Since

**) and ) , the equation above results in:
**

)

**Now, the correlation coefficient is given by:
**

)

3. a)

The basemap only shows the spatial distribution in an aerial view. We need to check the

scatterplots to answer the questions.

From the scatterplots we can see that the clustered regions both in northing and easting

directions tend to present higher values than most of the other samples.

b)

The histogram label shows mean value = 436.35011 and standard deviation = 299.91942.

c) With only eyeballing the basemap, we could say that the typical spacing between samples

in the non-clustered regions would be between 20 and 25. With the tool graphical parameters,

we can see the maximum and minimum. So we take the amplitude (maximum - minimum)

and divide it by the amount of samples in a random line in the corresponding axis direction,

obtaining 19.9. So the chosen value for the distance which will generate the weights is 20.

The histogram for the declustered samples is shown below with the values of weighted mean

and standard deviation are, respectively,

279.68004 and 251.43704.

d)

Given that the histogram is an estimate of the probability distribution of a continuous variable

and clustered samples do not have a continuous distribution in space, it is expected that the

histogram for the clustered samples does not show a proper bell curve. With the weighted

samples, however, we have the correct perspective as the samples now have continuous

spatial distributions and can be treated as continuous random variables generating a more

realistic and correct bell curve.

