You are on page 1of 8

Statical Analysis Of Salary Data

In one of the earlier ,, recipes " I briefly described the statical functions in Excel. My knowledge of statistics is quite modest (one of these days will probably
take some additional lessons to improve my knowledge . However I would like to dedicate one article to this area . On a practical case , the analysis of wages
in a small company , I'll show you how you can use these functions to make you come up with some interesting conclusions.

For starters, you need to create a table with a list of employees and amount of their salary € . How to calculate the average wage?. This is simply using the
AVERAGE . For a given data should enter the formula :
.
= AVERAGE(C2:C9)

And as a result , we get the amount of € 622.22. Similarly , writing a formula in which we AVERAGE to replace with MIN or MAX , we calculate the minimum
and maximum salary .
.

The average salary is useful information , but since there is a large range between maximum and minimum salary this a bit like the joke in which " some eat
the cabbage , some eat meat , but on average we eat canbbage rolls. " It is necessary to calculate the middle salary which we get by using median -value of
statiscal collection is divided into two equal parts . Let's create a formula.

=MEDIAN (C2:C10)

The result is € 450 . For further analysis , We can use the function to calculate percentile. Percentile is a statistical measure that shows value below which
a given percentage of observations in a group of observation fall. For example , if you enter a formula :

=PERCENTILE(C2:C10:,0.75)

result is that the value of salary that has less than 75% of employees. That is ,25% of employees have calculated or a higher salary . Next we can "play" with
percentile to get answer for all questions " we're interested in . or, we can use Box& whisker chart for further analysis . Newer version of excel have the sam
functions and syntax , PERCENTILE.INC and PERCENTILE.EXC , that enable computing percentile with more or less accuracy .

Furthermore , it is convenient to measure how much wages vary in relation to the average . For this purpose , we calculate the standard deviation and
variance with the help function STDEV and VAR . Both of these function have two version.
First ends with P ( STDEV.P, VAR.P) and it is used when we know all the elements of a dataset , as in our case . If we had data about nine randomly selected
workers , a sample , we would use functions ending with S (STDEV.S,VAR.s)
Data Analysis & Tool Park
Data Analysis & Tool Pak perform complex Data Analysis
If you need to develop complex statical or engineering analyses , you can save step and time by using the Analysis ToolPak . You provide the data and paramete
for each analysis , and the tool uses the appropriate statiscal or engineering macro function to calculate and display the result in an output table . Some tools
generate chart in addition to output tables.

The data analysis functions can be used on only on worksheet at a time . When you perform data analysis on grouped worksheets , result will appear on the firs
worksheet and empty formatted tables will appear on the remaining worksheets . To perform data analysis on the reminder of the worksheets, recalculate the
The data
analysis analysis
tool for functions can be used on only one worksheet at a time. When you perform data analysis on grouped worksheets, results will appear on th
each worksheet.

The Analysis ToolPak includes the tool described in the following sections . To access these tools, click Data Analysis in the Analysis group on the Data tab . If the
Data Analysis command is not available , you need to load the Analysis ToolPak add-in program.
Anova

The Anova analysis tools provide different types of variance analysis. The tools that you should use depend on the number of factors and
the number of samples that you have from the population that you want to test.

Anova : Single Factor

This tool performs a simple analysis of variance on data for two or more samples. The analysis provides a test of the hypothesis that each
simple is drawn from the same underlying probability distribution are not same for all the samples . If there are only two samples, you ca
use the worksheet function T.TEST . With more than two samples , there is no convenient generalisation of T.TEST , and the single Factor
Anova model can be called upon instead.

Anova : Two-Factor with Replication

The analysis tool is useful when data can be classified along two different dimensions. For example , in an experiment to measure the
height of plants, the plants may be given different brands of fertiliser ( for example , A,B,C) and might also be kept at different temperatur
(For example, low , high) . For each of the six possible pairs of { fertiliser, temperature } , we have an equal number of observations of pla
height . Using this Anova tool, we can test:

> Whether the heights of plants for the different fertiliser brand are drawn from the same underlying population . Temperature are
ignored for this analysis.

> Whether the heights of plants for the same underlying population . Fertiliser brands are ignored for this analysis.

Whether having accounted for the effects of difference between fertiliser brands found in the first bulleted point and difference in
temperatures found in the second bulleted point , the six samples representing all pairs of { fertilisers , temperature} values are drawn
from the same population . The alternative hypothesis is that there are based on fertiliser alone or on temperature alone.

Anova : Two-Factor Without Replication

This analysis tool is useful when data is classified on two different dimensions as in the Two-Factor case with Replication . However
for this tool it is assumed that there is only a single observation for each pair ( for example, each {fertiliser , temperature} pair in the
preceding example ) .
Correlation
The CORREL and PEARSON worksheet functions both calculate the correlation coefficient between two measurement variables when
measurement on each variable are observed for each N subjects. ( Any missing observations for any subject causes that subject to ignored
in the analysis). The Correlation analysis tool is particularly useful when there are more than two measurement variables for each of N
subjects . It provides of an output table, a correlation matrix, that shows the value of CORREL (or PEARSON) applied to each possible pair
of measurement variables.

Covariance

The Correlation and Covariance tools can both be used in the same setting , when you have N different measurement variables observed
on a set of individuals. The Correlation and Covariance tools each give an output table a matrix , that shows the correlation , coefficient ,
or covariance , respectively , between each pair of measurement variables. . The difference is that correlation coefficient are scaled to lie
between -1 and +1. inclusive . Corresponding covariance are not scaled . Both the correlation coefficient and the covariance are measure
of the extent to which two variables " vary together."
Exponential Smoothing

The Exponential smoothing analysis tool predicts a value that is based on forecast for the prior period , adjusted for the error in the prior
forecast . The tool uses the smoothing constant a , magnitude of which determines how strongly forecast respond to errors in the prior
forecast .

> Note : Values of 0.2 TO 0.3 are reasonable smoothing constant . These values indicate that the current forecast should be adjusted 20%
to 30% for error in the prior forecast . Larger constants yield a faster response but can produce erratic projection . Smaller constant
can result in long lag for forecast values.

Rank and Percentile

The rank and Percentile analysis tool produces a table that contains the ordinal and percentage rank of each value in a data set . You
can analyse the relative standing the values in a data set . This tool use the worksheet function RANK.EQ and PERCENTRANK. INC.
if you want to account for tied values, use the RANK.EQ function , which treats tied values as having the same rank , or use the
RANK.AVG
function , which returns the average rank for tied values .

You might also like