This is a compilation of a quantitative analysis carried out to discover the correlation between certain parameters of Human Resource (HR) data available.

UNIVERSITY OF WALES UK

Student Name: Yasaruwan Yuwanmini Landersz Student I.D: Report Title: 201207044 Statistical Analysis of Organisational Human Resource Figures PPQM 100 Quantitative Methods

Module:

Submission- Date:

Time: .

Table of Contents

1. 2. 3. 4. Introduction ............................................................................................................ 1 Methodology........................................................................................................... 2 Data Description ..................................................................................................... 4 Analysis and Interpretation ..................................................................................... 6 Descriptive Analysis .................................................................................................. 6 Multiple Correlation and Regression Analysis ........................................................ 10 Selective Correlation and Regression Analysis ....................................................... 12 Residual Analysis of Observed Correlation ............................................................. 14 5. 6. Conclusion ............................................................................................................ 17 References ............................................................................................................ 18

Table of Figures

Figure 4.1: Histograms of Age and Basic Salary (with Normal Curve) ........................ 7 Figure 4.2: Box Plots Age and Basic Salary (with Normal Curve) ............................... 7 Figure 4.3: Gender Distribution Pie Chart ..................................................................... 8 Figure 4.4: Qualification Distribution of Frequencies ................................................... 9 Figure 4.5: Sector wise Distribution of Frequencies ..................................................... 9 Figure 4.6: Partial Regression Scatter Plots ................................................................. 12 Figure 4.7: Scatter Plot of Age vs. Basic Salary .......................................................... 14 Figure 4.8: Scatter Plot of Age vs. Residual ................................................................ 16

Table of Tables

Table 1: Descriptive Statistics of Numerical Variable .................................................. 6 Table 2: Correlations of Multiple Variables against Basic Salary ............................... 10 Table 3: Model Summeryb ........................................................................................... 10 Table 4: Coefficientsa................................................................................................... 11 Table 5: Correlation of Age Vs Basic Salary............................................................... 12 Table 6: Model Summaryb ........................................................................................... 13 Table 7: Descriptive Statistics of Residuals................................................................. 15 Table 8: Correlations.................................................................................................... 15 Table 9: Model Summaryb ........................................................................................... 15 Table 10: Coefficientsa................................................................................................. 16

1. Introduction

This report is formulated to explain the Statistical Analysis of Organisational Human Resource (HR) Figures with respect to meeting the expectations of the criteria set for this subject and assignment. As mentioned, the requirement and objective of this assignment was to first identify a valid data set, collect this raw data and analyse in terms of a Descriptive Statistical, Correlation and Regression or Time Series Analysis. Therefore, as the title explains this report will briefly touch on the main highlights of the above analysis with respect to certain HR parameters that were collected. First of all it is only fair to give a brief introduction of the organisation from which this data was gathered from1.ABC Group of Companies began as a as a pharmaceuticals and trading enterprise is now a diversified conglomerate with over 20 active subsidiaries organized into five key sectors - FMCG, Healthcare, Transportation, Leisure and Strategic Investments. It is also a listed company in the Colombo Stock Exchange where its revenues in 2010-2011 were $164 million (Rs. 18, 067 million) and profits amounted to $12 million. (ABC Group of Companies, 2012) Now that there is a bearing on the background of the company, let us look at the data that was obtained by this company. The main data that was obtained from this company is related to a few parameters of employee information. These figures (or variables) are mostly in line with the actual composition of staff currently employed in the organisation. This basic employee data or Human Resource figures will be the focus of the analysis which is to be carried out. The nature of the data may be assumed to be secondary as this was extracted from a system database which is maintained purely for testing purposes. The data in this database may be outdated by a few years or so as well. This data obtained from a group HR database includes data about various aspects of HR within the organisation. However for this report it was decided that variable which contributed to any change in the Basic Salary of an employee was to be identified analysed using the principles taught in this subject. It was also of interest to know how the organisation rates their salaries with respect to market value of a job as well as the increment basis for employee increments, which will also reflect by the information selected (more information about the data is available in the Data Description section).

Please note that the organization name will have to remain anonymous, due to sensitive information such as salaries. However throughout this report it will be referred to as ABC Group of Companies.

2. Methodology

Keeping in line with the requirement of this assignment, a step by step approach was adopted in order to analyse the above mentioned data. As of the requirement, a descriptive analysis and a correlation & regression analysis was carried out, where as the correlation & regression analysis was the main focus of the assignment. The following are the sequence of steps carried out during the execution of this assignment and preparation of this report: 1. Variable Selection The first step was to derive which parameters to select for this analysis. Since the database accessed regarding ABC Group of companies included a vast and comprehensive scope of data related to HR metrics it was a tough choice to select relevant information. However since employee salaries were of main concern for many HR and managerial professionals Basic Salary as well as some other determinants assumed to have an effect on this was selected, based on general know how. 2. Sample Selection The 30 samples of employee records were selected randomly across the ABC Group of companies, with no particular focus or criteria, since the sector and designations of employees was required to be observed. The idea was to maximize on the spread and minimize on the inconsistencies which we will observe during a later stage. Also the number of samples were decided based on a general assumption that in statistical analysis, if the number of samples were more than 30 it would be considered as a significantly large sample. 3. Data Entry into SPSS Software The software SPSS was use to statistically analyse the above information and therefore the collected data was keyed in along with the relevant variable definition. Microsoft Excel was also used to arrive at certain simple and intermediate charts and outputs during this process as well. Here numerical data were directly analysed, while categorical data were analysed based on their frequencies were based on their frequencies. Since at this point the main purpose was just to validate the selected data, this variation in method to describe the data was not considered to be significant.

4. Descriptive Analysis A descriptive analysis was carried out for a few variables which were assumed to be critical with respect to their contribution to the variance of Basic Salary, while Basic Salary that was collected was also subject to this. This was done in order to determine the validity of the selected sample for the correlation and regression analysis. The Mean, Median, Mode and Standard Deviation were observed to determine the central tendency of the information. Also the Co-efficient of variance was derived to determine the consistency of each critical variable. 5. Multiple Correlation and Regression Analysis As mentioned above it was identified that several variables maybe key factors in determining an employees Basic Salary in ABC Group of companies. Therefore initially all of these critical variables were put against the Basic Salary for a multiple correlation (how much these multiple variables were related to the change in Basic Salary) and regression analysis (Up to which effect sis Basic Salary change when each of these variables changed). Based on this the significance of correlation was to be observed against salary and the most significant determinant was identified by observation of the inclination of the trend line and correlation coefficient. 6. Selective Correlation and Regression Analysis In the previous step we blindly assessed the correlation of all critical variables against Basic Salary. But now that we know the degree of the effect on salary by each variable the most significant variable was selected and a one to one correlation and regression analysis (the significant variable is the independent variable and salary the dependant variable) was carried out. The purpose was to confirm the results in the previous steps as well as to justify the degree of correlation on salary and come up with a model to predict the salary based on the selected variable 7. Residual Analysis of Observed Correlation The correlation of the residual (difference between observed actual salary and the derived salary based on the model created) against the selected variable was observed to determine the independence and the constant variance for all values of this variable (homoscedasticity) which is important to validate the correlation of the variable with salary 8. Observations and Conclusions

3. Data Description

The data sample that was considered for this analysis was randomly picked with no special criteria and information with respect to the following variables were extracted for 30 employees: 1. Employee wise Sector which sector does each employee belong to (Categorical data in the nominal scale) 2. Employee Designation what is the designation held by each employee (Categorical data in the nominal scale) 3. Age the ages of all 30 employees (Numerical data) 4. Gender the gender of each employee (Categorical data in the nominal scale) 5. Qualification Type whether an academic, professional (Categorical data in the nominal scale) 6. Level of qualification this is a derived set of information from the original data Qualification Type. The purpose of deriving such a value was to identify the level of education of an employee against their expected salary. Qualification Type is merely a set of categories with no direct relationship between the separate categories. So in order to better understand this Ranks were assumed in the following manner for each Qualificaiton Type for this analysis. E.g. Primary Education =1, Secondary =2, Bachelors =3, Masters =4, etc. (Numerical data in the ordinal scale) 7. Basic Salary the basic salary of each employee (Numerical data)

8. Related Work Experience the number of related working experience for each employee (Numerical data) 9. Total Work Experience the total years of experience an employee has whether or not the work is relevant or irrelevant to the current job (Numerical data)

As per the methodology the Variable Selection, Sample Selection and Data Entry in to SPSS Software are assumed to be completed. This section will highlight the various methods used during this analysis exercise and how it was executed in line with the proposed methodology.

Descriptive Analysis

The main focus of this exercise was to determine the factors that would affect the Salary of an employee. Therefore during this descriptive analysis, instead of going into detailed descriptive, the focus is to identify the quality of data obtained in order to proper analyse the correlation in question. First let us take a look at the numerical variables:

Table 1: Descriptive Statistics of Numerical Variable

Related Age N Valid Missing Mean Median Mode Std. Deviation Range Minimum Maximum Percentiles 25 50 75 Coefficient of Variance 30 0 30.10 29.50 28

a

4.544 55310.03437 20 24 44 27.00 29.50 32.25 0.151 235625.00 14375.00 250000.00 16825.0000 30000.0000 47500.0000 1.0419

As seen above, when considering the numerical data gathered, the best data set would be the Age distribution. Why this is said is because all other data sets have very varying Means, Medians and Modes where as Age has a very close difference. This will also imply that age would have a normal distribution (Refer figure 1.1). Also the 6

Standard Deviations of the variables are comparatively large except for Age. This will imply that the variation around the mean is very high for all other variables. Considering the Coefficient of Variance which was derived manually (not from SPSS) this is also at the lowest for Age indicating high consistency of data. The other variables seem to be highly unpredictable and we may even decide that the data set is not valid. However this is the case with real data and we will proceed with the analysis ignoring these facts. Due to the consistency of data and the focus of the exercise Age and Basic Salary were selected for further descriptive analysis.

Figure 4.1: Histograms of Age and Basic Salary (with Normal Curve)

As interpreted earlier with the use of Mean, Median and Mode we can observe that Age has a somewhat normal distribution. Basic Salary on the other hand is left skewed and tailing off to the right, indicating that the distribution has more salaries in the lower levels, confirmed also by the box plot below. Roughly more than 50% (First to Third Quartiles) of the salaries are observed to below Rs. 50,000.

Figure 4.2: Box Plots Age and Basic Salary (with Normal Curve)

Considering the categorical data that was observed, the following descriptive outputs were obtained using SPSS:

Figure 4.3: Gender Distribution Pie Chart

As you can see above the Gender distribution of the sample has 53.3% Males (16 Employees) and 46.7% females (14 Employees).

Figure 4.4: Qualification Distribution of Frequencies

Considering the Qualification Types obtained by employees at the organisation we can observe that most of them have Professional or Chartered Qualificaitons (63.3%). Considering the left skewness of the Basic Salary observed above this is an initial indication that the level of qualificaiton may not actually impact the Salary.

Figure 4.5: Sector wise Distribution of Frequencies

From what is observed in the distribution of employees in each sector we see that more employees are based at the corporate office rather than specialising in each field. Form our knowledge of the business we can presume that they may be staff more related to clerical work rather than specialised jobs, which will also indicate why the Basic Salary histogram is left skewed.

As an initial step we will blindly compare afore mentioned variables (they will be assumed to be the independent variables) with respect to the variations of Basic Salary (this will b the dependant variable) to determine the best correlative variable i.e. the variable that affects the change of Basic Salary significantly. For this let us observe the correlations obtained by SPSS.

Table 2: Correlations of Multiple Variables against Basic Salary

Qualification Level .153 .210 30 Related Experience .156 .206 30 Total Experience .103 .294 30

Age Pearson Correlation Sig. (1-tailed) N Basic Salary Basic Salary Basic Salary .742 .000 30

As you can see the highest level of correlation can be observed with the variable Age. Also considering the significance only age is less than 0.05 (95% accurate) and all others are above 0.2 (80% accurate). This will help us decide that Age is more correlated to Basic Salary than all others. However to decide what is the impact caused by the variation of Age to the variation of Basic Salary (along with others) we should carry out a multiple regression analysis. The outputs are mentioned below:

Table 3: Model Summeryb

Change Statistics Model R R Square Adjusted Std. Error R Square of the Estimate R Square Change 1 .776

a

F Change 9.479

df1

df2

Sig. F Change

.603

.603

25

.000

a. Predictors: (Constant), Total Experience, Age, Qualification Level, Related Experience b. Dependent Variable: Basic Salary

10

When considering the above model we can see that our set of variables is strongly and positively correlated with Basic Salary considering the correlation coefficient (R) value being close to +1. This means that as our variables will increase the Basic Salary will also grow. In practical perspective this is also true. Considering the coefficient of determination (R2) we can also observe that 60% of the variation in Basic Salary can be explained by the variation of the collective independent variables given. Let us now observe the collective prediction model or Trend Line as well as the scatter plot for this data set.

Table 4: Coefficientsa

Unstandardized Coefficients Standardized Coefficients 95.0% Confidence Interval for B Upper Model 1 (Constant) B Std. Error Beta t -4.809 Sig. Lower Bound .000 -351025.373 Bound 140524. 500 Age 9171.076 1570.647 .753 5.839 .000 5936.269 12405.8 84 Qualification Level Related Experience Total Experience a. Dependent Variable: Basic Salary -1861.356 3262.580 -.079 -.571 .573 14943.454 9600.032 .209 1.557 .132 4147.330 4276.188 .125 .970 .341 -4659.644 12954.3 05 -4828.183 34715.0 91 -8580.764 4858.05 2

-245774.936 51103.892

By observing the above table we can derive the following equation in order to predict the variation of Basic Salary with respect to our independent variables: Y = 9171.076X1 + 4147.33X2 + 14943.454X3 -1861.356X4 245774.936 Y Basic Salary X1 Age X2- Qualification Level X3 - Related Experience X4 - Total Experience

Also the following individual partial regression scatter plots were derived to decide the key independent variable out of the given variables.

11

As you can observe above only the Age is highly affecting the change in Basic Salary positively. Since our collective correlation was also positive we can safely assume that it was greatly due to the effect of age. Let us look at the relationship of Age and Basic Salary in detail.

Now we will only take in to consideration Age since its impact on Basic Salary was observed to be relatively high.

Table 5: Correlation of Age Vs Basic Salary

Age Basic Salary Pearson Correlation Sig. (2-tailed) N **. Correlation is significant at the 0.01 level (2-tailed). .742

**

.000 30

When comparing Age vs. Basic Salary we can observe that the correlation coefficient (R) is close to +1 and has a value of 0.742 indicating that Age is positively correlated to Basic Salary. This implies that as an employee grows older their basic salary will increase (The reality of this is explained in Conclusions). Also the

12

correlation significance is well below 0.05 (95% accurate) which is also a good indication of correlation.

Table 6: Model Summaryb

Change Statistics Model R R Square Adjusted Std. Error R Square of the Estimate R Square Change 1 .742

a

F Change

df1

df2

Sig. F Change

.551

.551 34.299

28

.000

Considering the model derived above we can observe that the coefficient of determination (R2) with a value of 0.551 is also very significant. This indicates that 55.1% of the variation of Basic Salary can be explained by the variation of Age.

Unstandardized Coefficients Standardized Coefficients 95.0% Confidence Interval for B Upper Model 1 (Constant) B Std. Error Beta t -4.662 Sig. Lower Bound .000 -314916.676 Bound 122650. 815 Age 9032.295 1542.262 .742 5.857 .000 5873.114 12191.4 76 a. Dependent Variable: Basic Salary

-218783.745 46930.578

By observing the above table we can derive the following equation in order to predict the variation of Basic Salary with respect to our independent variables: Y = 9032.295X 218783.745 Y Basic Salary X - Age

Using the above information we can derive a scatter plot with the above given trend line as follows:

13

As observed the correlation and regression is clearly visible for the selected sample data. However this is not enough to purely say that the two variables are correlated. There may be an indirect correlation between the variables as well since only around 55% is explained in this model. The remaining 45% may also have a dependency with Age. To check this we will do a residual analysis.

As explained above the purpose of this analysis will be to establish the independence of the variable age as well as the constant variance for all ages (homoscedasticity). Following steps were carried out for this: 1. First we will use the prediction model (the trend line equation above) to derive the estimated Basic Salary. 2. Then the difference between the actual Basic Salary and Predicted Basic Salary was obtained as the Residual (considered as another variable). 3. Finally a correlation and regression analysis was done between the Residual vs. Age. The results were as follows:

14

Mean Residual On Age Age -.0012 30.10 Std. Deviation 37080.26836 4.544 N 30 30

As shown above the mean of the residual is roughly 0 and the values for it are spared across a large area when compared to the mean sine the standard deviation is 37080.26836. This is consistent for all values when the coefficient of variance is also close to zero (3.236222533099272e-7).

Table 8: Correlations

Age Pearson Correlation Sig. (1-tailed) N Residual On Age Residual On Age Residual On Age .000 .500 30

Change Statistics Model R R Square Adjusted Std. Error R Square of the Estimate R Square Change 1 .000

a

F Change .000

df1

df2

Sig. F Change

.000

.000

28

1.000

The correlation of the residual is negligible since the correlation coefficient(R) is 0 and the coefficient of determination (R2) is also 0. This means that Age is a purely independent variable and affected by 0% from the other factors that would affect Basic Salary as a whole. The following coefficient details will also help us construct a model for this relationship

15

Unstandardized Coefficients Standardized Coefficients 95.0% Confidence Interval for B Upper Model 1 (Constant) B Std. Error -.001 46930.578 Beta t Sig. Lower Bound Bound

.000 1.000

-96132.931 96132.9 29

Age

.000

1542.262

.000

.000 1.000

-3159.181 3159.18 1

As observed above the slope and the constant of Age against the Residuals is almost 0. This implies that values are spread almost equally on both sides of the trend line when we look at the scatter plot. This further confirms the independence of Age with respect to the residual variable which determines basic salary.

Figure 4.8: Scatter Plot of Age vs. Residual

When we observe the scatter plot it is evident that residuals are spread equally alongside the trend line and there is no visible pattern of the Residuals with respect to Age. Also the plot shows the linearity and homoscedasticity of the variable Age.

16

5. Conclusion

As witnessed during the statistical analysis, several variables were first statistically analysed to see if they were valid. Once the validity of the sample of data was confirmed next several variables were put up against a Multiple Regression analysis against Basic Salary. During this process the variable Age stood out and therefore Age alone was analysed against Basic Salary for correlation. Once the high correlation was also identified it was further analysed and the linearity, independence and homoscedasticity was established. This proves that as an employee's age in ABC Group of Companies increases their Basic Salary should also increase. But in reality this is not true. There are many determinants of employee basic salary like qualifications, experience and performance. But as a coincidence we know that employees with more work experience will be older. Also experience will impact their performance as well. So the correlation between Age and Basic Salary can only be a coincidental correlation. However statistically we can arrive at the conclusion that age does have an impact on an employees basic salary. Also another fact is that only 55% of the variation of Basic Salary can be predicted or is rather explained by Age. There may be other factors like work experience (related or overall experience). But during the multiple regression we observed this was not the case for ABC Group of Companies. Also the qualifiaction levels that were compared with basic slary did not also explain the variation of Basic Salry to a great extent in the case of this organisation. This brings us to the final conclusion that the organisation salary scheme may be driven by another determinant. From the experience and the background of this organisation we can presume this factor to be the performance oriented evaluation and culture. Since we have not considered performance data during this analysis (due to unavailability) we will have to go on an assumption that this kind of factor may explain the remaining variation of 45% of the Basic Salary. We can also conclude that by analysisng any other parameter that might affect the Basic Salary we can again prove the fact in the same method.

17

6. References

ABC Group of Companies*, 2012. ABC Group of Companies Web Site. [Online] Available at: * [Accessed 29 August 2012]. hSenid Business Solutions, 2012. Sample HR Database for Indian Operations, Chennai, India: s.n.

18

Employ ee No 006105 Employ ee Name EMP01 Sector Healthcar e Sector Healthcar e Sector FMCG Corporat e Office Corporat e Office Corporat e Office Healthcar e Sector FMCG Corporat e Office Corporat e Office Corporat e Office Corporat e Office Corporat e Office Age 36 Gender Female Qualification Type Higher Education Qualification Degree Higher Education Qualification Degree Professional Qualification Professional Qualification Professional Qualification Professional Qualification Professional Qualification Professional Qualification Professional Qualification Professional Qualification Professional Qualification Professional Qualification Professional Qualification Qualifi cation Level 7 Basic Salary 30,000.00 Related Experie nce 0 Total Expe rienc e 0 Predicted Basic 113,415.11 Residual -83,415.11 Basic On Age 106,378.88 Residual On Age 76,378.88 30,249.70 15,152.81 17,820.52 10,088.22 7,976.37 31,217.40 16,217.40 70,718.01 70,718.01 5,879.48 16,383.67 -8,585.10

006672

EMP02

32

Male

40,000.00

67,424.03

-27,424.03

70,249.70

006674 004650 007025 007059 007062 007095 007209 007329 007349 007420 007439

EMP03 EMP04 EMP05 EMP06 EMP07 EMP08 EMP09 EMP10 EMP11 EMP12 EMP13

29 28 27 25 31 31 33 33 28 24 30

Female Male Male Male Female Male Male Female Female Female Female

7 6 6 6 6 6 6 6 6 6 6

28,000.00 16,300.00 15,000.00 15,000.00 30,000.00 45,000.00 150,000.00 150,000.00 40,000.00 14,375.00 43,600.00

0 0 0 0 0 0 0 3 0 0 0

0 1 0 0 3 0 0 6 4 0 2

49,217.58 34,037.82 26,728.10 8,385.94 57,828.33 63,412.40 81,754.55 115,416.78 28,453.75 -785.13 50,518.61

-21,217.58 -17,737.82 -11,728.10 6,614.06 -27,828.33 -18,412.40 68,245.45 34,583.22 11,546.25 15,160.13 -6,918.61

43,152.81 34,120.52 25,088.22 7,023.63 61,217.40 61,217.40 79,281.99 79,281.99 34,120.52 -2,008.66 52,185.10

Corporat e Office Corporat e Office FMCG Transport ation Sector FMCG FMCG

27 26 44 32

Professional Qualification Professional Qualification Professional Qualification Professional Qualification Higher Diploma Other Educational Qualification (Diploma) Other Educational Qualification (Diploma) Other Educational Qualification (Diploma) Primary & Secondary Education (School Level) Primary & Secondary Education (School Level) Professional Qualification Professional Qualification Other Educational Qualification (Diploma)

6 6 6 6

0 1 0 0

0 1 0 8

008708 006890

EMP18 EMP19

39 31

Female Female

6 4

100,000.00 55,000.00

0 0

0 3

136,781.01 49,533.67

-36,781.01 5,466.33

133,475.76 61,217.40

33,475.76 -6,217.40

006689

EMP20

FMCG

30

Male

40,000.00

41,799.33

-1,799.33

52,185.10

12,185.10 -9,120.52

007019

EMP21

Leisure

28

Male

25,000.00

62,703.48

-37,703.48

34,120.52

007415

EMP22

24

Female

16,000.00

-16,949.83

32,949.83

-2,008.66

18,008.67

007694

EMP23

36

Female

30,000.00

75,501.64

-45,501.64

106,378.88

76,378.88

29 25 24

1 6 6

0 0 0

0 0 0

004892 004893

EMP27 EMP28

28 31

Male Male

004894 004895

EMP29 EMP30

29 33

Male 1

3 6

25,000.00 30,000.00

0 0

0 0

23,457.18 63,412.40

1,542.82 -33,412.40

34,120.52 61,217.40

7 6

28,000.00 150,000.00

0 0

0 0

49,217.58 81,754.55

-21,217.58 68,245.45

43,152.81 79,281.99

