Professional Documents
Culture Documents
Introduction
The sample data provided is based on a company that has multiple departments and
consist of information on employees like their ethnicity, their start and end dates, their Annual
Salary, etc.
In this report, we make a detailed analysis of the company employee data and get
meaningful insights that can be used to see the performance of the workforce in different
employees according to their tenure in the company and analysis of average age of the
employees in the company and relationship between different variable of data and their deep
analysis to get insights. The purpose of this report analysis is to gain a deeper understanding of
the workforce composition, performance distribution, and relationships between key variables.
Analysis
Task 1:
Use pivot tables to calculate and summarize the average salary and tenure across
In this Task 1, we calculate the tenure from the given data set of employees which we
find by using the Exit date minus the Hire date. Once we have tenure in decimal form in years
we use pivot table to calculate and summarize average salary and tenure based on different
departments and job titles. We get the desired result by using the Variable of employee sample
data ‘job title’ and ‘department’ in the row of the table and Average tenure and Average salary in
Task 2:
Create a pivot table to showcase the distribution of employees based on their performance
In this task, we first make a Performance score indicator of the employees and to create a
Performance score we use the bonus of the employees and multiply it by 10 to calculate a
performance indicator out of 10. Using Full name, Department, and performance score in the
pivot table. We place the Performance score in the value part of the pivot table and sort the data
department and employee performance score-wise within departments. Below is the pivot table.
Task 3:
Investigate which Business Unit has the highest Female-to-Male employee ratio.
In this task we investigated the gender distribution across business units to determine which one
has the highest Female-to-Male employee ratio, to indicate diversity patterns within the
organization. And as per the search women are 51.8% in the company as compared to men which
are 48.20%
4
Task 4:
Compute the mean, median, mode, standard deviation, and range for age, salary, and
tenure.
Using descriptive statistics tools in data analysis we find summary statistics which include mean,
mode, median, standard deviation, and range for table variables like age, salary, and tenure.
These statistics show a thorough overview of the central tendencies and distribution within these
key variables. Note that in the case of tenure, we pick the data only for those data sets whose exit
Task 5:
Analyze and compare the variance in age, salary, and tenure. Discuss what these variance
5
measurements reveal about each variable in conjunction with the results from the previous
question.
data set. More specifically, variance measures how far each number in the set is from the mean
(average), and thus from every other number in the set. After applying data analysis tool using
covariance on age, salary and tenure we find that, age has a moderate spread with a relatively
normal distribution. Salary has a wide spread with a positive distribution, indicating the presence
of higher earners. And tenure has a moderate spread suggesting some employees have longer
tenures.
Task 6:
Create a histogram to illustrate the age distribution. Compare the insights from this chart
In this task we have created the histogram chart for age variable of the sample data. The x-axis of
the date shows the age range of the employees, where as the y-axis tells us about the number of
employees in that age range. As we calculated the mean age earlier which is 44.38 in which
almost 88 employees fall near to it as per the histogram. The interesting observation is that in
histogram the maximum number of employees falls under 44.5 – 48.4 range.
6
Task 7 & 8:
Examine the correlation between salary and tenure, as well as between age and
performance score, using scatter plots for visualization. Adjust the scatter plot from the
previous question. On the Annual Salary axis, use bins of $35,000 each and display the
units in thousands.
A correlation ship tells us about the movement of two variable which are interdependent and fall
between -1 and 1. After applying correlation between Salary and tenure we that both variables
moves inversely but not perfectly and somewhat they are weak and negative in direction where
as in the case of performance and age the correlation is very weak and and are negative in
direction. And when we plot the data for these variables on Scatter plots we can see in case of
tenure vs annual salary there are many outlier after applying the trend line, the relationship is
somewhat weak even after making bin adjustment of 35000. The X-axis indicates the Annual
7
salary and Y-axis indicates the Tenure of an employee. On the other hand the Age vs
performance scatter plot the data is scatter and have many outliers in it when see with trend line
indicating relation is weak and negative in direction. In X-axis we have age variable where as in
Annual Performance
Salary Tenure Score Age
Performance
Annual Salary 1 Score 1
-
Tenure 0.129928067 1 Age -0.015555127 1
20
15
Years
10
0
$35,000 $85,000 $135,000 $185,000 $235,000 $285,000
Annual Salary
3
2.5
2
1.5
1
0.5
0
20 25 30 35 40 45 50 55 60 65 70
Age
8
Task 9:
Conclusion
The analysis of employee data shows some interesting trends. The company has an equal gender
presence with women constituting 51.8% of its employees. The average age of 44.38 is at the
center with a slight emphasis on employees whose age is in the range between 44.5 years and 48.
However, the varying nature of salary and tenure is indicative of a difference in compensation
plans and employee retention approaches. The relationship between the salary and tenure is a
weakly negative correlation which implies that there is a slight lowering of salary with increase
in years of service. Performance scores results demonstrate poor correlation with age, and so age
practically has no influence on individual performance. The scatter plots show the presence of
outliers in relationships which imply individual difference. In general, these patterns may be
shaped or affected by organizational policies, and selection methods as well as dynamics within