You are on page 1of 8

MBA431 Quantitative Business Analysis Dr.

Hong Chen
1


LAB ONE

Introduction to Microsoft Excel, Graphical Presentation of Data & Descriptive
Statistics
Ref: pp13-14, 78-79, Chapter 4, Selvanathan et al. (2011)


Part 1. Introduction to Microsoft Excel

Basic statistical analysis can be done easily in Microsoft Excel and some plug-ins, one of which is
Data Analysis ToolPak an Excel add-in program.

1.1 Opening Excel
If you dont find Excel on your computer desktop, you can go to Start, All Programs,
Microsoft Office and click Microsoft Excel 2010 (2007 will also do).
From Excel screen, click on File from the main menu, select New from the drop-down menu,
and then click on Create.

1.2 The Excel workbook and worksheet
Excel files are called workbooks.
A workbook contains worksheets (by default, 3 worksheets, namely Sheet 1~3. You can
generate a number of worksheets based on your needs). You can operate on any of these
sheets and any other sheets that may be created. To change the worksheet, use your mouse
pointer and click the sheet you wish to move to.
A worksheet consists of rows and columns. The rows are numbered, and the columns are
identified by letters. And each cell in the worksheet can be identified by the combination of
one letter and one number, e.g. A1 refers the first cell in the worksheet, and D3 is the cell in
the fourth column and the third row.
A cell becomes active when you move the mouse pointer (which appears as a large plus sign)
and click, e.g. cell D5 in the Figure 1. In the active cell, you can type in a number, word or
formula.

You can use any of the four Up, Down, Left or Right arrow keys, which appear on your
keyboard as arrows pointing up, down, left and right respectively.
At the bottom left-hand corner of the screen you will see the word Ready. As you begin to
type something into the active cell, the word Ready changes to Enter.

2

1.3 Inputting data
To input data, open a new workbook by clicking the File tab from the menubar and then
selecting New.
Data are usually stored in columns. Active the cell in the first row of the column in which
you plan to type the data.
You may type the name of the variable if you wish. E.g. if you plan to type your assignment
marks in column A you may type Assignment Marks in cell A1. Hit the Enter key on your
keyboard and cell A2 becomes active.
Begin typing the marks, following each one by Enter. Use the arrow key or mouse pointer to
move to a new column if you wish to enter another set of numbers.

1.4 Importing data files
Data files for this courses computer practices can be downloaded from Moodle.
To import a file, click the File tab and select Open on the drop-down menu.
Browse the directories to find the required file. Double-click each of the directories along the
path until you reach the file you wish to open.
The file will appear in the form in which it was saved.

1.5 Data Analysis ToolPak
The Data Analysis ToolPak is a group of statistical functions that comes with Excel. You can find
the ToolPak by clicking the Data tab from the menubar and then Data Analysis from the Analysis
sub-menu. If the ToolPak does not appear in the menu, follow the following steps to add it in:
Click on the File tab, select Excel Options.
From the options list, click on Add-Ins, which will display another menu. Make sure that
under Manage, you select Excel Add-Ins and then click Go.
Select Analysis ToolPak and then click OK.
To access Analysis TookPak, simply click the Data tab and then Data Analysis from the
Analysis sub-menu.
There are 19 menu items in Data Analysis. Click the one you wish to use, and follow the
instructions described in the textbook or lab session notes.

1.6 Formula bar and Insert function fx
On the Formula tab from menubar you will find the fx Insert function. Clicking this button
produces other menus that allow you to specify functions that perform various calculations.

1.7 Saving workbooks
To save a new file, click the File tab from the menubar and select the option Save as on the
drop-down menu. Enter the new file name and click Save.
To save an already saved file with the same name, choose Save on the drop-down menu, and
the original file will be overwritten.





3

Part 2. Graphical Presentation of Data

2.1 Line plot of time series data

Use the Insert tab from Excel menubar to draw line plot. We use dataset in the Excel file
XR02-81 to draw line plot and describe the trend of job vacancies in New South Wales (NSW)
over 1991-2008.

1) Highlight cells from B1 to B19.
2) Click the Insert tab from menubar, from Charts submenu choose Line, and then click
the first type of chart in the second row (when you place mouse pointer on the chart it
shows the name Line with Markers).
3) Note that the horizontal axis does not show the points of years. To let the horizontal axis
represent points of years, from the Design tab, click Select data from Data submenu.
4) In the pop-up dialogue named Select Data Source, click Edit under the Horizontal
(Category) Axis Labels.
5) Move your mouse pointer to the Input Range box and click. Back to Sheet1, highlight
cells A2 to A19, and then click OK in the pop-up box.
6) Click OK in the Select Data Source dialogue, which will produce a line plot as follows.
(You can continue working on the format of the plot and choose options from the
right-click menu).

7) To put all three time series in one graph, simply highlight three series by covering cells
from B1 to D19. Then follow Steps 2~6. The multiple series line plot will be as follows:


2.2 Histogram to present frequency distribution

For the rest of this practice we use the M-Status (marriage status) series in Column C in the
0.0
10.0
20.0
30.0
40.0
50.0
60.0
1
9
9
1
1
9
9
2
1
9
9
3
1
9
9
4
1
9
9
5
1
9
9
6
1
9
9
7
1
9
9
8
1
9
9
9
2
0
0
0
2
0
0
1
2
0
0
2
2
0
0
3
2
0
0
4
2
0
0
5
2
0
0
6
2
0
0
7
2
0
0
8
NSW
0.0
20.0
40.0
60.0
80.0
100.0
120.0
140.0
160.0
180.0
200.0
1
9
9
1
1
9
9
2
1
9
9
3
1
9
9
4
1
9
9
5
1
9
9
6
1
9
9
7
1
9
9
8
1
9
9
9
2
0
0
0
2
0
0
1
2
0
0
2
2
0
0
3
2
0
0
4
2
0
0
5
2
0
0
6
2
0
0
7
2
0
0
8
NSW
Victoria
Australia

4

Excel file XR02-82.

1) We use Analysis ToolPak for this practice. To plot a Histogram we need to define the
classes (categories) for the variable. To do that, in cell G1~G5 type class, 1, 2, 3, 4
respectively (see below).

2) From the Data tab go to the last submenu Analysis ToolPak (refer to Part 1.5 to add it
into Excel if you cannot find it in the submenu).
3) Choose Histogram from the list.
4) In the pop-up Histogram dialogue, move mouse pointer to Input Range box and click.
Go back to worksheet Sheet1, highlight cells from C1 to C501 (in total 500 observations
in this sample). Switch back to the pop-up dialogue.
5) Move mouse pointer to Bin Range box and click. Go back to worksheet Sheet1,
highlight cells from G1 to G5. Switch back to the pop-up dialogue.
6) Check (i.e. tick) the Labels option to indicate that the first row is series names.
7) Choose Output Range from the Output Options. Move mouse pointer to Output
Range box and click. Go back to worksheet, highlight cell H1 (this will decide where the
chart will be displayed). Switch back to the pop-up box.
8) Check (i.e. tick) the box of Chart Output. Then click OK. A histogram will be produced
accordingly. Note that the chart has a fifth category named Other. Simply delete the last
row in the newly generated frequency table, so that the histogram chart is without the
fifth category (see below).
9) Note that in our practice today classes are denoted by single values rather than intervals.
These values can be regarded as mid-points of classes.


2.3 Bar charts to present frequency distribution
1) Click the Insert tab from menubar, from Charts submenu choose Bar, and then click the
first type of chart in the first row (when you place mouse pointer on the chart it shows
the name Clustered Bar).
2) Place mouse pointer to the empty chart and click. From the Design tab, click Select data

5

from Data submenu.
3) Move mouse pointer to the Chart data range box and click. Return to Sheet1, highlight
cells I1 to I5, and then click OK. In the current example, the Horizonal (Category) Axis
Labels are automatically tuned.
4) You can try the other types of bar charts to see different views.


2.4 Pie charts to present relative frequency distribution
1) We can use Pie charts to present frequency distribution and relative frequency
distribution. For the latter, we need to firstly calculate relative frequency. To do that, in
cell J1 type Relative Frequency, then press Enter on your keyboard which makes cell J2
active. In J2 type =I2/500 (i.e. divide the value in cell I2 by the total sample size 500),
and press Enter to finish the command. To duplicate the command in other cells, move
mouse pointer to the right-bottom corner of cell J2 (and you will see the mouse pointer
becomes a +). Click the right-bottom corner of J2 and drag the mouse pointer over cells
J3 to J5 (caution: keep pressing the left side of the mouse and dont release it until J3 to
J5 are covered). Then you will see that J3 to J5 are filled with values of relative
frequency.
2) Click the Insert tab from menubar, from Charts submenu choose Pie, and then click the
first type of chart in the first row.
3) Place mouse pointer to the empty chart and click. From the Design tab, click Select data
from Data submenu.
4) Move mouse pointer to the Chart data range box and click. Return to Sheet1, highlight
cells J1 to J5, and then click OK. In the current example, the Horizonal (Category) Axis
Labels are automatically tuned.
5) To add data labels to the Pie chart, move mouse pointer to the Pie chart, right-click on the
Pie area and choose Add Data Labels.


6


2.5 Ogive to present cumulative frequency distribution
1) We use Ogive plots to present cumulative relative frequency distribution. Thus we need
to calculate cumulative relative frequency. To do that,
In cell K1 type Cumulative Relative Frequency, then press Enter on your keyboard
which makes cell K2 active. In K2 type =J2 (i.e. first classs cumulative relative
frequency is relative frequency in the first class), and press Enter to finish the
command.
In K3 type =J3+K2 (i.e. second classs cumulative relative frequency is the sum of
second classs relative frequency and cumulative relative frequency of preceding
class).
To duplicate the command in other cells, move mouse pointer to the right-bottom
corner of cell K3 (and you will see the mouse pointer becomes a +). Click the
right-bottom corner of K3 and drag the mouse pointer over cells K3 to K5 (caution:
keep pressing the left side of the mouse and dont release it until K3 to K5 are
covered). Then you will see that K3 to K5 are filled with values of cumulative
relative frequency.
2) Click the Insert tab from menubar, from Charts submenu choose Line, and then click
the first type of chart in the second row.
3) Place mouse pointer to the empty chart and click. From the Design tab, click Select data
from Data submenu.
4) Move mouse pointer to the Chart data range box and click. Return to Sheet1, highlight
cells K1 to K5, and then click OK.



7

Part 3. Descriptive statistics

We use dataset in the Excel file XM04-06 to summarize sample statistics for 100 students
marks.

1. Maximum value can be found by typing =max(A2:A101). Note that A2:A101 is the
input range.
2. Minimum value can be found by typing =min(A2:A101).
3. Range is the difference of the above two by typing =B2-B3, where B2 is the cell of
maximum and B3 is the cell of minimum.
4. To calculate mean of the marks series, in any blank cell type =average(A2:A101).
5. To calculate median of the series, in any blank cell type =median(A2:A101).
6. To calculate mode of the series, in any blank cell type =mode(A2:A101).
7. To calculate variance of the series, in any blank cell type =var(A2:A101).
8. To calculate standard deviation of the series, in any blank cell type =stdev(A2:A101).
Check that s.d. is the square root of variance. You can do this by typing =sqrt(.) where .
is the cell where variance is.
9. Coefficient of variation can be calculated by typing =B9/B5 where B5 is the cell of
mean value and B9 is the cell of standard deviation in my practice.
10. The above statistics can be obtained by using the Summary Statistics function of Data
Analysis Toolpak. Follow the steps below to obtain the Summary:
a) Click the Data tab from the menubar, choose the last submenu Data Analysis and
Descriptive Statistics.
b) In the Input range, type in $A$2:$A$101 (or $A$1:$A$101 if the cell containing the
variable name is included, then tick the box Label in first row).
c) Under Output options click the check box for Output Range, and type the starting
cell reference for the output, e.g. $D$2 (again the cell should be blank to avoid
overwrite existing data).
d) Tick the check box for Summary Statistics.
e) Click OK.

Check that the values obtained upon commands are consistent with those in the Summary
table.

Percentiles

To find out the kth percentile value, firstly you need to arrange the data in either ascending or
descending order. Here we arrange data in ascending array.
1. Move the mouse pointer over the column reference (e.g. A if the series is in column A)
and click. Note that by doing this the whole column should be selected.
2. Right-click the mouse and choose Copy.
3. Move the mouse pointer to cell G1 and click. Right-click the mouse and choose Paste.
The whole series should be copied to column G.
4. From the Data tab, click Sort A to Z button ( ) from the Sort & Filter submenu. Data

8

are arranged in an ascending order.
5. From the Formulas tab, click Insert Function. From Select a Category drop-down list,
choose Statistical, and from Select a function, choose PERCENTILE. INC.
6. In the Array box, type in G2:G101 (the input range). Type 0.1 in the K box, which gives
the 10
th
percentile value, 38.9; type 0.25 in the K box, which gives the 25
th
percentile (i.e.
Q1) value, 64.75; type 0.5 in the K box, which gives the 50
th
percentile (i.e. median or
Q2 )value, 81; type 0.75 in the K box, which gives the 75
th
percentile (i.e. Q3) value, 90;
and type 1 in the K box, which gives the 100
th
percentile value (i.e. the last observations
value in the sample), 100.

You might also like