You are on page 1of 252

Intermediate Excel for Data Analysis

Muhammad Harunurashid bin Thelatha


16 & 17 June 2022
Workshop Content
01 Excel Overview

02 Basic Statistics : Descriptive

03 Formula and Functions

04 Data Cleaning and Preparing Tools

05 Tables

Charts and Visualization


06 Techniques
Workshop Content
07 Pivot Tables, Charts And Slicers

08 Power Query

09 Google Data Studio (Introduction)


Excel Overview
Workbook and Worksheet
• A worksheet or sheet is a single page in a file created with an electronic spreadsheet
program such as Microsoft Excel or Google Sheets.
• A workbook is the name given to an Excel file and contains one or more worksheets.
• When we open an electronic spreadsheet program, it loads an empty workbook file
consisting of one or more blank worksheets for us to use.
Worksheet Details
• We use worksheets to store, manipulate, and display data.
• The primary storage unit for data in a worksheet is a rectangular-shaped cell arranged
in a grid pattern in every sheet.
• Individual cells of data are identified and organized using the vertical column letters
and horizontal row numbers of a worksheet, which create a cell reference, such as
A1, D15, or Z467.

• Worksheet specifications for current versions of Excel include:


• 1,048,576 rows per worksheet
• 16,384 columns per worksheet
• 17,179,869,184 cells per worksheet
• A limited number of sheets per file based on the amount of memory available on the
computer (in future: cloud based services)

• For Google Sheets:


• 256 columns per sheet
• 400,000 cells for all worksheets in a file
• 200 worksheets per spreadsheet file
Official references
Question!
Just imagine you’re an office administrator who’s
been sent a ginormous 3.57GB CSV file
containing 30 million records of individual
consumption data. You’ve been asked to inspect it
and provide a summary.
Usually, you’d open the CSV file directly in Excel and immediately crack on with the rest.
The issue is when you try to do that with this, an error message pops up warning you it’s too large for the grid.
Worksheet Names
• In both Microsoft Excel and Google Sheets, each worksheet has a name.
• By default, the worksheets are named Sheet1, Sheet2, Sheet3, and so on, but we can
change these names.
Workbook Details
• Add worksheets to a workbook using the context menu or the New Sheet/Add Sheet
icon (+) next to the current sheet tabs.
• Delete or hide individual worksheets in a workbook.
• Rename individual worksheets and change worksheet tab colours to make it easier to
identify single sheets in a workbook using the context menu.
• Select the sheet tab at the bottom of the screen to change to another worksheet.
Cells and Range
• A cell range in an Excel file is a collection of selected cells.
• This range is usually symmetrical (square), but can exist of separate cells just the same.
• A cell range can be referred to in a formula as well.

• In a spreadsheet, a cell range is defined by the reference of the upper left cell (minimum
value) of the range and the reference of the lower right cell (maximum value) of the range.
• Eventually separate cells can be added to this selection, then the range is called an
irregular cell range.
• In Excel, the minimum and maximum value are included.
• That’s different from a mathematical range, in which it is a collection of values between a
maximum and a minimum value.
Apperance of a Cell Range
• A symmetrical cell range can appear as below.
• The notation for this range is (A1:C6); from upper left cell A1 to bottom right cell C6.
Irregular Cell Range
• Irregular cell ranges, like in the image below, also occur. The notation for this range is
(A1:C6;E2;E6;C7;C9).
A Cell Range within a Formula
• A cell range can be used inside a formula, for example to calculate the sum of the values
within the selected cells.
• The notation for the sum of all values in cell range (A1:C6) is =SUM(A1:C6).
Formatting numbers
• Number formats not only make your spreadsheet easier to read, but they also make it easier to use.
• When we apply a number format, we're telling the spreadsheet exactly what types of values are stored
in a cell.
• For example, the date format tells the spreadsheet that we're entering specific calendar dates.
• This allows the spreadsheet to better understand the data, which can help ensure that data remains
consistent and that the formulas are calculated correctly.
Applying number format
• Just like other types of formatting, like changing the font colour, we will apply number formats by
selecting cells and choosing the desired formatting option.
• There are two main ways to choose a number format:

• Go to the Home tab, click the Number Format drop-down menu in the Number group, and select the
desired format.
Applying number format
• Click one of the quick number-formatting commands below the drop-down menu.
Applying number format
• In this example, we've applied the Currency number format, which adds currency symbols ($) and
displays two decimal places for any numerical values.
Applying number format
• If you select any cells with number formatting, you can see the actual value of the cell in
the formula bar.
• The spreadsheet will use this value for formulas and other calculations.
Percentage formats
• One of the most helpful number formats is the percentage (%) format. It displays values
as percentages, like 20% or 55%.
• This is especially helpful when calculating things like the cost of sales tax or ROI.
• When we type a percent sign (%) after a number, the percentage number format will be
applied to that cell automatically.
Percentage formats
• As we may remember from Mathematics class (20 years before?), a percentage can also be written as a
decimal.
• So 15% is the same thing as 0.15, 7.5% is 0.075, 20% is 0.20 and 55% is 0.55.
• There are many times when percentage formatting will be useful.
• For example, in the images below notice how the sales tax rate is formatted differently for each
spreadsheet (5, 5%, and 0.05):
• The calculation in the spreadsheet on the left didn't work correctly. Without the percentage number format,
our spreadsheet thinks we want to multiply $22.50 by 5, not 5%. And while the spreadsheet on the right
still works without percentage formatting, the spreadsheet in the middle is easier to read.
Date formats
• Whenever we are working with dates, we will want to use a date format to tell the spreadsheet that we are
referring to specific calendar dates, like 1 April 2014.
• Date formats also allow us to work with a powerful set of date functions that use time and date information
to calculate an answer.

• Spreadsheets don't understand information the same way a


person would.
• For instance, if we type October into a cell, the spreadsheet won't
know we're entering a date so it will treat it like any other text.
• Instead, when we enter a date, we'll need to use a specific format
which out spreadsheet understands, like day/month/year.
• In this example, we'll type 10/12/2014 for 10 December 2014.
• Our spreadsheet will then automatically apply the date number
format for the cell.
Cell formats
• When we format cells in Excel, we change the appearance of a number without changing the number
itself. We can apply a number format (0.8, $0.80, 80%, etc) or other formatting (e.g. alignment, font,
border).

• Enter the value 0.8 into cell B2.

• By default, Excel uses the General format (no specific number format) for numbers. To apply a number
format, use the 'Format Cells' dialog box.
• Select cell B2.
• Right click, and then click Format Cells (or press CTRL + 1).
Cell formats
• The 'Format Cells' dialog box appears.
• For example, select Currency.
Cell formats
• Note: Excel gives you a life preview of how the
number will be formatted (under Sample).
• Click OK.
• Cell B2 still contains the number 0.8. We only
changed the appearance of this number. The
most frequently used formatting commands are
available on the Home tab.

• On the Home tab, in the Number group, click


the percentage symbol to apply a Percentage
format.
• On the Home tab, in the Alignment group,
centre the number.
• On the Home tab, in the Font group, add
outside borders and change the font colour to
blue.
Freeze panes and split boxes
• Select the row right below the row or rows you want to freeze. If you want to freeze columns, select the cell
immediately to the right of the column you want to freeze.
• Go to the View tab.
• Select the Freeze Panes option and click "Freeze Panes." This selection can be found in the same place
where "New Window" and "Arrange All" are located.
• The frozen rows will stay visible when you scroll down.
• If you want to unfreeze the rows, go back to the Freeze Panes command and choose "Unfreeze Panes".
• Note that under the Freeze Panes command, you can also choose "Freeze Top Row," which will freeze the
top row that's visible (and any others above it) or "Freeze First Column," which will keep the leftmost
column visible when you scroll horizontally.
Freeze panes and split boxes
• Besides allowing you to compare different rows in a long spreadsheet, the freeze panes feature lets you
keep important information, such as table headings, always in view.
Freeze panes and split boxes
• To split a cell in Excel, add a new column, change the column widths and merge cells. To split the contents
of a cell into multiple cells, use the Text to Columns wizard, flash fill or formulas.
• Task B starts at 13:00 and requires 2 hours to
complete.
• Suppose task B starts at 13:30. We would like to
split cell B3 and colour the right half.
• Select column C.
• Right click, and then click Insert.
• The default width of a column is 64 pixels. Change
the width of column B and C to 32 pixels.
• Select cell B1 and cell C1.
• On the Home tab, in the Alignment group, click the
down arrow next to Merge & Center and click
Merge Cells.
Freeze panes and split boxes
Printing | a worksheet
• On the File tab, click Print.
• To preview the other pages that will
be printed, click 'Next Page' or
'Previous Page' at the bottom of the
window.
• To print the worksheet, click the big
Print button.
Printing | what to print
• Instead of printing the entire
worksheet, you can also print the
current selection.
• First, select the range of cells you
want to print.
• Next, under Settings, select Print
Selection.
• To print the selection, click the big
Print button.

Note: you can also print the active sheets (first select the
sheets by holding down CTRL and clicking the sheet
tabs) or print the entire workbook.
Use the boxes next to Pages (see previous screenshot)
to only print a few pages of your document.
For example, 2 to 2 only prints the second page.
Printing | multiple copies
• Use the arrows next to the Copies
box.
• If one copy contains multiple pages,
you can switch between Collated
and Uncollated. For example, if you
print 6 copies, Collated prints the
entire first copy, then the entire
second copy, etc. Uncollated prints
6 copies of page 1, 6 copies of
page 2, etc.
Printing | orientation
• You can switch between Portrait
Orientation (more rows but fewer
columns) and Landscape
Orientation (more columns but
fewer rows).
Printing | page margins
• Select one of the predefined
margins (Normal, Wide or Narrow)
from the Margins drop-down list.
• Or click the 'Show Margins' icon at
the bottom right of the window. Now
you can drag the lines to manually
change the page margins.
Printing | scaling
• Select 'Fit Sheet on One Page' from
the Scaling drop-down list.

Note: you can also shrink the printout to one page wide or
one page high.
Click Custom Scaling Options to manually enter a scaling
percentage or to fit the printout to a specific number of
pages wide and tall.
Be careful, Excel doesn't warn you when your printout
becomes unreadable.
Basic Statistics : Descriptive
Count | Minimum | Maximum | Average | Median | Mode
• The COUNT function returns the count of numeric values in the list of supplied arguments.
• COUNT takes multiple arguments in the form value1, value2, value3, etc.
• Arguments can be individual hardcoded values, cell references, or ranges up to a total of 255 arguments.
• All numbers are counted, including negative numbers, percentages, dates, times, fractions, and formulas
that return numbers. Empty cells and text values are ignored.
Count | Minimum | Maximum | Average | Median | Mode
• The COUNT function counts numeric values and ignores text values:

= COUNT(1,2,3) // returns 3
= COUNT(1,"a","b") // returns 1
= COUNT("apple",100,125,150,"orange") // returns 3

• Typically, the COUNT function is used on a range. For example, to count numeric values in the range
A1:A10:

• In the example shown, COUNT is set up to count numbers in the range B5:B15:

= COUNT(A1:A100) // count numbers in A1:A10


Count | Minimum | Maximum | Average | Median | Mode
• COUNT returns 6, since there are 6 numeric values in the range B5:B15.
• Text values and blank cells are ignored. Note that dates and times are numbers, and therefore included in
the count.
• The COUNTA function works like the COUNT function, but COUNTA includes numbers and text in the
count.

= COUNT(A1:A100) // count numbers in A1:A10

• In the example shown, COUNT is set up to count numbers in the range B5:B15:

= COUNT(B5:B15) // returns 6

• COUNT returns 6, since there are 6 numeric values in the range B5:B15. Text values and blank cells are
ignored. Note that dates and times are numbers, and therefore included in the count.

• The COUNTA function works like the COUNT function


• COUNTA includes numbers and text in the count.
Count | Minimum | Maximum | Average | Median | Mode
• COUNT returns 6, since there are 6 numeric values in the range B5:B15.
• Text values and blank cells are ignored. Note that dates and times are numbers, and therefore included in
the count.
• The COUNTA function works like the COUNT function, but COUNTA includes numbers and text in the
count.
= COUNT(A1:A100) // count numbers in A1:A10

• In the example shown, COUNT is set up to count numbers in the range B5:B15:

= COUNT(B5:B15) // returns 6

• COUNT returns 6, since there are 6 numeric values in the range B5:B15. Text values and blank cells are
ignored. Note that dates and times are numbers, and therefore included in the count.

• The COUNTA function works like the COUNT function


• COUNTA includes numbers and text in the count.
Count | Minimum | Maximum | Average | Median | Mode
• To count numbers only, use the COUNT function.
• To count numbers and text, use the COUNTA function.
• To count with one condition, use the COUNTIF function
• To count with multiple conditions, use the COUNTIFS function.
• To count empty cells, use the COUNTBLANK function.

Notes
• COUNT can handle up to 255 arguments.
• COUNT ignores the logical values TRUE and FALSE.
• COUNT ignores text values and empty cells.
Count | Minimum | Maximum | Average | Median | Mode
• The MIN function returns the smallest numeric value in the data provided. The MIN function can be used to
return the smallest value from any type of numeric data.
• For example, MIN can return the fastest time in a race, the earliest date, the smallest percentage, the
lowest temperature, or the bottom sales number.

• The MIN function takes multiple arguments in the form number1, number2, number3, etc. up to 255 total.
Arguments can be a hardcoded constant, a cell reference, or a range, in any combination. MIN ignores
empty cells, text values, and the logical values TRUE and FALSE.
Count | Minimum | Maximum | Average | Median | Mode
• The MIN function returns the smallest numeric value in supplied data:
= MIN(12,17,25,11,23) // returns 11

• The MIN function can accept values as separate arguments or in ranges or arrays:
= MIN(5,10)
= MIN(A1,A2,A3)
= MIN(A1:A10)
= MIN(A1:A10,C1:C10)

• MIN ignores logical values and numbers entered as text, unless they are provided as arguments:
= MIN(-1,TRUE) // returns 1
= MIN(-1,TRUE,"3") // returns 3

• To return the maximum value with criteria, use the MINIFS function.
• To retrieve the nth largest value in a data set, use the LARGE function.
• To determine the rank of a number in a set of data, use the RANK function.

Notes
Arguments can be provided as numbers, names, arrays, or references.
MIN accepts up to 255 arguments. If arguments contain no numbers, MIN returns 0.
MIN ignores empty cells, text values, and TRUE and FALSE in references.
MIN will evaluate numbers as text and logical values supplied directly as arguments.
To include logical values in a reference, see the MINA function.
Count | Minimum | Maximum | Average | Median | Mode
• The MAX function returns the largest numeric value in the data provided. The MAX function can be used to
return the largest value from any type of numeric data.
• For example, MAX can return the slowest time in a race, the latest date, the largest percentage, the
highest temperature, or the top sales number.

• The MAX function takes multiple arguments in the form number1, number2, number3, etc. up to 255 total.
Arguments can be a hardcoded constant, a cell reference, or a range, in any combination. MAX ignores
empty cells, text values, and the logical values TRUE and FALSE.
Count | Minimum | Maximum | Average | Median | Mode
• The MAX function returns the largest numeric value in supplied data:
= MAX(12,17,25,11,23) // returns 25

• The MAX function can accept values as separate arguments or in ranges or arrays:
= MAX(5,10)
= MAX(A1,A2,A3)
= MAX(A1:A10)
= MAX(A1:A10,C1:C10)

• MAX ignores logical values and numbers entered as text, unless they are provided as arguments:
= MAX(-1,TRUE) // returns 1
= MAX(-1,TRUE,"3") // returns 3

• To return the maximum value with criteria, use the MAXIFS function.
• To retrieve the nth largest value in a data set, use the LARGE function.
• To determine the rank of a number in a set of data, use the RANK function.

Notes
Arguments can be provided as numbers, names, arrays, or references.
MAX accepts up to 255 arguments. If arguments contain no numbers, MAX returns 0.
MAX ignores empty cells, text values, and TRUE and FALSE in references.
MAX will evaluate numbers as text and logical values supplied directly as arguments.
To include logical values in a reference, see the MAXA function.
Count | Minimum | Maximum | Average | Median | Mode
• The AVERAGE function calculates the average of numbers provided as arguments. To calculate the
average, Excel sums all numeric values and divides by the count of numeric values.

• AVERAGE takes multiple arguments in the form number1, number2, number3, etc. up to 255 total.
Arguments can include numbers, cell references, ranges, arrays, and constants. Empty cells, and cells that
contain text or logical values are ignored. However, zero (0) values are included. You can ignore zero (0)
values with the AVERAGEIFS function, as explained below.

• The AVERAGE function will ignore logical values and numbers entered as text. If you need to include
these values in the average, see the AVERAGEA function.

• If the values given to AVERAGE contain errors, AVERAGE returns an error. You can use the AGGREGATE
function to ignore errors.
Count | Minimum | Maximum | Average | Median | Mode
• A typical way to use the AVERAGE function is to provide a range, as seen below. The formula in F3,
copied down, is:
= AVERAGE(C3:E3)

• At each new row, AVERAGE calculates an average of the quiz scores for each person.

• The AVERAGE function automatically ignores blank cells. In the screen below, notice cell C4 is empty, and
AVERAGE simply ignores it and computes an average with B4 and D4 only:

Basic usage
Count | Minimum | Maximum | Average | Median | Mode
• However, note the zero (0) value in C5 is included in the average, since it is a valid numeric value. To
exclude zero values, use AVERAGEIF or AVERAGEIFS instead.
• In the example below, AVERAGEIF is used to exclude zero values. Like the AVERAGE function,
AVERAGEIF automatically excludes empty cells.

= AVERAGEIF(B3:D3,">0") // exclude zero

Blank cells
Count | Minimum | Maximum | Average | Median | Mode
• The numbers provided to AVERAGE can be a mix of references and constants:

= AVERAGE(A1,A2,4) // returns 3

Mixed arguments
Count | Minimum | Maximum | Average | Median | Mode
• To calculate an average with criteria, use AVERAGEIF or AVERAGEIFS. In the example below,
AVERAGEIFS is used to calculate the average score for Red and Blue groups:

= AVERAGEIFS(C5:C14,D5:D14,"red") // red average


= AVERAGEIFS(C5:C14,D5:D14,"blue") // blue average

Average with criteria


Count | Minimum | Maximum | Average | Median | Mode
• By combining the AVERAGE function with the LARGE function, you can calculate an average of top n
values. In the example below, the formula in column I computes an average of the top 3 quiz scores in
each row:

Average top n
Count | Minimum | Maximum | Average | Median | Mode
• To calculate a weighted average, you'll want to use the SUMPRODUCT function, as shown below:

Weighted average
Count | Minimum | Maximum | Average | Median | Mode
• The average function automatically ignores empty cells in a set of data. However, if the range contains no
numeric values, AVERAGE will return a #DIV/0! error. To avoid this problem, you can check the count of
values with the COUNT function and the IF function like this:

= IF(COUNT(range)>0,AVERAGE(range),"") // check count first

• When the count of numeric values is zero, IF returns an empty string (""). When the count is greater than
zero, AVERAGE returns the average.

Notes
AVERAGE automatically ignores empty cells and cells with text values.
AVERAGE includes zero values. Use AVERAGEIF or AVERAGEIFS to ignore zero
values.
Arguments can be supplied as constants, ranges, named ranges, or cell references.
AVERAGE can handle up to 255 total arguments.
To see a quick average without a formula, you can use the status bar.

Average without #DIV/0!


Count | Minimum | Maximum | Average | Median | Mode
• The MEDIAN function returns the median (middle number) in a set of data.
• The calculation performed by MEDIAN varies according to the number of numeric values provided.
• When the number is odd, MEDIAN returns the middle number in the group. When the number is even,
MEDIAN returns the average of the two numbers in the middle.

= MEDIAN(B5:B16) // returns 83.5


= MEDIAN(D5:D16) // returns 80
Count | Minimum | Maximum | Average | Median | Mode
• The Excel MODE.MULT function returns a vertical array of the most frequently occurring number(s) in a
numeric data set.
• The mode is the most frequently occurring number in a set of data. When there is just one mode in a set of
data, MODE.MULT will return a single result.
• If there is more than one mode in supplied data, MODE.MULT will return more than one result. If there are
no modes, MODE.MULT will return #N/A.
Count | Minimum | Maximum | Average | Median | Mode
• If there are no duplicate numbers, the MODE.MULT function returns the #N/A error:
= MODE(7,9,6,5,3,1,0) // returns #N/A

• If there is more than one mode in a set of data, MODE.MULT will return more than one result:
= MODE.MULT(1,3,3,5,5,7,7,8) // returns {3,5,7}

• The MODE.MULT function returns results in a vertical array. To return a horizontal array, add the
TRANSPOSE function:
= TRANSPOSE(MODE.MULT(range))
Count | Minimum | Maximum | Average | Median | Mode
• MODE.SNGL is different from MODE.MULT function as the MODE.SNGL function returns the lowest
mode, whereas the MODE.MULT function returns an array of all the modes.
Range | Interquartile Range | Standard Deviation | Variance | Coefficient of Variation |
Min/Max Ratio | Correlation

• Range = maximum value – minimum value

• So if you have a set of data such as 4, 2, 5, 8, 12, 15, the range is the highest number (15) minus the
lowest number (2). In this case:

• Range = 15 – 2 = 13

another example
Range | Interquartile Range | Standard Deviation | Variance | Coefficient of Variation |
Min/Max Ratio | Correlation

• There is no direct formula to calculate the IQR in Excel, however, it is relatively straight forward to do. The
easiest approach is to firstly calculate the Q1 and Q3 and then use these to determine the IQR.

• To calculate the Q1 in Excel, click on an


empty cell and type ‘=QUARTILE(array,
1)‘. Replace the ‘array‘ part with the data
of interest. For this, simply click and drag
on the cells containing all of the data. The
‘1‘ in the formula signifies Excel to return
the Q1 of the data.
Range | Interquartile Range | Standard Deviation | Variance | Coefficient of Variation |
Min/Max Ratio | Correlation

• Next, we need to calculate Q3. To calculate Q3 in Excel, simply find an empty cell and enter the formula ‘=
QUARTILE(array, 3)‘. Again, replacing the ‘array‘ part with the cells that contain the data of interest.
Range | Interquartile Range | Standard Deviation | Variance | Coefficient of Variation |
Min/Max Ratio | Correlation

• Finally, to calculate the IQR, simply subtract the Q1 value away from the Q3 value. In the example above,
the formula used would be ‘=D3-D2‘.
Range | Interquartile Range | Standard Deviation | Variance | Coefficient of Variation |
Min/Max Ratio | Correlation

• Use the median to separate the dataset into two halves.


• Calculate Q1 as the median value in the lower half and Q3 as the median value in the upper half. Be sure
to exclude the median of the dataset when calculating Q1 and Q3. I L E. E X
C
T
QUAR
• Use the median to separate the dataset into two halves.
• Calculate Q1 as the median value in the lower half and Q3 as the median value in the upper half. Be sure
to include the median of the dataset when calculating Q1 and Q3. I L E. I N
C
T
QUAR
• This function calculates the quartiles of a dataset as well. It will return the exact same value as the
QUARTILE.INC function. RTILE QUA

QUARTILE.EXC vs. QUARTILE.INC in Excel: What’s the Difference?


Range | Interquartile Range | Standard Deviation | Variance | Coefficient of Variation |
Min/Max Ratio | Correlation

• The STDEV function calculates the standard deviation for a sample set of data.
• Standard deviation measures how much variance there is in a set of numbers compared to the average
(mean) of the numbers.
• The STDEV function is meant to estimate standard deviation in a sample. If data represents an entire
population, use the STDEVP function.
Range | Interquartile Range | Standard Deviation | Variance | Coefficient of Variation |
Min/Max Ratio | Correlation

Name Data set Text and logicals


STDEV Sample Ignored
STDEVP Population Ignored
STDEV.S Sample Ignored
STDEV.P Population Ignored
STDEVA Sample Evaluated
STDEVPA Population Evaluated

Notes
STDEV calculates standard deviation using the "n-1" method.
STDEV assumes data is a sample only. When data represents an entire population, use
STDEVP or STDEV.P.
Numbers are supplied as arguments. They can be supplied as actual numbers, ranges, arrays,
or references that contain numbers.
STDEV ignores text and logical values that occur in references, but evaluates text and logicals
hardcoded as function arguments.
To evaluate logical values and/or text in the calculation, use the STDEVA function.
Range | Interquartile Range | Standard Deviation | Variance | Coefficient of Variation |
Min/Max Ratio | Correlation

• Variance is the square of standard deviation.


Range | Interquartile Range | Standard Deviation | Variance | Coefficient of Variation |
Min/Max Ratio | Correlation

• Coefficient of variation is a measure of relative variability of data with respect to the mean.
• It represents a ratio of the standard deviation to the mean, and can be a useful way to compare data series
when means are different. It is sometimes called relative standard deviation (RSD).

= STDEV.P(B5:F5)/AVERAGE(B5:F5)

• The calculated CV values show variability with respect mean more clearly. In the first data series, the CV is
nearly 50%. In the last data series, the CV is only .12%.
Range | Interquartile Range | Standard Deviation | Variance | Coefficient of Variation |
Min/Max Ratio | Correlation

• The GCD function returns the greatest common divisor of two or more integers.
• The greatest common divisor is the largest positive integer that divides the numbers without a remainder.
In other words, the largest number that goes into all numbers evenly.
= GCD(60,36) // returns 12

• To return the greatest common divisor of the numbers 60 and 36:


• GCD returns the number 12, since 12 is the largest factor that goes into both numbers evenly. To get the
greatest common divisor of 12, 16, 48:
= GCD(12,16,48) // returns 4

Notes
GCD evaluates empty cells as zero.
GCD works with integers; decimal values are removed before calculation.
If arguments contain a non-numeric value. GCD returns the #VALUE! error.
To calculate the least common multiple, see the LCM function.
Range | Interquartile Range | Standard Deviation | Variance | Coefficient of Variation |
Min/Max Ratio | Correlation

• The Excel Correl function calculates the Pearson Product-Moment Correlation


Coefficient for two sets of values.
• The Pearson Product-Moment Correlation Coefficient of the values in columns A and B
of the spreadsheet can be calculated using the Excel Correl function, as follows:

= CORREL( A2:A21, B2:B21 )

• This gives the result 0.870035104, which indicates a strong positive correlation
between the two sets of values.

Notes
Correl function ignores text values and logical values that are supplied as
part of an array.
Frequency tables
• The FREQUENCY function counts how often numeric values occur in a set of data and returns a frequency
distribution – a list that shows the frequency (count) of each value in a range at given intervals (bins).
• FREQUENCY returns the distribution as a vertical array of numbers that represent a "count per bin".

• The FREQUENCY function always returns an array with one more item than bins in the bins_array. This is
by design, to catch any values greater than the largest value in the bins_array. The general pattern for
FREQUENCY is:

= FREQUENCY(data,bins)
Data visualisation: types of graphs
• As a best practice, divide your Excel workbook into three.

1) Data – This could be one or more than one worksheet that contain the raw data.
2) Calculations – This is where you do all the calculations. Again, you may have one or more than one sheet
for calculations.
3) Dashboard/ Data visualisation – This is the sheet that has the dashboard. In most of the cases, it is a
single page view that shows analysis/insights backed by data.
Formula and Functions
Mathematical and statistical functions
• Excel provides a number of rounding functions, each with a different behaviour:

 To round with standard rules, use the ROUND function.


 To round to the nearest multiple, use the MROUND function.
 To round down to the nearest specified place, use the ROUNDDOWN function.
 To round down to the nearest specified multiple, use the FLOOR function.
 To round up to the nearest specified place, use the ROUNDUP function.
 To round up to the nearest specified multiple, use the CEILING function.
 To round down and return an integer only, use the INT function.
 To truncate decimal places, use the TRUNC function.

ROUND
Mathematical and statistical functions
• The Excel SUM function returns the sum of values supplied. These values can be numbers, cell references,
ranges, arrays, and constants, in any combination. SUM can handle up to 255 individual arguments.

 SUM automatically ignores empty cells and cells with text values.
 If arguments contain errors, SUM will return an error.
 The AGGREGATE function can sum while ignoring errors.
 SUM can handle up to 255 total arguments.
 Arguments can be supplied as constants, ranges, named ranges, or cell references.

SUM
Mathematical and statistical functions
• The Excel SUMIF function returns the sum of cells that meet a single condition. Criteria can be applied to
dates, numbers, and text. The SUMIF function supports logical operators (>,<,<>,=) and wildcards (*,?) for
partial matching.

 SUMIF only supports one condition. Use the SUMIFS


function for multiple criteria.
 When sum_range is omitted, the cells in range will be
summed.
 Text strings in criteria must be enclosed in double
quotes (""), i.e. "apple", ">32", "ja*"
 Cell references in criteria are not enclosed in quotes,
i.e. "<"&A1
 The wildcard characters ? and * can be used in criteria.
A question mark matches any one character and an
asterisk matches any sequence of characters (zero or
more).
 To find a literal question mark or asterisk, use a tilde (~)
in front of the question mark or asterisk (i.e. ~?, ~*).
 SUMIFS requires a range, you can't substitute an array.

SUMIF
Mathematical and statistical functions
• The COUNTIF function in Excel counts the number of cells in a range that match one supplied condition.
Criteria can include logical operators (>,<,<>,=) and wildcards (*,?) for partial matching. Criteria can also be
based on a value from another cell.
 COUNTIF is not case-sensitive. Use the EXACT
function for case-sensitive counts.
 COUNTIF only supports one condition. Use the
COUNTIFS function for multiple criteria.
 Text strings in criteria must be enclosed in double
quotes (""), i.e. "apple", ">32", "ja*"
 Cell references in criteria are not enclosed in quotes,
i.e. "<"&A1
 The wildcard characters ? and * can be used in criteria.
A question mark matches any one character and an
asterisk matches any sequence of characters (zero or
more).
 To match a literal question mark or asterisk, use a tilde
(~) in front question mark or asterisk (i.e. ~?, ~*).
 COUNTIF requires a range, you can't substitute an
array.
 COUNTIF returns incorrect results when used to match
strings longer than 255 characters.
 COUNTIF will return a #VALUE error when referencing
another workbook that is closed. COUNTIF
Mathematical and statistical functions
• The COUNTIFS function in Excel counts the number of cells in a range that match one supplied criteria.
Unlike the older COUNTIF function, COUNTIFS can apply more than one condition at the same time.
Conditions are supplied with range/criteria pairs, and only the first pair is required. For each additional
condition, you must supply another range/criteria pair. Up to 127 range/criteria pairs are allowed.

 Multiple conditions are applied with AND logic,


i.e. condition 1 AND condition 2, etc.
 Each additional range must have the same
number of rows and columns as range1, but
ranges do not need to be adjacent. If you
supply ranges that don't match, you'll get a
#VALUE error.
 Non-numeric criteria needs to be enclosed in
double quotes but numeric criteria does not.
For example: 100, "100", ">32", "jim", or A1
(where A1 contains a number).
 The wildcard characters ? and * can be used
in criteria. A question mark matches any one
character and an asterisk matches any
sequence of characters.
 To find a literal question mark or asterisk, use
a tilde (~) in front question mark or asterisk
(i.e. ~?, ~*). COUNTIFS
Mathematical and statistical functions
• AVERAGEIF calculates the average of the numbers in a range that meet supplied criteria. Criteria can be
supplied as numbers, strings, or references. For example, valid criteria could be 10, ">10", A1, or "<"&A1.

 Cells in range that contain TRUE or FALSE


are ignored.
 Empty cells are ignored in range and
average_range when calculating averages.
 AVERAGEIF returns #DIV/0! if no cells in
range meet criteria.
 Average_range does not have to be the same
size as range. The top left cell in
average_range is used as the starting point,
and cells that correspond to cells in range are
averaged.
 AVERAGEIF allows the wildcard characters
question mark (?) and asterisk (*), in criteria.
The ? matches any single character and the *
matches any sequence of characters. To find
a literal ? or *, use a tilde (~) before the
character, i.e. ~* and ~?.

AVERAGEIF
Mathematical and statistical functions
• AVERAGEIF calculates the average of the numbers in a range that meet supplied criteria. Criteria can be
supplied as numbers, strings, or references. For example, valid criteria could be 10, ">10", A1, or "<"&A1.

 If no data matches criteria, AVERAGEIFS


returns the #DIV0! error
 Each additional range must have the same
number of rows and columns as the
average_range.
 Non-numeric criteria needs to be enclosed in
double quotes but numeric criteria does not.
For example: 100, "100", ">32", "jim", or A1
(where A1 contains a number).
 The wildcard characters ? and * can be used
in criteria. A question mark matches any one
character and an asterisk matches zero or
more characters of any kind.
 To find a literal question mark or asterisk, use
a tilde (~) in front question mark or asterisk
(i.e. ~?, ~*).

AVERAGEIFS
Mathematical and statistical functions
• The Excel SMALL function returns a numeric value based on its position in a list when sorted by value in
ascending order. In other words, SMALL can return the "nth smallest" value (1st smallest value, 2nd
smallest value, 3rd smallest value, etc.) from a set of numeric data.

 SMALL ignores empty cells, text values, and


TRUE and FALSE values.
 If array contains no numeric values, SMALL
returns a #NUM! error.
 To determine the rank of a number in a data
set, use the RANK function.

SMALL
Mathematical and statistical functions
• The Excel LARGE function returns a numeric value based on its position in a list when sorted by value in
descending order. In other words, LARGE can retrieve the "nth largest" value – 1st largest value, 2nd
largest value, 3rd largest value, etc.

 LARGE ignores empty cells, text values, and


TRUE and FALSE values.
 If array contains no numeric values, LARGE
returns a #NUM! error.
 To determine the rank of a number in a data
set, use the RANK function.

LARGE
Mathematical and statistical functions
• The Excel RANK function returns the rank of a numeric value when compared to a list of other numeric
values. RANK can rank values from largest to smallest (i.e. top sales) as well as smallest to largest (i.e.
fastest time).

 The default for order is zero (0). If order is 0


or omitted, number is ranked against the
numbers sorted in descending order: smaller
numbers receive a higher rank value, and the
largest value in a list will be ranked #1.
 If order is 1, number is ranked against the
numbers sorted in ascending order: smaller
numbers receive a lower rank value, and the
smallest value in a list will be ranked #1.
 It is not necessary to sort the values in the list
before using the RANK function.
 In the event of a tie (i.e. the list contains
duplicates) RANK will assign the same rank
value to each set of duplicates.
 Some documentation suggests ref can be a
range or array, but it appears ref must be a
range.

RANK
Mathematical and statistical functions
• The Excel SORT function sorts the contents of a range or array in ascending or descending order. Values
can be sorted by one or more columns. SORT returns a dynamic array of results.

SORT
Mathematical and statistical functions
• The Excel SORTBY function sorts the contents of a range or array based on the values from another range
or array. The range or array used to sort does not need to appear in results.

SORTBY
Mathematical and statistical functions
• Unfortunately, SORTBY function is available in Excel for Microsoft 365 Excel for Microsoft 365 for Mac
Excel for the web Excel 2021 Excel 2021 for Mac Excel for iPad Excel for iPhone Excel for Android tablets
Excel for Android phones

SORTBY
Understanding functions from logical reasoning
KSSM Mathematics Form 4
Chapter 3: Logical Reasoning
Lookup and reference function
• VLOOKUP is an Excel function to look up data in a table organized vertically. VLOOKUP supports
approximate and exact matching, and wildcards (* ?) for partial matches. Lookup values must appear in the
first column of the table passed into VLOOKUP.

= VLOOKUP (lookup_value, table_array, column_index_num, [range_lookup])

lookup_value - The value to look for in the first column of a table.


table_array - The table from which to retrieve a value.
column_index_num - The column in the table from which to retrieve a value.
range_lookup - [optional] TRUE = approximate match (default). FALSE = exact match.

VLOOKUP
Lookup and reference function
• The purpose of VLOOKUP is to look up information in a table like this:

• With the Order number in column B as the lookup_value, VLOOKUP can get the Cust. ID, Amount, Name,
and State for any order. For example, to get the name for order 1004, the formula is:

= VLOOKUP(1004,B5:F9,4,FALSE) // returns "Sue Martin"

VLOOKUP
Lookup and reference function
• When you use VLOOKUP, imagine that every column in the table_array is numbered, starting from the left.
To get a value from a given column, provide the number for column_index_num. For example, the column
index to retrieve the first name below is 2:

VLOOKUP
Lookup and reference function
• VLOOKUP can only look to the right. In other words, you can only retrieve data to the right of the column
that holds lookup values:

VLOOKUP
Lookup and reference function
• EXACT match vs. APPROXIMATE match

VLOOKUP
Lookup and reference function
• The Excel HLOOKUP function finds and retrieve a value from data in a horizontal table. The "H" in
HLOOKUP stands for "horizontal", and lookup values must appear in the first row of the table, moving
horizontally to the right. HLOOKUP supports approximate and exact matching, and wildcards (* ?) for
finding partial matches.

= HLOOKUP (lookup_value, table_array, row_index, [range_lookup])

HLOOKUP
Lookup and reference function
• The Excel HLOOKUP function finds and retrieve a value from data in a horizontal table. The "H" in
HLOOKUP stands for "horizontal", and lookup values must appear in the first row of the table, moving
horizontally to the right. HLOOKUP supports approximate and exact matching, and wildcards (* ?) for
finding partial matches.

= HLOOKUP (lookup_value, table_array, row_index, [range_lookup])

• In the example shown, the goal is to look


up the correct Level and Bonus for the
sales amounts in C5:C13. The lookup
table is in H4:J6, which is the named
range "table". Note this is an approximate
match scenario. For each amount in
C5:C13, the goal is to find the best match,
not an exact match.

HLOOKUP
Lookup and reference function
• In the screen below, the goal is to look up the correct level for a numeric rating 1-4. In cell D5, the
HLOOKUP formula

= HLOOKUP(C5,table,2,FALSE) // exact match

Range_lookup controls whether the lookup value needs to match


exactly or not. The default is TRUE = allow non-exact match.
Set range_lookup to FALSE to require an exact match.
If range_lookup is omitted or TRUE, and no exact match is found,
HLOOKUP will match the nearest value in the table that is still less
than the lookup value. However, HLOOKUP will still match an exact
value if one exists.
If range_lookup is TRUE , lookup values in the first row of the table
must be sorted in ascending order. Otherwise, HLOOKUP may return
an incorrect or unexpected value.
If range_lookup is FALSE (exact match), values in the first row of the
lookup table do not need to be sorted.

HLOOKUP
Lookup and reference function
• The Excel INDEX function returns the value at a given location in a range or array. You can use INDEX to
retrieve individual values, or entire rows and columns. The MATCH function is often used together with
INDEX to provide row and column numbers.

• In the example shown, the goal is to get the diameter of the planet Jupiter. Because Jupiter is the fifth
planet in the list, and Diameter is the third column, the formula in G7 is:

= INDEX(B5:E13,5,3) // diameter of Jupiter

INDEX
Lookup and reference function
• MATCH is an Excel function used to locate the position of a lookup value in a row, column, or table.
MATCH supports approximate and exact matching, and wildcards (* ?) for partial matches. Often, MATCH
is combined with the INDEX function to retrieve a value at a matched position.

• The MATCH function is used to determine the position of a value in a range or array. For example, in the
screenshot above, the formula in cell E6 is configured to get the position of the value in cell D6.
• The MATCH function returns 5 because the lookup value ("peach") is in the 5th position in the range
B6:B14:

= MATCH(D6,B6:B14,0) // returns 5

MATCH
Lookup and reference function
• The MATCH function is commonly used together with the INDEX function. The resulting formula is called
"INDEX and MATCH". For example, in the screen below, INDEX and MATCH are used to return the cost of
a code entered in cell F4. The formula in F5 is:

= INDEX(C5:C12,MATCH(F4,B5:B12,0)) // returns 150


= INDEX(C5:C12,7)
= 150

INDEX and MATCH


Logical functions
• The IF function runs a logical test and returns one value for a TRUE result, and another for a FALSE result.
For example, to "pass" scores above 70: =IF(C6>70,"Pass","Fail"). More than one condition can be tested
by nesting IF functions. The IF function can be combined with logical functions like AND and OR to extend
the logical test.

IF
Logical functions
• The Excel IFERROR function returns a custom result when a formula generates an error, and a standard
result when no error is detected. IFERROR is an elegant way to trap and manage errors without using
more complicated nested IF statements.

• The IFERROR function is a useful function, but it is a blunt instrument since it will trap many kinds of errors.
For example, if there's a typo in a formula, Excel may return the #NAME? error, but IFERROR will suppress
the error and return the alternative result. This can obscure an important problem. In many cases, it makes
more sense to use the IFNA function, which only traps the #N/A error.

Excel provides a number of error-related functions, each with a different behaviour:

 The ISERR function returns TRUE for any error type except the #N/A error.
 The ISERROR function returns TRUE for any error.
 The ISNA function returns TRUE for #N/A errors only.
 The ERROR.TYPE function returns the numeric code for a given error.
 The IFERROR function traps errors and provides an alternative result.
 The IFNA function traps #N/A errors and provides an alternative result.

IFERROR
Logical functions
• The Excel AND function is a logical function used to require more than one condition at the same time. AND
returns either TRUE or FALSE. To test if a number in A1 is greater than zero and less than 10, use
=AND(A1>0,A1<10). The AND function can be used as the logical test inside the IF function to avoid extra
nested IFs, and can be combined with the OR function.

AND
Logical functions
• To test if the value in A1 is greater than 0 and less than 5, you can use AND like this:

= AND(A1>0,A1<5)

• You can embed the AND function inside the IF function. Using the above example, you can supply AND as
the logical_test for the IF function like so:

= IF(AND(A1>0,A1<5), "Approved", "Denied")

• This formula will return "Approved" only if the value in A1 is greater than 0 and less than 5.

• You can combine the AND function with the OR function. The formula below returns TRUE when A1 > 100
and B1 is "complete" or "pending":

= AND(A1>100,OR(B1="complete",B1="pending"))

AND
Logical functions
• The Excel OR function returns TRUE if any given argument evaluates to TRUE, and returns FALSE if all
supplied arguments evaluate to FALSE. For example, to test A1 for either "x" or "y", use
=OR(A1="x",A1="y"). The OR function can be used as the logical test inside the IF function to avoid nested
IFs, and can be combined with the AND function.
• For example, to test if the value in A1 OR the value in B1 is greater than 75, use the following formula:

= OR(A1>75,B1>75)

• OR can be used to extend the functionality of


functions like the IF function. Using the above
example, you can supply OR as the logical_test
for an IF function like so:

= IF(OR(A1>75,B1>75), "Pass", "Fail")

• This formula will return "Pass" if the value in A1


is greater than 75 OR the value in B1 is greater
than 75.
OR
Date functions
• The YEAR function extracts the year from a given date as a 4-digit number. For example:

= YEAR("23-Aug-2012") // returns 2012


= YEAR("11-May-2019") // returns 2019

• You can use the YEAR function to extract a month number from a date into a cell, or to feed a month
number into another function like the DATE function:

= DATE(YEAR(A1),1,1) // first of same year

YEAR
Date functions
• The Excel MONTH function extracts the month from a given date as number between 1 to 12. You can use
the MONTH function to extract a month number from a date into a cell, or to feed a month number into
another function like the DATE function.

• To use the MONTH function, supply a date:

= MONTH("23-Aug-2012") // returns 8
= MONTH("11-May-2019") // returns 5

• With the date "3 October 1975" in cell B5, MONTH returns 10:

• You can use the MONTH function to extract a month


number from a date into a cell, or to feed a month
number into another function like the DATE function. The
formula below extracts the month from the date in cell B5
and uses the TODAY and DATE functions to create a
date on the first day of the same month in the current
year.

= DATE(YEAR(TODAY(),MONTH(B5),1) // same month current year MONTH


Date functions
• The Excel DAY function returns the day of the month as a number between 1 to 31 from a given date. You
can use the DAY function to extract a day number from a date into a cell. You can also use the DAY
function to extract and feed a day value into another function, like the DATE function.

• The DAY function returns the day value in a given date as a number between 1 to 31 from a given date. For
example, with the date January 15, 2019 in cell A1:

= DAY(B5) // returns 1

• You can use the DAY function to extract a day


number from a date into a cell. You can also use
the DAY function to extract and feed a day value
into another function, like the DATE function. For
example, to change the year of a date in cell A1 to
2020, but leave the month and day as-is, you can
use a formula like this:

= DATE(2020,MONTH(A1),DAY(A1))

DAY
Date functions
• The Excel DAYS function returns the number of days between two dates. With a start date in A1 and end
date in B1, =DAYS(B1,A1) will return the days between the two dates.
= DAYS (end_date, start_date)

DAYS
Date functions
• The Excel DATE function creates a valid date from individual year, month, and day components. The DATE
function is useful for assembling dates that need to change dynamically based on other values in a
worksheet.
= DATE (year, month, day)

DATE
Date functions
• The Excel WEEKDAY function takes a date and returns a number between 1-7 representing the day of
week. By default, WEEKDAY returns 1 for Sunday and 7 for Saturday, but this is configurable. You can use
the WEEKDAY function inside other formulas to check the day of week.
= WEEKDAY (serial_number, [return_type])

WEEKDAY
Date functions
• The Excel WEEKDAY function takes a date and returns a number between 1-7 representing the day of
week. By default, WEEKDAY returns 1 for Sunday and 7 for Saturday, but this is configurable. You can use
the WEEKDAY function inside other formulas to check the day of week.
= WEEKDAY (serial_number, [return_type])

WEEKDAY
Date functions
• Serial_number should be a valid Excel date in serial number format. Return_type is an optional numeric
code that controls which day of the week is considered the first day. By default, WEEKDAY returns 1 for
Sunday and 7 for Saturday, as seen in the table below:

Result Meaning
1 Sunday
2 Monday
3 Tuesday
4 Wednesday
5 Thursday
6 Friday
7 Saturday

WEEKDAY
Date functions
• WEEKDAY supports several numbering schemes, controlled by the return_type argument. Return_type is
optional and defaults to 1. The table below shows available return_type codes, the numeric result of each
code, and which day is the first day in the mapping scheme.

Result Numerical result Day mapping


None 1–7 Sunday – Saturday
1 1–7 Sunday – Saturday
2 1–7 Monday – Sunday
3 0–6 Monday – Sunday
11 1–7 Monday – Sunday
12 1–7 Tuesday – Monday
13 1–7 Wednesday – Tuesday
14 1–7 Thursday – Wednesday
15 1–7 Friday – Thursday
16 1–7 Saturday – Friday
17 1–7 Sunday – Saturday
WEEKDAY
Text functions
• The Excel TRIM function strips extra spaces from text, leaving only a single space between words and no
space characters at the start or end of the text.
= TRIM(" A stitch in time. ") // returns "A stitch in time.“

• The TRIM function can be used together with the CLEAN function to remove extra space and strip out other
non-printing characters:
= TRIM(CLEAN(A1)) // trim and clean

• TRIM often appears in other more


advanced text formulas. For example,
the formula below will count the
number of words in cell A1:
LEN(TRIM(A1))-LEN(SUBSTITUTE(A1," ",""))+1

TRIM
Text functions
• The Excel CONCATENATE function concatenates joins values together and returns the result as text.

• For example, to concatenate the value of A1 and B1, separated by a space, you can use CONCATENATE
like this:
= CONCATENATE(A1," ",B1)

• The result of this formula is the same as using the concatenation operator (&) manually like this:
= A1&" "&B1 // manual concatenation

Notes
The ampersand character (&) is an alternative to
CONCATENATE. The result is the same, but the ampersand is
more flexible, and creates formulas that are shorter and
(arguably) easier to read.

CONCATENATE
Text functions
• When concatenating numeric values like dates, times, percentages, etc., number formatting will be lost. For
example, with the date 1-Jul-2021 in cell A1, the date reverts to a serial number during concatenation:
= CONCATENATE("Date: ",A1) // returns "Date: 44378“

• To apply formatting during concatenation use the TEXT function :


= CONCATENATE("The date is ",TEXT(A1,"mmmm d")) // "Date: July 1“

• The CONCATENATE function will not handle ranges:


= CONCATENATE(A1:D1) // does not work

Notes
To concatenate values in ranges, see the CONCAT function. To concatenate many
values with a common delimiter, see the TEXTJOIN function. TEXTJOIN can do
everything CONCAT can do, but can also accept a delimiter and optionally ignore
empty values.

CONCATENATE
Text functions
• The Excel UPPER function converts a text string to all uppercase letters. Numbers, punctuation, and
spaces are not affected.

• If a numeric value is given to UPPER, number formatting is removed. For example, if cell A1 contains the
date 26 June 2021, date formatting will be lost and UPPER will return a date serial number as text:
= UPPER(A1) // returns "44373"

UPPER
Text functions
• The Excel LOWER function converts a text string to all lowercase letters. Numbers, punctuation, and
spaces are not affected.

• If a numeric value is given to LOWER, number formatting is removed. For example, if cell A1 contains the
date 26 June 2021, date formatting will be lost and UPPER will return a date serial number as text:
= LOWER(A1) // returns "44373"

LOWER
Text functions
• The Excel PROPER function capitalizes each word in a given text string. Numbers, punctuation, and
spaces are not affected.
= PROPER("apple") // returns "Apple"
= PROPER("APPLE") // returns "Apple“

• Numbers or punctuation characters inside a text string are unaffected:


= PROPER("XYY-020-kwp") // returns "Xyy-020-Kwp"

PROPER
Text functions
• The Excel LEN function returns the length of a given text string as the number of characters. LEN will also
count characters in numbers, but number formatting is not included.

• LEN returns the count of characters in a text string:


= LEN("apple") // returns 5

• Space characters are included in the count:


= LEN("apple ") // returns 6

LEN
Text functions
• The Excel LEFT function extracts a given number of characters from the left side of a supplied text string.
For example, LEFT("apple",3) returns "app".

• If num_chars exceeds the string length, LEFT returns the entire string:
= LEFT("apple",100) // returns "apple"

LEFT
Text functions
• The Excel RIGHT function extracts a given number of characters from the right side of a supplied text
string. For example, RIGHT("apple",3) returns "ple".

• If the optional argument num_chars is not provided, it defaults to 1:


= RIGHT("ABC") // returns "C"

RIGHT
Text functions
• The Excel TEXT function returns a number in a given number format, as text. You can use the TEXT
function to embed formatted numbers inside text.

• The TEXT function is especially useful when concatenating a number to a text string with formatting. For
example, with the date 1 July 2021 in cell A1, concatenation causes date formatting to be removed, since
dates are numeric values:
= "The date is "&A1 // returns "The date is 44378"

TEXT
Array Formula
• An array formula is a type of formula that performs an operation on multiple values instead of a single
value. The final result of an array formula can be either one item or an array of items, depending on how
the formula is constructed. For example, the following formula is an array formula that returns the sum of all
characters in a range:
{=SUM(LEN(range))}

Notes
An array is a collection of more than one item.
Arrays in Excel appear inside curly brackets. For
example, {1;2;3} or {"red","blue","green"}.

An array is a must in Python environment.


Data Cleaning and Preparing
Tools
Data Cleaning
• The Excel CLEAN function takes a text string and returns text that has been "cleaned" of line breaks and
other non-printable characters.

• CLEAN will not remove extra space characters. To remove extra space, use the TRIM function. You can
use CLEAN and TRIM together in one formula like this:
= TRIM(CLEAN(A1)) // clean and remove extra space

• Overall, Top 8 Excel Data Cleaning Techniques to Know are as follows:

1) Remove Duplicates
2) Data Parsing from Text to Column
3) Delete All Formatting
4) Spell Check
5) Change Case - Lower/Upper/Proper
6) Highlight Errors
7) TRIM Function
8) Find and Replace
Handling Data Outlier
• An outlier is a value that is significantly higher or lower than most of the values in your data. When using
Excel to analyse data, outliers can skew the results. For example, the mean of a data set might not truly
reflect your values.

• In the image below, the outliers are reasonably easy to spot: the value of 2 assigned to Eric and the value
of 173 assigned to Ryan. In a data set like this, it’s easy enough to spot and deal with those outliers
manually.
Handling Data Outlier
• In a larger set of data, that will not be the case. Being able to identify the outliers and remove them from
statistical calculations is important.

• To find the outliers in a data set, we use the following steps:


1) Calculate the 1st and 3rd quartiles.
2) Evaluate the interquartile range
3) Return the upper and lower bounds of our data range.
4) Use these bounds to identify the outlying data points.
5) The cell range on the left/right/bottom of the data set will be used to store these values.
Handling Data Outlier
1) To calculate the 1st Quartile we can use the following formula in cell F2.

= QUARTILE(B2:B14,1)
= QUARTILE(B2:B14,3)

2) The interquartile range (or IQR) is the middle 50% of values in your data. It is calculated as the
difference between the 1st quartile value and the 3rd quartile value. We’re going to use a simple
formula into cell F4 that subtracts the 1st quartile from the 3rd quartile.

= F3-F2

3) The lower and upper bounds are the smallest and largest values of the data range that we want to
use. Any values smaller or larger than these bound values are the outliers. We’ll calculate the lower
bound limit in cell F5 by multiplying the IQR value by 1.5 and then subtracting it from the Q1 data
point.

= F2-(1.5*F4)
Handling Data Outlier
1) To calculate the upper bound in cell F6, we’ll multiply the IQR by 1.5 again, but this time add it to the
Q3 data point.
= F3+(1.5*F4)

2) Now that we’ve got all our underlying data set up, it’s time to identify our outlying data points, the ones
that are lower than the lower bound value or higher than the upper bound value. We’ll use the OR
function to perform this logical test and show the values that meet these criteria by entering the
following formula into cell C2:
= OR(B2<$F$5,B2>$F$6)

3) We’ll then copy that value into our C3-C14 cells.


4) A TRUE value indicates an outlier.
Auto fill and Flash fill
• Auto fill works a little like Flash Fill, although it’s better suited for tasks that involve a lot of cells. It’s also
better for cells that have an even more obvious pattern, such as numbers.

• Click and drag to select both cells.


Auto fill and Flash fill
• Find the square in the bottom right of the cell and drag it down. You can drag it as far as you’d like.

• Excel recognised the pattern and filled all of the cells below that you told it to.
Auto fill and Flash fill
• But this doesn’t just work for numbers. Auto Fill is great for all sorts of patterns, like days and months,

• Just like in the previous example, if we click the box at the bottom right and drag it down, Excel will fill all of
the cells below using the Auto Fill feature.
Auto fill and Flash fill
• Flash Fill can automatically detect patterns in data and help you quickly fill cells. For example, if we start
with a list of full names (first and last), but then decide that we should have split them into separate
columns, Flash Fill can automate a lot of the work.

• To start, let’s assume that we have a list of names. In the column where you want the first names to go,
type just the first name from the first cell.
Auto fill and Flash fill
• Click the “Data” tab on the ribbon at the top of the Excel window.

• Then, click the “Flash Fill” button in the Data Tools section.
Auto fill and Flash fill
• As you can see, Excel detected the pattern, and Flash Fill filled the rest of our cells in this column with only
the first name.

• From here, now that Excel knows our pattern, it should show you a preview as you type. Try this: In the
next cell over from where you typed in the first name, type in the corresponding last name.
Auto fill and Flash fill
• If we click “Enter” on the keyboard, which moves us to the cell below, Excel now shows all of the last
names in their proper places.

• Click “Enter” to accept, and Flash Fill will automatically complete the rest of the cells in this column.
Find and replace
• To find something, press CTRL+F, or go to Home > Editing > Find & Select > Find (+ Options >>)
Find and replace
• To replace text or numbers, press CTRL+H, or go to Home > Editing > Find & Select > Replace (+ Options >>)
Remove duplicate
• Sometimes duplicate data is useful, sometimes it just makes it harder to understand your data. Use conditional
formatting to find and highlight duplicate data. That way you can review the duplicates and decide if you want
to remove them.

1) Select the cells you want to check for duplicates (Note:


Excel can’t highlight duplicates in the Values area of a
PivotTable report).

2) Click Home > Conditional Formatting > Highlight Cells


Rules > Duplicate Values.
Remove duplicate
3) In the box next to values with, pick the formatting you
want to apply to the duplicate values, and then click
OK.

4) When you use the Remove Duplicates feature, the


duplicate data will be permanently deleted. Before you
delete the duplicates, it’s a good idea to copy the
original data to another worksheet so you don’t
accidentally lose any information.

5) Select the range of cells that has duplicate values you


want to remove (Note: Remove any outlines or
subtotals from your data before trying to remove
duplicates).
Remove duplicate
6) Click Data > Remove Duplicates, and then Under
Columns, check or uncheck the columns where you
want to remove the duplicates.

7) For example, in this worksheet, the January column


has price information we want to keep.
Remove duplicate
8) So, we unchecked January in the Remove Duplicates
box.

9) Click OK.
Text to columns
• Text to Columns, allows us to move text from one column into another, effectively splitting text entries into
two separate spaces.

1) Add entries to the first column and select them all.


Text to columns
2) Choose the Data tab atop the ribbon.

3) Select Text to Columns.


Text to columns
4) Ensure Delimited is selected and click Next.

5) Clear each box in the Delimiters section and instead choose Comma and Space.

6) Click Finish.
Sorting and filtering
• For a quick sort/filter, click the arrow below the Sort & Filtering icon in the Editing group of the Home ribbon
and choose the Sort A to Z / Z to A / filter icons in the Sort & Filter group of the Data ribbon.
Custom sorting
• For a more complex sort, go to the Home ribbon, click the arrow below the Sort & Filter icon in the Editing
group and choose Custom Sort. This takes you to the same Sort dialog box you get with the Sort icon in the
Sort & Filter group of the Data ribbon.

1) Under Column, choose the first column that you


would like to sort. If you want to sort multiple
columns, click the Add Level button.
2) Under Sort On, choose how you would like to sort.
Note that Excel can sort by cell or font colour in
addition to values.
3) Under Order, choose A to Z (ascending), Z to A
(descending), or Custom List.
4) Click OK to perform the sort.
Advanced filtering
• In addition to basic filtering, you may find that adding a specific filter allows you to better analyse your data.
• When data is filtered, only rows that meet the filter criteria will display and other rows will be hidden. With
filtered data, you can then copy, format, print without having to sort or move it first. Specifically,

1) Go to the Home ribbon, click the arrow below the Sort & Filtering icon in the Editing group and choose
Filter.

2) You will notice that all of your column headings now have an arrow next to the heading name.

3) Click on the arrow next to the heading with which you want to filter, and you will see a list
of all the unique values in that column. Check the box next to the criteria you wish to
match and click OK. Click on the arrow next to another heading to further filter the data.
Advanced filtering
4) To clear the filter, choose one of these options:
 Click on the Filter icon next to the heading and choose Clear Filter from “Name of Heading”.
 Go to the Data ribbon and click the Clear icon in the Sort & Filter group.
 Go to the Home ribbon, click the arrow below the Sort & Filter icon in the Editing group and choose Clear.

5) In the Sort & Filter group of the Data ribbon, there is an Advanced icon, which evokes the Advanced Filter
dialog box. This dialog box allows you to set a particular criteria, copy results to another location, and
capture unique values.
Tables
Introduction to Excel tables
• To make managing and analysing a group of related data easier, you can turn a range of cells into an Excel
table (previously known as an Excel list).

Note: Excel tables should not be confused with the data tables that are part of a suite
of what-if analysis commands
Introduction to Excel tables
• Header row By default, a table has a header row. Every table column has filtering enabled in the header
row so that you can filter or sort your table data quickly.
Introduction to Excel tables
• Banded rows Alternate shading or banding in rows helps to better distinguish the data.
Introduction to Excel tables
• Calculated columns By entering a formula in one cell in a table column, you can create a calculated
column in which that formula is instantly applied to all other cells in that table column.
Introduction to Excel tables
• Total Row Once you add a total row to a table, Excel gives you an AutoSum drop-down list to select from
functions, such as SUM and AVERAGE.
• When you select one of these options, the table will automatically convert them to a SUBTOTAL function,
which will ignore rows that have been hidden with a filter by default. If you want to include hidden rows in
your calculations, you can change the SUBTOTAL function arguments.
Introduction to Excel tables
• Sizing handle A sizing handle in the lower-right corner of the table allows you to drag the table to the size
that you want.
Introduction to Excel tables
• Create a table To quickly create a table in Excel, do the following:

1) Select the cell or the range in the data.

2) Select Home > Format as Table.

3) Pick a table style.

4) In the Format as Table


dialog box, select the
checkbox next to My
table as headers if you
want the first row of
the range to be the
header row, and then
click OK.
Tables vs Normal range
• What is an Excel range? Any group of selected cells can be considered as an Excel range.
• A range of cells is defined by the reference of the cell that is at the upper left corner and the one at the
lower right corner.
• For example, the range selected in the image below consists of cells A1 to C7, denoted as A1:C7.
Tables vs Normal range
• What is an Excel range? The main purpose of using named ranges is to make references to a group of
cells more intuitive.
• For example, if the name of the following selected range is “Sales”, then you can simply refer to this range
by name in formulas (rather than using cell references like B2:B7):

• To convert a range of cells to a named range, all you need to do is select the range, type the name into the
Name Box and press the return key.
Tables vs Normal range
• Differences Not only do they look different, they are also quite different in the amount of functionality they
offer.

1) Cells in an Excel table need to exist as a contiguous collection of cells. Cells in a range, however, don’t necessarily
need to be contiguous.
2) Every column in an Excel table must have a heading (even if you choose to turn the heading row of the table off).
Named ranges, on the other hand, have no such compulsion.
3) Each column header (if displayed) includes filter arrows by default. These let you filter or sort the table as required. To
filter or sort a range, you need to explicitly turn the filter on.
4) New rows added to the table remain a part of the table. However, new rows added to a range or are not implicitly part of
the original range.
5) In tables, you can easily add aggregation functions (like sum, average, etc.) for each column without the need to write
any formulas. With ranges, you need to explicitly add whatever formulas you need to apply.
Tables vs Normal range
• How to convert a table to range Let’s say you have the following table and you want to convert it to
range.

1) Select any cell in your table.


2) You should see a new ribbon titled ‘Table Tools’ in the main menu. Select the Design tab under this menu.

3) In the Tools group, select the ‘Convert to Range’ button.


Adding and deleting rows of columns
Columns
• Select any cell within the column, then go to Home > Insert > Insert Sheet Columns or Delete Sheet
Columns.
• Alternatively, right-click the top of the column, and then select Insert or Delete.

Rows
• Select any cell within the row, then go to Home > Insert > Insert Sheet Rows or Delete Sheet Rows.
• Alternatively, right-click the row number, and then select Insert or Delete.
Adding and deleting rows of columns
Formatting options
• When you select a row or column that has formatting applied, that formatting will be transferred to a new
row or column that you insert.
• If you don't want the formatting to be applied, you can select the Insert Options button after you insert, and
choose from one of the options as follows:

• If the Insert Options button isn't visible, then go to File > Options > Advanced > in the Cut, copy and paste
group, check the Show Insert Options buttons option.
Chart and Visualisation
Techniques
Conditional formatting
General rule
• Conditional formatting makes it easy to highlight certain values or make particular cells easy to identify.
This changes the appearance of a cell range based on a condition (or criteria).
• You can use conditional formatting to highlight cells that contain values which meet a certain condition. Or
you can format a whole cell range and vary the exact format as the value of each cell varies.
Conditional formatting
Apply conditional formatting to text
• Select the range of cells, the table, or the whole sheet that you want to apply conditional formatting to.
• On the Home tab, click Conditional Formatting.
• Point to Highlight Cells Rules, and then click Text that Contains.
• Type the text that you want to highlight, and then click OK.
Conditional formatting
Create a custom conditional formatting rule
• Select the range of cells, the table, or the whole sheet that you want to apply conditional formatting to.
• On the Home tab, click Conditional Formatting.
• Click New Rule.
• Select a style, for example, 3-Color Scale, select the conditions that you want, and then click OK.
Conditional formatting
Format only unique or duplicate cells
• Select the range of cells, the table, or the whole sheet that you want to apply conditional formatting to.
• On the Home tab, click Conditional Formatting.
• Point to Highlight Cells Rules, and then click Duplicate Values.
• Next to values in the selected range, click unique or duplicate.
Conditional formatting
Copy conditional formatting to additional cells
• Select the cell that has the conditional formatting that you want to copy.
• On the Home tab, click Format

and then select the cells where you want to copy the conditional formatting.
Conditional formatting
Clear conditional formatting from a selection
• Select the cells that have the conditional formatting that you want to remove.
• On the Home tab, click Conditional Formatting.
• Point to Clear Rules, and then click the option that you want.
Conditional formatting
Change a conditional formatting rule
• Click in the range that contains the conditional formatting rule that you want to change.
• On the Home tab, click Conditional Formatting.
• Click Manage Rules.
• Select the rule, and then click Edit Rule.
• Make the changes that you want.
• Click OK.
Excel charts
Create a chart
• Select the range A1:D7.
• On the Insert tab, in the Charts group, click the Line symbol.
Excel charts
Change chart type
• Select the chart.
• On the Design tab, in the Type group, click Change Chart Type.
• On the left side, click Column.
• Click OK.
Excel charts
Switch row/ column
• Select the chart.
• On the Design tab, in the Data group, click Switch Row/Column.
Excel charts
Legend position
• Select the chart.
• Click the + button on the right side of the chart, click the arrow next to Legend and click Right.
Excel charts
Data labels
• Select the chart.
• Click a green bar to select the June data series.
• Hold down CTRL and use your arrow keys to select the population of Dolphins in June (tiny green bar).
• Click the + button on the right side of the chart and click the check box next to Data Labels..
Pivot Tables, Charts and
Slicers
Introduction
• Power Pivot is an Excel add-in you can use to perform powerful data analysis and create sophisticated data
models. With Power Pivot, you can mash up large volumes of data from various sources, perform
information analysis rapidly, and share insights easily.

• In both Excel and in Power Pivot, you can create a Data Model, a collection of tables with relationships.
The data model you see in a workbook in Excel is the same data model you see in the Power Pivot window.
Any data you import into Excel is available in Power Pivot, and vice versa.

• You can think of a pivot table as a report. However, unlike a static report, a pivot table provides an
interactive view of your data. With very little effort (and no formulas) you can look at the same data from
many different perspectives. You can group data into categories, break down data into years and months,
filter data to include or exclude categories, and even build charts.
Introduction
Task In Excel In Power Pivot
Import data from different
sources, such as large corporate
Import all data from a data Filter data and rename columns
databases, public data feeds,
source. and tables while importing.
spreadsheets, and text files on
your computer.
Tables can be on any worksheet Tables are organized into
Create tables in the workbook. Worksheets can individual tabbed pages in the
have more than one table. Power Pivot window.
Can edit values in individual cells
Edit data in a table Can’t edit individual cells.
in a table.
Create relationships between In Diagram view or the Create
In the Relationships dialog box.
tables Relationships dialog box.
Write advanced formulas with
the Data Analysis Expressions
Create calculations Use Excel formulas. (DAX) expression language.
Microsoft Power BI
Introduction
Task In Excel In Power Pivot
Define Hierarchies to use
Create hierarchies Not available everywhere in a workbook,
including Power View.
Create key performance Create KPIs to use in PivotTables
Not available
indicators (KPIs) and Power View reports.
Create Perspectives to limit the
Create perspectives Not available number of columns and tables
your workbook consumers see.
Create PivotTable reports in
Create PivotTables and Excel. Click the PivotTable button in the
PivotCharts Power Pivot window.
Create a PivotChart
Make enhancements such as
Enhance a model for Power View Create a basic data model. identifying default fields, images,
and unique values.
Introduction
Task In Excel In Power Pivot
Use Visual Basic for Applications VBA is not supported in the
Use VBA in Excel.
(VBA) Power Pivot window..
Use DAX in calculated columns
Group data Group in an Excel PivotTable
and calculated fields.

• Power Pivot is an Excel add-in you can use to perform powerful data analysis and create sophisticated data
models. With Power Pivot, you can mash up large volumes of data from various sources, perform
information analysis rapidly, and share insights easily.

• In both Excel and in Power Pivot, you can create a Data Model, a collection of tables with relationships.
The data model you see in a workbook in Excel is the same data model you see in the Power Pivot window.
Any data you import into Excel is available in Power Pivot, and vice versa.
Creating and modifying Pivot Table
• The sample data contains 452 records with 5 fields of information: Date, Colour, Units, Sales, and Region.
This data is perfect for a Power Pivot / Power BI.
Creating and modifying Pivot Table
1) To start off, select any cell in the data and click Pivot Table on the Insert tab of the ribbon. Excel will
display the Create Pivot Table window. Notice the data range is already filled in. The default location for a
new pivot table is New Worksheet.
Creating and modifying Pivot Table
2) Override the default location and enter H4 to place the pivot table on the current worksheet.
Creating and modifying Pivot Table
3) Click OK, and Excel builds an empty pivot table starting in cell H4.
Creating and modifying Pivot Table
3) Excel also displays the PivotTable Fields pane, which is empty at this point. Note all five fields are listed,
but unused.
4) To build a pivot table, drag fields into one the Columns, Rows, or Values area. The Filters area is used to
apply global filters to a pivot table.
Creating and modifying Pivot Table
Add fields
1) Drag the Sales field to the Values area.
2) Excel calculates a grand total, 26356. This is the sum of all sales values in the entire data set:
Creating and modifying Pivot Table
Add fields
3) Drag the Colour field to the Rows area.
4) Excel breaks out sales by Colour. You can see Blue is the top seller, while Red comes in last.
Creating and modifying Pivot Table
Add fields
5) Notice the Grand Total remains 26356. This makes sense, because we are still reporting on the full set of
data.
6) Let's take a look at the fields pane at this point. You can see Colour is a Row field, and Sales is a Value
field.
Creating and modifying Pivot Table
Second value field
You can add more than one field as a Value field.

1) Drag Units to the Value area to see Sales and Units together.
Creating and modifying Pivot Table
Percent of total
There are different ways to display values. One option is to show values as a percent of total. If you want to
display the same field in different ways, add the field twice.

1) Remove the Units from the Values area


2) Add the Sales field (again) to the Values area.
3) Right-click the second instance and choose "% of grand total".
Creating and modifying Pivot Table
Percent of total
4) The result is a breakdown by colour along with a percent of total
Creating and modifying Pivot Table
Percent of total
4) The result is a breakdown by colour along with a percent of total
Creating and modifying Pivot Table
Percent of total
5) Here is the Fields pane at this point:
Creating and modifying Pivot Table
Two-way pivot
Pivot tables can plot data in various two-dimensional arrangements.
1) Drag the Date field out of the columns area
2) Drag Region into the Columns area.
3) Excel builds a two-way pivot table that breaks down sales by colour and region.
Creating and modifying Pivot Table
Two-way pivot
4) Swap Region and Colour (i.e. drag Region to the Rows area and Colour to the Columns area).
5) Excel builds another two-dimensional pivot table:

Notes
• Again notice total sales ($26,356) is the same in all pivot tables above. Each table
presents a different view of the same data, so they all sum to the same total.
• The above example shows how quickly you can build different pivot tables from the
same data. You can create many other kinds of pivot tables, using all kinds of data.
Formatting data with Pivot Table
Number formatting
Pivot Tables can apply and maintain number formatting automatically to numeric fields. This is a big time-saver
when data changes frequently.

1) Right-click any Sales number and choose Number Format.


Formatting data with Pivot Table
Number formatting
2) Apply Currency formatting with zero decimal places, the click OK:
Formatting data with Pivot Table
Number formatting
3) In the resulting pivot table, all sales values have Currency format applied. Currency format will continue to
be applied to Sales values, even when the pivot table is reconfigured, or new data is added.
Refreshing Pivot Table
Refresh data
Pivot table data needs to be "refreshed" in order to bring in updates. To reinforce how this works, we'll make a
big change to the source data and watch it flow into the pivot table.

1) Select cell F5 and change $11.00 to $2000.


2) Right-click anywhere in the pivot table and select "Refresh".
Refreshing Pivot Table
Refresh data
3) Notice "Red" is now the top selling colour, and automatically moves to the top.
4) Change F5 back to $11.00 and refresh the pivot again.
Sorting and filtering Pivot Table
Sorting by value
1) In the resulting pivot table, all sales values have Currency format applied. Currency format will continue to
be applied to Sales values, even when the pivot table is reconfigured, or new data is added.
Sorting and filtering Pivot Table
Filtering data
1) PivotTables are great for taking large datasets and creating in-depth detail summaries. Sometimes, you
want the added flexibility of being able to further filter your data on the fly to a smaller portion of your
PivotTable.
Grouping Data
Group by date
Pivot tables have a special feature to group dates into units like years, months, and quarters. This grouping
can be customized.
1) Remove the second Sales field (Sales2).
2) Drag the Date field to the Columns area.
3) Right-click a date in the header area and choose "Group".
Grouping Data
Group by date
4) When the Group window appears, group by Years only (deselect Months and Quarters).
Grouping Data
Group by date
5) We now have a pivot table that groups sales by colour and year.
Grouping Data
Group by date
6) Notice there are no sales of Silver in 2016 and 2017. We can guess that Silver was introduced as a new
colour in 2018. Pivot tables often reveal patterns in data that are difficult to see otherwise.
7) Here is the Fields pane at this point.
Visualising data with Pivot charts
• Sometimes it's hard to see the big picture when your raw data hasn’t been summarized. Your first instinct
may be to create a PivotTable, but not everyone can look at numbers in a table and quickly see what's
going on. PivotCharts are a great way to add data visualizations to your data.
Visualising data with Pivot charts
Create a PivotChart
1) Select a cell in your table.
2) Select Insert > PivotChart.
3) Select OK.

Create a chart from a PivotTable


4) Select a cell in your table.
5) Select PivotTable Tools > Analyse > PivotChart.
6) Select a chart.
7) Select OK.
Pivoting slicers
• Sometimes using more than one pivot table is a mess. But it’s not a mess if you connect all the pivot tables
with a single slicer.
• Let’s say you are working on a dashboard where you are using multiple pivot tables. If you are able to
connect a slicer to all the pivot tables you can control the entire dashboard with a single slicer.
Pivoting slicers
1) First of all, take two or more pivot tables to connect a slicer.
Pivoting slicers
2) After that, select a cell in any of the pivot tables.
3) From here, go to Analyse → Filter → Insert Slicer.
4) Now from the "Insert Slicer" dialog box, select the column to use as a filter in the slicer and click OK.
Pivoting slicers
5) At this point, you have a slicer in your worksheet which can filter the pivot table in which you insert it.
6) Next, you need to connect it to the second pivot table.
7) From here, select the slicer and go to Analyse → Slicer → Report Connections..
Pivoting slicers
8) You will get a new dialog box with the list of pivot tables that are in your workbook.
9) In the end, just tick mark all the pivot tables and click OK.
Power Query
Introduction to Power Query
• Power Query is a tool in Excel that allows you to import data from a wide variety of sources, and
manipulate that data to meet your needs. For example, you can:
1) Import a CSV file on your computer
2) Import a table from a web page
3) Import data from an online database

• In addition to importing data to Excel, Power Query is designed to "transform" data. You can easily do
things like remove columns or rows, rename and reorder columns, split columns, add new columns, fix
date problems, join tables, and much more.

• The beautify of Power Query is that each step is defined separately in a query. When you "refresh" the
data, all steps will be automatically repeated in exactly the same order.

• You can find Power Query tools on the Data tab of the ribbon.
Introduction to Power Query
• Power Query has a vast set of features that are updated frequently. In a nutshell, here are a few key
benefits:

1) Import data of all kinds directly into Excel with a modern and robust tool.
2) Refresh data directly in the Excel workbook. No need to navigate back to a website and download
data manually.
3) Define specific steps to retrieve, clean, and reshape data. These steps will be repeated, in order, each
time data is refreshed.
4) Drop data into an Excel Table to analyse with formulas, pivot tables, and charts.
Connecting and transforming data with Power Query in Excel
1) Click Data > Get Data > From web

2) Enter the URL https://raw.githubusercontent.com/MoH-Malaysia/covid19-public/main/epidemic/cases_state.csv and click OK


Connecting and transforming data with Power Query in Excel
3) Click the Transform Data button to launch Power Query
Connecting and transforming data with Power Query in Excel
4) Power Query will automatically add three steps: Source, Promote Headers and Change type. If you select
a step, you can see what it does.

5) Remove the automatic "Change Type" step. Hover and click on the "X" on the left. We will manually
change type again below to make the query more resilient.
Connecting and transforming data with Power Query in Excel
6) Control-click to select five columns: date; state; cases_new; cases_import; cases_recovered. Then,
right-click on a select column and choose "Remove other columns".
Connecting and transforming data with Power Query in Excel
7) For each column, please ensure it is in correct format.
Connecting and transforming data with Power Query in Excel
8) Drag to reorder columns: state; date; cases_new; cases_recovered; cases_import
Connecting and transforming data with Power Query in Excel
9) Rename columns to: state, date, new, recovered, import. Double-click header to rename columns.
Connecting and transforming data with Power Query in Excel
10) Sort data by the "new" column in descending order.
Connecting and transforming data with Power Query in Excel
11) Rename Query to "states".
Connecting and transforming data with Power Query in Excel
12) Click Close and Load button on Data tab of ribbon. The data will end up in an Excel Table called "states".
Refreshing data
1) To fetch the latest data, right-click in the table and select "Refresh".
2) Power Query will pull down a fresh set of source data, run through the steps defined above, and deliver the
result back to Excel.
Refreshing data
1) To fetch the latest data, right-click in the table and select "Refresh".
2) Power Query will pull down a fresh set of source data, run through the steps defined above, and deliver the
result back to Excel.
Descriptive statistics
• Selected formulas.
How to edit the query
1) Click Queries and Connections on the Data tab of the ribbon.
2) Double click the "states" query to edit.
Introduction to data model
• A Data Model allows you to integrate data from multiple tables, effectively building a relational data source
inside an Excel workbook. Within Excel, Data Models are used transparently, providing tabular data used
in PivotTables and PivotCharts.
• A Data Model is visualised as a collection of tables in a Field List, and most of the time, you’ll never even
know it's there.
Building table relationship with data model
1) We will load data from other database in Excel (our local computer).
2) Use Data > Get & Transform Data > Get Data to import data from any number of external data sources,
such as a text file, Excel workbook, website, Microsoft Access, SQL Server, or another relational database
that contains multiple related tables.
Building table relationship with data model
3) Select one item, then click Load. Repeat it.
Building table relationship with data model
4) Go to Power Pivot > Manage.
5) On the Home tab, select Diagram View.
6) Choose Use First Row as Header.
Building table relationship with data model
7) All of your imported tables will be displayed, and you might want to take some time to resize them
depending on how many fields each one has.
8) Next, drag the primary key field from one table to the next. The following example is the Diagram View of
my table.

Orders | Region > People > Region


Advanced Excel analytics
Advanced Excel analytics
Google Data Studio
(introduction)
Why learn Google Data Studio?
• Get more value out of digital analytics data.
• Cut down time on reports & streamline reporting.
• Improve your data presentation skills & influence decision-makers.
• Increase the public service productivity.
• You’ve spent time visualising data in other systems, e.g. Excel; Tableau; Power BI; R
Shiny; Python Dash.
Why learn Google Data Studio?
• Google Data Studio is a powerful visual reporting tool that allows you to transform a
sea of raw data into engaging and interactive dashboards and charts.
• The user-friendly pre-built data connectors mean you don’t need any programming
knowledge to pull data from multiple different sources. And combined with Google’s
sharing functions, teams can easily combine, filter and present their data together in a
really professional format, with great graphics.

• As you’d expect from a piece of Google software, Google Data Studio is designed to
integrate seamlessly with data sources such as Google Ads, Google Analytics,
BigQuery.
• On top of that, there are over 150 third-party connectors to fetch data from sources
like Facebook, Ebay, LinkedIn and Mailchimp. If you want to use spreadsheets as
your data source, you need to use Google Sheets.
Why learn Google Data Studio?
• In addition, Google Workspace (MAMPU initiative).
How to connect Excel to Google Data Studio?
Use Sheetgo to connect your files
• Sheetgo is a no-code automation tool for spreadsheets and other office apps. When
you create a Sheetgo connection, you watch your data move from one spreadsheet to
another automatically.
How to connect Excel to Google Data Studio?
Use Sheetgo to connect your files
• Please ensure you have sync the Sheetgo with Google Sheets
How to connect Excel to Google Data Studio?
Use Sheetgo to connect your files
• Please ensure you have sync the Sheetgo with Google Sheets and Google Drive
Automate the workflow
Use Sheetgo to connect your files
• Click on Automate to schedule the frequency with which you want the automatic
updates between Excel and Google Sheets to run. This can happen on an hourly,
daily, weekly, or monthly basis.
Connect your Excel files to Google Data Studio
• Ok, so you’ve set up an automated system to pull data from your Excel file(s) to
Google Sheets. Now you’re ready to connect to Google Data Studio.

1) Open Google Data Studio.


2) Click on Select Data Source, in the top right hand corner of the screen.
3) Select Google Sheets.
Connect your Excel files to Google Data Studio
4) Select the spreadsheet, then the worksheet (tab).
5) Click on the blue Connect button in the top right-hand corner.
Connect your Excel files to Google Data Studio
6) You will now see an overview of all columns, fields and items. Click Add to report.
7) A popup appears, to verify that you have selected the correct data. Check it and click
Add to report.
Connect your Excel files to Google Data Studio
• Your connection is complete!
• You have created a link from the Excel file on your hard drive to Google Data Studio,
using Backup and Sync and Sheetgo.
• Any changes to the data in the original Excel file(s) on your computer, or the Google
Sheets file in your Drive, will be reflected in your Google Data Studio reports.
Presentation using Google Data Studio
• View the report you want to present.
• In the upper right, click More options > More options..
• Click Present > icon Present.
Presentation using Google Data Studio
• View the report you want to present.
• In the upper right, click More options > More options..
• Click Present > icon Present.
Two other Excel
• Infographic Stylerelated courses

EViews | Python | MATLAB | R | STATA | Julia | SPSS | AMOS | Java | micro:bit | Azure | Tableau | Power BI | Excel VBA
• Infographic Style

Thank You

Harunurashid 0107916627 / 0177540486 /


harunurashid.thelatha@mof.gov.my
Muhammad Harunurashid
Speaker

EViews | Python | MATLAB | R | STATA | Julia | SPSS | AMOS | Java | micro:bit | Azure | Tableau | Power BI | Excel VBA

You might also like