You are on page 1of 4

Unit 2 Chapter 2

Excel Basics
Filters - Filters in Excel are tools that allow you to display only the data that meets specific criteria.
By applying filters to your data, you can easily focus on the information that is relevant to your
analysis.
PivotTables - PivotTables in Excel are powerful tools that allow you to summarize and analyze large
datasets quickly and easily. With PivotTables, you can reorganize and summarize your data to gain
valuable insights and identify patterns and trends.
PivotCharts - PivotCharts in Excel are visual representations of PivotTable data that provide a
graphical way to analyze and present your information. By creating PivotCharts, you can easily
visualize trends, patterns, and comparisons within your PivotTable data.

Types of Graphs
Line Graphs - Line graphs are a common type of chart used to display data trends over a continuous
period or progression. In Excel, you can create line graphs by plotting data points on a grid with
horizontal and vertical axes. The x-axis typically represents the independent variable, such as time or
categories, while the y-axis represents the dependent variable, such as values or quantities.
Bar Charts - Bar charts are graphical representations of data using rectangular bars to compare
values across different categories or groups. In Excel, you can create bar charts by plotting data points
on a horizontal or vertical axis, with the length or height of each bar proportional to the value it
represents. Bar charts are effective for visualizing comparisons between individual data points or
showing trends over time. They are commonly used to display categorical data, such as sales by
region, survey responses by category, or market share by product.
Pie Charts - Pie charts are circular graphs divided into slices to represent the proportion of different
categories within a dataset. In Excel, you can create pie charts by assigning each category a slice of
the pie, with the size of each slice corresponding to the percentage of the whole it represents. Pie
charts are ideal for illustrating the distribution of a single data series and comparing the relative sizes
of different categories at a glance. They are commonly used to show market share, budget allocations,
survey responses, or any data that can be divided into distinct parts.
Histogram - Histograms are graphical representations of the distribution of numerical data, showing
the frequency or count of data points within predefined intervals or "bins." In Excel, you can create
histograms by organizing your data into bins and plotting the frequency of data points falling within
each bin. Histograms consist of contiguous bars, where the area of each bar represents the frequency
of data points in a particular interval. Histograms are useful for visualizing the shape, center, and
spread of a dataset, as well as identifying patterns and outliers. They are commonly used in statistics
to analyze the distribution of data and detect any underlying patterns or trends.
Categorical Data
Categorical Frequency distribution table
Categorical data, also known as qualitative data, consists of variables that represent categories or
groups with distinct characteristics. In statistical analysis, categorical data is non-numeric and often
expressed in terms of labels or names. Examples of categorical data include gender, color, type of car,
or job title. This type of data is typically divided into two subtypes: nominal and ordinal. Nominal
categorical data represents categories with no inherent order or ranking, such as eye color or city
names. Ordinal categorical data, on the other hand, has a specific order or ranking, like educational
attainment levels (e.g., high school, college, postgraduate). Analyzing categorical data involves
techniques such as frequency distribution, cross-tabulation, and chi-square tests to explore
relationships and patterns among different categories. Visualizing categorical data is often done using
bar charts, pie charts, or stacked bar charts to represent the distribution and relationships between
different categories effectively.
A frequency distribution table is a tabular representation of the number of occurrences or frequency
of values within a dataset. It organizes data into different categories or intervals along with the
corresponding counts or frequencies of each category. Frequency distribution tables are commonly
used in statistics to summarize and present the distribution of data in a clear and structured format.
Each row in the table typically represents a category or interval, and the corresponding column shows
the frequency or count of data points falling within that category.
Bar Charts - Bar charts are graphical representations of data using rectangular bars to compare
values across different categories or groups. In Excel, you can create bar charts by plotting data points
on a horizontal or vertical axis, with the length or height of each bar proportional to the value it
represents. Bar charts are effective for visualizing comparisons between individual data points or
showing trends over time. They are commonly used to display categorical data, such as sales by
region, survey responses by category, or market share by product.
Pie Charts - Pie charts are circular graphs divided into slices to represent the proportion of different
categories within a dataset. In Excel, you can create pie charts by assigning each category a slice of
the pie, with the size of each slice corresponding to the percentage of the whole it represents. Pie
charts are ideal for illustrating the distribution of a single data series and comparing the relative sizes
of different categories at a glance. They are commonly used to show market share, budget allocations,
survey responses, or any data that can be divided into distinct parts.

Cross-tabulation Table
A component bar chart(2 Variables), also known as a stacked bar chart, is a type of bar chart that
represents data in separate segments, each corresponding to a different category or component of the
whole. In this chart, each bar is divided into segments, with each segment representing a different
component or category. The total length of each bar remains constant, but the segments are
proportional to the values they represent.
A multiple bar chart, also known as a clustered bar chart, is a type of bar chart that displays multiple
bars side by side for each category or group being compared. In this chart, each group of bars
represents a different category, and the bars within each group represent sub-categories or
components.
Summarising Numeric Data (quantitative data)
Single Numeric Variable
Numeric Frequency distribution (quantitative data)
Construct a numeric frequency distribution.
1. Determine the data range
Organize Your Data: Arrange your data in ascending order from smallest to largest.

Identify the Smallest and Largest Values: Determine the smallest value (minimum) and the
largest value (maximum) in your data set.

Calculate the Data Range: Subtract the smallest value from the largest value to find the data
range. The data range gives you an idea of how spread out the data is.
2. Choose the number of intervals
Consider the Size of Your Data Set: The number of intervals you choose should be
appropriate for the size of your data set. For smaller data sets, fewer intervals may be
sufficient, while larger data sets may require more intervals for a detailed distribution.
Use a Rule of Thumb: A common rule of thumb is to use between 5 to 20 intervals.
However, the optimal number of intervals can also depend on the distribution of your data and
the level of detail you want to capture.
Consider Data Patterns: If your data has clear patterns or if you are looking for specific
insights, you may need to adjust the number of intervals accordingly. For example, if you
want to identify outliers or specific ranges of values, you might need more intervals.
Balance Detail and Interpretability: Choose a number of intervals that strike a balance
between providing detailed information about the distribution of your data and ensuring that
the frequency distribution is easily interpretable.
3. Determine the interval width

Data range/number of intervals


The interval width represents the range covered by each interval and helps in organizing the
data effectively.
4. Set up interval limits
5. Tabulate the data values
When constructing a numeric frequency distribution, ensure that:the interval widths are equal in size

 the interval limits do not overlap (i.e. intervals must be mutually exclusive)
 each data value is assigned to only one interval
 the intervals are fully inclusive (i.e. cover the data range)
 the sum of the frequency counts must equal the sample size, n, or that the percentage
 frequencies sum to 100%
Histogram
A histogram is a graphical representation of the distribution of numerical data. It consists of a series of
bars that represent the frequency or relative frequency of data within certain intervals, also known as
bins or classes. The x-axis of a histogram displays the intervals or categories of the data, while the y-
axis represents the frequency of occurrences within each interval.
Histograms are particularly useful for visualizing the shape, center, and spread of a dataset. They
provide a quick and effective way to understand the underlying patterns and characteristics of the
data, such as identifying peaks, gaps, clusters, or outliers.
A frequency polygon is a type of graphical representation used to display the shape of a distribution. It
is created by joining the midpoints of the tops of the bars in a histogram with straight lines. This
method provides a visual representation of the distribution of the data and can be useful in identifying
patterns or trends within the dataset.
Cumulative frequency distribution (ogive) A cumulative frequency distribution, also known as an
ogive, is a graph that represents the cumulative frequencies for the classes in a frequency distribution.
It is created by plotting the cumulative frequency against the upper boundary of each class interval.
The resulting curve can help visualize the total frequency of values that are less than or equal to a
certain point in the dataset. This graphical representation is useful for understanding the overall
distribution and identifying key points within the data.
Cumulative frequency polygon A cumulative frequency polygon is a graphical representation that
displays the cumulative frequencies of a dataset. It is constructed by plotting points representing the
cumulative frequencies at the upper boundaries of the corresponding class intervals and then
connecting these points with straight line segments. This type of graph provides a visual depiction of
how the cumulative frequencies increase as you move through the dataset, offering insights into the
overall distribution and patterns within the data.

You might also like