You are on page 1of 4

Revision Paper

Grade 8 Unit Test 1 – 2nd Semester

Data Science – is extracting meaning from large data sets in order to provide insights to support
decision making.
Presenting Analyzed Data
Data Scientist implores different methods of presenting data, here are some common Data
Analysis presentation used.
1. Data visualisations are visual representation of data (such as charts and graphs) intended
to help an audience process the information more easily and get a clear idea about the
data at a glance.

2. Infographics are visual representations of data, often involving pictures that reflect
patterns and help tell a story. Infographics can include visualisations.

Example of an
Infographics
Correlations
Showing that there is a relationship between the two variables; we call that a correlation.
Correlations can be positive, negative and there could be no correlation at all.

An example of correlation can be establish such as comparing the result of students exam scores in
relation to student doing their homework. There could be a positive correlation in between these two
variables such that those student who did not do their homework and did not revise will definitely have
lower scores. So, the correlation between the exam result and students not doing their homework is a
positive correlation. The more students do not do their homework, the lower the scores are.

Outliners

Data that sits outside a trend is known an outlier.

Outliers can cause problems when working out statistics such as the mean, but they shouldn’t be removed
from the data set without investigating the reason for them.

The trend (positive


correlation)

OUTLINERS – Data that


are not included in the
trend.
The PPDAC cycle is a framework for us to follow when asking and answering real-world problems
using data.

The Investigative Cycle

1. Problem
2. Plan
3. Data
4. Analysis
5. Conclusion

** These are the order of the


investigative cycle.

Problem

Pose a question that you think the data will help you to answer.

Plan
The plan involves working out:Where will we get the data from? How will we collect the data?
Data
In this step, we gather the data.

Once we have the data, we need to help us answer the question, we should look through
the data to see if the data needs cleansing.

Cleansing is detecting and correcting, or removing, corrupt or inaccurate data. Cleansing is


part of the Data Stage.

Analysis
This step is all about making sense of the data.
Conclusions and recommendations
Can we use this data to make a case for action, or has it led to further questions that need to be
answered?
FORMULATING QUESTIONS, CRITERIA (VARIABLES) AND CONCLUSION BASED ON A GIVEN
SCENARIO:
Example Scenario: You are given a set of data for the students’ performance based on their
gender, Math results, Science results, History results and English results.
To formulate a question, you have to choose two variables that you want to compare: For
example I will choose Gender and Math Result.
Is there a correlation between gender and the math results of students?
Next question is what the criteria are used for the question. Criteria also means variables. Since
we started with the variables, gender and math result, then the 2 criteria I used are GENDER and
MATH RESULT
Making Predictions.
Based on the graph available you can make predictions. However, if no graph is available, you
can make your own prediction, whether there will be a POSITIVE, NEGATIVE or NO
CORRELATION at all. So, for this example, I have no graph so I will make a guess that there is a
POSITIVE CORRELATION between gender and math result.

You might also like