Professional Documents
Culture Documents
a) Team work
b) Advanced statistics
c) High-level math
1) Structured data
2) Unstructured data
3) Natural language
7) Streaming data
ESSU-ACAD-501|Version 4Page 1 of 6
Effectivity Date: June 10, 2021
3. Identify the year when the significant events in the evolution of data science took
place.
Event Year
Science)
1997
The term “data science” came to prominence in
ESSU-ACAD-501|Version 4Page 2 of 6
Effectivity Date: June 10, 2021
Assessment 2. Introduction to Data Science (2)
Learning.
Information overload can lead to many disadvantages such as it can cause our
brain to become less productive, easily get tired and distracted. There are
Module Assessment:
ESSU-ACAD-501|Version 4Page 3 of 6
Effectivity Date: June 10, 2021
Identifying the structure of data
Cleaning, filtering, reorganizing, augmenting, and aggregating data
Visualizing data
Data analysis, statistics, and modeling
Machine Learning
Assembling data processing pipelines to link these steps
Leveraging high-end computational resources for large-scale problems
Often, different tools address different parts of this process.
Therefore, interoperability among tools, based on common data structures and
interfaces, is an important element in enabling the construction of complex, multifaceted
data analysis pipelines. It is in this sense that we can talk about an ecosystem for data
science. For any particular application, you might only be interested in a subset of these
operations.
2. Among the data scientists, who do you think has the greatest contribution in the
existence of data science? Support your answer with a brief explanation.
3. What is data science, and what are the skills needed for you to be data scientist.
ESSU-ACAD-501|Version 4Page 4 of 6
Effectivity Date: June 10, 2021
Explain how these results can be used to solve business problems
4. Enumerate the 4 V’s in big data, and expound why data science in essential?
The reason why we need data science is the ability to process and interpret data.
This enables companies to make informed decisions around growth,
optimization, and performance.
For example, machine learning is now being used to make sense of every kind of
data – big or small.
Volume
The first V of big data is all about the amount of data—the volume. Today, every single
minute we create the same amount of data that was created from the beginning of time
until the year 2000. We now use the terms terabytes and petabytes to discuss the size
of data that needs to be processed. The quantity of data is certainly an important aspect
of making it be classified as big data. As a result of the amount of data we deal with
daily, new technologies and strategies such as multitiered storage media have been
developed to securely collect, analyze and store it properly.
ESSU-ACAD-501|Version 4Page 5 of 6
Effectivity Date: June 10, 2021
Velocity
Velocity, the second V of big data, is all about the speed new data is generated and
moves around. When you send a text, check out your social media feed and react to
posts on Facebook, Instagram or Twitter or make a credit card purchase, these acts
create data that need to be processed instantaneously. Compound these activities by all
the people in the world doing the same and more and you can start to see how velocity
is a key attribute of big data.
Variety
Veracity
The veracity of big data denotes the trustworthiness of the data. Is the data accurate
and high-quality? When talking about big data that comes from a variety of sources, it’s
important to understand the chain of custody, metadata and the context when the data
was collected to be able to glean accurate insights. The higher the veracity of the data
equates to the data’s importance to analyze and contribute to meaningful results for an
organization.
ESSU-ACAD-501|Version 4Page 6 of 6
Effectivity Date: June 10, 2021