You are on page 1of 8

1.

DATA SCIENCE AND THE ROLE OF DATA SCIENTIST


Data science is the process of deriving knowledge and insights from a huge and diverse set of
data through organizing, processing and analyzing the data. It involves many different
disciplines like mathematical and statistical modeling, extracting data from it source and
applying data visualization techniques. Often it also involves handling big data technologies to
gather both structured and unstructured data. Below we will see some example scenarios
where Data science is used.

Recommendation systems: - As online shopping becomes more prevalent, the e-commerce


platforms are able to capture users shopping preferences as well as the performance of various
products in the market. This leads to creation of recommendation systems which create models
predicting the shoppers needs and show the products the shopper is most likely to buy.

Financial Risk management: - The financial risk involving loans and credits are better analyzed
by using the customers past spend habits, past defaults, other financial commitments and many
socio-economic indicators. These data is gathered from various sources in different formats.
Organizing them together and getting insight into customers profile needs the help of Data
science. The outcome is minimizing loss for the financial organization by avoiding bad debt.

Improvement in Health Care services: - The health care industry deals with a variety of data
which can be classified into technical data, financial data, patient information, drug information
and legal rules. All this data need to be analyzed in a coordinated manner to produce insights
that will save cost both for the health care provider and care receiver while remaining legally
compliant.

Computer Vision: - The advancement in recognizing an image by a computer involves


processing large sets of image data from multiple objects of same category. For example, Face
recognition. These data sets are modeled, and algorithms are created to apply the model to
newer images to get a satisfactory result. Processing of these huge data sets and creation of
models need various tools used in Data science.

Efficient Management of Energy: - As the demand for energy consumption soars, the energy
producing companies need to manage the various phases of the energy production and
distribution more efficiently. This involves optimizing the production methods, the storage and
distribution mechanisms as well as studying the customers consumption patterns. Linking the
data from all these sources and deriving insight seems a daunting task. This is made easier by
using the tools of data science.

Python in Data Science: - The programming requirement of data science demands a very
versatile yet flexible language which is simple to write the code but can handle highly complex
mathematical processing. Python is most suited for such requirements as it has already
established itself both as a language for general computing as well as scientific computing.
Moreover it is being continuously upgraded in form of new addition to its plethora of libraries
aimed at different programming requirements. There are such kinds of python features which
make it the preferred language for data science.

The role of a data scientist is normally associated with tasks such as predictive modeling,
developing segmentation algorithms, recommender systems, A/B testing frameworks and
often working with raw unstructured data.

The nature of their work demands a deep understanding of mathematics, applied statistics and
programming. There are a few skills common between a data analyst and a data scientist, for
example, the ability to query databases. Both analyze data, but the decision of a data scientist
can have a greater impact in an organization.
Here is a set of skills a data scientist normally needs to have −

 Programming in a statistical package such as: R, Python, SAS, SPSS, or Julia


 Able to clean, extract, and explore data from different sources
 Research, design, and implementation of statistical models
 Deep statistical, mathematical, and computer science knowledge
2. Data in computer programming Data analytics perspective

Data analytics is a field of technology that takes raw data and retrieves information that can


resolve the problems or answer the questions with proper data.

Data analytics is the science of analyzing raw data to make conclusions about
that information.
The techniques and processes of data analytics have been automated into
mechanical processes and algorithms that work over raw data for human
consumption.
Data analytics help a business optimize its performance

Data analytics is a broad term that encompasses many diverse types of data analysis. Any type of
information can be subjected to data analytics techniques to get insight that can be used to improve
things. Data analytics techniques can reveal trends and metrics that would otherwise be lost in the
mass of information. This information can then be used to optimize processes to increase the overall
efficiency of a business or system.

For example, manufacturing companies often record the runtime, downtime, and work queue for
various machines and then analyze the data to better plan the workloads so the machines operate
closer to peak capacity.

Data analytics can do much more than point out bottlenecks in production. Gaming companies use
data analytics to set reward schedules for players that keep the majority of players active in the game.
Content companies use many of the same data analytics to keep you clicking, watching, or re-
organizing content to get another view or another click .

Data analytics is important because it helps businesses optimize their performances.


Implementing it into the business model means companies can help reduce costs by
identifying more efficient ways of doing business and by storing large amounts of data. A
company can also use data analytics to make better business decisions and help analyze
customer trends and satisfaction, which can lead to new—and better—products and services. 

Data types in computer programming perspective: - As its name indicates, a data type
represents a type of the data which you can process using your computer program. It can be
numeric, alphanumeric, decimal, etc.

 In our day-to-day life, we deal with different types of data such as strings, characters, whole
numbers (integers), and decimal numbers (floating point numbers).
Similarly, when we write a computer program to process different types of data, we need to
specify its type clearly; otherwise the computer does not understand how different operations
can be performed on that given data. Different programming languages use different
keywords to specify different data types. For example, C and Java programming languages
use int to specify integer data, whereas char specifies a character data type.

C and Java support almost the same set of data types, though Java supports additional data
types. For now, we are taking a few common data types supported by both the programming
languages −

Type Keyword Value range which can be represented by this data


type

Character char -128 to 127 or 0 to 255

Number int -32,768 to 32,767 or -2,147,483,648 to 2,147,483,647

Small Number short -32,768 to 32,767

Long Number long -2,147,483,648 to 2,147,483,647

Decimal Number float 1.2E-38 to 3.4E+38 till 6 decimal places

These data types are called primitive data types and you can use these data types to build
more complex data types, which are called user-defined data type, for example a string will be
a sequence of characters.
3. SERIES OF STEPS NEEDED TO GENERATE VALUE AND USEFUL INSIGHTS FROM DATA

In Data Value Chain information flow is described as a series of steps needed to generate


value and useful insights from data.
The data value chain describes the process of data creation and use from first identifying a need
for data to its final use and possible reuse. The data value chain has four major stages:
collection, publication, uptake, and impact. These four stages are further separated into twelve
steps: identify, collect, process, analyze, release, disseminate, connect, incentivize, influence,
use, change, and reuse.

Throughout the process, from one end of the value chain to another and back again, there
should be constant feedback between producers and stakeholders. The data value chain can be
used as a teaching tool to show the complex set of steps from data creation to use and impact
or as a management tool to monitor and evaluate the data production process.
 
While the data value chain was motivated by research on collecting and documenting gender
data impact stories, the concept applies to development data more broadly. This research led
to mapping the path from data production to tangible impacts. The importance of gender data
has received significant attention since the launch of the Sustainable Development Goals
(SDGs). To motivate progress on closing gender data gaps, it is essential to document the
impact of data on policies and outcomes, and, conversely, to demonstrate the opportunities
lost when we lack good data.
The motivating questions included: What enabling conditions were present that lead to the
success? What roadblocks impeded progress? Were data disseminated in a way that
encouraged their use? Building on previous research, the data value chain was constructed to
capture and visualize the steps data go through from production to impact.

Many people think of data and insights as synonymous, but there are subtle, yet important
distinctions between these two terms. Data are information, generally sets of numbers or text.
Insights are the knowledge gained through analyzing data, generating conclusions from the
data that can benefit your business. Data are the input. Insights are the output.

Data insights are knowledge that a company gains from analyzing sets of information pertaining
to a given topic or situation. Analysis of this information provides insights that help businesses
make informed decisions and reduces the risk that comes with trial-and-error testing methods. 

In the digital world we live in, there are copious amounts of data at our fingertips. But though
anyone can access raw data, the ability to extract valuable and actionable information from the
numbers is what will determine whether you can generate a competitive advantage for your
business.
74% of companies say they want to be “data-driven,” according to a Forrester report, but only
29% are actually successful at connecting analytics to insight. Data will be the largest area of
spending for companies as they attempt to become more data-driven.

If your business is going to survive, you need a strategy for the future. You have marketing,
social media, web analytics, sales, and support. Staying up-to-date with reliable information on
your growth is not easy tasks.  Data insights can give you a clear overview of what’s happening
across your business. And see everything in one place with a data visualization tool like Cyfe.  
4. Principle goals Of Data science
The goal of data science is to construct the means for extracting business-focused insights
from data. This requires an understanding of how value and information flows in a business,
and the ability to use that understanding to identify business opportunities.

Nowadays, it is often used by highly skilled computing professionals. Data scientists must own a
combination of analytical techniques, machine learning, data mining, and statistical skills, as
well as experience with algorithms. Its main task is of managing and interpreting large amounts
of data, along with this many data scientists are also tasked with creating data visualization
models that help illustrate the business value of digital information.

This field has a broad career path that is still undergoing developments and thus gives a
promising chance in the future. The data science job is more specific, which has made this a
specialized field. Exponentially increasing data requires experts to have a better understanding
and drive the latest computational methods to analyze them. Data Science professionals can be
called as data jugglers as they need to dive into big data world and derive valuable insights for
their company by utilizing their training and curiosity.

After completing this course, one can apply for the role of Data Science Engineer, Data Analysis
Engineer, Machine Learning Engineer, and Data Scientist. Some Principles of data science are:

• Countering the data-analytics complication:- It is important to understand the various


specifications, requirements, and priorities. One must possess the required resources present in
terms of people, technology, time, and data to support the proceedings. In this stage, all the
business problems and complications are framed. A very common mistake people working on
data science projects make is just directly rushing into data collection and analyzing without
understanding the requirements and complications properly.

• Determining the accurate data sets and variables :- In this stage, for determining the data set
and variables, one requires an analytical sandbox so that they can perform analytics of the
project during the entire project duration. For accomplishing this stage, one needs to
preprocess and condition data prior to modeling. Cleaning and validating the data to ensure
accuracy-in this phase, one delivers final reports, briefings, code, and technical documents.
Sometimes, a project is also implemented in a real-time production scenario. This provides a
clear picture of the performance and other related constraints on a small scale before full
deployment.

• Interpreting the data for obtaining solutions:- At last, it is important to evaluate whether the
objectives are accomplished or not that have been planned in the first stage. Evaluating all the
key findings, communicate to the stakeholders, and establish whether the results of the project
are a success or a failure.
With the continuing popularity of the “data scientist” position, it is possible that the future of
data science will be home to modelers and data engineers. And they will be paid on the same
corporate scale. In the future, data scientists and analytic professionals will almost play the
same roles. But it might take many years

You might also like