Professional Documents
Culture Documents
1. Business Understanding
While analyzing the data for the industry we should have clear overview and
understanding of the industry what it does, what kind of decision they are going to
make, for which purpose the data is being analyzed, this all data analyzing process is
started with a question, lots of people think that the data can be analyzed by using the
data set, availability of the data set is sufficient to analyze any kind of pattern, as per
understanding there is no data set for analyzing the data all we need it the questions
define the data sets itself, the only challenge, in this case, is while answering the one
questions another question can be pop up bu it is ok, it more than actually a part of
source such as data warehouse, logs, and data set to answer those question, row data
is queried to answering the questions but this is not the row data set, instead, we need
to call it row data because it is not exactly in the form of where we want it to analyzing.
leads the further analyzing process this is a clean data set. SQL is used for extracting the
data from the database. the database which is queried to extract the data having several
rows exceed 1 Million. where database query languages like SQL enables an Analyst to
analyze and transform data easily. SQL is the first thing you should learn as it enables
structure to another state structure, it is the fundamental state of data integration where
the data collected from different sources have been integrated into particular structured
data in such manner that it can be used at a desti006Eation for analysis process this
process is known as ETL(Extract Transform Load). The data transformation process refers
to detecting and understanding the data in its original structured or source format. This
is usually achieved with the help of algorithms which is implemented by using data
analysis and profiling tool. This step helps you decide what needs to happen to the data
to get it into the desired or requested format. Generally, R or Python language enables
you to perform data transformation on large or complex data that is coming from the
source.
5. Data Visualization
After building or creating the datasets, we need to visualize data to develop your
Hypothesis or Insights to explore and evaluate the data. Tableau/saas (data visualization
application) allows us to visualize large rows of columns of data in both structured and
unstructured databases and easily bring insights/ meaningful patterns out of the
dataset.
6. Statical Analysis
it is the important aspects of data analysis which summarize the data and it’s
understanding in terms of model and graphs apart from this it also explains how the
data is related to the underlying real world. the statical analysis is also used to
identifying the pattern or trends for predictive analytics which helps to make the
business decision, it also helps to determine the statical significance of the data set.
data model development consists of the definition of model goals, the concept of the
R/Python enables you to create a statistical model to reject any invalid or null
mathematical complexity. Vendors are developing software as services such as table and
SAS to making the analysis process easier and easier by building models using
and the result or consequences of the analysis process is represented in terms of story,
report, recommendations and PPT, tableau and SAS application plays an important role
to summarize the analysis process via a report or story building, this report includes:
Conclusion
For most businesses, enterprises, industries and government agencies, lack of data isn’t
a problem. There’s huge information available to make a clear data-driven and business-
oriented decision. With so much data to use in the analytics oriented process, we need
something more appropriate knowledge and information from available data: Business
needs to know it is the right data for making the data-driven decision. Business needs to
https://online.hbs.edu/blog/post/data-visualization-techniques
What is Data?
Data is a raw and unorganized fact that required to be processed to
make it meaningful. Data can be simple at the same time unorganized
unless it is organized. Generally, data comprises facts, observations,
perceptions numbers, characters, symbols, image, etc.
What is Information?
Information is a set of data which is processed in a meaningful way
according to the given requirement. Information is processed,
structured, or presented in a given context to make it meaningful and
useful.
https://paldhous.github.io/ucb/2018/dataviz/week2.html
Data graphics to address different types of audiences
Each visualization you create will have an implied reader; you might
not have named them out loud, but somewhere in the back of your
mind you are making something for someone. Maybe you’re preparing
a big presentation for the department director. Perhaps you’d like to
explain a nuanced issue to a collaborator. Your mother might have
called, reminding you that she doesn’t know how you feed yourself. No
matter the audience, the way you construct a visualization should take
the specific reader into account.
The core visualization you create might stay roughly the same, at least
insofar as the data you use and the message you relay. Still, there are
meaningful adjustments that ensure your work is understood by the
people you want to reach. Here are a few of the personalities I’ve
encountered in my work, along with my suggestions for giving each
what they uniquely need.
The Big Boss will know the context already — that’s their job — though
a quick reminder will be appreciated. What they don’t know is why you
are standing at their desk. Explain yourself directly, in simple
language, via the title of your graph. Be obvious and say something
like, “No Significant Effect Observed from Study Drug”.
Use large, horizontal font that they can see without glasses or tilting
their head. Employ arrows and annotations to highlight your talking
points. Rename axis and legend labels to something other than
variable names. Write a caption — or even a few drill-down graphs —
that answer what you know they’ll ask. Your goal is to provide an
efficient, omniscient experience.
You can relax a bit on the formatting for these readers — within reason,
of course — especially if you are iterating quickly and everyone is on
the same page. Unlike for other audiences, you might want to keep the
variable names on your axes. Installing reference points like threshold
lines, confidence intervals, and annotations will allow your peers to
sanity check the work and notice specific areas of interest.
Because you’re using visualization to pick out patterns and problems,
try to stick with similar aesthetics across team members; this will make
it easier to detect substantive changes and reduce interpretation errors.
Consider sticking to more common formats like bars, lines, points, and
(gasp) pies. Use color schemes that have strong associations. Try to use
fewer than four or five categories to keep complexity low, and highlight
colors if there are particularly important areas. The overarching goal is
to provide a straightforward, low-stress experience for individuals who
might already feel out of their depth.
On a slightly different note, the brand aesthetic you develop for your
visualizations can have a big impact on the way your work is perceived.
Take a look at this article that describes how you can implement a
unique visualization brand for yourself.