You are on page 1of 8

6/14/2014 What Is Data Science?

http://www.datascientists.net/what-is-data-science 1/8
Theories Behind Data Science
If youd like to perform data science there are several theories and principles that you need to understand.
And once you understand these theories and principles, it will allow you to learn a certain set of practices, and
step by step skills that data scientists do. If you dont understand these theories and principles, then you
wont be able to understand the practices and skills. So first let me teach you a few theories and principles
that are involved, and once you understand the theoretical elements, then I can teach you a simple step-by-
step method for doing data science.
Database Theory
Firstly lets talk about database theory. Database theory is about organising data and organising it in a way
that makes storing and retrieving it efficient. Data can be categorised into objects, objects can be put into
collections and objects and collections can have relationships between each other and themselves. The one
thing you need to know about this theory is that they way you organise your data will impact the effort
required to get answers from it.
Agile Manifesto
Now lets talk about the Agile Manifesto. The Agile Manifesto is a set of principles that ensures high quality
outputs in environments subject to high levels of change and ambiguity. Agile methods overcome rapid
changes and ambiguity through adopting an iterative development process. It utilises self managed teams and
those that are passionate about technological advancements are drawn to it like scientists to big bang theory.
The Agile Manifesto looks to remove all cultural barriers between developer, client and end user and focuses
on using the latest technology to making things simple but not simpler. The one thing you need to know
about this set of principles is that all things change and the longer you take to test your solution in the live
environment the higher the risk of failure.
Spiral Dynamics
The last theory Id like to touch on is Spiral Dynamics Theory. Spiral Dynamics is a theory of human
development and behaviour and explains why humans do what we do. It explains the psychology behind why
we get out of bed in the morning, why we feel compelled to create things and why we seek to better ourselves
and better serve our loved ones. The theory talks about two mental states, one of facts and one of values.
Facts are what we believe. Our beliefs are based on the knowledge we currently have and the environment we
are currently in. Values are what we desire. Our desires are driven by our intentions and/or concerns which are
also based on the knowledge we currently have and the environment we are currently in. The one thing you
need to know about this theory is that our facts and our desires come from what data is presented to us.
Data Scientists
Data Scientists perform data science. They use technology and skills to increase awareness, clarity and
direction for those working with data. The data scientist role is here to accommodate the rapid changes that
occur in our modern day environment and are bestowed the task of minimising the disruption that technology
Home What Is Data Sci ence?
What Is Data Science?
Home Data Science Services Blog About

6/14/2014 What Is Data Science?
http://www.datascientists.net/what-is-data-science 2/8
and data is having on the way we work, play and learn. Data Scientists dont just present data, data scientists
present data with an intelligence awareness of the consequences of presenting that data.
How To Do Data Science
The three components involved in data science are organising, packaging and delivering data (the OPD of
data). Organising is where the physical location and structure of the data is planned and executed. Packaging
is where the prototypes are build, the statistics is performed and the visualisation is created. Delivering is
where the story gets told and the value is obtained. However what separates data science from all other
existing roles is that they also need to have a continual awareness of What, How, Who and Why. A data
scientist needs to know what will be the output of the data science process and have a clear vision of this
output. A data scientist needs to have a clearly defined plan on how will this output be achieved within the
restraints of available resources and time. A data scientist needs to deeply understand who the people are that
will be involved in creating the output. And most of all the data scientist must know why there is a motivation
behind attempting to manifest the creative visualisation.
The 3 step OPD Data Science Process
Step 1. Organi se Data.
Organising data involves the physical storage and format of data and incorporated best practices in data
management.
Step 2. Package Data.
Packaging data involves logically manipulating and joining the underlying raw data into a new representation
and package.
Step 3. Del i ver Data.
Delivering data involves ensuring that the message the data has is being accessed by those that need to hear
it.
Plus, at all steps have answers to these questions.
What is being created?
How will it be created?
Who will be involved in creating it?
Why is it to be created?
The Data Science Model
6/14/2014 What Is Data Science?
http://www.datascientists.net/what-is-data-science 3/8
Data Science in action
Data science in action it is simply about moving people and/or systems between current and new
technologies and between beginner and expert skills.
Step 1. Organising Data.
Organising data involves moving people and systems from current to new (left to right) and from beginner to
expert (top to bottom). Advancing technologies and skills is the essence of innovation.
6/14/2014 What Is Data Science?
http://www.datascientists.net/what-is-data-science 4/8
Step 2. Packaging Data.
Packaging data is the reverse of organising data and involves moving people and systems from new to current
(right to left) and from expert to beginner (bottom to top). This is the art of making things simple but not
simpler.
6/14/2014 What Is Data Science?
http://www.datascientists.net/what-is-data-science 5/8
Step 3. Delivering Data.
Delivering data is enabling the movement from one view to another, enabling a beginner to become an expert,
enabling current technology to seem new, enabling expert data to be understood by beginners and enabling
new technology to seem like it has be a part of your life since you were born. This is transformational
education.
6/14/2014 What Is Data Science?
http://www.datascientists.net/what-is-data-science 6/8
Tweet 26
40 people like this. Like
6/14/2014 What Is Data Science?
http://www.datascientists.net/what-is-data-science 7/8
Links list Subscribe Recent news
Add a comment
Facebook social plugin
10 comments
Anthony-Emily Ryley-Flynn Nolan Sydney, Australia
Nice to see it put in more modern terms, but it is still a very limited view of
the Information Science (not computer science) material, taught to me in
Uni in the late 80's early 90's. This at least beings back data science
towards to Brenda Dervin writings, that the information scientist has a
need, to understand the context. What I think this article is missing is the
way that data, well rather its perception and context, is linked to time and
space, and that data and its usage is multi-faceted to different perceivers.
I would also like to suggest that there is a refinement feedback cycle to
understanding, combining and delivering data to the customer, which
contributes to the whole process, that also needs to be included.
Reply Like June 23, 2013 at 1:08pm
Sivaramakrishna Reddy Asst Manager - Spend Analysis at Genpact
More or less, every one in analytics are following these principles.
however this is awesome article to represent the philosophy behind
analytics.
I like this presentation.
Reply Like May 8, 2013 at 11:27am 1
Gail La Grouw
A great account. Thank you for sharing your perspective with us all.
Reply Like September 20, 2012 at 12:50am 1
Joel Lockbaum Senior Systems Engineer at Pure Storage
I understood it (or perhaps I don't) to be the live environment is actually
more of the testbed and in fact more actual production, due to the iterative
way to quickly resolve issues....we take more risk after release than we
used to in monolithic design. Or more clear, you give me more buggy but
quick code out here, and I ask for (collaborate to you) more items in a
smaller release.
Reply Like October 26, 2012 at 1:14pm
DataScientists.Net
Hi Joel, You are correct. The fast you can deliver, gain
feedback and repeat the quicker you can adapt to the ever
changing environment.
Reply Like November 7, 2012 at 10:44pm
Joel Lockbaum Senior Systems Engineer at Pure Storage
and now add a cooperative userbase that expects fast change
and fast fix to their issues and you can almost develop in
production. Neartime with (if needed) roll back to now -1 . I can
modify my running environment in such small increments, that
I have better stability too! Neat!! indeed!
Reply Like November 9, 2012 at 11:01pm
Joel Lockbaum Senior Systems Engineer at Pure Storage
I followed you all the way to this sentence... Can you explain?
The one thing you need to know about this set of principles is that all
things change and the longer you take to test your solution in the live
environment the higher the risk of failure.
Reply Like October 26, 2012 at 1:06pm
Praveen Gagan
A great representation for experts and beginners. Thanks for great
thoughts.
Reply Like October 15, 2012 at 8:41am
Tom Kafa Brisbane, Queensland, Australia
Truly inspiring, in the fields of information technology we are aware that
we need to be continually learning and growing. The way of the Data
Scientlist highlights the need to grow not just mentally and logically but
spiritually and with total fulfillment. Getting to the core of what and why that
makes us the brilliant creators and developers we are. Thanks for
making this amazing report and making me aware of the direction I want
to travel in work and in life.
Reply Like March 25, 2012 at 6:17pm 2
Getulio Amorim The University of Nottingham
Demanda por estatsticos.
Reply Like August 20, 2012 at 12:03pm
Send
6/14/2014 What Is Data Science?
http://www.datascientists.net/what-is-data-science 8/8
Home
Data Science
Services
Blog
About
Full Name
Email Address
Enter Word Verif ication in box below
Subscribe
No news found.
Follow us
Facebook
Twitter
LinkedIn
Copyright 2011. Data Scientists Pty Ltd ABN 45134581089 ACN 134581089 Home Contact Terms of Service Privacy

You might also like