Professional Documents
Culture Documents
Assignment 2
Assignment 2
to your initiative. In other words, how well will the existing tools and technology in place
The benefits of data science and its applications is countless. In the current business environment
where data science has been firmly embraced by various co-operations and businesses, there is
the need to develop and apply tools to help business leaders make informed decisions. Currently
there are thousands of Data tools out there for data analysis. As a data scientist, the goal is to use
data analysis to inspect, clean, transform, and model the data at your disposal with the goal of
m
er as
discovering useful information, suggesting conclusions, and supporting decision making. Hence
co
eH w
the need to use appropriate analytic tools to analyze the data in response to the survey output
o.
rs e
collated specifically in this case from the community in the city of Concordia with the goal of
ou urc
seeking to find out the various connections that may exist among the various community
o
organizations to draw out common collaborations or linkages to meet some of the needs in the
aC s
vi y re
community.
In assessing and making an appropriate choice for an appropriate tool to analyze the data at hand,
ed d
one has to bear in mind that an appropriate analytical tool has to be able to explore the data to
ar stu
find patterns and relationships, and also have the ability to apply appropriate statistical
is
techniques to determine whether hypotheses about the data set are true or false (Tukey, 2003).
Th
With data analytic tools been separated into quantitative data analysis and qualitative data
sh
analysis. My initiative or project would employ the use of quantitative analytic tools which will
involve the analysis of numerical data with quantifiable variables that can be compared or
measured statistically. The goal here will be to use data analytics which involves data mining
This study source was downloaded by 100000831318803 from CourseHero.com on 09-05-2021 10:39:18 GMT -05:00
https://www.coursehero.com/file/35716588/Assignment-2docx/
involving sorting through large data sets to identify trends, patterns and relationships (such as in
the Association Rule); predictive analytics, which seeks to predict customer behavior, and other
future events; and machine learning, an artificial intelligence technique that uses automated
algorithms to churn through data sets more quickly (such as in clustering techniques). With an
array of such data analytic tools at my disposal, I can apply these data mining, predictive
analytics and machine learning tools to the sets of data collated from my initiative from the
Applicability to initiative
m
er as
co
eH w
At this point, to test the applicability of the tools to the initiative there is the need to building an
o.
rs e
analytical model, using predictive modeling tools and analytics software/programming languages
ou urc
such as R. Using R, the model will initially be run against a partial data set to test its accuracy
o
and output; it will then be revised and tested again, until it functions as intended. Once this is
aC s
vi y re
achieved, the model is run in production mode against the full data set. R programming would
F. Tool Applicability to Data: Assess the applicability of the existing tools for the data you
have or will have, based on your analysis of the characteristics of that data. In other words,
is
how fitting are the existing tools for the data, considering the various forms the data may
Th
take?
With the countless number of new analytics tools at our disposal it is sometimes difficult to
sh
select the best choice for the task at hand. The overall goal is to select an application that is:
This study source was downloaded by 100000831318803 from CourseHero.com on 09-05-2021 10:39:18 GMT -05:00
https://www.coursehero.com/file/35716588/Assignment-2docx/
2. Intuitive and possesses an easy-to-use interface. The analytics tool should be capable of
patterns in data
5. The ability to define test parameters for analytics models
The existing tool, R, is applicable for the data available. With the R language, it has all
standard data analysis tools for accessing data in different formats. Using R, there are tools
for traditional and modern statistical models like ANOVA, Regression, making it easy to
extract and merge the required information from the data available. With the data from the
m
er as
initiative at hand basic statistics can easily be applied in R to assess the mean, maxima,
co
eH w
minima and standard deviation of the responses of the survey participants. Using R’s
o.
rs e
statistical application, we can draw out the averages of people in a particular social group
ou urc
who are interested in a particular project and how it links or connects to other interests
o
expressed by other participants in other groups. Its applicability can be run on data in
aC s
vi y re
Windows and MacOS, making it suitable for the data at hand. Using R which is the most
used tool for predictive analytics the data available can be used to predict the commonalities
ed d
or linkages that exists between the various social organizations in the city to reveal any
ar stu
linkages across the diverse community organizations to draw out common collaborations or
is
Also, with its ability to create pleasing graphics because of its parallel processing
functionality making R a strong visualization and graphics tool, the graphics can be used to
sh
representations of “time spent in community and whether one is employed or not” or “gender
This study source was downloaded by 100000831318803 from CourseHero.com on 09-05-2021 10:39:18 GMT -05:00
https://www.coursehero.com/file/35716588/Assignment-2docx/
and whether one belongs to a professional grouping or not” can all be graphically represented
using R, thus making it well suited to be applied to the data currently collated.
G. Tool Recommendations: This course covers many analytic tools and technologies,
including their benefits and limitations for various uses and data. Recommend two tools that
are not already used and could reasonably be applied to your initiative. Assess the
applicability and value of these tools as they relate to your available and planned data and the
m
er as
co
eH w
Two data analytic tools that could be used based on the available data at hand would be
o.
1. Tableau
2. Phython rs e
ou urc
o
As a data analytic tool, Tableau is a data visualization software package and enables one to
aC s
vi y re
explore data and make all kinds of analysis and observations. Its intelligent algorithms figure out
the type of data and the best method available to process it. With the data at hand various
ed d
scenarios using complex algorithms can be used to spew out various data outcomes that can be
ar stu
analyzed to draw out different scenarios. With its unique ability to access files in different
is
formats, tableau can be applied to my current data to analyze and visualize data better than any
Th
other data visualization software on the market. Tableau makes it easier to create powerful visual
information that communicates what is important better than a spreadsheet. Hence by using
sh
tableau, graphical commonalities that exists between the various social groupings in the
community can easily be graphed and visualized. Thus, I will be able to come up with graphical
This study source was downloaded by 100000831318803 from CourseHero.com on 09-05-2021 10:39:18 GMT -05:00
https://www.coursehero.com/file/35716588/Assignment-2docx/
Family, Hobbies, Social_Club etc and how it influences choice of common projects by the
survey participants
Python as an analytic tool is powerful and easy to learn. Over time, analytics features have been
added, making it increasingly popular with developers looking to do analytics applications but
wanting more power than the R language. In comparison to R, R is built for one thing, statistical
analysis, but Python can do analytics plus many other functions including machine learning and
analytics. Hence as an improved tool with superior applications in comparison to R, phython can
be used analyse the available data to generate several conclusions to enable us draw out several
m
er as
connections and linkages in terms of common projects which the various groups in the
co
eH w
community will be interested to work on to further enhance the mission of the city Concordia.
o.
rs e
ou urc
Reference
o
aC s
Hoaglin, D. (2003). John W. Tukey and Data Analysis. Statistical Science, 18(3), 311-318.
vi y re
ed d
ar stu
is
Th
sh
This study source was downloaded by 100000831318803 from CourseHero.com on 09-05-2021 10:39:18 GMT -05:00
https://www.coursehero.com/file/35716588/Assignment-2docx/
Powered by TCPDF (www.tcpdf.org)