You are on page 1of 5

Student Name : Mridul Shaik Mukthar

Student ID : 48790983

DATA7001 : Introduction to Data Science 1 2024


Case Study

For this case study I decided to consider Report 3 and 4.

Report 3

A. Why was the analysis carried out? Who may be the


stakeholders? What value can it bring to stakeholders?

The analysis was done to account the main reason why


unemployment is high in Queensland, whether the impact of Covid-
19 and any other aspects that could affect the employment rate.
Factors that affect the employment situation are the national policy
and personal aspects of job seekers. The key stakeholders consider in
this report are mainly being job hunters, students and the
government. The values that these stakeholders can bring to this
employability is substantially high compared to other as they are the
ones that impact on the society and economy of the nation’s
employability.
B. What was the format of the data used? What can you say about
the data size and specific format? Is missing data involved?

The report shows data from 3 sources and datasets are analyzed at
the same time. With over 300 data entries with all the type being
numerical, it is a structured data as all the sources are directly from
the Australian government website which is accurate and well
organized. Hence the data quality is high. Data size is 195mb and the
file is in pdf format. Attributes with missing data were there so they
have replaced it with “0” and others removed those values. Data
format was changed from months name to months number.

C. Based on what you can understand - what


tools/algorithms/methods are used in the analysis?

At first data preprocessing was done where they had cut short data
according to their requirements. Data exploration was done for
factors like covid-10, education and age. Ain order to find the
relationships between these factors using linear analysis. Further
analytic methods used her are regression lines, logistic regression
and graphs.

D. How can such analysis affect decisions? For application problem,


what are the important questions that need to be solved?

These kinds of analysis can affect the whole reason why


unemployment is present currently. After analysis of the data and
with the relationship found, better decision making, and policy
formulation can be made. The important questions that need to be
solved for unemployment are on how they could overcome
unemployment over underlying causes, skill needed in demand for
labor market, role of government policies and regional disparities in
unemployment rate.

E. Imagine you were speaking to stakeholders associated with the


dataset. Are there any questions you want to ask them?

The questions I would like to ask is, what initiatives the government
planned to address unemployment, in what ways can government
support could help reduce unemployment, what challenges do they
anticipate for acquiring a job in their own field, how long and how
sure they are about getting a job after education/any additional skill
required to for employment.

REPORT 4

A. Why was the analysis carried out? Who may be the


stakeholders? What value can it bring to stakeholders?

By reading the report it is said that the analysis was done to


understand which type of crimes is considered in high. The past
situation in Queensland, knowing which district is more dangerous
than the rest and finding a relationship between crimes which shows
many trends and patterns over time. The initial crime data statistics
are carried out by law enforcements and further shows the
frequencies of the data including geographic distribution of the
crime. From the sample data, the report shows potential
stakeholders which are said to be citizens, law enforcement, state
government and business. Values gained by the stakeholders varies
depending on safety, deploying more manpower into the dangerous
zones to decrease crime rate and to arrange time accordingly for
business to make profitable.

B. What was the format of the data used? What can you say about
the data size and specific format? Is missing data involved?

In this report, the two different datasets from same sources one
being month by month crime record and other being district wise
crime record are both structured data in a tabular format. From the
district wise data, it allows for comparison not only between months
but also between different districts which can be useful for
identifying patterns or difference in crime rates across different
areas. Using the csv data into python, they have cleared the
formatting. The file format is in pdf and the about 100 data entries-
most of data type is numeric. The Data size is 99.5mb. As the data is
acquired from the government there won’t be any missing data and
data quality is high.

C. Based on what you can understand - what


tools/algorithms/methods are used in the analysis?

While analyzing this report, its shown that firstly data cleaning was
done using python for the dataset and further merged the month
data with years. With the use of different types of graphs, plots and
map visualization analysis, they have checked if there is any
relationship by making the data confess. At the end, linear regression
model was created for best fit to find a trend of how the number of
assaults cases changes over time.
D. How can such analysis affect decisions? For application problem,
what are the important questions that need to be solved?

These kinds of analysis can affect the initial decision made by other.
After deeply analysis the data and with the relationship found, better
decisions can be made. The most necessary problem that need to be
solved is a way to control the crime rates in the state.

E. Imagine you were speaking to stakeholders associated with the


dataset. Are there any questions you want to ask them?

If I had to ask anything, it will be related to the current initiative on


how crime rates can be reduced. By not engaging into dangerous
zones and be in a better community. Asking if there would be more
safety protocols to the police and suggesting running programs to
prevent crime.

You might also like