You are on page 1of 9

Class – X Subject:Artificial Intelligence

Chapter-2.AI Project Cycle

1.What are the stages of AI Project Cycle?


AI Project Cycle has the following main stages:

1. Problem Scoping
2. Data Acquisition
3. Data Exploration
4. Data Modelling
5. Evaluation

2. What is meant by Problem Scoping?

The problem scoping refers to the identification of a problem and the vision to solve it.

3. Explain 4Ws canvas using in Problem Scoping.


The 4Ws are very helpful in problem scoping. They are:

1. Who? – Refers that who is facing a problem and who are the stakeholders of the problem
2. What? – Refers to what is the problem and how you know about the problem
3. Where? – It is related to the context or situation or location of the problem
4. Why? – Refers to why we need to solve the problem and what are the benefits to the
stakeholders after solving the problem

4. What are the important aspects of problem scoping?


The following are few key points:

 When you start with an AI project or model you need to do problem scoping first.
 It the process of figure out the problem and what are the solutions.
 The AI project must have problem statement with required clarity

5. Define problem statement template.


When the 4Ws canvas are completely filled we need to prepare a summary of these 4Ws. This
summary is known as the problem statement template. This template explains all the key points in
a single template. So if the same problem arises in the future this statement helps to resolve it
easily.
[Stakeholder(s)]
– Write about the stakeholders
Our _____________ Who
_____________
_____________

[issue, problem, need]


– Description of the problem and need for a
solution
has/have a problem that What
________________
________________
________________

[context, situation]
– Describe the context or location or situation
when/while ___________ Where
___________
___________

[benefit of the solution for them]


– Description About the solution
an ideal solution would be ____________ Why
____________
_____________

6. Define Data Acquisition


Data Acquisition consists of two words:

1. Data : Data refers to the raw facts , figures, or piece of facts, or statistics collected for
reference or analysis.
2. Acquisition: Acquisition refers to acquiring data for the project.

The stage of acquiring data from the relevant sources is known as data acquisition.

7. Write short notes about data features.

Data features refer to the type of data that we want to collects. Here two terms are associated with
this:
1. Training Data: The collected data through the system is known as training data. In other
words the input given by the user in the system can be considered as training data.
2. Testing Data: The result data set or processed data is known as testing data. In other
words, the output of the data is known as testing data.

8. What are the various methods of data acquisition?

The most common methods of data acquisition are:

1. Surveys : Through Google Forms, MS Teams Forms or any other interface


2. Web Scrapping: Some software are Scarpy, Scrape hero Cloud, ParseHub, OutHitHub,
Visual Web Ripper, Import.io
3. Sensors: to convert physical parameters to electrical signals, to convert sensor signals
into a form that can be converted to digital values and to convert conditioned sensor signals
to digital values
4. Cameras: To capture images
5. Observations: Way of gathering data by watching behavior, events, or noting physical
characteristics in their natural setting
6. API (Application Program Interface)

9. Mention some of the open source datasets for data acquision.

Here the list of open source datasets

●Lionbridge AI
●Amazon Mechanical Turk
●LabelBox
●Figure Eight
●Kaggle
●http://mospi.nic.in/data

Big Data for AI Project Cycle


 A collection of data that is huge in volume, yet growing exponentially with time.
 It is a data with so large size and complexity that none of traditional data management tools
can store it or process it efficiently.

Examples of Big Data


 Stock Exchange
 Social Media Websites
 Youtube and web series platforms
Types of Big Data
There are three types of big data:

Structured Semi structured Unstructured

Having a pattern, usually stored in No well-defined


Without any structure
tabular form and accessed by structures but categorized
or not defined in any
some applications like MS excel or data using some meta
framework
DBMS tags
– Audio Video file
– Employees data of a company – HTML Page
– Social Media posts
– Result dataset of a board – CSV Files

Training, Testing and Validation of Data


 Training set: The data where the model is trained on
 Validation set: Data the model has not been trained on and used to tune hyper
parameters
 Test set: In principle the same like the validation set, just used at the final end after the
model has been tailored.

Data Exploration
Data Exploration refers to the techniques and tools used to visualize data through complex
statistical methods.

Advantages of Data Visualization


 A better understanding of data
 Provides insights into data
 Allows user interaction
 Provide real-time analysis
 Help to make decisions
 Reduces complexity of data
 Provides the relationships and patterns contained within data
 Define a strategy for your data model
 Provides an effective way of communication among users

Till now you learned about problem scoping and data acquisition. Now you have set your goal for
your AI project and found ways to acquire data. When you acquired data the main problem with
data is – the data is very complex. Because it’s having numbers. To make use of these numbers
user need a specific pattern to understand the data.

For example if you are going to reading a book. You went to library and selected a book. The first
things you try to do is, just turning the pages and take a review and then select a book of your
choice. Similarly, when you are working with data or going to analyze data you need to use data
visualization.

Data Visualization Tools


There are many data visualization tools available.

Here the list of 20 data visualization tools for you. Although there are many more tools available
and these numbers increasing day by day.

1. Microsoft Excel
2. Tableau
3. Qlikview
4. FusionCharts
5. DataWrapper
6. MS Power BI
7. Google Data Studio
8. Sisense
9. HiCharts
10. Xplenty
11. HubSpot
12. Whatagraph
13. Adaptive Discovery
14. Teammate Analytics
15. Jupyter
16. Dundas BI
17. Infogram
18. Google Charts
19. Visme
20. Domo

Do a small research and learn how to visualize your data with above tools.

How to select a proper graph for data


visualization
Now you are familiar with various chart types. Now the next step is to select an appropriate chart
for data visualization. The selection of chart all depends on the data and the goal you are going to
achieve through your model. Although some basic purposes of charts that let you select an
appropriate chart, they are as follows:
1. Comparison of Values – Show periodical changes i.e. Bar Chart
2. Comparison of Trends – Show changes over a period of time i.e. Line Chart
3. Distribution of Data according to categories – Show data according to category i.e.
Histogram
4. Highlight a portion of a whole – Highlight data according to value i.e. Pie Chart
5. Show the relationship between data – Multiple charts can be used

Need of data visualization


 Quickly get a sense of the trends, relationships and patterns contained within the data.
 Define strategy for which model to use at a later stage.
 Communicate the same to others effectively.
 To visualize data, we can use various types of visual representations.

Data Visualization tools


 Microsoft Excel
 Tableau
 Qlikview
 Datawrapper
 Google Data Studio

You can get access to multidimensional data by following this link: Visualize 200-dimensional data

Modelling
Now you are entering the modelling stage. So let’s explore the terms for it:

 Artificial Intelligence, or AI, refers to any technique that enables computers to mimic
human intelligence.
 Machine Learning, or ML, enables machines to improve at tasks with experience. The
machine learns from its mistakes and takes them into consideration in the next execution.
 Deep Learning, or DL, enables software to train itself to perform tasks with vast amounts
of data. In deep learning, the machine is trained with huge amounts of data which helps it
into training itself around the data.
 AI Modelling refers to developing algorithms, also called models which can be trained to
get intelligent outputs. That is, writing codes to make a machine artificially intelligent.
Types of AI models

Rule-Based model refers to setting up rules and training the model accordingly. It follows an
algorithm or code to train, test and validate data.

Learning-based refer to identifying the data by its attributes and behaviour and training the model
accordingly. There is no code or algorithm to train, test and validate the data. It learns from past
behaviour and attributes received from data.

Decision Tree
 Decision tree builds classification or regression models in the form of a tree structure.
 It breaks down a dataset into smaller and smaller subsets while at the same time an
associated decision tree is incrementally developed.
 The final result is a tree with decision nodes and leaf nodes

Types of learning
There are three types of learning:

1. Supervised
2. Unsupervised
3. Reinforcement

Supervised Learning

 The dataset which is fed to the machine is labelled.


 A label is some information which can be used as a tag for data.
 For example, students get grades according to the marks they secure in examinations.
 These grades are labels which categorize the students according to their marks.
 Classification
o Where the data is classified according to the labels.
o The entries are divided in two classes normally.
o The boundary condition is defined to classify.
 Regression
o Regression deals with continuous data.
o For example, if we know the growth rate, we can predict the salary of someone after
a certain number of years.
o Regression is linear as well as non-linear.

Unsupervised Learning

 An unsupervised learning model works on unlabelled dataset.


 This means that the data which is fed to the machine is random and there is a possibility
that the person who is training the model does not have any information regarding it.
 The unsupervised learning models are used to identify relationships, patterns and trends
out of the data which is fed into it.
 It helps the user in understanding what the data is about and what are the major features
identified by the machine in it.
 Clustering
o Refers to the unsupervised learning algorithm which can cluster the unknown data
according to the patterns or trends identified out of it.
o The patterns observed might be the ones which are known to the developer or it
might even come up with some unique patterns out of it
 Dimensionality reduction
o We humans are able to visualize up to 3-Dimensions only.
o If we have a ball in our hand, it is 3-Dimensions right now.
o But if we click its picture, the data transforms to 2-D.
o Hence, to reduce the dimensions and still be able to make sense out of the data, we
use Dimensionality Reduction.
Reinforcement Learning

 Reinforcement Learning is defined as a Machine Learning method that is concerned with


how software agents should take actions in an environment.
 Reinforcement Learning is a part of the deep learning method that helps you to maximize
some portion of the cumulative reward.

Evaluation
Once a model has been made and trained, it needs to go through proper testing so that one can calculate the
efficiency and performance of the model. Hence, the model is tested with the help of Testing Data (which was
separated out of the acquired dataset at Data Acquisition stage) and the efficiency of the model is calculated
on the basis of the following parameters,

1. Accuracy

2. Precision

3. Recall

4. F1 score

You might also like