You are on page 1of 5

Volume 7, Issue 6, June – 2022 International Journal of Innovative Science and Research Technology

ISSN No:-2456-2165

Machine Learning Algorithms and


its Applications: A Survey
RAJA IRFAN AHMAD MIR

Abstract:- In today present world lots of II. DIFFERENT TYPES OF MACHINE LEARNINGS
microelectronicstatistics is created in apiece and every
field. The data that we obtain contains valuableinfo to A. Supervised Learning
forecast the future. Owing to the enormous in Supervised learning (SL) is a type of Machine Learning
magnitude, the physical forecasting gives aintricatechore in which a data that we select as a training data
to humans. To overwhelm this problem, the data model setwithlabelled data, or data with previouslyknown output
is made in such a way so that it can predict the future by value. The glitches that we are solve through supervised
the situation with the aid of training data and test learning are Classification and regression.
datasets. To Pullman the machine or the data model,
numeroustypes of machine learningalgorithms (ML) and B. Unsupervised Learning
tools are available. This paper willemphasis on the Unsupervised learning is a type techniques that do-
review of the few machine learning algorithms(ML) and notuse a preparationset but finds some patterns or edifice in
methods used in numerous applications and domains in the already fed data by themselves or among the data.
a detailed manner. Clustering problems are resolvedby an unsupervised
learning approach.
Keywords:- Artificial Intelligence (AI), Machine Learning
(ML), Algorithms, Neural Networks (NN), Least Square C. Semi Supervised Learning
Method. Semi-supervised learning is a type of learning approach
that uses unlabelled and a minorquantity of labelled data that
I. INTRODUCTION we have fed as an input data. Using a minorquantity of
already fed labelled data can significantlysurge the
Machine Learning (ML) is a subgroup of Artificial effectiveness of unsupervised learning errands. The modelor
Intelligence (AI).Using MachineLearning (ML) we can machine essentially cram the edifice to organize the
make applications acquire from experience in the same way statisticsis ready to make estimates.
as human do. When data is nursed into these applications,
they learn grow and change giving to experience. This is D. Reinforcement Learning
done by using algorithms thatcram from data in anrepetitive Reinforcement learning is a type of learning approach
process. Applications that use ML use pattern recognition to that practices the data that we have given him as input from
reply to various data that are fed as an input to the the surroundings as a stimulus and check how the model or
application. Machine learning is the ability of an the machine mustrespond on giving some input. Feedback is
applications to react to new data that we have fed as an input not producedover a training route like supervised learning
using repetitions. Machine learning algorithms helps the but as rewards or forfeits in the surroundings.
system to learn how to predict outputs based on previous Reinforcement Learning is used in robot control.
examples that we have given to the system and the
relationshipamong the data that we fed as input data and III. WORKING OF MACHINE LEARNING
output data which is known as training data set . ALGORITHM?
Relationship between inputs and outputs of any model is Machine Learning process starts with nurturing data to
gradually improved by testing its predictions and correcting an algorithm.This process of entering the data to any
that when wrong output is obtained. Machine learning (ML) algorithm is called as trainingthealgorithm.Tocheck whether
is a set of computerised methods for knowing different an algorithm is working, data is nurtured into the algorithm
outlines in data. and results are tartan. If looked-for results are not achieved,
Machine Learning (ML) is a way of creating a method then the algorithm is re-trained with the data that we have
of something like the Line ofbest fit method also called as given him and tested again. Thisprocess is recurrenttill we
Least Square Method. It is beneficial to automate this achieve thedesired results. This helps the machine learning
method when the data has numerous features and is very algorithm to learn and give the desired output as well as
complex. increases the accuracy of the result.

IV. HOW DOES MACHINE LEARNING WORK?

In the procedure of machine learning (ML),the worth


ofinformation is the primary factor that is provided by the
external environment to the system . The information that is
fed to the system from the external environment in some
defined form and delivers itselfinsome form, and then
connotescradles of outside data; Learning can be defined as

IJISRT22JUN901 www.ijisrt.com 778


Volume 7, Issue 6, June – 2022 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
the process that routes the externalfacts to knowledge first it  Tree structure is easily understandable through
obtains the information of outside environment and then visualization
processes the information to knowledge ( we get after
information), and places this acquaintance into the b) Limitations:
warehouse; Warehouse stores many common principles that  Low Prediction Accuracy
escort a portion of the applicationdeed, owing to  Complexity in calculations if class labels are huge
environment that delivers all types of facts for learning  Need of redrawing for each additional data to the
system, the excellence of facts impactsdirectly on learning data set.
understanding of the system that whether it is easy or  Probability of over-fitting in the decision tree is
messy. high.
Data Warehouse is the second factor that influences the c) Applications:
intention of machine learning system. The mien of  Agriculture
knowledge is diverse, such as, eigenvector,first order logic  Medicine
statements ,rules of production, semantic networks and  Financial analysis
frameworks and many more , these customs of mien and
each has its stout point. We take into consideration B. Support Vector Machines
following four aspects when we have to elect, these four The key objective of this is to discover a hyper plane that
aspects are as: (a) stout in appearance (b) easy to deduce, (c) is used to divide the classes into two types. Depending upon
easy to alterfount, (d) the knowledge is tranquil to grow. The the values obtained as hyper values the obtained data set is
implementation can be defined as the process that uses the placed into the data set to which the similarities resembles.
facts of fount to complete a convincedjob, and to response To draw a hyper plane we must take two rules into
the factsthat we have obtained in the development of consideration . Firstis thatthe hyper plane needs to been
finalising the job to the learning, and thus will help us to chosen in such a way that it should separate the two classes
guide the system for advance study. and best hyper separatorplane should be chosen as
maximum-margin hyper.
V. DIFFERENT TYPES MACHINE LEARNING SVM can be classified into two different types : a) Linear
ALGORITHMS SVM b) Non-Linear SVM
Depending upon resemblance in rapports of their a) Advantages:
purpose how the algorithm will work algorithms are of  Robust Classifier for prediction problem
different types . For example,tree-basedmethods, and neural  Efficientifn(D) > n(S) where n(D) the number of
network inspired methods. One of the most valuablemethod dimensionsis greater than the n(S) number of
to clusterprocedures and it is the technique that we are going samples
to use here. This is a suitable grouping technique, but these  Suitable for high dimensional data spaces
are not perfect. There are also some algorithms that
 Memory efficient.
canfairas effortlessly fit into many categories
likeLearningVectorQuantization(LVQ) that is both a neural b) Disadvantages:
network inspired (NN) method and an instance-based  Very slow in test phase
method. There are also some classes that have the same  Not suitable for large and noisy data sets
name that define the problem and the class of algorithm such  Classification error percentage will be increased if
as Regression and Clustering.
wrong kernel is selected
A. Decision Tree Based Classification  Memory consumption is high
Decision tree algorithm is a type of classification that is
c) Applications:
primarily used to build a modelinthewayofa structure that
resembles a tree like having (root, branchandleaf), that  Image classification
isbased on(inferred from) earlierinformation to  Bioinformatics
classify/predict class or target variables of future(new data)  Face detection
that we can get with the help of decision rules or decision
C. K-Nearest Neighbours
trees. The main usage ofDecisionTrees is in the numericalas
This type of algorithm is used to place the data set into
well as categorical data. The algorithm works on
the predefined set by measuring the distance with the
greedysearch approach that is it will start from top
predefined data set by taking into consideration k values.
tobottom.
The distance that it finds least puts the data set into that data
a) Advantages:
set .the methods that is used to calculate the distance
 itEasy to implement between two points uses k variable values among 0 an d1
 It Can classify and predict categorical as well as normally. Most commonly used algorithm used in this are
numerical data. Euclidean distance , hamming distance. This method is used
 isLess data pre-processing in such problems where we have to do classification of data
 Statistical test can be done to validate the tree set. For continuous variables we are using Euclidean
model. Resembles human decision-making distance formula and for categorical variables we are using
technology. hamming distance.

IJISRT22JUN901 www.ijisrt.com 779


Volume 7, Issue 6, June – 2022 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
a) Advantages: independent variable are those variables that have
 Supple with traits and distancefunctions discrete or dis-continuous values. regression models
 Supports multi classdata are of two kinds. One is linear and other one is non-
linear. The linear regression model is one that uses
b) Disadvantages: straight line and non-linear regression model is a type
 Finding appropriate K value is difficult of model that uses curved line relationships among
 Needs large sample for high accuracy dependent and independent variables.
 Requires large storage space
d) Advantages:
D. Naïve Bayes Algorithm  Shows relationship among dependent and
This algorithm is used in the classification of data set if independent variables
the data set is in large quantity and have much more records  Simple and easy to understand Disadvantages:
as the multiclass and binary class related classification  Not applicable for non-linear data
problem. This can be used in the classification job in the  Only predicts numerical output
field related to machine learning. The main objective of this  Data must be independent Applications:
is to analyse the text and natural language processing. For  Observational Astronomy
naïve bayes algorithm one must have the concept related to  Finance
bayes theorem ( based on the conditional probability)
Conditional probability can be defined as an event will VI. COMPARISONS OF DIFFERENT ALGORITHMS
happen with conditioned (based) on an event already
occurred. This will help us to merge the different algorithms  PARAMETER ESTIMATION ALGO: is used to obtain
to form a naïve bayes by using a common principle. the parameter calculation that are used by the system or
by the model.
Types of Naïve Bayes:
 Gaussian Naïve Bayes  MODEL COMPLEXITY REDUCTION: the model
 Multinomial Naïve Bayes used should not be complex or we should reduce the
 Bernoulli Naïve Bayes complexity of our model.

a) Advantages:  GENERATIVE OR DISCRIMINATIVE: The


 Fast discriminative model is mainly used for machine
 Scalable learning particularly for supervised. The discriminative
model is also known as conditional model and this
 Efficient formultinomial distributed and also for
model crams the boundaries between classes or labels in
binary attribute values
a dataset.
b) Disadvantages:
Generative models this falls in the class of
 Not suitable for regression problems
statistical models that yield new data examples. This
 Cannot find relationships among attributes
type of models are mostly used in unsupervised
Applications:
machine learning to do tasks related to probability and
 Text, e-mail, and symbol analysis estimation, modelling data points, and distinguishing
 Recommendation Systems 5.5 between classes using these probabilities.
c) Linear Regression:  LOSS FUNCTION: The loss function is the function
This type of algorithm is used to find the link that calculates the distance between the current output of
between an independent variable also known as the algorithm and the estimated output. It’s a method to
(predictor (X)) and a dependent variable (criterion estimate how your algorithm models the data. It can be
(Y)) variable that can be implemented to predict the categorized into two groups. One
future values of the dependent variable. for classification (discrete values, 0,1,2…) and the
In Simple regression one independent variable otheror regression (continuous values). Some normally
is used and in case of multiple regressions we can used Loss functions used are Cross-entropy (CE), Log loss,
use two or more independent variables depending Mean Square Error (MSE — L2),Mean Absolute Error
upon the data set to predict the future. Dependent (MAE — L1),Huber Loss.
variable are those that have a continuous and

IJISRT22JUN901 www.ijisrt.com 780


Volume 7, Issue 6, June – 2022 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
S.NO LEARNING PARAMETER MODEL GENERATIVE OR LOSS DECISION
METHOD ESTIMATION COMPLEXITY DISCRIMINATIVE FUNCTION FUNCTION
ALGO REDUCTION
1 GUSSAIN Estimate ˆμ, ˆσ2, and Place previous Generative − log P(X, Y ) Equal variance:
NAÏVE P(Y ) using maximum parameters and linear boundary.
BAYES likelihood use MAP Unequal
estimator variance:
quadratic
boundary
2 Logistic No closed form L2 Discriminative − log P(Y |X) Linear
Regression estimates. Optimize regularization
objective function
using gradient
descent
3 Decision Trees Many algorithms: ID3, Prune tree or Discriminative Either – log P(Y Axis-aligned
CART, C4.5 limit tree |X) or zero-one partition of
depth loss feature space
4 K-Nearest Choose K using cross Increase K Discriminative zero-one loss Arbitrarily
Neighbours validationby store all complicated
training data to classify
new points.
5 Support Vector Solve quadratic Reduce C Discriminative hinge loss: linear (depends
Machines (with program to find |1−y(wT x)|+ on kernel)
slack variables, boundary that
no kernel) maximizes margin
Table 1

VII. JEOPARDIES IN MACHINE LEARNING D. Over-fitting


When a model learns the information and noise in the
A. Data Poisoning/Destroying training data to the point where it degrades the model's
For the security of any ML system data plays a vital role performance on fresh data, this is known as overfitting. This
. the reason for this is that the ML system learns to do what means that the model picks up on noise or random
it does right from data that we have given to it. If amugger fluctuations in the training data and learns them as ideas.
can deliberatelyhandle the data that we have used in ML The difficulty is that these notions do not apply to fresh data,
system in a synchronised fashion, the whole MLsystem can causing the model's ability to generalise to be harmed.
be conceded. A special attention is needed for Data Nonparametric and nonlinear models, which have more
poisoning attacks. ML engineers must take those measures flexibility when learning a target function, are more prone to
into consider that in what fraction of the training data an overfitting. As a result, many nonparametric machine
attacker can control and to what extent. learning algorithms incorporate parameters or approaches
that limit and constrain the amount of detail learned by the
B. Data privacy/secrecy model.
In MLData protection is problematicample without
flinging into the combination. One of the most sole VIII. BENEFITS OF MACHINE LEARNING
challenge in ML is the caringsubtle or private data that we
have given to the model, through training, are built right into  Trends and patterns are easily discernible. Machine
a model. Understated but actual extraction attacks against an Learning can analyse enormous amounts of data and
ML system’s data are an important category of risk. identify specific trends and patterns that people might
miss. For example, an ecommerce website like Amazon
C. Online system manipulation uses it to better understand its users' browsing habits and
When a ML system continues to learn during operational purchase histories in order to provide them with the most
use, modifying its behaviour over time this is said to be relevant products, promotions, and reminders. It uses the
online system. In this situation, a cunning attacker can use information to show them ads that are relevant to them.
system input to deliberately nudge the still-learning system  Improvements are ongoing. Machine Learning algorithms
in the wrong direction, gradually retraining the ML system improve in accuracy and efficiency as they gather
to perform the wrong thing. It's worth noting that such an experience. This enables people to make more informed
attack might be both subtle and simple to carry out. To selections. Assume you're working on a weather
adequately handle this risk, ML engineers must take into forecasting model. Your algorithms learn to generate more
account data provenance, algorithm selection, and system accurate predictions faster as the amount of data you have
operations. grows.
 No human intervention needed (automation) With ML,
you don`t want to make the adjustments manually, because
itmanner giving machines the cap potential to learn, it

IJISRT22JUN901 www.ijisrt.com 781


Volume 7, Issue 6, June – 2022 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
allows them to make predictions and additionallyenhance REFERENCES
the algorithms on their own. A not unusual placeinstance
of that is anti-virus software`s; they discover ways toclear [1.] JafarTanha, “Semi-supervised self-training for decision
out new threats as they may be recognized. ML is tree classifiers”, International Journal of Machine
likewiseexact at spotting spam. Learning and Cybernetics, Volume 8, Issue 1, Page
No’s: 355-370, January, 2015.
IX. APPLICATIONS OF MACHINE LEARNING [2.] Khadim D, Fleur M and Gayo D, “Large scale
biomedical texts classification: a k-NN and an ESA
 Autonomous Vehicles Machine learning algorithms are an based approaches”, Journal of Biomedical Semantics,
essential component of self-driving automobiles, and they 7:40, June, 2016
will play an increasingly crucial role in their capacity to [3.] Hong R, H. M. Wang and Jian L, “Privacy Preserving
operate.These learning systems are widely employed for nearest Neighbour Computation in Multiple Cloud
tasks such as image identification and scheduling, but they Environments, IEEE Access, ISSN: 2169-3536,
are challenging to train in chaotic real–world contexts.It's Volume 4, Page No’s: 9589-9603, December, 2016.
difficult to collect, analyse, and combine a lot of different [4.] L. Jiang, C. Li, “Deep feature weighting for naive
sorts of data.So far, the solution has been to do extensive Bayes and its application to text classification”, Journal
road testing in order to catch as many of these unusual of Engineering Applications of Artificial Intelligence,
instances as possible.Algorithms are employed to create Volume 52, Page No’s: 26-39, June, 2016.
replies based on this data, which are subsequently tested in [5.] Ahmed M, Alison H, “Modeling built-up expansion
simulations. and densification with multinomial logistic regression,
 Medicine, Healthcare When it comes to diagnosis or cellular automata and genetic algorithm”,Volume 67,
decision making, machine learning algorithms are not a PageNo’s: 147- 156, January, 2018
good replacement for clinicians - at least in most [6.] T.Razzaghi,OlegR,“Multilevel Weighted Support
situations. A good diagnosis must take into account Vector Machine for Classification on Healthcare Data
structured (e.g. diagnosis codes, medications), with Missing Values”, PLUS ONE, Page No’s:1- 18,
unstructured data (clinical notes), image data (X-rays) and May 2016
even subtle visual cues from the patient (do they look ill, [7.] Hui L, D. Pi, “Integrative Method Based on Linear
how did they answer family history questions) in a very Regression for the Prediction of Zinc binding Sites in
short time frame. Proteins”, IEEE Access, Volume 5, Page No’s:14647-
 Governmental machine learningMachine learning is now 14656, August, 2017.
being used by a small number of government agencies, [8.] L. Wang, D. Wang, “Intelligent CFAR Detector Based
such as the Government Digital Services (GDS), which is on Support Vector Machine”, IEEE Access, Volume 5,
using it to anticipate page views in order to find Page No’s: 26965-26972, December, 2017.
nomalies11, and the HMRC, which is using clustering [9.] Enrico R, Michel L, “The Counter, a Frequency
algorithms to segment VAT clients.The adoption rate is Counter Based on the Linear Regression”, IEEE
still low, and there is a lot of untapped potential.GDS has Transactions on Ultrasonics, Ferroelectrics, Volume
thus far concentrated on proving machine learning 63, Issue 7, Page No’s: 961-969, July, 2016.
algorithms' capabilities on a variety of goods and [10.] J. Han, M. Kamber and J. Pei, “Data Mining: Concepts
prototype services.The development of a 'data first' and Techniques”, 3rd Edition, MK Series, 2012
mindset at a much earlier level in the policy process is one [11.] http://www.maxlittle.net/publications/index.php 10.
of the first steps toward increased exploitation of these [12.] https://gdsdata.blog.gov.uk/2014/08/15/anomalydetecti
learning systems. on-a-machine-learning-approach/
http://www.ons.gov.uk/ons/.../mwp1-
X. CONCLUSION onsinnovationlaboratories.pdf
[13.] https://berryvilleiml.com/ results/ara.pdf
This paper introduces machine learning, its basic [14.] X. Yu, P. He, Q. Zhu, and X. Li, “Adversarial
model, and its applications in numerous fields, as well as its examples: Attacks and defenses for deep learning,”
benefits and drawbacks. IEEE Trans. Neural Netw. Learn. Syst., vol. 30, no. 9,
pp. 2805–2824, 2019. doi:
It also examines numerous machine learning
10.1109/TNNLS.2018.2886017
approaches and tools, such as classification and prediction
[15.] https://data-
techniques, as well as their objectives, working procedures,
flair.training/blogs/advantagesanddisadvantages-of-
benefits, drawbacks, real-time applications, and
machine-learning/
implementation tools.
[16.] https://www.burtchworks.com/2018/06/12/2018m
The solid underpinnings of the above-mentioned achine-learning-flash-survey-results/
methodologies are required by emerging developments in
Artificial Intelligence and machine learning, and they will
be valuable in transdisciplinary domains as well.

IJISRT22JUN901 www.ijisrt.com 782

You might also like