Introduction to
Machine Learning:
What is Machine Learning?
Machine learning is a field of computer science that enables computers to
learn from data without explicit programming. It uses algorithms to analyze
data, identify patterns, and make predictions.
Presented By:
Tannu Shree
Himanshu Mandal
Hiya Chatterjee
[Link]
Why is Machine Learning Important?
1 Machine Learning is important because it has the ability to process
and analyze vast amounts of data, which would be impossible for
humans to handle manually.
By learning from data, machines can automate tasks, predict outcomes,
2
and make decisions faster and more accurately than humans in many
cases.
3 This is why industries like healthcare, finance, and e-commerce
are increasingly relying on Machine Learning to improve efficiency,
reduce costs, and enhance customer experience.
[Link]
Machine Learning Terminology
Algorithm Dataset
A collection of data points used
A set of rules or steps used to
to train and evaluate machine
solve a problem or make a
learning models.
prediction.
Feature Model
A measurable characteristic or A mathematical representation
attribute of a data point. of a relationship between
features and outputs.
[Link]
Types of Machine Learning
Supervised Learning
Models learn from labeled data, where both inputs and desired outputs are
provided. Ex: Regression, classification
Unsupervised Learning
Models discover patterns and structures in unlabeled data, without specific
outputs. Ex: Clustering
Semi-Supervised Learning
Models learn from a combination of labeled and unlabeled data, leveraging
both types of information.
Reinforcement Learning
Models learn through trial and error, receiving rewards for desired actions
and penalties for undesired ones.
[Link]
Supervised Learning
1
Training
The model is trained on labeled data
learning to associate inputs with
their corresponding outputs.
2
Evaluation
The model's performance is
evaluated
based on its accuracy in predicting
outputs for a separate test
dataset.
3
Prediction
The trained model uses the learned
patterns to predict the output for new,
unseen inputs. [Link]
Unsupervised Learning
Clustering Grouping similar data points
together.
Association Rule Mining Discovering relationships and
patterns between data points.
Dimensionality Reduction Simplifying data by reducing the
number of features.
[Link]
Semi-Supervised Learning
Leverages both types of data
Combines labeled and unlabeled data to improve learning efficiency.
Reduces labeling effort
Requires less labeled data compared to supervised learning.
Improves accuracy
Utilizes unlabeled data to enhance model performance.
[Link]
Reinforcement Learning
1 Agent-Environment Interaction
The agent interacts with its environment and receives
rewards or penalties for its actions.
2 Learning through Experience
The agent learns to maximize rewards by adjusting its
behaviour based on past experiences.
3 Goal-Oriented
The agent aims to achieve a specific goal, such as winning a
game or optimizing a process.
[Link]
[Link]
[Link]
[Link]
STEPS OF PREPARING A MACHINE LEARNING MODEL:
[Link]
[Link]
[Link]
[Link]
[Link]
[Link]
[Link]
[Link]
Evaluation Metrics
1 Accuracy 2 Precision
The overall proportion of The fraction of positive
correct predictions made by predictions that are actually
the model. correct.
3 Recall 4 F1-Score
The fraction of actual The harmonic mean of
positives that are correctly precision and recall,
identified by the model. providing a balanced
evaluation.
[Link]
Popular Machine Learning Algorithms
Linear Regression Decision Trees
Models the linear relationship Builds a tree-like model of
between input features and a decisions and their possible
continuous target variable. consequences.
k-Nearest Neighbors Support Vector Machines
Classifies data based on the Finds the optimal hyperplane
closest training examples in the that separates different classes
feature space. of data.
[Link]
Linear Regression
[Link]
DECISION TREE
[Link]
KNN
KNN ALG0RITHM
[Link]
LOGISTIC REGRESSION CURVE:
[Link]
LOGISTIC REGRESSION VS SVM:SVM IS A BETTER CLASSIFIER THAN LOGISTIC REGRESSION AS WE SEE SVM
CONSTANTLY UPDATING THE MARGIN TO READJUST ITSELF ACCORDING TO THE MEAN DISTANCE OF THE POINTS FROM THE LINE
SO THAT WE OBTAIN MINIMUM DISTANCE FROM EACH OF THE POINTS IN THE SCATTER PLOT WHEREAS IN LOGISTIC REGRESSION
WE GET AN EXPONENTIAL GRAPH PLOT WITH INCREASING COMPLEXITY SO THAT ALMOST ALL THE POINTS CAN BE PLOTTED ON IT.
[Link]
SUPPORT VECTOR MACHINE CLASSIFIER
[Link]
[Link]
Machine Learning Applications
Healthcare Finance Marketing
ML assists in disease diagnosis, drug ML algorithms detect fraud, predict ML enables customer segmentation,
discovery, and personalized treatment. stock market trends, and optimize personalized recommendations, and
investment portfolios. targeted advertising.
[Link]
[Link]
Challenges of Machine Learning
• Data Issues
[Link] Collection
[Link] Data
[Link] Representative Data
[Link] Quality Data
[Link] Features
[Link]
Collecting Quality Data
Relevant Data
Ensure that the data you collect is truly relevant to the problem you're trying to solve.
Sufficient Quantity
Gather enough data to train your models effectively and avoid underfitting.
Accurate Labeling
Carefully label your data to improve the quality of your model's predictions.
Representative Samples
Collect data that accurately reflects the real-world distribution of your problem
domain.
[Link]
Algorithm Issues
Overfitting Underfitting
A model that is too complex can learn A model that is too simple may fail to
the noise in the training data, leading capture the underlying patterns in the
to poor performance on new, unseen data, resulting in poor performance
data. on both training and test data.
[Link]
Software Integration Challenges
Data Integration
Seamlessly connect your ML models to diverse data sources and databases.
Scalable Infrastructure
Leverage cloud-based services and platforms to handle the computational demands of your ML projects.
API Integration
Ensure your ML models can be easily integrated into existing software systems and workflows.
Cost Management
Optimize your ML infrastructure and resource utilization to manage the costs associated with model development and
deployment.
[Link]
Tools and Frameworks for
Machine Learning
Programming languages
1 Python
Most popular language for machine learning.
Extensive libraries and frameworks (e.g., TensorFlow, Py Torch,
Scikit-learn).
Easy syntax, strong community support, and ideal for rapid
prototyping.
2 R Language
Specializes in statistical analysis and data visualization.
Offers strong tools for data manipulation and modeling (e.g., caret,
random Forest).
Commonly used in academia and research settings.
[Link]
.
Libraries
Scikit-Learn
Python-based library for classical machine learning
algorithms . Great for tasks like regression,
classification, clustering, and preprocessing.
Simple and efficient tools for data mining and data
analysis.
TensorFlow
Developed by Google. Open-source framework for
building and deploying deep learning models.
Supports both CPU and GPU, widely used for
production-level AI applications
Py-Torch
Developed by Facebook. Flexible and easy-to-use
deep learning framework.
Preferred for research, dynamic computational
graphs, and quick prototyping.
[Link]
Platforms
[Link]
[Link]
Implementation
Today, we will write Python code to predict home prices using a machine learning
technique called simple linear regression. We will use a dataset that includes the
prices of homes based on their area in Itanagar
[Link]
Conclusion
Machine Learning's Potential: Future Outlook:
ML is transforming industries by enabling data-driven decision- As technology advances, ML models will become more efficient,
making. accessible, and integrated into everyday life.
From healthcare to finance, ML applications are vast and The future of ML lies in ethical AI, automation, and innovation.
impactful.
Challenges to Overcome: Continuous Learning:
While promising, challenges like data quality, model overfitting, Machine learning is a rapidly evolving field that demands constant
and deployment complexities require ongoing efforts. learning and adaptation.
Addressing these challenges is key to successful ML Staying updated on new tools, frameworks, and algorithms is essentia
implementation. for success.
---
"Machine learning is not just about algorithms; it's about
transforming data into insights that drive impactful change."
[Link]
Questions?
[Link]
[Link]