You are on page 1of 18

Catching The Machine Learning Terminology And Notations

Monday, December 9, 2019 Machine Learning By Sathish Yellanki Slide No : 1


NLP - Natural Language Processing

Monday, December 9, 2019 Machine Learning By Sathish Yellanki Slide No : 2


Natural Language Processing (NLP) is A Common Notion
Which Makes it Possible For The Computer To Understand
And Perform Operations Using Human (i.e. Natural)
Language As it is Spoken OR Written

Monday, December 9, 2019 Machine Learning By Sathish Yellanki Slide No : 3


Text Classification And Ranking
• The Goal is To Predict A Class OR Label of A Document, OR Rank
Documents Within in A List Based on Their Relevance
Sentiment Analysis
• Sentiment Analysis Aims To Determine The Attitude OR
Emotional Reaction of A Person With Respect To Some Topic
Document Summarization
• Document Summarization is A Set of Methods For Creating Short,
Meaningful Descriptions of Long Texts
Named Entity Recognition (NER)
• Named Entity Extraction Algorithms Process A Stream of
Unstructured Text And Recognize Predefined Categories of
Objects Called Entities

Monday, December 9, 2019 Machine Learning By Sathish Yellanki Slide No : 4


Speech Recognition
• Speech Recognition Techniques Are Used For Determining A
Textual Representation of An Audio Signal of People Speaking
Natural Language Understanding And Generation (NLU And NLG)
• NLU is Used For Transforming A Human-Generated Text into
More Formal Representations Interpretable By A Computer, And
Conversely
• NLG Techniques Support Transformation of A Formal Logical
Representation into A Human-Like Generated Text.
Machine Translation
• Machine Translation is A Task of Automatically Translating Text
OR Speech From One Human Language into Another.

Monday, December 9, 2019 Machine Learning By Sathish Yellanki Slide No : 5


Understanding Dataset

Monday, December 9, 2019 Machine Learning By Sathish Yellanki Slide No : 6


All The Data That is Used For Either Building OR Testing The
Machine Learning Model is Called A Dataset.
Data Scientists Divide Their Datasets into Separate Groups
For Making The Analysis Easier

Monday, December 9, 2019 Machine Learning By Sathish Yellanki Slide No : 7


Training Data
• Training Data is Used To Train A Model.
• Machine Learning Model Sees That Data And Learns To Detect
Patterns OR Determine Which Features Are Most Important
During Prediction.
Validation Data
• Validation Data is Used For Tuning Model Parameters And
Comparing Different Models To Determine The Best Ones.
Test Data
• Test Data is Used Once The Final Model is Chosen To Simulate
The Model’s Behavior on A Completely Unseen Data.
• Unseen Data Refers To Data Points That Weren’t Used in Building
Models OR Even in Deciding Which Model To Choose.
Image
• The Visualization of The MNIST (Modified National Institute of
Standards And Technology Database) Dataset Using A Mixture of T-SNE
(Distributed Stochastic Neighbor Embedding) And Jonker-
Volgenant
Monday, Algorithms. Machine Learning By Sathish Yellanki
December 9, 2019 Slide No : 8
Computer Vision

Monday, December 9, 2019 Machine Learning By Sathish Yellanki Slide No : 9


Computer Vision (CV) is A Field of Artificial Intelligence
Concerned With Providing Tools For Analysis And High-
Level Understanding of Image And Video Data.

Monday, December 9, 2019 Machine Learning By Sathish Yellanki Slide No : 10


Image Classification
• Image Classification is A CV Task of Teaching A Model To
Recognize What is on A Given Image.
Object detection
• Object Detection is A CV Task of Teaching The Model To Detect
An Instance of An Object From A Set of Predefined Categories By
Providing A Bounding Box Around Each Instance of A Given Class.
Image segmentation
• Image Segmentation is A CV Task Where One Trains A Model To
Annotate Each Pixel With A Class From A Predefined Set To
Which A Given Pixel Most Probably Belongs.
Saliency detection
• Saliency Detection is A CV Task of Training A Model To Be Able To
Provide A Region Which Would Most Likely Attract The Attention
of A Viewer.

Monday, December 9, 2019 Machine Learning By Sathish Yellanki Slide No : 11


Neural Networks

Monday, December 9, 2019 Machine Learning By Sathish Yellanki Slide No : 12


Neural Networks is A Very Wide Family of Machine
Learning Models.
The Main Idea Behind Them is To Mimic The Behavior of A
Human Brain When Processing Data As Exactly The
Networks Connecting Real Neurons in The Human Brain.
The Artificial Neural Networks Are Composed of Layers
Exactly Simulating The Human Brain.
Each Layer is Considered As A Set of Neurons, All of Which
Are Responsible For Detecting Different Things.
A Neural Network Processes Data Sequentially, Which
Means That Only The First Layer is Directly Connected To
The Input All Subsequent Layers Detect Features Based on
The Output of A Previous Layer, Which Enables The Model
To Learn More And More Complex Patterns in Data As The
Number of Layers Increases.
Monday, December 9, 2019 Machine Learning By Sathish Yellanki Slide No : 13
Convolution Neural Networks
• Convolution Neural Networks Are A Huge Breakthrough in
Computer Vision Tasks And Also Proved To Be Very Useful in NLP
Problems.
Recurrent Neural Networks
• Recurrent Neural Networks Are Designed To Process Data With
Sequential Nature Such As Texts OR Stock Prices.
Fully Connected Neural Networks
• Fully Connected Neural Networks Are The Simplest Models Used
on Static/Tabular Data.

Monday, December 9, 2019 Machine Learning By Sathish Yellanki Slide No : 14


Overfitting

Monday, December 9, 2019 Machine Learning By Sathish Yellanki Slide No : 15


Over Fitting is A Negative Effect When The Model Builds An
Assumption OR Bias From An Insufficient Amount of Data.
When Overfitting Happens, it Usually Means That The
Model is Treating Random Noise in The Data As A
Significant Signal And Adjusts To it.
In Overfitting The Training Set is Biased Towards Articles
About What it Knows And Hence Can Fail To Detect The
Correct Set of Assumption

Monday, December 9, 2019 Machine Learning By Sathish Yellanki Slide No : 16


Roadmap For Building Machine Learning Systems

Monday, December 9, 2019 Machine Learning By Sathish Yellanki Slide No : 17


• Feature Extraction And Scaling
• Feature Selection
• Dimensionality Reduction
• Sampling

Labels

Labels
Training Dataset Learning
Final Model New Data
Test Dataset Algorithm
Raw
Data
Labels

Machine
Pre Processing Learning Evaluation Prediction
Learning

• Model Selection
• Cross Validation
• Performance Metrics
• Hyper Parameter Optimization

Monday, December 9, 2019 Machine Learning By Sathish Yellanki Slide No : 18

You might also like