You are on page 1of 32

Mod 3

Part1- Tensorflow : Introduction , tensor , tensor properties , basic tensor methods

represented using tensors, and as a result, neural network programming utilizes

Tensor flow Intro

Applications
5
NLP, Object detection, Image Processing


Tensors

Tensor Properties
6

Shape: The length (number of elements) of each of the axes of a tensor.

Rank: Number of tensor axes. A scalar has rank 0, a vector has rank 1, a matrix is rank 2
.
Axis or Dimension: A particular dimension of a tensor.

Size: The total number of items in the tensor, the product shape vector.

Homogeneous: a single data type

immutable
TensorFlow Working (Training and Development)
Neural networks working

● A neural network is made up of neurons connected to each other; at the same


time, each connection of our neural network is associated with a weight that
dictates the importance of this relationship in the neuron when multiplied by the
input value.


● Each neuron has an activation function that defines the output of the neuron.
The activation function is used to introduce non-linearity in the modeling
capabilities of the network. We have several options for activation functions that
we will present in this post.
in deep learning model using tensorflow
Part2
● Next we will use a loss function to estimate the loss (or error) and to compare
and measure how good/bad our prediction result was in relation to the correct
result.
● After this using backpropagation each neuron receives the error rate however, the
neurons of the hidden layer only receive a fraction of the total signal of the loss,
based on the relative contribution that each neuron has contributed to the
original output. This process is repeated, layer by layer, until all the neurons in
the network have received a loss signal that describes their relative contribution
to the total loss.
● Now that we have spread this information back, we can adjust the weights of
connections between neurons. What we are doing is making the loss as close as
possible to zero the next time we go back to using the network for a prediction.
For this, we will use a technique called gradient descent.
● For Model parameterization we use Epochs, Batch size , Learning rate

CNN in Tensorflow

Applying convolution in Tensorflow

Steps involved in ML model


https://analyticsindiamag.com/the-7-key-steps-to-build-your-machine-learning-model/
1. Collect Data
2. Prepare the data
3. Choose the model
4. Train your deep learning model
5. Evaluation of metrics
6. Parameter Tuning
7. Prediction or Inference
Mod 4
Out-of-vocabulary (OOV) are terms that are not part of the normal lexicon found in
a natural language processing environment. In speech recognition, it's the audio
signal that contains these terms. Word vectors are the mathematical equivalent of word
meaning.
Questions paper

Mod 3

Q1.

Q2.
Q3.

https://towardsdatascience.com/learning-process-of-a-deep-neural-network-5a9768d7a651

Deep learning is a type of machine learning and artificial intelligence (AI) that imitates
the way humans gain certain types of knowledge. Deep learning is an important
element of data science, which includes statistics and predictive modeling. It is
extremely beneficial to data scientists who are tasked with collecting, analyzing and
interpreting large amounts of data; deep learning makes this process faster and easier.

At its simplest, deep learning can be thought of as a way to automate


predictive analytics. While traditional machine learning algorithms are linear,
deep learning algorithms are stacked in a hierarchy of increasing complexity
and abstraction.
https://analyticsindiamag.com/the-7-key-steps-to-build-your-machine-learning-model/
8. Collect Data
9. Prepare the data
10. Choose the model
11. Train your deep learning model
12. Evaluation of metrics
13. Parameter Tuning
14. Prediction or Inference
Q4.

Long short-term memory is a modified RNN architecture that addresses the problem of
training over long sequences and retaining memory.

LSTM is best suited for sequence data. LSTM can predict, classify, and generate
sequence data.

An example of a sequence is a video, which can be considered as a sequence of


images or a sequence of audio clips.

Prediction based on the sequence of data is called the sequence prediction. Sequence
prediction is said to have four types.

● Sequence numeric prediction

● Sequence classification


● Sequence generation


● Sequence-to-sequence prediction


Mod 4

Q1.

Word embeddings have a capability of capturing semantic and syntactic


relationships between words and also the context of words in a document.
Word2vec is the technique to implement word embeddings.

Every word in a sentence is dependent on another word or other words.If you


want to find similarities and relations between words ,we have to capture word
dependencies.
Q2.

When there are a small number of training examples, the model sometimes
learns from noises or unwanted details from training examples—to an extent that
it negatively impacts the performance of the model on new examples. This
phenomenon is known as overfitting.
It means that the model will have a difficult time generalizing on a new dataset.

Overfitting refers to a model that models the training data too well.

Overfitting happens when a model learns the detail and noise in the training data
to the extent that it negatively impacts the performance of the model on new
data. This means that the noise or random fluctuations in the training data is
picked up and learned as concepts by the model. The problem is that these
concepts do not apply to new data and negatively impact the models ability to
generalize.

Overfitting is more likely with nonparametric and nonlinear models that have
more flexibility when learning a target function. As such, many nonparametric
machine learning algorithms also include parameters or techniques to limit and
constrain how much detail the model learns.

For example, decision trees are a nonparametric machine learning algorithm that
is very flexible and is subject to overfitting training data. This problem can be
addressed by pruning a tree after it has learned in order to remove some of the
detail it has picked up.

There are multiple ways to fight overfitting in the training process. In this example, we
will use data augmentation and add Dropout to the model.
Q3 .
https://dataaspirant.com/word-embedding-techniques-nlp/#t-1597685144204

“Term frequency–inverse document frequency, is a numerical statistic that is intended


to reflect how important a word is to a document in a collection or corpus.”
Q4.

The Word2vec method learns all those types of relationships of words while building a
model. For this purpose word2vec uses 2 types of methods. There are

1. Skip-gram
2. CBOW (Continuous Bag of Words)

1. Skip -gram
In this method , take the center word from the window size words as an input and
context words (neighbor words) as outputs. Word2vec models predict the context words
of a center word using skip-gram method. Skip-gram works well with a small dataset
and identifies rare words really well.

2. Continuous bag of words

CBow is just a reverse method of the skip gram method. Here we are taking context
words as input and predicting the center word within the window. Another difference
from the skip gram method is, It was working faster and better representations for most
frequency words.
Resources

TF-IDF
TF
IDF
Implementation of TF-IDF by using Sklearn
Word2vec
Skip-Gram
Continuous Bag-of-words
Word2vec implementation
Word embedding model using Pre-trained models
Google word2vec
Stanford Glove Embeddings

You might also like