Slides

Microsoft.
com/Learn
Introduction to natural language
processing with TensorFlow
Speaker Name
Title
Speaker Name
Title
Join the chat at https://aka.ms/LearnLiveTV
Prerequisites  Basic Python knowledge
 Basic understanding of machine learning
Learning  Understand how text is processed for natural language
processing tasks
objectives  Get an introduced to Recurrent Neural Networks (RNNs) and
Generative Neural Networks (GNNs)
 Learn about Attention Mechanisms
 Learn how to build text classification models
Agenda  Introduction to natural language processing with
TensorFlow
 Representing text as Tensors
 Represent words with embeddings
 Capture patterns with recurrent neural networks
 Generate text with recurrent networks
Introduction to natural language
processing with TensorFlow
Introduction to natural language processing with
TensorFlow
In this presentation, we will explore different neural network
architectures for dealing with natural language text.
Natural Language Tasks
There are several NLP tasks that we can solve using neural networks:
• Text Classification is used when we need to classify a text fragment into one of several predefined classes. Examples include e-
mail spam detection, news categorization, assigning a support request to a category, and more.
• Intent Classification is one specific case of text classification, where we want to map an input utterance in the conversational AI
system into one of the intents that represent the actual meaning of the phrase, or intent of the user.
• Sentiment Analysis is a regression task, where we want to understand the degree of positivity of a given piece of text. We may
want to label text in a dataset from most negative (-1) to most positive (+1), and train a model that will output a number
representing the positivity of the input text.
• Named Entity Recognition (NER) is the task of extracting entities from text, such as dates, addresses, people names, etc.
Together with intent classification, NER is often used in dialog systems to extract parameters from the user's utterance.
• A similar task of Keyword Extraction can be used to find the most meaningful words inside a text, which can then be used as
tags.
• Text Summarization extracts the most meaningful pieces of text, giving the user a compressed version of the original text.
• Question Answering is the task of extracting an answer from a piece of text. This model takes a text fragment and a question as
input, and finds the exact place within the text that contains the answer. For example, the text "John is a 22 year old student who
loves to use Microsoft Learn", and the question How old is John should provide us with the answer 22.
Essential Facts
Use embedding layer before fully-connected classifier
layer
Reduces the dimensionality of the input vector to neural classifier

Use character-level LSTM
To generate words with specific meaning, like funny ones.
LSTM network architecture
Allows explicit state management with forgetting and state triggering
Question 1
Suppose your text corpus contains 80000 different words. Which of
the below would you complete to reduce the dimensionality of the
input vector to neural classifier?
A. Randomly select 10% of the words and ignore the rest.

B. Use convolutional layer before fully-connected classifier layer
C. Use embedding layer before fully-connected classifier layer
D. Select 10% of most frequently used words and ignore the rest
Question 1
Suppose your text corpus contains 80000 different words. Which of
the below would you complete to reduce the dimensionality of the
input vector to neural classifier?
A. Randomly select 10% of the words and ignore the rest.

B. Use convolutional layer before fully-connected classifier layer
C. Use embedding layer before fully-connected classifier layer
D. Select 10% of most frequently used words and ignore the rest
Question 2
We want to train a neural network to generate new funny words for a
children's book. Which architecture can we use?
A. Word-level LSTM
B. Character-level LSTM
C. Word-level RNN
D. Character-level perceptron
Question 2
We want to train a neural network to generate new funny words for a
children's book. Which architecture can we use?
A. Word-level LSTM
B. Character-level LSTM
C. Word-level RNN
D. Character-level perceptron
Question 3
Recurrent neural network is called recurrent, because:
A. A network is applied for each input element and output from the previous
application is passed to the next one
B. It is trained by a recurrent process
C. It consists of layers which include other subnetworks
Question 3
Recurrent neural network is called recurrent, because:
A. A network is applied for each input element and output from the
previous application is passed to the next one
B. It is trained by a recurrent process
C. It consists of layers which include other subnetworks
Question 4
What is the main idea behind LSTM network architecture?
A. Fixed number of LSTM blocks for the whole dataset

B. It contains many layers of recurrent neural networks
C. Explicit state management with forgetting and state triggering
Question 4
What is the main idea behind LSTM network architecture?
A. Fixed number of LSTM blocks for the whole dataset

B. It contains many layers of recurrent neural networks
C. Explicit state management with forgetting and state triggering
Question 5
What is the main idea of attention?
A. Attention assigns a weight coefficient to each word in the vocabulary to

show how important it is
B. Attention is a network layer that uses attention matrix to see how much
input states from each step affect the final result.
C. Attention builds global correlation matrix between all words in vocabulary,
showing their co-occurrence
Question 5
What is the main idea of attention?
A. Attention assigns a weight coefficient to each word in the vocabulary to

show how important it is
B. Attention is a network layer that uses attention matrix to see how
much input states from each step affect the final result.
C. Attention builds global correlation matrix between all words in vocabulary,
showing their co-occurrence
Summary
Summary
We have covered all the basics of Natural Language Processing such
as: text representation, traditional recurrent network models, and near
state-of-the-art models with attention.
Next Steps
Please tell us how you liked this
Practice your knowledge by
trying these Learn modules: workshop by filling out this survey:
There are other Learn Modules for https://aka.ms/workshopomatic-

TensorFlow that are grouped in feedback
the TensorFlow fundamentals Learning Pat
h
© Copyright Microsoft Corporation. All rights reserved.

Slides

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Slides

Uploaded by

Copyright:

Available Formats

Microsoft.

Reduces the dimensionality of the input vector to neural classifier

A. Randomly select 10% of the words and ignore the rest.

A. Randomly select 10% of the words and ignore the rest.

A. Fixed number of LSTM blocks for the whole dataset

A. Fixed number of LSTM blocks for the whole dataset

A. Attention assigns a weight coefficient to each word in the vocabulary to

A. Attention assigns a weight coefficient to each word in the vocabulary to

There are other Learn Modules for https://aka.ms/workshopomatic-

You might also like