You are on page 1of 13

Natural Language

Processing
Build cutting-edge NLP systems
Overview
The demand for natural language processing (NLP), or the ability to render
human language through computational systems, is exploding. From
applications such as customer-service chatbots to online language translators
to AI-enabled virtual assistants, NLP has become increasingly sophisticated.
Technology professionals who are highly proficient can earn six-figure salaries
in a market that Statista predicts will grow to $43 billion by 2025—an increase
of more than 1,400% in less than a decade.

Natural Language Processing, a 10-week online program available through the


Executive Education program from Carnegie Mellon University School of
Computer Science, provides both a fundamental understanding of NLP and an
overview of its applications. Working at the intersection of language and
mathematics, you will learn how to use a variety of tools and resources,
including deep learning, to develop innovative NLP solutions for everyday
challenges.

Through its world-renowned Language Technologies Institute, CMU School of


Computer Science is a recognized leader in the NLP field. Our expert faculty
will guide you through the challenges and complexities of NLP in a thorough
and logical way. By the end of the 10-week program, you will have hands-on
experience with NLP along with a foothold in
this burgeoning field.

#1 in Artificial Intelligence Specialty and Graduate Programs


for Computer Science.
(SOURCE: U.S. NEWS & WORLD REPORT)

Natural Language Processing 01


Key Takeaways
In this program, you will:
Learn to build cutting-edge NLP systems for any domain”
Use finite-state transducers (FSTs) for lemmatization—a key step in NLP
training tasks
Synthesize n-gram language models and make qualitative/quantitative
comparison of simple to complex n-gram models
Model machine learning features from documents efficiently and
accurately using convolutional neural networks (CNNs) and long
short-term memory (LSTM) (e.g., word/sentence/document embedding)
Utilize neural networks to label parts of speech (POS) and named entities
(NER) in English and other languages
Use Continuous Bag of Words (CBOW) model to identify word meaning
and calculate informativeness metrics (e.g., PMI, TF-IDF)
Automatically detect phrase structure grammar from tree banks by
generating parse trees
Apply FrameNet and PropBank to the task of semantic role labeling (SRL),
translate meaning representation languages (MRLs) into English
vocabulary, and use lambda calculus to apply semantic parsing of English
sentences to compositionality based on meaning

Who Should Attend?


You will have an opportunity to practice NLP applications via hands-on
programming activities. The program is particularly suitable for the
following technology professionals:

Software developers and technology professionals


Data science, data analytics, and machine learning professionals

Program Prerequisites
Participants should have strong programming abilities in Python and
knowledge of data structures and algorithms. If you have not learned
Python yet, consider CMU’s Programming with Python course.

Natural Language Processing 02


Program Curriculum
This program consists of 10 modules, each designed to explore a specific
aspect of NLP. Together the modules outline practical applications of NLP,
which you will demonstrate by completing a short capstone project.

Module 1 Introduction to NLP

Examine the what, why, and how of NLP, its key applications, and
associated challenges. You will:

Synthesize foundational components and boundaries of NLP

Identify real-world applications of NLP, including ethical considerations


and current and future uses

Distinguish levels of linguistic structure representation independently


and as related to specific languages and language in general

Module 2 Linguistic Morphology

Explore the basics of linguistics and morphology and the importance of


morphology as both a problem and resource in NLP. Plus, learn to
distinguish prefixes, suffixes, and infixes and how to construct a simple
finite state transducer (FST) for lemmatization. You will:

Identify and describe the structure of words

Construct a finite-state machine capable of recognizing morphologically


valid words

Apply understanding of an FST via implementation of allomorphic rules


and lemmatization

Evaluate tools and resources related to morphological analysis for use in


specific industries

Natural Language Processing 03


Module 3 Language Models and Smoothing

Synthesize models that provide the probability/frequency of a sentence


structure based on a domain and identify word boundaries. You will:

Analyze language models to determine relevance of probability theory,


smoothing, methods, and application
Identify word boundaries and how to represent words as symbols
and/or word embeddings, including rare and unknown words

Evaluate tools and resources for language model construction

Module 4 Classifiers

Define mathematical classifiers for documents and how they relate to


machine learning. You will:

Describe document topic classification and notation

Identify simple and complex classifiers

Define precision, recall, and f-scores for evaluating classifiers

Build a classifier and select features with sentiment data

Discuss advanced aspects of building classifers

Evaluate tools and resources related to classifiers

Module 5 Deep Learning for NLP

Learn the history and significance of neural networks in the context of NLP
and implement sentiment classification. You will:
Compare and contrast the components of simple, deep, and sequence
based neural networks

Define neural network training and optimization of trained models

Implement sentiment classification in a feed-forward neural network

Explore tools and resources for neural networks and deep learning

Natural Language Processing 04


Sequence Labeling—Speech Tagging and Named-Entity
Module 6
Recognition (NER)
Identify computational approaches for labeling sequences of text.
You will:

Tag parts of speech

Utilize statistical and neural tools for labeling parts of speech

NER tagging in English and other languages

Evaluate tools and resources for part of speech and NER tagging

Module 7 Lexical Semantics

Implement computational approaches to word meanings. You will:

Employ symbolic approaches to model word meaning

Quantify word meaning using patterns of distribution

Weigh words for informativeness based using PMI or TF-IDF

Module 8 Word Embeddings

Learn how to define approaches to representing words as dense vectors.


You will:
Summarize nontechnical justification for word embeddings and intrinsic
and extrinsic evaluation of word embeddings

Investigate various efficiency techniques for representing word


embeddings

Identify alternative techniques for word embedding

Implement sentiment analysis

Improve modeling by examining results

Describe tools and resources of word embeddings

Natural Language Processing 05


Module 9 Phrase Structure and Dependency Syntax

Learn how to interpret syntax, the representation of sentence and phrase


structure, and facets of dependency grammar. You will:

Utilize syntax through different perspectives, specifically phrase structure


and bilexical dependencies
Define constituency, types of constituents, and the use of constituency in
the English language

Manually implement phrase structure grammars to generate parse trees


and automatically learn grammars from tree banks

Summarize dependency grammar, dependency relations in the English


language, and components of dependency parsing

Module 10 Sentence Semantics

Learn how to construct meaning representations based on semantic


roles. You will:

Apply FrameNet and PropBank to the task of semantic role labeling (SRL)

Choose the best meaning representation language for a particular task

Utilize lambda calculus to apply semantic parsing of English sentences to


to compositionality based on meaning

Natural Language Processing 06


Program Experience

The Carnegie Mellon


Programming
Assignments School of Computer
Science Difference

Expertise
Discussions
Instructors who are experts in their
fields blend thought leadership
with practical experience

Guest Speakers
Integration
The ability to develop a suite of
interconnected learning modules
that leverage resources from
Knowledge Checks across Carnegie Mellon University
School of Computer Science

Engagement
Capstone Project Program structured around
small-group learning, which
allows for direct interaction with
both instructors and peers

Reputation
A globally recognized leader in
academic research

Natural Language Processing 07


Program Faculty

David R. Mortensen
Systems Scientist, Language Technologies
Institute, School of Computer Science, Carnegie
Mellon University
David Mortensen is a systems scientist and assistant
professor in the Language Technologies Institute,
which is part of CMU's School of Computer Science.
A computational linguist, David focuses his research
on two strands: uncovering how linguistic knowledge (especially of phonology
and morphology) can contribute to NLP and using computational models to
uncover linguistic knowledge and investigate linguistic hypotheses. Before
coming to CMU in 2014, he was an assistant professor in the University of
Pittsburgh's Department of Linguistics.
David completed his graduate work at the University of California, Berkeley,
where he received a Ph.D. in linguistics for a thesis on theoretical phonology.
He also holds a bachelor's degree in English literature, with minors in
linguistics and psychology, from Utah State University and attended the LSA
Summer Institute at Massachusetts Institute of Technology.

Alan Black
Professor, Language Technologies Institute, School
of Computer Science, Carnegie Mellon University
Alan Black is a professor in the Language Technology
Institute at CMU. A world leader in the area of speech
synthesis, Alan is the principal author of the Festival
Speech Synthesis System, a free software system
used worldwide by academic and industrial groups.
He is also the author of the FestVox Voice Building tools, which have been used
to create speech synthesizers in more than 700 different languages. He
specializes in spoken dialogue systems, language generation, code switching,
and language technologies for low resource languages.
Alan received his Ph.D. from the University of Edinburgh and holds a B.S. (with
honors) in computer science from Coventry University. Before coming to CMU
in 1999, he was a research fellow at the Centre for Speech Technology Research
at the University of Edinburgh. He has published around 300 peer-reviewed
papers covering many aspects of speech and language technologies and has
served on committees at many international conferences and workshops.

Natural Language Processing 08


Guest Speaker

Burr Settles
Burr Settles leads the research group at Duolingo, an
award-winning website and mobile app offering free
language education globally. He also runs
FAWM.ORG, a global annual songwriting experiment.
He is the author of Active Learning—an intermediate
text on machine learning algorithms that are
adaptive, curious, and exploratory (if you will).

His research has been published in NeurIPS, ICML, AAAI, ACL, EMNLP,
NAACL-HLT, and CHI, and has been covered by The New York Times, Slate,
Forbes, WIRED, and the BBC, among others. Previously he was a postdoctoral
researcher at Carnegie Mellon and earned a Ph.D. from University of
Wisconsin—Madison. Burr currently lives in Pittsburgh, where he gets around
by bike and plays guitar in the pop band Delicious Pastries.

Natural Language Processing 09


Certificate
Upon successful completion of the program, participants will receive a
verified digital certificate of completion from Carnegie Mellon University
School of Computer Science Executive Education.

This document confirms that

[Recipient Name]
has successfully completed the program

Natural Language Processing

Ram Konduru Prof. David R. Mortensen & Prof. Alan Black


Director of Executive Education Department of Computer Science

Your digital certificate will be issued in your legal name and emailed to you
at no additional cost upon completion of the program, per the stipulated
requirements. All certificate images are for illustrative purposes only and
may be subject to change at the discretion of CMU School of Computer
Science Executive Education.

Natural Language Processing 10


About Carnegie
Mellon University’s
School of
Computer Science
The School of Computer
Science (SCS) at Carnegie
Mellon University is recognized
and respected internationally
as a center for unparalleled
research and education in
computer science. A home to
world-class faculty, SCS offers
undergraduate and graduate
education and research
opportunities that are second
About Emeritus
to none, along with executive
education programs designed Carnegie Mellon University’s School of
for professionals who work in a Computer Science is collaborating
variety of technical leadership with online education provider
roles. SCS is known for being at Emeritus to offer a portfolio of
the forefront of — often setting high-impact online programs. By
the course for — advanced working with Emeritus, we are able to
computer science disciplines, broaden access beyond our
including artificial intelligence, on-campus offerings in a collaborative
computational biology, and engaging format that stays true
human-computer interaction, to the quality expected from the CMU
language technologies, School of Computer Science Executive
machine learning, robotics, and Education.
software research.
The Emeritus approach to learning is
grounded in a cohort-based design to
maximize peer-to-peer sharing and
includes live teaching with world-class
faculty and hands-on project-based
learning. In the last year, more than
100,000 students from over 80
countries have benefited
professionally from Emeritus courses.

Natural Language Processing Python for Data Science


11
DURATION
10 Weeks

FORMAT
Online

PROGRAM FEE
US$2,500

CONNECT WITH A
PROGRAM ADVISOR

Email: CMUSCS@emeritus.org
Phone: +1 412-314-2432

Easily schedule a call with a program advisor to


learn more

SCHEDULE A CALL

You can enroll in the program here

ENROLL

You might also like