Professional Documents
Culture Documents
UNIT-III (10LECTURES)
TEXT REPRESENTATION:
Vector Space Models, Basic Vectorization Approaches, One-Hot Encoding, Bag of Words, Bag of N-Grams, TF-IDF,
Distributed Representations, Word Embeddings, Going Beyond Words, Distributed Representations Beyond Words and
Characters, Universal Text Representations, Visualizing Embeddings, Handcrafted Feature Representations.
Learning Outcomes: At the end of the module student will be able to:
Describe vector space models. (L2)
Choose one of the various vectorization approaches. (L3)
Prepare Handcrafted Feature Representations. (L3)
Learning Outcomes: At the end of the module student will be able to:
Describe pipeline for building text classification systems. (L2)
Describe how to use existing text classification APIs. (L3)
Apply pre-trained models to classify the data. (L3)
Learning Outcomes: At the end of the module student will be able to:
Describe temporal information extraction. (L2)
Explain various relationship extraction approaches. (L2)
Apply various APIs to relationship extraction. (L2)
TEXTBOOKS:
1. Sowmya Vajjala, Bodhisattwa Majumder, Anuj Gupta and Harshit Surana, “Practical Natural Language
Processing”, O’Reilly Media Inc., 2021, ISBN: 978-93-8588-918-9.
Computer Applications R-2022
REFERENCE BOOKS:
1. Daniel Jurafsky and James H Martin, ”Speech and Language Processing: An Introduction to Natural Language
Processing, Computational Linguistics and Speech Recognition”, Prentice Hall, 2nd Edition, 2009.
2. C. Manning and H. Schutze, “Foundations of Statistical Natural Language Processing”, MIT Press. Cambridge,
MA,1999.
3. James Allen, “Natural Language Understanding”, The Benjamin/Cummings Publishing Company Inc.
4. ToolkitSteven Bird, Ewan Klein, and Edward Loper, “Natural Language Processing with Python – Analyzing Text
with the Natural Language”, 1st Edition, 2009.