You are on page 1of 3

CSM344 – NATURAL LANGUAGE PROCESSING

SAMPLE QUESTIONS

UNIT – 1

2 Marks:

1. Describe the application of regular expressions in the realm of text


manipulation and analysis.
2. Elucidate the intricacies inherent in Natural Language Processing (NLP)
and the associated computational challenges.
3. Detail a specific instance of Natural Language Processing (NLP)
application.
4. Provide a brief overview of different methodologies employed in text
encoding.

10 Marks:

1. Detail the role of Natural Language Processing (NLP) in practical


implementations, such as its utilization within spam detection systems,
considering its technical intricacies and effectiveness.
2. Detail the application of regular expressions for information extraction
from textual documents, elucidating with an illustrative example,
emphasizing technical considerations.
3. Describe the sequential procedures constituting a text processing pipeline,
emphasizing technical nuances and considerations.
UNIT – 2

2 Marks:

1. Outline the methods utilized for tokenization in text analysis.


2. Detail the procedures involved in stemming and lemmatization within
NLP.
3. Elucidate the techniques employed for stop word removal in NLP.
4. Explain the fundamental principle underlying TF-IDF representation.

10 Marks:

1. Describe in detail the implementation of tokenization on a textual


document, including an example to illustrate the process, while
emphasizing technical aspects and considerations.
2. Elaborate on the technical intricacies involved in the process of
canonicalization, detailing its steps and significance in data processing
and normalization, with supporting examples to illustrate its application.
3. Detail the technical aspects encompassing both the advantages and
limitations associated with the removal of stop words in natural language
processing tasks, considering their impact on data processing and
analysis.
4. Provide a comprehensive technical analysis comparing and contrasting
stemming and lemmatization techniques in text analysis, elucidating their
respective methodologies, advantages, and limitations, supported by
illustrative examples for each approach.
UNIT – 3

2 Marks:

1. Compare and contrast constituency parsing and dependency parsing.


2. Give the syntactic structure of the sentence by using constituency parsing:
‘The lazy cat killed the big fat rat.”
3. Briefly describe Part-of-Speech (PoS) tagging, highlighting its
advantages and limitations in natural language processing.
4. Briefly describe how HMMs can be used for understanding the sequential
data.

10 MARKS:

1. Detail the technical intricacies of Part-of-Speech (PoS) tagging within


text processing, including its role in facilitating comprehension of textual
documents, emphasizing its implementation, significance, and impact on
natural language understanding, with supporting examples to illustrate its
effectiveness.
2. Elaborate on the technical aspects of different parsing methods,
delineating their strengths and weaknesses in the context of natural
language processing. Provide examples to illustrate each parsing method's
application and discuss how they contribute to text analysis tasks.
3. Detail the technical significance of syntactic processing in enhancing the
efficacy of Natural Language Processing (NLP) applications, elucidating
its role in optimizing performance. Provide a comprehensive analysis,
including examples, to illustrate how syntactic processing contributes to
the advancement of NLP tasks.

You might also like