1. Describe the application of regular expressions in the realm of text
manipulation and analysis. 2. Elucidate the intricacies inherent in Natural Language Processing (NLP) and the associated computational challenges. 3. Detail a specific instance of Natural Language Processing (NLP) application. 4. Provide a brief overview of different methodologies employed in text encoding.
10 Marks:
1. Detail the role of Natural Language Processing (NLP) in practical
implementations, such as its utilization within spam detection systems, considering its technical intricacies and effectiveness. 2. Detail the application of regular expressions for information extraction from textual documents, elucidating with an illustrative example, emphasizing technical considerations. 3. Describe the sequential procedures constituting a text processing pipeline, emphasizing technical nuances and considerations. UNIT – 2
2 Marks:
1. Outline the methods utilized for tokenization in text analysis.
2. Detail the procedures involved in stemming and lemmatization within NLP. 3. Elucidate the techniques employed for stop word removal in NLP. 4. Explain the fundamental principle underlying TF-IDF representation.
10 Marks:
1. Describe in detail the implementation of tokenization on a textual
document, including an example to illustrate the process, while emphasizing technical aspects and considerations. 2. Elaborate on the technical intricacies involved in the process of canonicalization, detailing its steps and significance in data processing and normalization, with supporting examples to illustrate its application. 3. Detail the technical aspects encompassing both the advantages and limitations associated with the removal of stop words in natural language processing tasks, considering their impact on data processing and analysis. 4. Provide a comprehensive technical analysis comparing and contrasting stemming and lemmatization techniques in text analysis, elucidating their respective methodologies, advantages, and limitations, supported by illustrative examples for each approach. UNIT – 3
2 Marks:
1. Compare and contrast constituency parsing and dependency parsing.
2. Give the syntactic structure of the sentence by using constituency parsing: ‘The lazy cat killed the big fat rat.” 3. Briefly describe Part-of-Speech (PoS) tagging, highlighting its advantages and limitations in natural language processing. 4. Briefly describe how HMMs can be used for understanding the sequential data.
10 MARKS:
1. Detail the technical intricacies of Part-of-Speech (PoS) tagging within
text processing, including its role in facilitating comprehension of textual documents, emphasizing its implementation, significance, and impact on natural language understanding, with supporting examples to illustrate its effectiveness. 2. Elaborate on the technical aspects of different parsing methods, delineating their strengths and weaknesses in the context of natural language processing. Provide examples to illustrate each parsing method's application and discuss how they contribute to text analysis tasks. 3. Detail the technical significance of syntactic processing in enhancing the efficacy of Natural Language Processing (NLP) applications, elucidating its role in optimizing performance. Provide a comprehensive analysis, including examples, to illustrate how syntactic processing contributes to the advancement of NLP tasks.