Assignment LTL Syeda Shabana Gul - 075817

1
Artificial Intelligence and Computational Linguistics
Sayada Shabana Gul

M.phil Linguistics
Dr. Khalid Mehmood
Date December 31, 2023
Department of English, Qauid -i- Azam University, Islamabad

2
Table of Contents
Abstract
1. Introduction
 Background of the Topic
 Glossary of Key Terms
 Definition of Key Terms
 Types/Classification
 Importance/Significance in Linguistics
1.2 Challenges and Future Directions
 Current Challenges in AI and Linguistics
 Ethical Considerations
 Future Trends and Innovations
2. Literature Review
3. Algorithms Frameworks in AI and Linguistics
 Commonly Used Frameworks
 TensorFlow
 PyTorch
 NLTK (Natural Language Toolkit)

3
 SpaCy
3.1 Strategies in AI for Linguistic Analysis
 Preprocessing Techniques
 Feature Engineering
 Supervised vs. Unsupervised Learning
 Evaluation Metrics in NLP
3.2 Coding Examples
 Python for Natural Language Processing
 Implementing a Basic Language Model
 Sentiment Analysis with Machine Learning
 Building a Chatbot Using NLP
4. Conclusion
 Summary of Key Findings
 Implications for Linguistics and AI
5. References
4
Abstract:
A new era of multidisciplinary study has been brought about by the convergence of
computational linguistics and artificial intelligence (AI), which has completely changed the field
of language processing and comprehension. This term paper explores the symbiotic link between
artificial intelligence (AI) and computational linguistics, examining how these fields interact to
lead to significant advances in sentiment analysis, machine translation, natural language
processing, and other areas. This study seeks to clarify the critical role artificial intelligence (AI)
will play in influencing language technology going forward by thoroughly examining important
ideas, approaches, and applications. Through comprehending the complex interplay between
computers and language, we may explore the possibilities, difficulties, and moral dilemmas that
arise in this revolutionary collaboration.
KEYWORDS:, Natural language processing (nlp), Computational linguistics, language
technology, human-computer interaction.
1. Introduction:
1.1 Background and Context
The convergence of Computational Linguistics and Artificial Intelligence (AI) represents
a paradigm shift in how humans use, interact with, and comprehend language. Understanding
and processing natural language has become essential for technological advancement in the
digital age, opening up new and innovative applications in a variety of fields. The
interdisciplinary area of Computational Linguistics, which combines computer technology and

5
linguistics, offers the theoretical framework for deciphering and comprehending human
language. Conversely, artificial intelligence (AI) gives machines the capacity for learning,
reasoning, and decision-making—abilities that are critical for handling the complexity of
language. The theoretical foundation for artificial intelligence (AI) and computational linguistics
was established in the middle of the 20th century by trailblazers like Alan Turing and Noam
Chomsky. In his ground-breaking work "Computing Machinery and Intelligence" (1950), Alan
Turing introduced the renowned Turing Test, which called into question the notion that a
machine might display intelligent activity that could be mistaken for human-like behaviour.
Chomsky's contributions to linguistics, especially the Chomsky hierarchy, had an impact on
early attempts to computably comprehend language structure.
1.1.1Early Developments: Researchers started working on large-scale programmes to
create machine translation systems in the 1950s and 1960s. Nevertheless, the intricacy of natural
language presented formidable obstacles, culminating in the infamous "AI winter" of the 1970s,
characterised by a decline in financial support and enthusiasm for AI studies.
1.1.2 Re-emergence and Progress: In the 1980s, the area saw a rebirth thanks to the use of
statistical models and the introduction of more potent computational resources. Language
processing jobs started to heavily rely on machine learning techniques like Hidden Markov
Models and, later, statistical language models.
1.2 Contemporary Challenges and Future Directions:
Notwithstanding notable progress, obstacles continue to exist. Ambiguity, context
awareness, and linguistic cultural quirks continue to be major challenges. More advanced deep
6
learning models, the integration of linguistic theories, and the resolution of ethical issues with
bias in AI systems are the main areas of ongoing study.
1.3 Applications:
Applications such as chatbots, language translation services, sentiment analysis tools,
virtual assistants (like Siri and Alexa), and more demonstrate the influence of AI and
computational linguistics. The partnership between AI and Computational Linguistics has the
potential to further enhance human-computer interaction and transform the information
processing landscape as technology develops.
1.4 Key Technologies and Techniques:
Natural language processing, or NLP, is the foundation of computational linguistics and
artificial intelligence. It entails creating models and algorithms that let computers comprehend,
interpret, and produce human language. NLP applications include part-of-speech tagging, named
entity identification, and sentiment analyses.
Machine Translation: With the development of neural networks and deep learning, machine
translation has advanced, building on the initial attempts. These technologies are used by
systems like Google Translate to produce translations that are more precise and appropriate for
the context.
Speech Recognition: The creation of systems that can translate spoken language into written text
is another essential component. Applications for this technology can be found in transcription
services, virtual assistants, and other areas.

7
Semantic Analysis: Understanding the meaning behind words and sentences is a complex task.
Computational Linguistics seeks to unravel the semantics of language, enabling machines to
comprehend context and infer intent.
Artificial Intelligence is the field of study o to build or program computers to make them
able to do what human minds are cable of doing.”(Boden, 1996). The goal of the
multidisciplinary computer science discipline of artificial intelligence (AI) is to create machines
that are able to carry out tasks that normally require human intelligence. Learning, reasoning,
problem-solving, perception, language comprehension, and decision-making are just a few of
these tasks. The goal of artificial intelligence (AI) is to build devices and systems that can mimic
or recreate the cognitive processes involved in human intellect . The academic definition of AI
often encompasses various subfields and approaches. Here's a more detailed breakdown:
Problem Solving and Search: Artificial Intelligence entails the creation of methods and
algorithms for resolving issues and exploring potential fixes. This covers techniques like
optimisation, constraint satisfaction, and heuristic search.
Knowledge Representation and Reasoning: AI systems must be able to represent the world's
knowledge in a way that enables them to draw conclusions and take action. Formal languages,
ontologies, and reasoning techniques are used in this.
Planning: AI systems need to be able to organise and schedule tasks in order to accomplish
particular objectives. This involves creating algorithms for formulating strategies and making
decisions.
8
Machine Learning: A key component of artificial intelligence is machine learning, which is the
creation of models and algorithms that let systems learn from data and get better over time. This
covers reinforcement learning, supervised learning, and unsupervised learning.
Natural Language Processing (NLP): NLP is concerned with making machines capable of
comprehending, interpreting, and producing human language. This covers jobs like question-
answering, sentiment analysis, and language translation.
Perception: AI systems frequently require the capacity to sense and comprehend their
surroundings. This includes speech recognition, other sensory abilities, and computer vision
(interpreting visual data).
Robotics: Robotics, where intelligent systems are created to interact with the real environment,
is closely related to artificial intelligence. This covers activities including path planning,
autonomous navigation, and object manipulation.
Cognitive Computing: The creation of artificial intelligence (AI) systems aims to imitate human
thought processes. This covers elements of cognition, perception, and decision-making..
The science of artificial intelligence is fast developing, and scientists are always looking for new
methods and strategies to increase the power of intelligent machines. It has connections to many
other fields, including linguistics, computer science, mathematics, psychology, and
neuroscience..(Ertel, 2018)
Similarly, Computational linguistics is the study of computer systems for understanding
and generating natural language(Grishman, 1986). It is an interdisciplinary area that builds
models, algorithms, and applications to let computers analyse and comprehend human language
by fusing ideas and techniques from computer science and linguistics. This entails applying
9
computational methods to analyse, simulate, and model elements of natural language in order to
create applications that include speech recognition, machine translation, information retrieval,
and natural language processing (NLP).
The study of the computational components of the human language faculty is a common
definition of computational linguistics in academic contexts. It covers a variety of study fields,
such as:
Syntax and Grammar: In order to study the formal structures and laws of language,
computational linguistics creates models and algorithms for phrase parsing and grammatically
correct output.
Semantics: Language meaning is the main topic of this field. By representing and modifying the
meaning of words, phrases, and sentences, computational techniques hope to help computers
comprehend context and deduce semantics.
Pragmatics: The subject of computational pragmatics focuses on how language interpretation is
influenced by context. Discourse structure, speech acts, and reference resolution are among the
topics it covers.
Speech Processing: Speech synthesis and recognition rely heavily on computational linguistics. It
entails creating algorithms to translate spoken language into written language and vice versa.
Text Mining and Information Retrieval: In this field, researchers create methods for gleaning
valuable information from massive amounts of textual data. This covers document
summarization, named entity identification, and sentiment analysis.

10
Machine Translation: The creation of machine translation systems, which translate text or
speech mechanically between languages, depends heavily on computational linguistics.
Natural Language Processing (NLP): Natural Language Processing, a branch of computational
linguistics is a science that studies how computers process human language. It covers things like
dialogue systems, language creation, and comprehension.
Corpus Linguistics: In order to help construct language models and comprehend linguistic
events, computational linguists examine patterns and frequencies of language usage using
massive collections of texts, or corpora.
Cognitive Modelling: In order to better understand how people interpret and produce language,
computational linguistics is used to build models of human language processing.
Computational Psycholinguistics: In an effort to comprehend how people mentally represent
and process language, this subfield investigates the relationship between computer models and
psychological theories of language processing.
There are useful uses for computational linguistics in many different fields, like as
technology, healthcare, finance, and more. Computational linguistics is still a dynamic,
developing discipline that makes important contributions to the creation of intelligent systems
that can comprehend and produce human language as long as technology keeps developing
(Ledeneva & Sidorov, 2010).
Artificial Intelligence and Computational Linguistics have a lot of potential to work together to
create novel solutions in machine translation, sentiment analysis, chatbots, and other areas. The
11
goal of this synergy is to close the gap between machine comprehension and human
communication. The advancements in recent years, like as chatbots that can converse naturally
and language models that produce text that appears human, highlight the partnership's
transformational potential.
We shall explore the domains of artificial intelligence and computational linguistics in the pages
that follow, removing the layers of invention and learning that have elevated these subjects to the
forefront of technological advancement.
1.5. Challenges and Future Directions
Current Challenges in AI and Linguistics:
a. Ambiguity and Context Understanding:
One of the main challenges is the ambiguity and context sensitivity of natural language.
Artificial intelligence models often struggle to comprehend the nuanced connotations that words
and phrases have in different contexts.
b. Multilingual Understanding: It is still difficult to create models that can comprehend and
process several languages efficiently. Complexity is increased in AI systems when handling
heterogeneous linguistic variants and structures.
c. Commonsense Reasoning: Commonsense reasoning is frequently absent from current AI
models, which makes it difficult for them to comprehend and react appropriately to everyday
circumstances.
12
d. Lack of Annotated Data: A significant amount of annotated data is needed for training on
many linguistic tasks, including named entity recognition and sentiment analysis. The creation of
new models may be hampered by the lack of high-quality labelled datasets.
e. Interdisciplinary Collaboration:
It's critical to close the communication gaps between linguists, cognitive scientists, and AI
researchers. To create language models that are more successful, collaborative efforts are
required to integrate computational methods with linguistic models (Perc, Ozer, & Hojnik,
2019).
1.6 Ethical Considerations:
a. Bias and Fairness:
Biased datasets used to train AI algorithms have the potential to reinforce and even magnify
societal prejudices. One of the most important ethical issues in language modelling is ensuring
fairness and minimizing bias.
b. Privacy Breech:
Language models frequently handle delicate textual data. It might be difficult to strike a
compromise between protecting user privacy and gaining insightful information.
c. Misuse of Language Models:
As language models get more sophisticated, there's a chance they might be abused to produce
false information, deep fakes, or other harmful intent. In order to stop misuse, governance
structures and ethical standards are essential.

13
d. Explainability and Transparency:
As language models become more complex, there's a potential that they will be misused to
generate deep fakes, misleading information, or other malicious intent. Governance frameworks
and moral principles are critical for preventing abuse.
e. Informed Consent:
When implementing language models in chatbots or other applications, users must give their
informed consent and communicate clearly about how their data will be used, especially when
discussing delicate subjects (Webber, Detjen, MacLean, & Thomas, 2019).
1.7. Futuristic Innovations and Trends:
When implementing language models in chatbots or other applications, users must
give their informed consent and communicate clearly about how their data will be
used, especially when discussing delicate subjects.
b. Zero-shot and few-shot Learning: Future language models might improve their ability to
generalise to new tasks with few instances, allowing them to adapt to a variety of linguistic
obstacles more quickly.
c. Explainable AI in NLP:In order to better comprehend language models' decision-making
processes and solve ethical issues, efforts to improve their interpretability will be essential.
d. Multimodal NLP : More and more data will be integrated from many modalities—text, image,
and audio—allowing models to comprehend context and human intent more fully.
14
e. Real-world Applications: A greater emphasis on implementing language models for actual
problems with an eye towards healthcare, education, and customer assistance, among other real-
world applications.
f. Continual Learning: Adaptive models that can learn constantly over time without losing track
of prior information will be critical for handling changing language difficulties.
In order to fully utilise natural language processing in a responsible and advantageous way, it
will be essential to address these issues and ethical concerns as AI and linguistics develop.
This term paper explores the fundamental ideas, practices, and applications that
characterise the partnership between artificial intelligence and computational linguistics. It
attempts to provide a thorough picture of the significant influence these fields have on one
another and the larger field of technology by examining the historical background, contemporary
cutting-edge innovations, and future directions. We will also negotiate the difficulties presented
by prejudices, ethical issues, and the ongoing search for more sympathetic and successful
human-machine communication.
1.8Significance
Within the science of linguistics, artificial intelligence (AI) and computational linguistics
have become essential fields that are transforming language studies in previously unheard-of
ways. These interdisciplinary fields provide sophisticated computational models that mimic and
evaluate linguistic processes by combining concepts from cognitive science, computer science,
and linguistics in a synergistic way. The potential of artificial intelligence and computational
linguistics to decipher the complex structures of human language is one of its main significances,
since it allows scholars to learn more about grammatical patterns, syntactic structures, and
15
semantic subtleties. Computational linguistics applies complex algorithms, machine learning,
and natural language processing methods to enable automated analysis of large linguistic datasets
and to develop intelligent systems that can produce and comprehend language similar to that of
humans. Through this symbiotic relationship between linguistics and AI, new perspectives on
language evolution, acquisition, and usage can be gained, leading to a more sophisticated
comprehension of the intricacies involved in linguistic communication.
Furthermore, there are significant ramifications for language technology and human-
computer interaction when AI and computational linguistics are combined in linguistic study.
Artificial intelligence (AI)-powered Natural Language Processing (NLP) systems are now a
necessary part of daily life, powering sentiment analysis software, language translation tools, and
virtual assistants. These technological developments improve language-related tasks' efficiency
while also bridging communication gaps across linguistically varied societies. The importance of
AI and computational linguistics in academia goes beyond the creation of novel instruments and
techniques that enable linguists to investigate linguistic phenomena on a never-before-seen scale.
The interaction between AI and linguistics promises to advance linguistic research, encourage
interdisciplinary cooperation, and influence the direction of these subjects as they develop.
2. Literature Review
The research explores ideas and design of a communication system intended for usage in
large AI systems—which are currently usually constructed to function in a distributed fashion
across local networks of workstations—have been described. They contended that the adoption
of sensible theoretical ideas, such as those contained in Hoare (1978), to substantially more
powerful solutions than the impromptu communication devices used when a communication
16
demand emerges. A little modification was made to the rim channel paradigm, which was
implemented on top of l?VM, the de facto standard for distributed system communication. The
system structure represents a collection of parts that exchange information bilaterally between
themselves without the need for a central mechanism or data structure to take part in each
communication event. Rather, communication between the communication partners is strictly
local once their identities have been confirmed(Amtrup & Benra, 1996).
The research implemented a central name server to handle requests for the creation of accounts
and to store the components operating within an application. There are two types of channels:
those that ensure successful communication between any two partners and those that allow the
parameters of the message channel to be customised to suit specific preferences. Split channels
also make it simple to configure a system with regard to interchangeable system components and
associated visualisation.
Further it demonstrated the advantages of the communication system achieved with this
approach in a variety of scenarios and system contexts, from highly interactive systems to purely
sequential systems and intermediate forms in between.
Similarly in the article “Analysing collaborative learning processes
automatically: Exploiting the advances of computational linguistics in computer-
supported collaborative learning. International journal of computer-supported
collaborative learning”, the researchers (Rosé et al., 2008) provide an overview of the
newly-emerging field of text categorization research that is centred on the issue of collaborative
learning process analysis, both generally and more narrowly in terms of a series of freely
accessible tools known as TagHelper tools. It takes time and effort to analyse the range of
17
pedagogically valuable aspects of learners' interactions. Adapting and applying modern text
categorization technologies to improve automated assessments of these highly valued
collaborative learning processes will make it easier to extract insights from corpus data.
Many industrial applications, including question answering, legal texts or news summary,
and headline generation systems, are actively integrating automatic text summarization (ATS), a
subset of natural language processing. In the context of the big data and industrial revolution 4.0
age, the explosion in the amount of text data from diverse sources necessitates the development
of novel automatic text summarization techniques that were previously assumed to be
unachievable in an increasingly digital environment. This chapter centres on the automatically
generated summaries evaluation (AGSE), a subtask of the ATS. It suggests an explainable and
cognitive approach to AGSE. The Kintsch reading comprehension model has been
computationally adapted into the proposed model. It was put to the test and contrasted with the
industry standard Recall-Oriented Understudy for Gisting Evaluation (ROUGE) method(Ayed,
Biskri, & Meunier, 2021).
This paper focuses on the knowledge acquisition aspects of an ongoing project that deals
with setting up a general environment for the creation and use of Large Knowledge Bases
(LKBs). The project is being carried out within the framework of the French National Centre for
Scientific Research (CNRS) with support from public and private bodies. Our LKBs have the
feature that every bit of data—"facts" and "rules"—stored in secondary memory is represented
by conventional AI methods. Analysing natural language descriptions of events, or "messages,"
is how knowledge is acquired for the "assertational" component of the fact base (episodic
memory).(Zarri, 1990) The messages are then translated into the internal Knowledge Description
Language (KDL) and "filtered" to remove irrelevant information. The knowledge acquisition
18
components of an ongoing project that deals with creating a general environment for the
production and utilisation of Large Knowledge Bases (LKBs) are the subject of this study. With
funding from both public and private sources, the project is being conducted under the auspices
of the French National Centre for Scientific Research (CNRS). Every piece of information
—"facts" and "rules"—stored in secondary memory is represented by traditional AI techniques
in our LKBs. For the "assertation" part of the fact base (episodic memory), information is
obtained by the analysis of natural language descriptions of occurrences, or "messages." After
that, the messages are "filtered" to eliminate unnecessary information using the internal
Knowledge Description Language (KDL).
The research “The Importance of Advancing Computational Linguistics” states that the
intersection of computer science and linguistics, computational linguistics is a key player in
determining the direction of technology-driven understanding and communication in the future.
This abstract explores the vital importance of developing computational linguistics in the quickly
changing digital world of today. Natural language processing has been completely transformed
by developments in computational linguistics, which have made it possible for machines to
produce, understand, and communicate with human language at previously unheard-of levels of
complexity. The significance of investigating the diverse contributions of computational
linguistics is emphasised in this abstract. The capacity to create multilingual and cross-lingual
models promotes inclusivity and makes it easier for varied language populations to communicate
with one another as societies get more integrated.
It is impossible to exaggerate the significance of developing computational linguistics.
Global connectivity is supported, data-driven insights are fostered, user experiences are
enhanced, ethical AI is ensured, and linguistic diversity is celebrated. It is critical to support and
19
invest in computational linguistics if we are to lead our society towards a time when technology
coexists peacefully with human relationships, promoting inclusivity and innovation on a
worldwide basis(Nilufar, 2023).
This article examines that the bibliometric text file mining is made possible by the Defence
Advanced Research Projects Agency (DARPA) initiative, which is developing the Technology
Opportunities Analysis System (TOAS). With the help of software called TOAS, relevant data
can be extracted from literature abstract files. These files contain fields that have been found to
repeat in each abstract record of a number of databases, including U.S. Patents, Engineering
Index (ENGI), INSPEC, Business Index, and National Technical Information Service (NTIS)
Research Reports. Natural language processing (NLP), computational linguistics (CL), fuzzy
analysis, latent semantic indexing, and principal components analysis (PCA) are just a few of the
technologies that the TOAS uses.(Watts, Porter, Cunningham, & Zhu, 1997) This software
system combines sophisticated matrix manipulations, statistical inference, and artificial
intelligence techniques with straightforward operations (such as listing, counting, list
comparisons, and sorting of search term obtained aggregated records' field results) to uncover
patterns.
3. Algorithms and Frameworks in AI and Linguistics
The aforementioned frameworks are extensively employed in the domains of linguistics and
artificial intelligence (AI) for a variety of applications, including deep learning, machine
learning, and natural language processing (NLP). Here is a quick synopsis of each:
i) TensorFlow:
20
Purpose: TensorFlow is a popular open-source machine learning framework for creating and
refining deep learning models. It was firstly developed by Google team.
Key Features: It offers a thorough ecosystem for deep learning and machine learning.
It supports deep learning models as well as conventional machine learning methods and makes it
possible to deploy models effectively across a range of platforms.
PyTorch:
Purpose: Facebook's AI Research Lab created the open-source machine learning library PyTorch
(FAIR). It is renowned for having a dynamic computational graph that improves its intuitiveness
for developers and academics.
Key Features: provides dynamic computing, which facilitates model understanding and
debugging. Popular among scholars because to its user-friendliness and versatility. Has a vibrant
community and is extensively used in the scientific world.
ii) NLTK (Natural Language Toolkit):
Purpose: The Python library NLTK is used to work with data related to human language. It is
frequently used for activities involving text processing and analysis and offers user-friendly
interfaces for working with linguistic data.
Important characteristics: provides a large selection of tools to handle tasks including parsing,
tokenization, stemming, tagging, and semantic reasoning and it is utilized for natural language
processing research and instructional objectives.
iii) SpaCy:
21
iv) Purpose: An open-source Python package called SpaCy is used for sophisticated natural
language processing. It is intended to be quick, effective, and ready for production.
Key Features: offers pre-trained models in a number of languages. focuses on excellent
performance, which qualifies it for practical uses. Tools for named entity recognition,
tokenization, and part-of-speech tagging are also included.
These frameworks are essential for creating applications involving AI and NLP. While
NLTK and SpaCy are specifically designed for natural language processing, providing tools and
functions for linguistic analysis and understanding, TensorFlow and PyTorch are more general-
purpose and utilised for a wide range of machine learning applications.
3.1. Strategies in AI for Linguistic Analysis
i) Pre-processing Techniques:
Pre-processing is the process of sanitising and converting unprocessed text data into a format
that machine learning models can use efficiently.
Purpose: Inconsistencies, extraneous information, and noise are frequently found in raw text
data. Pre-processing aids in the elimination of these problems to improve the data's quality.
Common Techniques: Text can be tokenized (broken down into words or subwords), stemmed
(words are reduced to their base or root form), lemmatized (words are reduced to their base
form), stop-word removed (common words that don't add much to the meaning), and special
characters can be handled.
ii) Feature Engineering:

22
In feature engineering, features are chosen, transformed, or created from the raw data in
order to enhance a machine learning model's performance.
Goal: Robust features are essential to the performance of machine learning models. The goal of
feature engineering is to describe the data in a way that extracts pertinent patterns and
information for the current task.
Examples of features in linguistic analysis are syntactic characteristics, semantic features, word
frequencies, and n-grams, which are collections of n words. It is also possible to represent words
in a continuous vector space using methods such as word embedding.
iii) Supervised vs. Unsupervised Learning:
In supervised learning, input data and matching output labels are coupled, and the model is
trained on a labelled dataset. With reference to the given examples, the model learns how to map
inputs to outputs.
Unsupervised Learning: Using unlabelled data, models are trained using unsupervised learning.
Without explicit direction in the form of labelled outputs, the model investigates the underlying
structure or patterns in the data.
Application: Supervised learning is useful in linguistic analysis for tasks such as named entity
recognition, sentiment analysis, and part-of-speech tagging. Unsupervised learning can be used
for tasks like as topic discovery in a corpus of texts or clustering similar items.
3.2Coding Examples
i) Python for Natural Language Processing:

23
Because of its extensive library ecosystem, Python is a popular programming language in the
Natural Language Processing (NLP) sector. For NLP tasks, the Natural Language Toolkit
(NLTK) is an effective library. To conduct tokenization using NLTK, for example: import nltk
from nltk.tokenize import word_tokenize
text = "Natural Language Processing is fascinating!"
tokens = word_tokenize(text)
print(tokens)
Here, we import the nltk library and specifically use the word_tokenize function for tokenization.
ii) Implementing a Basic Language Model:
Training a model to predict the next word in a series is the first step in creating a basic language
model. An elementary illustration of a character-level language model built with TensorFlow and
Keras:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense
text = "Your text data here..."
# Preprocess the text data
# Build the model
model = Sequential()
24
model.add(LSTM(128, input_shape=(seq_length, vocab_size)))
model.add(Dense(vocab_size, activation='softmax'))
# Compile and train the model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X, y, epochs=10, batch_size=32)
This example uses a simple LSTM (Long Short-Term Memory) neural network for sequence
prediction.
iii) Sentiment Analysis with Machine Learning:
Sentiment analysis is the process of categorising a text's sentiment. Creating a simple sentiment
analysis model with scikit-learn:
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import accuracy_score, classification_report
# Load and pre-process data
# Split desired data into testing and training sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Vectorize the text data
vectorizer = TfidfVectorizer()
25
X_train_vectorized = vectorizer.fit_transform(X_train)
X_test_vectorized = vectorizer.transform(X_test)
# Build and train the model
model = MultinomialNB()
model.fit(X_train_vectorized, y_train)
# Make predictions and evaluate
predictions = model.predict(X_test_vectorized)
accuracy = accuracy_score(y_test, predictions)
print("Accuracy:", accuracy)
print("Classification Report:\n", classification_report(y_test, predictions))
This example uses the Multinomial Naive Bayes classifier with TF-IDF vectorization.
iv) Building a Chatbot Using NLP:
NLP techniques are used in the chatbot building process to comprehend user input and provide
relevant responses. For building chatbots, the ChatterBot library is a straightforward tool:from
chatterbot import ChatBot from chatterbot. Trainers import Chatter Bot Corpus Trainer
# Create a chatbot instance
chatbot = ChatBot('MyBot')
# Create a new trainer for the chatbot
trainer = ChatterBotCorpusTrainer(chatbot)
26
# Train the chatbot on English language data
trainer.train('chatterbot.corpus.english')
# Get a response to a user input
response = chatbot.get_response("How are you?")
print(response)
This example uses ChatterBot to create a chatbot and apply it on English language corpus data.
These examples provide a starting point for various NLP tasks and can be expanded upon based
on specific use cases and requirements.
4. Conclusion
In conclusion, the field of artificial intelligence (AI) in linguistics is dynamic, marked by
both notable advancements and enduring difficulties. Important discoveries highlight the
advancements in natural language processing, with sophisticated language models exhibiting
hitherto unheard-of performance in applications spanning from sentiment analysis to language
translation. Nonetheless, there are still issues that need to be resolved, such as the complex
nature of linguistic ambiguity, the requirement for large amounts of annotated data, and the
necessity of taking privacy and prejudice into account. Since linguistic findings continue to
inform the creation of ever-more-complex language models, interdisciplinary collaboration
between linguists, cognitive scientists, and AI researchers is essential to the field's advancement.
The confluence of linguistics and AI has significant and far-reaching consequences.
Linguistic research and its practical applications have a transformational potential as AI models
increasingly use linguistic concepts. With its roots in the study of language meaning, structures,
27
and communication, linguistics is today deeply entwined with cutting-edge technologies that
have the potential to improve human language comprehension. The ethical issues and difficulties
that have been found highlight the significance of developing AI responsibly and exhort
professionals to give equity, openness, and user privacy top priority. The potential for linguistics
and AI to work together in new ways is intriguing. New developments could completely change
the way we engage with language, close communication barriers, and push the boundaries of
science and technology.

28
References
Amtrup, J. W., & Benra, J. (1996). Communication in large distributed AI systems for natural language
processing. Paper presented at the COLING 1996 Volume 1: The 16th International Conference
on Computational Linguistics.
Ayed, A. B., Biskri, I., & Meunier, J.-G. (2021). An efficient explainable artificial intelligence model of
automatically generated summaries evaluation: a use case of bridging cognitive psychology and
computational linguistics. Explainable AI Within the Digital Transformation and Cyber Physical
Systems: XAI Methods and Applications, 69-90.
Boden, M. A. (1996). Artificial intelligence: Elsevier.
Ertel, W. (2018). Introduction to artificial intelligence: Springer.
Grishman, R. (1986). Computational linguistics: an introduction: Cambridge University Press.
Ledeneva, Y., & Sidorov, G. (2010). Recent advances in computational linguistics. Informatica, 34(1).
Nilufar, N. (2023). THE IMPORTANCE OF ADVANCING COMPUTATIONAL LINGUISTICS. Paper presented at
the International Scientific and Current Research Conferences.
Perc, M., Ozer, M., & Hojnik, J. (2019). Social and juristic challenges of artificial intelligence. Palgrave
Communications, 5(1).
Rosé, C., Wang, Y.-C., Cui, Y., Arguello, J., Stegmann, K., Weinberger, A., & Fischer, F. (2008). Analyzing
collaborative learning processes automatically: Exploiting the advances of computational
linguistics in computer-supported collaborative learning. International journal of computer-
supported collaborative learning, 3, 237-271.
Watts, R. J., Porter, A. L., Cunningham, S., & Zhu, D. (1997). Toas intelligence mining; analysis of natural
language processing and computational linguistics. Paper presented at the European
Symposium on Principles of Data Mining and Knowledge Discovery.
Webber, S. S., Detjen, J., MacLean, T. L., & Thomas, D. (2019). Team challenges: Is artificial intelligence
the solution? Business Horizons, 62(6), 741-750.
Zarri, G. P. (1990). A cognitive (artificial intelligence+ computational linguistics) approach to the analysis
of natural language messages. Poetics, 19(1-2), 167-189.

Assignment LTL Syeda Shabana Gul - 075817

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Assignment LTL Syeda Shabana Gul - 075817

Uploaded by

Copyright:

Available Formats

1

Artificial Intelligence and Computational Linguistics

Sayada Shabana Gul

Department of English, Qauid -i- Azam University, Islamabad

 Background of the Topic

 Glossary of Key Terms

 Definition of Key Terms

1.2 Challenges and Future Directions

 Current Challenges in AI and Linguistics

 Future Trends and Innovations

3. Algorithms Frameworks in AI and Linguistics

 Commonly Used Frameworks

 NLTK (Natural Language Toolkit)

3.1 Strategies in AI for Linguistic Analysis

 Supervised vs. Unsupervised Learning

 Evaluation Metrics in NLP

3.2 Coding Examples

 Python for Natural Language Processing

 Implementing a Basic Language Model

 Sentiment Analysis with Machine Learning

 Building a Chatbot Using NLP

 Summary of Key Findings

 Implications for Linguistics and AI

lead to significant advances in sentiment analysis, machine translation, natural language

arise in this revolutionary collaboration.

KEYWORDS:, Natural language processing (nlp), Computational linguistics, language

technology, human-computer interaction.

1.1 Background and Context

The convergence of Computational Linguistics and Artificial Intelligence (AI) represents

interdisciplinary area of Computational Linguistics, which combines computer technology and

Chomsky's contributions to linguistics, especially the Chomsky hierarchy, had an impact on

early attempts to computably comprehend language structure.

1.1.1Early Developments: Researchers started working on large-scale programmes to

characterised by a decline in financial support and enthusiasm for AI studies.

Models and, later, statistical language models.

1.2 Contemporary Challenges and Future Directions:

Notwithstanding notable progress, obstacles continue to exist. Ambiguity, context

bias in AI systems are the main areas of ongoing study.

Applications such as chatbots, language translation services, sentiment analysis tools,

potential to further enhance human-computer interaction and transform the information

processing landscape as technology develops.

1.4 Key Technologies and Techniques:

Natural language processing, or NLP, is the foundation of computational linguistics and

entity identification, and sentiment analyses.

services, virtual assistants, and other areas.

Computational Linguistics seeks to unravel the semantics of language, enabling machines to

comprehend context and infer intent.

multidisciplinary computer science discipline of artificial intelligence (AI) is to create machines

problem-solving, perception, language comprehension, and decision-making are just a few of

optimisation, constraint satisfaction, and heuristic search.

ontologies, and reasoning techniques are used in this.

covers reinforcement learning, supervised learning, and unsupervised learning.

answering, sentiment analysis, and language translation.

(interpreting visual data).

autonomous navigation, and object manipulation.

thought processes. This covers elements of cognition, perception, and decision-making..

other fields, including linguistics, computer science, mathematics, psychology, and

Similarly, Computational linguistics is the study of computer systems for understanding

and generating natural language(Grishman, 1986). It is an interdisciplinary area that builds

and natural language processing (NLP).

definition of computational linguistics in academic contexts. It covers a variety of study fields,

comprehend context and deduce semantics.

Pragmatics: The subject of computational pragmatics focuses on how language interpretation is