NLP - Practical List

M.Sc.
IT(AI&ML)-III
P83A1NLP: Natural Language Processing
Practical List
1. Write a python program to explain various methods of the OS Module.
Code:-
import os
print(os.name)
os.mkdir("D:\\New_folder")
print(os.getcwd())
os.chdir("D:\\")
os.rmdir("D:\\New_folder")
fw = os.popen("D:\\02file.txt", 'w')
fw.write("This is awesome")
os.rename("D:\\02file.txt",'Python1.txt')
Ouput:-
Yash Amin 21084341001 1

2. Write a python program to show various ways to read and write as well
as append the data in a text file.
Code:-
with open("File12.txt","w") as fw:
fw.write("Hello\n")
fw.write("World\n")
print("Written in file")
with open("File12.txt","a") as fa:
fa.write("Nice\n")
fa.write("to\n")
fa.write("Meet\n")
fa.write("You\n")
print("Appended in file")
with open("File12.txt","r") as fr:
a = fr.read()
print(a)
output:-
Yash Amin 21084341001 2

3. Write a python program to show various ways to read and write as well
as append the data in a word file.
Code:-
import docx
doc = docx.Document()
doc.add_paragraph("This is first paragraph of a MS Word file.")
doc.add_paragraph("This is the second paragraph of a MS Word file.")
doc.add_heading("This is level 1 heading", 0)
doc.add_heading("This is level 2 heading", 1)
doc.save("D:/file1.docx")
all_paras = doc.paragraphs
for para in all_paras:
print(para.text)
print("-------")
Output:-
Yash Amin 21084341001 3

4. Write a python program to demonstrate the words and sentences
tokenizing using NLTK. Also show the concept of Bigrams, Trigrams &
Ngrams.
Code:-
import nltk
from nltk.tokenize import word_tokenize
from nltk.corpus import stopwords,wordnet
from nltk.stem import WordNetLemmatizer
from nltk.stem import PorterStemmer
from nltk import bigrams,trigrams,ngrams
from nltk.probability import FreqDist
df['tokenized']=df['text'].apply(word_tokenize)
df['lower'] = df['tokenized'].apply(lambda x: [word.lower() for word in x])
stop_words=set(stopwords.words('english'))
df['stopwords_removed']= df['lower'].apply(lambda x:
[word for word in x if word not in stop_words])
wnl = WordNetLemmatizer()
df['lemmatized'] = df['stopwords_removed'].apply(lambda x:
[wnl.lemmatize(word) for word in x])
Output:-
Yash Amin 21084341001 4

Yash Amin 21084341001 5
5. Write a python program to demonstrate the concept of Frequency
Distribution in text or document.
Code:-
from nltk.corpus import stopwords,webtext
from nltk import bigrams
text_data = webtext.words('D:\\abc.txt')
stop_words = set(stopwords.words('english'))
f_w = []
for word in tex_lst:

if word not in stop_words:
if len(word)>3:
f_w.append(word)
bigram = bigrams(f_w)
freq_dist = FreqDist(bigram)
Output:-
Yash Amin 21084341001 6

6. Write a python program to implement the removing stop words from the
document according to the English dictionary using NLTK.
Code:-
from nltk.corpus import stopwords,webtext
from nltk import bigrams
text_data = webtext.words('D:\\abc.txt')
stop_words = set(stopwords.words('english'))
f_w = []
for word in text_data:

if word not in stop_words:
f_w.append(word)
print(f_w)
Output:-
Yash Amin 21084341001 7

7. Write a python program to implement the part of Speech tagging in
NLTK.
Code:-
import nltk
nltk.download('averaged_perceptron_tagger')
df['pos_tags'] = df['stopwords_removed'].apply(nltk.tag.pos_tag)
Yash Amin 21084341001 8

def get_wordnet_pos(tag):
if tag.startswith('J'):
return wordnet.ADJ
elif tag.startswith('V'):
return wordnet.VERB
elif tag.startswith('N'):
return wordnet.NOUN
elif tag.startswith('R'):
return wordnet.ADV
else:
return wordnet.NOUN
df['wordnet_pos'] = df['pos_tags'].apply(lambda x:
[(word, get_wordnet_pos(pos_tag))
for (word, pos_tag) in
x])
df['lemmatized']=df['stopwords_removed'].apply(lambda x:
Output:-
Yash Amin 21084341001 9

8. Write a python program to implement stemming and lemmatization with
NLTK.
Code:-
ps = PorterStemmer()
df['stemming'] = df['stopwords_removed'].apply(lambda x: [ps.stem(word)

for word in x])
df.head()
Yash Amin 21084341001 10
Output:-
9. Write a python program to implement the Named Entity Recognition

(NER) using NLTK.
Code:-
import spacy
from spacy import displacy
NER = spacy.load("en_core_web_sm")
raw_text="The Indian Space Research Organisation or is the national space

agency of India, headquartered in Bengaluru. It operates under Department
of Space which is directly overseen by the Prime Minister of India while
Chairman of ISRO acts as executive of DOS as well."
Yash Amin 21084341001 11

text1= NER(raw_text)
for word in text1.ents:

print(word.text,word.label_)
Output:-
10. Write a complete NLP task for cleaning and pre-processing text using
NLTK.
Code:-
import nltk
Yash Amin 21084341001 12

nltk.download('averaged_perceptron_tagger')
df['pos_tags'] = df['stopwords_removed'].apply(nltk.tag.pos_tag)
def get_wordnet_pos(tag):
if tag.startswith('J'):
return wordnet.ADJ
elif tag.startswith('V'):
return wordnet.VERB
elif tag.startswith('N'):
return wordnet.NOUN
elif tag.startswith('R'):
return wordnet.ADV
else:
return wordnet.NOUN
df['wordnet_pos'] = df['pos_tags'].apply(lambda x:
[(word, get_wordnet_pos(pos_tag))
for (word, pos_tag) in
x])
Yash Amin 21084341001 13

df.head()
Output:-
Yash Amin 21084341001 14

NLP - Practical List

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

NLP - Practical List

Uploaded by

Copyright:

Available Formats

M.Sc.

Yash Amin 21084341001 1

Yash Amin 21084341001 2

Yash Amin 21084341001 3

Yash Amin 21084341001 4

for word in tex_lst:

Yash Amin 21084341001 6

for word in text_data:

Yash Amin 21084341001 7

Yash Amin 21084341001 8

Yash Amin 21084341001 9

from nltk.stem import PorterStemmer

df['stemming'] = df['stopwords_removed'].apply(lambda x: [ps.stem(word)

9. Write a python program to implement the Named Entity Recognition

raw_text="The Indian Space Research Organisation or is the national space

Yash Amin 21084341001 11

for word in text1.ents:

Yash Amin 21084341001 12

Yash Amin 21084341001 13

Yash Amin 21084341001 14

You might also like