Welcome to Scribd!

Assignment 7

Uploaded by

0% found this document useful (0 votes)

17 views2 pages

The document demonstrates how to perform various natural language processing tasks like tokenization, removing stop words, stemming, and lemmatization using the NLTK library in Python. It shows importing NLTK modules, downloading NLTK data, tokenizing text at the sentence and word level, identifying and removing common stop words, stemming words to their root forms, lemmatizing words to their base forms, and tagging parts of speech.

Original Description:

Original Title

assignment 7

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

17 views2 pages

Assignment 7

Uploaded by

Ashwini Patil

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 2

Search inside document

In [ ]:

import nltk
nltk.download('punkt')
nltk.download('stopwords')
nltk.download('wordnet')
nltk.download('averaged_perceptron_tagger')

[nltk_data] Downloading package punkt to /root/nltk_data...

[nltk_data] Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data] Package stopwords is already up-to-date!
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data] Package wordnet is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data] /root/nltk_data...
[nltk_data] Package averaged_perceptron_tagger is already up-to-
[nltk_data] date!

Out [4]: True

In [ ]:
text= "Tokenization is the first step in text analytics. The process of breaking down a text paragraph into

from nltk import sent_tokenize

tokenized_text=sent_tokenize(text)
tokenized_text

Out [5]: ['Tokenization is the first step in text analytics.',

'The process of breaking down a text paragraph into smaller chunks such as words or sentences is called Tokenization.']

In [ ]:
from nltk import word_tokenize
tokenized_word=word_tokenize(text)
tokenized_word

Out [6]: ['Tokenization',

'is',
'the',
'first',
'step',
'in',
'text',
'analytics',
'.',
'The',
'process',
'of',
'breaking',
'down',
'a',
'text',
'paragraph',
'into',
'smaller',
'chunks',
'such',
'as',
'words',
'or',
'sentences',
'is',
'called',
'Tokenization',
'.']

In [ ]:
from nltk.corpus import stopwords
stop_words=set(stopwords.words("english"))
print(stop_words)

{'they', 'was', 'did', 'were', "that'll", 'can', "wouldn't", 'yourself', 'off', 'than', 'shouldn', 'out', 'a', 'doing', 'mustn', 'having',

In [ ]:
import re
text= "How to remove stop words with NLTK library in Python?"
text= re.sub('[^a-zA-Z]', ' ',text)
text

Out [11]: 'How to remove stop words with NLTK library in Python '

In [ ]:
tokens=word_tokenize(text.lower())
tokens

Out [14]: ['how',

'to',
'remove',
'stop',
'words',
'with',
'nltk',
'library',
'in',
'python']
In [ ]:
filtered_text=[]
for w in tokens:
if w not in stop_words:
filtered_text.append(w)
print("Tokenized Sentence:",tokens)
print("Filterd Sentence:",filtered_text)

Tokenized Sentence: ['how', 'to', 'remove', 'stop', 'words', 'with', 'nltk', 'library', 'in', 'python']
Filterd Sentence: ['remove', 'stop', 'words', 'nltk', 'library', 'python']

In [ ]:
from nltk.stem import PorterStemmer
e_words=["studies", "studying", "cries", "cry"]
#'watch','watching','watches','watched']
ps=PorterStemmer()
for w in e_words:
rootword=ps.stem(w)
print(rootword)

studi
studi
cri
cri

In [ ]:
from nltk.stem import WordNetLemmatizer
wordnet_lemmatizer = WordNetLemmatizer()
text = "studies studying cries cry"
tokenization = nltk.word_tokenize(text)
for w in tokenization:
print("Lemma for {} is {}".format(w, wordnet_lemmatizer.lemmatize(w)))

Lemma for studies is study

Lemma for studying is studying
Lemma for cries is cry
Lemma for cry is cry

In [ ]:
.0000import nltk
from nltk.tokenize import word_tokenize
data="The pink sweater fit her perfectly"
words=word_tokenize(data)
for word in words:
print(nltk.pos_tag([word]))

[('The', 'DT')]
[('pink', 'NN')]
[('sweater', 'NN')]
[('fit', 'NN')]
[('her', 'PRP$')]
[('perfectly', 'RB')]

Data Science Assignment
Document11 pages
Data Science Assignment
Karthik karakatla
No ratings yet
An Elementary Grammar of The Sanscrit La
Document295 pages
An Elementary Grammar of The Sanscrit La
Andrés López Castro
No ratings yet
Python Code Examples
Document30 pages
Python Code Examples
Asaf Katz
No ratings yet
Common Genres in Senior Secondary Schooling Structures and Some of The Important Language Features
Document32 pages
Common Genres in Senior Secondary Schooling Structures and Some of The Important Language Features
Aprilia Kartika Putri
No ratings yet
Present Simple
Document4 pages
Present Simple
Mauri Cio
50% (2)
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Asss 7
Document4 pages
Asss 7
Ashwini Patil
No ratings yet
NLP Lab 1
Document1 page
NLP Lab 1
Sahil Rajput
No ratings yet
#Loading NLTK: Import
Document5 pages
#Loading NLTK: Import
avinash
No ratings yet
ASTW RA03 PracticalManual
Document18 pages
ASTW RA03 PracticalManual
Diksha Nasa
No ratings yet
Week2 N9
Document4 pages
Week2 N9
20131A05N9 SRUTHIK THOKALA
No ratings yet
Natural Language Processing
Document17 pages
Natural Language Processing
coding ak
No ratings yet
Natural Language Processing
Document22 pages
Natural Language Processing
sandeepssn47
No ratings yet
Shubham Jade MSC It 31031420010 NLP Practical Journal
Document17 pages
Shubham Jade MSC It 31031420010 NLP Practical Journal
Shubham Jade
No ratings yet
NLTK Tutorial
Document33 pages
NLTK Tutorial
Deepak Gaur
No ratings yet
D22CS097 P6
Document3 pages
D22CS097 P6
Rushabh Goswami
No ratings yet
Assignment No - 7
Document4 pages
Assignment No - 7
Sid Chabukswar
No ratings yet
Source Code Python Jemmy
Document7 pages
Source Code Python Jemmy
Fadilah Riczky
No ratings yet
Bag of Words
Document1 page
Bag of Words
Mohith Kumar Narahari
No ratings yet
Aped For Fake News
Document6 pages
Aped For Fake News
Bless Co
No ratings yet
Dictionary Notes
Document4 pages
Dictionary Notes
Ilakkiyaa KP
No ratings yet
Resumen Python
Document13 pages
Resumen Python
sanabressanabres
No ratings yet
Act5 Wisnu Trenggono Wirayuda 57418379
Document9 pages
Act5 Wisnu Trenggono Wirayuda 57418379
Wisnu Trenggono Wirayuda
No ratings yet
01 Python 01 Programming Basics
Document13 pages
01 Python 01 Programming Basics
AyoubENSAT
No ratings yet
AP19110010110 Lab Assignment-2 - Jupyter Notebook
Document18 pages
AP19110010110 Lab Assignment-2 - Jupyter Notebook
Phani Bhushan
No ratings yet
LAB02
Document11 pages
LAB02
mausam
No ratings yet
Simple NMT
Document3 pages
Simple NMT
Furious Five
No ratings yet
NLP Aat-2 16-03-2023
Document9 pages
NLP Aat-2 16-03-2023
btms
No ratings yet
Untitled
Document65 pages
Untitled
Rodolfo Betanzos
No ratings yet
Course 3 - Week 1 - Exercise-Answer - Ipynb - Colaboratory
Document1 page
Course 3 - Week 1 - Exercise-Answer - Ipynb - Colaboratory
gcd
No ratings yet
D22dce179 Ai Practical-3,4
Document6 pages
D22dce179 Ai Practical-3,4
Vishv Faldu
No ratings yet
Notebook Python Crash Course
Document25 pages
Notebook Python Crash Course
darayir140
No ratings yet
Import From Import From Import From Import: 'Brown'
Document1 page
Import From Import From Import From Import: 'Brown'
I yr IT 10-Cherisha S
No ratings yet
Euc 2012 T Wili o Presentation
Document18 pages
Euc 2012 T Wili o Presentation
umesh nair
No ratings yet
Machine Learning NLP LAB Sayak Mallick
Document4 pages
Machine Learning NLP LAB Sayak Mallick
Sayak Mallick
No ratings yet
Tuples: 'Physics' 'Chemistry' "A" "B" "C" "D" "E"
Document7 pages
Tuples: 'Physics' 'Chemistry' "A" "B" "C" "D" "E"
sanju kumar
No ratings yet
Harvesting and Analyzing Tweets Using R
Document23 pages
Harvesting and Analyzing Tweets Using R
Rubila Dwi Adawiyah
No ratings yet
Text Chunking Using NLTK
Document24 pages
Text Chunking Using NLTK
VenkatMurthy
No ratings yet
Learning Journal Unit 8.docx (1) Web
Document3 pages
Learning Journal Unit 8.docx (1) Web
OscarMorett
No ratings yet
Reading Merged Dataset Reading Merged Dataset: 'Import Successfull'
Document7 pages
Reading Merged Dataset Reading Merged Dataset: 'Import Successfull'
Cookies Keeping
No ratings yet
Python Practical
Document35 pages
Python Practical
Rahul Verma
No ratings yet
NLP - Practical List
Document14 pages
NLP - Practical List
Yash Amin
No ratings yet
Intro PDS
Document12 pages
Intro PDS
Albin P J
No ratings yet
Dictionaries: Constructing A Dictionary
Document4 pages
Dictionaries: Constructing A Dictionary
sandip kumar
No ratings yet
I041 - NLP - Assignment1.ipynb - Colaboratory
Document11 pages
I041 - NLP - Assignment1.ipynb - Colaboratory
Devesh Pawar
No ratings yet
Basics of Python
Document8 pages
Basics of Python
sumit
No ratings yet
Dictionaries: Constructing A Dictionary
Document4 pages
Dictionaries: Constructing A Dictionary
mahanth gowda
No ratings yet
Lecture 22
Document64 pages
Lecture 22
Tev Wallace
No ratings yet
Unstructured Data Classification Handson
Document4 pages
Unstructured Data Classification Handson
mohamed yasin
No ratings yet
Python ZTM Cheatsheet: Andrei Neagoie
Document38 pages
Python ZTM Cheatsheet: Andrei Neagoie
malik
No ratings yet
Python Lab Working Code
Document35 pages
Python Lab Working Code
Violet Mars
No ratings yet
Artificial Intelligence (18Csc305J) Lab: EXPERIMENT 13: Implementation of NLP Problem
Document9 pages
Artificial Intelligence (18Csc305J) Lab: EXPERIMENT 13: Implementation of NLP Problem
SAILASHREE PANDAB (RA1911032010030)
No ratings yet
NLP Manual
Document21 pages
NLP Manual
1nt21ai012.vynavi
No ratings yet
Sample Code
Document9 pages
Sample Code
Angad Singh
No ratings yet
19BBTCS069 KrishnaKantPandey DAP Lab Record
Document100 pages
19BBTCS069 KrishnaKantPandey DAP Lab Record
Krishna Kant
No ratings yet
STD XI: Computer Science: Tuple
Document22 pages
STD XI: Computer Science: Tuple
z9819898203
No ratings yet
Python Quick Note
Document23 pages
Python Quick Note
Jyotirmay Sahu
No ratings yet
DP - Sets
Document12 pages
DP - Sets
Satrio Adrian Alvi Hariri
No ratings yet
Pograms
Document20 pages
Pograms
Thamizh Arasi
No ratings yet
Assignment 2 NLP 20bci7108
Document2 pages
Assignment 2 NLP 20bci7108
rupa sree
No ratings yet
Problem Statement
Document10 pages
Problem Statement
purpleworldot07
No ratings yet
PRACTICAL FILE CS Armaan Jaiswal
Document26 pages
PRACTICAL FILE CS Armaan Jaiswal
nik
No ratings yet
Lesson 6 Lists
Document9 pages
Lesson 6 Lists
Pavankumar Trvd
No ratings yet
MORPHOLOGY - Lecture 5 - Morpheme&morph - Week 6
Document9 pages
MORPHOLOGY - Lecture 5 - Morpheme&morph - Week 6
Lan Hương
No ratings yet
Need: Form: English Grammar Today
Document4 pages
Need: Form: English Grammar Today
Luvia Hou
No ratings yet
Read A Story, Write A Story
Document12 pages
Read A Story, Write A Story
noyolaclementej
No ratings yet
Tugas Baru
Document6 pages
Tugas Baru
Agustina
No ratings yet
Valle Ged102 WGN Week2
Document6 pages
Valle Ged102 WGN Week2
John Marlon Valle
No ratings yet
Comma S: English Composition
Document22 pages
Comma S: English Composition
Rhlo Marcela
No ratings yet
Express Regret and Relief With Third Conditional 1: Grammar - Conditionals
Document2 pages
Express Regret and Relief With Third Conditional 1: Grammar - Conditionals
Kati T.
No ratings yet
TEST A - VI Odd. (25.12.2018)
Document3 pages
TEST A - VI Odd. (25.12.2018)
Cvetanka Shterjovska
No ratings yet
Determiners 72 79
Document5 pages
Determiners 72 79
Nurmia Mia
No ratings yet
Review Units 7,8,9
Document12 pages
Review Units 7,8,9
Andrés Quezada Pérez
No ratings yet
The Usages of Japanese Interjection and Other Emotional Expressions
Document6 pages
The Usages of Japanese Interjection and Other Emotional Expressions
Peter Paul Navales
No ratings yet
Phrasal Verbs en Inglés.
Document2 pages
Phrasal Verbs en Inglés.
David Alexander Palomo Quiroz
No ratings yet
Adverb
Document7 pages
Adverb
harigene
No ratings yet
What Are Articles
Document3 pages
What Are Articles
salm099
No ratings yet
MA English Syntax Terms 3
Document4 pages
MA English Syntax Terms 3
Rahma
No ratings yet
Practice 3
Document18 pages
Practice 3
rizal irwanda
No ratings yet
Modulo 2 Describing Actions
Document13 pages
Modulo 2 Describing Actions
Alejandra Pérez Gagni
No ratings yet
Tenses: Indefinite/ Simple
Document5 pages
Tenses: Indefinite/ Simple
Asad Jamal
No ratings yet
A Literature-Based Approach Learning Plan
Document21 pages
A Literature-Based Approach Learning Plan
jonerey bartido
No ratings yet
Grammar7B PDF
Document2 pages
Grammar7B PDF
Mariadel Torralbo
No ratings yet
F1 March Exam
Document12 pages
F1 March Exam
Moureen Padan
No ratings yet
Verbs List
Document5 pages
Verbs List
Fernando De la Fuente
No ratings yet
In On at Dominoes
Document2 pages
In On at Dominoes
Janet Fernandez
No ratings yet
Modal Verb English
Document16 pages
Modal Verb English
Gabi Reyesm
No ratings yet
Violation of Coherence
Document35 pages
Violation of Coherence
Charles Dayne Lacibar Dofeliz
No ratings yet
Rangkuman Discourse Analysis
Document5 pages
Rangkuman Discourse Analysis
tayo hehe
No ratings yet
Tare Assignment Tare
Document18 pages
Tare Assignment Tare
dereje solomon
No ratings yet