Welcome to Scribd!

Skip carousel

Name: V.J.Karthik REG. NO. 18BCE0413: Web Mining Lab DA-1

Uploaded by

Akash Yadav

0% found this document useful (0 votes)

13 views7 pages

Original Title

18BCE0413_VL2019205001959_AST01

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

13 views7 pages

Name: V.J.Karthik REG. NO. 18BCE0413: Web Mining Lab DA-1

Uploaded by

Akash Yadav

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 7

Search inside document

WEB MINING LAB

DA-1

NAME: V.J.KARTHIK
REG. NO. 18BCE0413

Reading the text file

text=open("/home/matlab/Downloads/text.txt","r").read()
print(text)

Write a python program for following questions:

a) Take dynamic input from user as a paragraph (give
input >= 25 lines) and remove punctuations first (only).
Then print resulting paragraph (after removing
punctuations) with all lower case
CODE

import string
#TEXT IS TAKN FROM TXT FILE AS ABOVE
t=str.maketrans(dict.fromkeys(string.punctuation))
new_t=text.translate(t)
new_t=new_t.lower()
print(new_t)

b) Afterwards, remove stop words and print the whole

paragraph again.
CODE:
#18BCE0413-V.J.KARTHIK

import nltk
import string
nltk.download('stopwords')
nltk.download('punkt')

from nltk.corpus import stopwords

from nltk.tokenize import word_tokenize

t=str.maketrans(dict.fromkeys(string.punctuation))
new_t=text.translate(t)
new_t=new_t.lower()
example_sent = new_t
stop_words = set(stopwords.words('english'))
print("\n STOP WORDS \n",stop_words)

word_tokens = word_tokenize(example_sent)
filtered_sentence = [w for w in word_tokens if not w in stop_words]
print("\n FILTERED PARAGRAPH:\n"," ".join(filtered_sentence))
c) Collect those words that occur only once in the
paragraph and print them.
CODE:

#18BCE0413-V.J.KARTHIK

import nltk
import string
nltk.download('stopwords')
nltk.download('punkt')

from nltk.corpus import stopwords

from nltk.tokenize import word_tokenize

t=str.maketrans(dict.fromkeys(string.punctuation))
new_t=text.translate(t)
new_t=new_t.lower()
example_sent = new_t
stop_words = set(stopwords.words('english'))

word_tokens = word_tokenize(example_sent)
filtered_sentence = [w for w in word_tokens if not w in stop_words]
S=" ".join(filtered_sentence)
I=S.split(" ")
print("\n FILTERED PARAGRAPH \n",S)
print("\n LIST OF WORDS APPEARING ONLY ONCE IN FILTERED
PARAGRAPH-")
for j in I:
if S.count(j)==1 and j.isalpha():
print(j,end=",")
d) Collect those words that occur only twice in the
paragraph and print them.
CODE:
#18BCE0413-V.J.KARTHIK

import nltk
import string
nltk.download('stopwords')
nltk.download('punkt')

from nltk.corpus import stopwords

from nltk.tokenize import word_tokenize

t=str.maketrans(dict.fromkeys(string.punctuation))
new_t=text.translate(t)
new_t=new_t.lower()
example_sent = new_t
stop_words = set(stopwords.words('english'))

word_tokens = word_tokenize(example_sent)
filtered_sentence = [w for w in word_tokens if not w in stop_words]
S=" ".join(filtered_sentence)

I=S.split(" ")
print("\n FILTERED PARAGRAPH \n",S)
print("\n LIST OF WORDS APPEARING ONLY TWICE IN FILTERED
PARAGRAPH-")
for j in I:
if S.count(j)==2 and j.isalpha():
print(j,end=",")
e) Make a dictionary of these 4 types of words and apply
stemming to reduce inflected words to their word stem,
base or root form.
CODE:

#18BCE0413-V.J.KARTHIK

import nltk

from nltk.stem import PorterStemmer

#create an object of class PorterStemmer

porter=PorterStemmer()

#Making list out of the above words

D=['read','reads','reading','read']
for x in D:
print(x)
print(porter.stem(x))

NLP Tushar
Document21 pages
NLP Tushar
Yash Amin
No ratings yet
Natural Language Processing
Document17 pages
Natural Language Processing
coding ak
No ratings yet
Pradip
Document73 pages
Pradip
nileshlawande09
No ratings yet
Atharv Bhambare
Document73 pages
Atharv Bhambare
nileshlawande09
No ratings yet
Ir Practical
Document13 pages
Ir Practical
Ravishankar Gautam
No ratings yet
Python Test Series Eight
Document36 pages
Python Test Series Eight
mahesh palem
No ratings yet
Manya
Document25 pages
Manya
Swati Parihar
No ratings yet
Chinar Public School
Document18 pages
Chinar Public School
Swati Parihar
No ratings yet
ENROLLMENT NO.:-160280107033 PYTHON PROGRAMMING (2180711) : Be - Comp. - Sem-8 - Ldce Page
Document23 pages
ENROLLMENT NO.:-160280107033 PYTHON PROGRAMMING (2180711) : Be - Comp. - Sem-8 - Ldce Page
upendra
No ratings yet
Cs Practical
Document56 pages
Cs Practical
Materialistic Study
No ratings yet
New Python
Document4 pages
New Python
Sachin singh
No ratings yet
Alm Co-1 PDF
Document9 pages
Alm Co-1 PDF
Thota Deep
No ratings yet
Shreyas Practical Doc Final
Document44 pages
Shreyas Practical Doc Final
shreyassantoshkurup
No ratings yet
Python Manual
Document45 pages
Python Manual
akruti raj
No ratings yet
Activity 1: Example Output-Enter N1: 4 Enter N2: 3 4 + 3 12 4 - 3 1 4 3 12 4 / 3 1.3333 4//3 1 4%3 1
Document28 pages
Activity 1: Example Output-Enter N1: 4 Enter N2: 3 4 + 3 12 4 - 3 1 4 3 12 4 / 3 1.3333 4//3 1 4%3 1
angkur
No ratings yet
Name - Rushabh Shah Sem - 5 Roll No - 19BCP108 Compiler Design
Document6 pages
Name - Rushabh Shah Sem - 5 Roll No - 19BCP108 Compiler Design
Rushabh Shah
No ratings yet
ASTW RA03 PracticalManual
Document18 pages
ASTW RA03 PracticalManual
Diksha Nasa
No ratings yet
Reading Merged Dataset Reading Merged Dataset: 'Import Successfull'
Document7 pages
Reading Merged Dataset Reading Merged Dataset: 'Import Successfull'
Cookies Keeping
No ratings yet
Jihzn
Document3 pages
Jihzn
djihenebenbouha
No ratings yet
File Handling (FINAL)
Document37 pages
File Handling (FINAL)
Nitasha Garg
No ratings yet
IR - 754 All Practical
Document21 pages
IR - 754 All Practical
754Durgesh Vishwakarma
No ratings yet
Index
Document25 pages
Index
Swati Parihar
No ratings yet
Rica Python
Document7 pages
Rica Python
Samielyn Ingles Ramos
No ratings yet
I Am Sharing 'CS 12 - Practical - File - Sample-2023 - NK' With You
Document23 pages
I Am Sharing 'CS 12 - Practical - File - Sample-2023 - NK' With You
RITA SINGH
No ratings yet
OS PracticalSlipsQues
Document72 pages
OS PracticalSlipsQues
kolhesarthak7
No ratings yet
Python File
Document13 pages
Python File
kumarjass5251
No ratings yet
Unit Iv - Python Functions, Modules and Packages
Document102 pages
Unit Iv - Python Functions, Modules and Packages
unreleasedtrack43
No ratings yet
Programs With Outputs - Cycle 1
Document9 pages
Programs With Outputs - Cycle 1
Akhil Chowdary Maddineni
No ratings yet
Input and Output Statements PDF
Document11 pages
Input and Output Statements PDF
Rajendra Buchade
No ratings yet
Aped For Fake News
Document6 pages
Aped For Fake News
Bless Co
No ratings yet
BERT - Assignment - Jupyter Notebook
Document8 pages
BERT - Assignment - Jupyter Notebook
sriharsha bsm
0% (2)
Pyhton Practicals PDF
Document17 pages
Pyhton Practicals PDF
Megha Sundriyal
No ratings yet
BCSL-33 Lab-Manual-Solution PDF
Document27 pages
BCSL-33 Lab-Manual-Solution PDF
sparamveer95_4991596
0% (2)
Natural Language Processing
Document5 pages
Natural Language Processing
shivaybhargava33
No ratings yet
Python Code Examples
Document30 pages
Python Code Examples
Asaf Katz
No ratings yet
Program Python by Shailesh
Document48 pages
Program Python by Shailesh
officialblackfitbaba
No ratings yet
Questions
Document16 pages
Questions
Neeraj Kumar
No ratings yet
IGNOU MCA 2nd Semster Data Structure Lab Record Solved MCSL 025
Document77 pages
IGNOU MCA 2nd Semster Data Structure Lab Record Solved MCSL 025
fajer007
86% (7)
Computer Science Lab Mannual
Document23 pages
Computer Science Lab Mannual
shubhra.kuldeep2003
No ratings yet
Week 08 Tutorial Sample Answers
Document4 pages
Week 08 Tutorial Sample Answers
MP
No ratings yet
CSD101: Introduction To Computing and Programming: Marks Obtained 40
Document4 pages
CSD101: Introduction To Computing and Programming: Marks Obtained 40
aaaa
No ratings yet
Enter Your Name Nishigandha Enter Your Age 22 Nishigandha You Will Turn 100 Years Old in 2098
Document14 pages
Enter Your Name Nishigandha Enter Your Age 22 Nishigandha You Will Turn 100 Years Old in 2098
Nishigandha Patil
100% (1)
Python Practical
Document18 pages
Python Practical
Divya Rajput
No ratings yet
NLP - Practical List
Document14 pages
NLP - Practical List
Yash Amin
No ratings yet
Python Final Assignment Imp
Document12 pages
Python Final Assignment Imp
anjali.pandey0501
No ratings yet
Top 25 Unix Interview Questions With Answers
Document8 pages
Top 25 Unix Interview Questions With Answers
sanjay.gupta8194
No ratings yet
Python Programming Lab (21AML38)
Document31 pages
Python Programming Lab (21AML38)
Simran Tabassum
No ratings yet
Rohan Panda 1841012123 CSE D IR LAB ASSIGNMENT
Document32 pages
Rohan Panda 1841012123 CSE D IR LAB ASSIGNMENT
Sandeep Sourav
No ratings yet
Python Programs
Document36 pages
Python Programs
Sohail Ansari
No ratings yet
21221-Ghogare Priyanka Ashok
Document21 pages
21221-Ghogare Priyanka Ashok
Priyanka ghogare
No ratings yet
Csi 2120 Midterm Cheat Sheet
Document3 pages
Csi 2120 Midterm Cheat Sheet
Jason Matta
No ratings yet
Lab - 7
Document18 pages
Lab - 7
Delvin company
No ratings yet
Phytoncodesxx 2
Document12 pages
Phytoncodesxx 2
Jennifer Verona
No ratings yet
Practical 12th CS
Document18 pages
Practical 12th CS
riya Riya
No ratings yet
Class 12 Practicals 20 Prgs 5
Document59 pages
Class 12 Practicals 20 Prgs 5
Aakarsh Tiwari
No ratings yet
PYTHON LAB MANUAL Name: Jitendra Kumar Roll No.: 05035304421 Section: 2
Document20 pages
PYTHON LAB MANUAL Name: Jitendra Kumar Roll No.: 05035304421 Section: 2
Daily Vlogger
No ratings yet
MATLAB-Fall 11-12 Introduction To MATLAB Part I
Document33 pages
MATLAB-Fall 11-12 Introduction To MATLAB Part I
Husam Qutteina
No ratings yet
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Gd Script
From Everand
Gd Script
Marijo Trkulja
No ratings yet
Image Enhancement in Spatial Domain: Spatial Filtering Anisha M. Lal
Document12 pages
Image Enhancement in Spatial Domain: Spatial Filtering Anisha M. Lal
Akash Yadav
No ratings yet
Image Processing Cse4019 G2
Document3 pages
Image Processing Cse4019 G2
Akash Yadav
No ratings yet
Image Processing Cse4019 G1
Document3 pages
Image Processing Cse4019 G1
Akash Yadav
No ratings yet
Smart Garbage Collection System: Submitted by
Document32 pages
Smart Garbage Collection System: Submitted by
Akash Yadav
No ratings yet
Image Processing Cse4019 E1
Document3 pages
Image Processing Cse4019 E1
Akash Yadav
No ratings yet
18BCB0107 Da2
Document5 pages
18BCB0107 Da2
Akash Yadav
No ratings yet
Lean Startup Management MGT1022: Business Plan Writing For Physicians
Document6 pages
Lean Startup Management MGT1022: Business Plan Writing For Physicians
Akash Yadav
No ratings yet
Image Processing: Digital Assignment - 1
Document6 pages
Image Processing: Digital Assignment - 1
Akash Yadav
No ratings yet
18bcb0107 LSM Da1
Document12 pages
18bcb0107 LSM Da1
Akash Yadav
No ratings yet
Course Projects PDF
Document1 page
Course Projects PDF
sanjog kshetri
No ratings yet
LDSD Godse
Document24 pages
LDSD Godse
Kiranmai Srinivasulu
No ratings yet
CSEC SocStud CoverSheetForESBA Fillable Dec2019
Document1 page
CSEC SocStud CoverSheetForESBA Fillable Dec2019
chrissaine
No ratings yet
Tree Pruning
Document15 pages
Tree Pruning
rita44
No ratings yet
Bushcraft Knife Anatomy
Document2 pages
Bushcraft Knife Anatomy
Cristian Botozis
No ratings yet
Functions: Var S Add
Document13 pages
Functions: Var S Add
Revati Menghani
No ratings yet
Kefauver Harris Amendments
Document7 pages
Kefauver Harris Amendments
Anil kumar
No ratings yet
TRICARE Behavioral Health Care Services
Document4 pages
TRICARE Behavioral Health Care Services
Matthew X. Hauser
No ratings yet
Toi Su20 Sat Epep Proposal
Document7 pages
Toi Su20 Sat Epep Proposal
Talha Siddiqui
No ratings yet
Building For The Environment 1
Document3 pages
Building For The Environment 1
api-133774200
No ratings yet
Countable 3
Document2 pages
Countable 3
Pio Sulca Tapahuasco
100% (1)
Mitsubishi Fan
Document2 pages
Mitsubishi Fan
Kyaw Zaw
No ratings yet
N50-200H-CC Operation and Maintenance Manual 961220 Bytes 01
Document94 pages
N50-200H-CC Operation and Maintenance Manual 961220 Bytes 01
ANDRES
No ratings yet
Buildingawinningsalesforce WP Ddi
Document14 pages
Buildingawinningsalesforce WP Ddi
Mawaheb Contracting
No ratings yet
Optimal Dispatch of Generation: Prepared To Dr. Emaad Sedeek
Document7 pages
Optimal Dispatch of Generation: Prepared To Dr. Emaad Sedeek
AhmedRaafat
No ratings yet
LG Sigma+Escalator
Document4 pages
LG Sigma+Escalator
강민호
No ratings yet
Ch06 Allocating Resources To The Project
Document55 pages
Ch06 Allocating Resources To The Project
Josh Chama
No ratings yet
Obara Bogbe
Document36 pages
Obara Bogbe
Ojubona Aremu Omotiayebi Ifamoriyo
0% (1)
Stewart, Mary - The Little Broomstick
Document159 pages
Stewart, Mary - The Little Broomstick
Yunon
100% (1)
Fire Technical Examples DIFT No 30
Document27 pages
Fire Technical Examples DIFT No 30
Daniela Haneková
No ratings yet
Buss40004 - Balance of Power
Document3 pages
Buss40004 - Balance of Power
Vishwa Nirmala
No ratings yet
LKG Math Question Paper: 1. Count and Write The Number in The Box
Document6 pages
LKG Math Question Paper: 1. Count and Write The Number in The Box
Kunal Naidu
60% (5)
IPM Guidelines
Document6 pages
IPM Guidelines
Hittesh Solanki
No ratings yet
CDR Writing: Components of The CDR
Document5 pages
CDR Writing: Components of The CDR
indikuma
100% (3)
F5 Chem Rusting Experiment
Document9 pages
F5 Chem Rusting Experiment
Prashanthini Janardanan
No ratings yet
Amp DC, Oa
Document4 pages
Amp DC, Oa
Fantastic Kia
No ratings yet
Zygosaccharomyces James2011
Document11 pages
Zygosaccharomyces James2011
edson escamilla
No ratings yet
Avid Final Project
Document2 pages
Avid Final Project
api-286463817
No ratings yet
Handsout
Document3 pages
Handsout
loraine mandap
No ratings yet
Morse Potential Curve
Document9 pages
Morse Potential Curve
jagabandhu_patra
No ratings yet