You are on page 1of 6

Seminar report – 5th Semester

WEB SCRAPING

A Seminar Report

Submitted by

RAVI KUMAR
[20106107028]

in partial fulfilment for the award of the degree


of

Batchelor of Technology
IN
BRANCH OF STUDY
At

Department of Information Technology


Muzaffarpur Institute of Technology, Muzaffarpur
June 2023

Dept. of IT, MIT Muzaffarpur


Seminar report – 5th Semester

ACKNOWLEDGEMENT

I want to thank particularly our seminar topic Sudhir Kumar for his support and encouragement

throughout the completion of this seminar topic and for having faith in us. I also want to wish to thank

Sudhir kumar for their continuing support and encouragement.

Ravi kumar
Roll No.: - 20IT31
University Reg. No.- 20106107028
Session: 2020-24
Sem.:- 5th

Dept. of IT, MIT Muzaffarpur

Dept. of IT, MIT Muzaffarpur


Seminar report – 5th Semester

TABLE OF CONTENTS

1. INTRODUCTION

2. USES OF WEB SCRAPING

3. TECHNIQUES

4. PROCEDURE

5. SUMMARY

6. REFERENCES

Dept. of IT, MIT Muzaffarpur


Seminar report – 5th Semester

INTRODUCTION

Web scraping is a technique to fetch data from websites. While surfing on the web, many websites don’t allow the
user to save data for personal use. One way is to manually copy-paste the data, which both tedious and time-
consuming. Web Scraping is the automation of the data extraction process from websites. This event is done with
the help of web scraping software known as web scrapers. They automatically load and extract data from the
websites based on user requirements. These can be custom built to work for one site or can be configured to work
with any website.

USES OF WEB SCRAPING


Web scraping finds many uses both at a professional and personal level. Having different needs at
different levels, some popular uses of web scraping are.

• Price Monitoring
• Market Research
• News Monitoring
• Sentiment Analysis
• Email Marketing

Dept. of IT, MIT Muzaffarpur


Seminar report – 5th Semester

TECHNIQUES

Web Scraping is the process of automatically mining data or collecting information from

the World Wide Web. There are methods that some websites use to prevent web

scraping, such as detecting and disallowing bots from crawling (viewing) their pages. In

response, there are web scraping systems that rely on using techniques such as DOM

(Document Object Model), computer vision and natural language processing to simulate

human browsing to enable gathering web page content for offline parsing. Current web

scraping solutions range from the ad-hoc, requiring human effort, to fully automated

systems that can convert entire websites into structured information, with limitations.

• Human copy-and-paste

• Text pattern matching

• HTTP programming

• HTML parsing

• DOM parsing

PROCEDURE

The library of codes we can use for this project can:

• Requests Library

• Beautiful Soup Library

• Pandas

Dept. of IT, MIT Muzaffarpur


Seminar report – 5th Semester

SUMMARY

Web Scraping is an interesting and an extremely popular technique which proves itself to be

quite handy to learn. There are several other libraries apart from Beautiful Soup. Scrappy is

a very popular open-source web crawling framework that is also written in Python. It’s ideal

for web scraping and extracting data using API’s. Beautiful Soup is used to create a parse
tree and extract data from the HTML of a webpage.

REFERENCES

https://www.google.com
https://www.flipkart.com/

Dept. of IT, MIT Muzaffarpur

You might also like