Professional Documents
Culture Documents
Project
Group Members:
1. Imran Ahsanullah
2. Sheeraz Ali
SCOPE
With the addition of more and more data in the world of the internet, the importance of web
scraping is increasing. Many companies are now offering customized web scraping tools to their
clients in which they gather data from all over the world of the internet and arrange them into
useful and easily understandable data. It reduces the precious man-power to manually visit each
website and collect the data. Web Scrapers are designed and code for each and individual website
and crawlers do broad scraping. If the website has a complicated structure, more coding is required
to scrap its data as compared to a simple one. The Future of web scraping is indeed bright and it
will become more and more essential for every business with the passage of time.
Ten different mango websites would be scrapped. From where we will extract all the required
information that we have needed. Like: Extract name, description, sku, id, images, features,
options. Save to files: csv, xml, json,excel. Get solution to extract content from Mango website.
Extract the next Mango fields:
1. First of all we will select the scraping category that either we have to scrap the ecommerce
websites information or any businesses, schools information.
2. Select the different websites from where we have to extract the required information.
3. Analyze the website interface to make sure that how it would be scrapped
4. Write the code for extracting the information from different websites
a. There would be a different way/logic to extract the data from each website
5. Select Data Storage Type
a. Create a database and required tables
i. Add required columns and assign them proper data types based on the type
of data
ii. Insert the data in tables
b. Create a JSON file
c. Create a CSV File
6. Create the reports to display the stored/scrapped data in an informative way