You are on page 1of 27

Industrial Training

Presentation

Prepared By: Guided By:


Keval Rajeshbhai Katrodiya Mr. Vibha Patel
(201903103510130) Assistant Professor (CE)
B. Tech. (8th Semester) (CE) CGPIT.
CGPIT.
2
Outline
 Company Profile
 consent from internal and external guides
 Timeline Chart
 Training Work
 Project Introduction
 Project Objective
 Scope of Project
 Technologies/Platform
 Modules of Project
 Database Schema
 Diagrams
 Implementation
 Testing
 Future Work
 Conclusion
 References

Project Title
3
Company Profile
 Accrue is a financial technology company that provides data analytics and
portfolio management solutions to investment firms, wealth managers, and
individual investors. The company was founded in 2014 and is headquartered
in New York City.
 Accrue's platform uses advanced algorithms and machine learning to analyze
market data and provide insights on investment opportunities. The platform
also allows users to build customized portfolios based on their investment
objectives and risk tolerance.
 Accrue provides API of extensive historical data and events on stocks, futures,
and forex.
 Website: https://accrue.com/

Project Title
4 Consent from internal and
External guides
 Consent from external guide

Project Title
5 Consent from internal and
External guides
 Consent from internal guide

Project Title
Timeline Chart
Setuping Envitonment Clean and
Learned to solve
Dec 8
Continued organize the
error while
Scraping collected data Continued Scraping
scraping
Feb 2 Feb 16 Mar 16
Feb 28

Week 7 8 9 10 11 12 13 14 15

Feb 22 Mar 2
Continued Continued Successfully
Scraping Scraping
completed
Jan 5
Start admin panel
Learned to store Feb 9
Dec 1
data in csv file
Gained Knowledge
Learned about testing
Through

Pilgrimage Centre
7
Training Work
 Learn the basics of Python programming: Before diving into web scraping, it is important to have a
solid understanding of Python programming fundamentals. This includes data types, control
structures, functions, and object-oriented programming.
 I got familiarize with web technologies: To effectively scrape data from websites, it is important to
have a basic understanding of HTML, CSS, and JavaScript. This will help you identify the relevant
tags and elements needed to scrape data from a website.
 Select a web scraping library: Python offers several web scraping libraries, including Beautiful
Soup, Scrapy, and Selenium. Each library has its own strengths and weaknesses, so it is important to
research and choose the library that best suits your project needs.
 Practice web scraping with sample projects: Start with small, simple web scraping projects to
practice your skills and get comfortable with the library you've chosen. This can include scraping
data from a single webpage or a small set of web pages.
 Explore advanced techniques: As you become more comfortable with web scraping, you can
explore more advanced techniques such as scraping data from dynamic web pages, using APIs, and
scraping data from social media platforms.
 Stay up-to-date on legal and ethical considerations: Web scraping can raise ethical and legal
concerns, so it is important to stay informed on the latest laws and regulations related to web
scraping.
 Collaborate and seek feedback: Working with other web scraping practitioners or seeking feedback
on your projects can help you identify areas for improvement and accelerate your learning process

Project Title
8
Project Introduction
 Python is a popular programming language that can be used for a variety of applications, including
web scraping. Web scraping involves the automated collection of data from websites and can be
used for a variety of purposes, such as market research, data analysis, and content aggregation.
 In a Python web scraping project, the goal is to write code that can automatically navigate to a
website, collect the desired data, and store it in a usable format, such as a CSV file or a database.
Python offers a variety of libraries for web scraping, including Beautiful Soup, Scrapy, and
Selenium.
 To start a web scraping project in Python, you should first identify the website or websites you want
to scrape and determine the data you want to collect. You will then need to write code that can
navigate to the website, locate the relevant data, and extract it. This may involve inspecting the
website's HTML structure, identifying the appropriate tags or classes, and writing code to parse the
data.
 It is important to note that web scraping can raise ethical and legal concerns, particularly if you are
scraping data without the website owner's permission. Therefore, it is important to research and
understand the legal and ethical implications of web scraping before embarking on a project.
 Overall, a Python web scraping project can be a powerful tool for collecting data and extracting
insights from the web. With careful planning and consideration of ethical and legal issues, it can be
a valuable addition to your data analysis toolkit.

Project Title
9
Project Objective

 The objective of a web scraping project in Python is to write code that can
automatically navigate to a website, collect the desired data, and store it in a usable
format.
 The project involves identifying the website or websites to scrape, determining the
data to collect, and writing code to parse the data.
 Python offers a variety of libraries for web scraping, including Beautiful Soup,
Scrapy, and Selenium.
 It is important to note that web scraping can raise ethical and legal concerns, so it is
important to research and understand the legal and ethical implications of web
scraping before embarking on a project.
 The ultimate goal of a Python web scraping project is to collect data and extract
insights from the web.

Project Title
10
Project Scope
 The scope of a web scraping project in Python depends on the goals of the project
and the data that needs to be collected.
 The project may involve scraping data from a single website or multiple websites,
depending on the needs of the project.
 The data collected may include text, images, videos, or other media. The scope may
also include organizing and cleaning the collected data for analysis, as well as
conducting exploratory data analysis and extracting insights from the data.
 It is important to consider ethical and legal implications when determining the scope
of a web scraping project, as scraping data without permission can be a violation of
website terms of service and even be illegal in some cases.
 Therefore, the scope should be defined in a way that is both effective and ethical,
ensuring that the project meets the needs of the organization or individual while
respecting the rights of website owners and users.

Project Title
11
Technologies/Platform (to
be used for development)
 Front end :
• Python
 Back end  :
• Python
 Simulator :
• Visual Studio code , Jupyter notebook

Project Title
12
Modules of Project
 Address : The address of the hotel. Example: "address": "Rúa Juana de
Vega, 7, A, 15004 A Coruña“
 Availability : When the hotel is not available for the selected dates, this
field should be set to “Not available”
 Check_in_date : Check-in date collected from the mobile app String
format: “YYYY-MM-DD”. Example: "ci_date": "2022-05-01“
 Check_out_date : Check-in date collected from the mobile app String
format: “YYYY-MM-DD”. Example: "ci_date": "2022-05-01“
 City : The city as seen on the mobile app (if exist)
 Country : The country as seen on the mobile app (if exist)

Project Titles
13
Continue
 Offers : List of all offers available for this shop. Please see the list of
Offers fields below
 geo_location : Geolocation, if available on the mobile app.
Example: "geo_location": { "latitude": "43.3663834", "longitude": "-
8.4055085" },
 hotel_name : Hotel name from the mobile app’s page was opened.
Example: "hotel_name": “Eurostars Blue Coruña”  
 Hotel star rating : Number of stars if present
 Number_of_reviews : Number of customer reviews
 Number_of_guests : Number of guests collected from mobile app
 Pos : Used pos

Project Title
14
Continue
 Snapshot_url : Url to the screenshots if it was requested from input
 Status : Error status or message, if any. Otherwise, could be “200” or
“OK”
 Room_name : Name of the room. Example: double, twin, double with a
nice view, etc
 Rate_name : Rate of the room. Example: early booking, all inclusive, hot
deal rate, etc
 Rate_type : Pay now / pay later
 Promotion : Text that describes the promotion of the room rate (if exists)
 Available : True if the rate is available, false if not

Project Title
15
Continue
 Shown_currency : The shown currency
 Shown_price : The price that is shown for the room-rate
 Net_price : The base price (without taxes)
 Total_price : The total price (including all the taxes)
 Breakfast : True if breakfast is included, false if not
 Breakfast_policy : Text describing the policy of the breakfast, or the
breakfast itself (what does it include / cost / etc)
 Free_cancellation : True if free cancellation is included, false if not
 Free_cancellation_policy : Text describing the policy of the free
cancellation

Project Title
16
Continue
 Number_of_guests : Number of guests the room is good for
 Taxes_info : Text that describes which taxes are included / not included in
the shown_price
 Taxes : List of elements which describes Taxes:
● amount
● currency
● tax_name

Project Title
17
Database Schema
Table 1: Table Name

Column Data Type Size Constraint Description


Name

hotel_id INT 10 Primary Key Hotel Id

hotel_name VARCHAR 250 --- Hotel name

total_price VARCHAR 50 --- The total price

Project Title
18
Diagrams
 Our Company doesn’t provide the data dictionary, it has been
restricted and the Non-disclosure certificate is attached to it.

Project Title
19 Implementation

Fig. 1: Hotel Details


Project Title
20 Continue

Fig. 2: Room Details


Project Title
21 Continue

Fig. 3: Tax and Price


Project Title
22 Continue

Fig. 4: Not available cases


Project Title
23
Testing
Table 1: Request Table

Test Case Test Data Expected Actual Pass/Fail


ID Result
Result
1 Request Response Status Code = 200 Data get Data get Pass
stored into stored into
database database
2 Request Response Status Code = 404 Data does Data does Pass
not get not get
stored into stored into
Project Title
database database
24
Future Work
 Automated data analysis: As the amount of data available online continues to grow,
web scraping tools will become increasingly important for businesses and researchers
who need to collect and analyze data from multiple sources. In the future, we can
expect to see more advanced data analysis tools that use web scraping to gather
information from websites and other online sources.
 Personalization: Web scraping can also be used to create personalized experiences for
users. For example, an online retailer could use web scraping to gather information
about a customer's preferences and purchase history in order to offer personalized
recommendations.
 AI training: Web scraping can be used to collect large amounts of data that can be
used to train machine learning algorithms. In the future, we can expect to see more
advanced AI applications that rely on web scraping to gather data.
 Monitoring: Web scraping can be used to monitor websites for changes or updates.
For example, a news organization could use web scraping to monitor social media
and other online sources for breaking news.

Project Title
25
Conclusion
 web scraping is a valuable tool for extracting data from websites and other online
sources. It has many practical applications, including data analysis, personalization,
AI training, and monitoring. Web scraping can save time and resources, and it can
provide access to information that might not be available through other means.
 However, it is important to use web scraping tools ethically and responsibly. Some
websites have policies that prohibit web scraping, and scraping can also raise privacy
concerns if personal information is being collected. It is important to be aware of
these issues and to use web scraping tools in a way that respects the rights and
privacy of others.
 Overall, web scraping is a powerful tool that can be used to gather data and gain
insights into a wide range of topics. As technology continues to evolve, we can
expect to see even more innovative uses for web scraping in the years to come.

Project Title
26
References
 "Web Scraping" on Wikipedia: https://en.wikipedia.org/wiki/Web_scraping
 "The Ultimate Guide to Web Scraping" by Octoparse:
https://www.octoparse.com/blog/the-ultimate-guide-to-web-scraping
 "Web Scraping with Python" by DataCamp:
https://www.datacamp.com/community/tutorials/web-scraping-python-nlp
 "Web Scraping Tutorial: How to Scrape Data From Any Website" by Moz:
https://moz.com/blog/web-scraping-tutorial-python-tips
 "Ethics in Web Scraping" by Scrapinghub:
https://www.scrapinghub.com/ethics-in-web-scraping/
 "Web Scraping: A Guide to Best Practices for Developers" by
ProgrammableWeb: https://www.programmableweb.com/news/web-
scraping-guide-to-best-practices-for-developers/2019/07/22

Project Title
27

Thank You

Project Title

You might also like