Welcome to Scribd!

Skip carousel

How To Test With Simple HTML Tags Using Beautifulsoup

Uploaded by

nafisa salim

0% found this document useful (0 votes)

5 views5 pages

Original Title

5_6294097997592528578

Copyright

Available Formats

DOCX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

5 views5 pages

How To Test With Simple HTML Tags Using Beautifulsoup

Uploaded by

nafisa salim

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 5

Search inside document

HOW TO TEST WITH SIMPLE HTML TAGS USING BEAUTIFULSOUP

1. Open Anaconda Navigator

2. Launch Jupyter Notebook

3. New  python 3

4. Enter the following codes:

a. from bs4 import BeautifulSoup as bs SHIFT + ENTER

b. test_url = "C:\\Users\\User\\Desktop\\simple.html" SHIFT + ENTER

c. soup = bs(open(test_url), 'html.parser') SHIFT + ENTER

d. print (soup) SHIFT + ENTER

e. print (soup.prettify()) SHIFT + ENTER

f. soup.title SHIFT + ENTER

g. soup.body SHIFT + ENTER

h. soup.body.contents[1] SHIFT + ENTER

i. soup.get_text()

j. print (soup.get_text())
k. print (soup.get_text(strip=True))

l. print (soup.get_text(‘ ’, strip=True))

m. soup.findAll(‘p’)

n. soup.findAll('p',{'id':'First content'})
Example :
To scrape
https://www.newegg.com/Laptops-Notebooks/Category/ID-
223?Tid=17489

from bs4 import BeautifulSoup as soup

from urllib.request import urlopen as uReq

my_url = "https://www.newegg.com/Laptops-Notebooks/Category/ID-223?Tid=17489"

uClient = uReq(my_url) #to request the connection to URL specified

my_page = uClient.read() #read the webpage connected

page_soup = soup(my_page, "html.parser") #to parse the webpage content

#to select all tags <div class = item-container>

my_content = page_soup.findAll("div", {"class":"item-container"})

print (my_content) #to display what is in my_content

for x in my_content: #looping through all the contents in the item-container

model = x.div.div.a.img['title'] #scrap the title and pun in array – tree navigation
print (model)

for x in my_content:
model = x.div.div.a.img['title'] #different title name for each image
item_desc = x.findAll('a',{'class':'item-title'}) #find all the <a href ......class= item-title
print(len(item_desc)) #how many contents are there?
print(item_desc[0]) # Array index always starts with 0

for x in my_content:
model = x.div.div.a.img['title']
item_desc = x.findAll('a',{'class':'item-title'})
print(len(item_desc))
print(item_desc[0].text) # display only text

for x in my_content:
model = x.div.div.a.img['title']
item_desc = x.findAll('a',{'class':'item-title'})
print ('Model : ' + model)
print ('Product Name : ' + item_desc[0].text + '\n')

for x in my_content:
model = x.div.div.a.img['title']
item_desc = x.findAll('a',{'class':'item-title'})
shipping = x.findAll('li',{'class':'price-ship'}) #shipping information
print(shipping[0].text.strip())

for x in my_content:
model = x.div.div.a.img['title']
item_desc = x.findAll('a',{'class':'item-title'})
shipping = x.findAll('li',{'class':'price-ship'})
print('Model : ' + model)
print('Product Description : ' + item_desc[0].text)
print('Shipping : ' + shipping[0].text.strip() + '\n')

Notes - JQuery
Document7 pages
Notes - JQuery
aruruthala2001
No ratings yet
Lab Building Simple Shopping Cart Using Python, Flask, MySQL
Document14 pages
Lab Building Simple Shopping Cart Using Python, Flask, MySQL
Joker Jr
No ratings yet
Assignment: Department of Computer Science & Engineering ST Joseph Engineering College, Mangaluru-575028
Document28 pages
Assignment: Department of Computer Science & Engineering ST Joseph Engineering College, Mangaluru-575028
Akul Vinod
No ratings yet
Cypress Tutorial
Document11 pages
Cypress Tutorial
harshinihari
No ratings yet
05 Attribute Selectors Filters
Document12 pages
05 Attribute Selectors Filters
Ema
No ratings yet
Python Lab ALL 10 Prgms
Document16 pages
Python Lab ALL 10 Prgms
dvyvmsfcdwzbxpmymt
No ratings yet
Sahil Malhotra 16 BCE 0113 Web Mining L51+L52: 1. Universal Crawling 1.1. CODE
Document11 pages
Sahil Malhotra 16 BCE 0113 Web Mining L51+L52: 1. Universal Crawling 1.1. CODE
sahil
No ratings yet
Untitled Document
Document9 pages
Untitled Document
ophamdan53
No ratings yet
Python Crash Course by Ehmatthes 17
Document1 page
Python Crash Course by Ehmatthes 17
alfonsofdez
No ratings yet
Scrapy
Document1 page
Scrapy
Rayees Rasheed
No ratings yet
Materi Workshop Tkinter
Document24 pages
Materi Workshop Tkinter
Muhammad Syahnur
No ratings yet
Python Assignment Harsh Ue218122
Document8 pages
Python Assignment Harsh Ue218122
Harsh
No ratings yet
10 Lessons in Front-end
From Everand
10 Lessons in Front-end
Krasimir Tsonev
Rating: 2 out of 5 stars
2/5 (1)
Imprimir Datos Report Lab
Document5 pages
Imprimir Datos Report Lab
Antonio Aguilar
No ratings yet
Web Scraping Project
Document1 page
Web Scraping Project
prerak sheth
No ratings yet
Denis Ivy: Tutorial 01
Document6 pages
Denis Ivy: Tutorial 01
ritesh
No ratings yet
WT Assignment 6
Document8 pages
WT Assignment 6
om.lohade23
No ratings yet
Web Scraping
Document11 pages
Web Scraping
Alya Rusmi
No ratings yet
New DOCX Document
Document20 pages
New DOCX Document
Abdulrhman Bajasier
No ratings yet
Python 2 Lab Esy
Document34 pages
Python 2 Lab Esy
Sharukh Hussain
No ratings yet
6-10 Python Lab Program
Document16 pages
6-10 Python Lab Program
abcd12341109
No ratings yet
Python
Document18 pages
Python
Muskan Gupta
No ratings yet
Backend Expense
Document5 pages
Backend Expense
lukmanolamide001
No ratings yet
Flask Blog
Document24 pages
Flask Blog
stephen kimeu
No ratings yet
Django Notes: To Create A New Project
Document7 pages
Django Notes: To Create A New Project
devfaz
No ratings yet
Aim: Write A Program To Parse XML Text, Generate Web Graph and Compute Topic Specific Page Rank. Source Code
Document5 pages
Aim: Write A Program To Parse XML Text, Generate Web Graph and Compute Topic Specific Page Rank. Source Code
SumitMaurya
0% (1)
How To Create Multi Level Menu Dynamically Using C
Document7 pages
How To Create Multi Level Menu Dynamically Using C
Kuldeep Jagda
No ratings yet
10 PHP MVC Frameworks Templating and Forms Lab
Document15 pages
10 PHP MVC Frameworks Templating and Forms Lab
Nguyen Tuan Kiet (FGW DN)
No ratings yet
(7 9 1) - 财经数据GUI项目
Document6 pages
(7 9 1) - 财经数据GUI项目
Pandeng Li
No ratings yet
ImportKeywordHighlightINI Py
Document4 pages
ImportKeywordHighlightINI Py
Antony Fernandez
No ratings yet
Untitled Document
Document10 pages
Untitled Document
ophamdan53
No ratings yet
Pythoneffecttutorial: Effect Extension Script
Document5 pages
Pythoneffecttutorial: Effect Extension Script
Job A. Gorree
No ratings yet
Topalian Calculation Engine by Christopher Topalian
Document112 pages
Topalian Calculation Engine by Christopher Topalian
CollegeOfScripting
No ratings yet
Topalian Calculation Suite by Christopher Topalian
Document70 pages
Topalian Calculation Suite by Christopher Topalian
CollegeOfScripting
No ratings yet
The Django Model Cheat Sheet
Document2 pages
The Django Model Cheat Sheet
windoze007
100% (12)
Web Scraping in Python: 1. Use Scrapy Via Python. HTML: 2. Attributes: // All
Document5 pages
Web Scraping in Python: 1. Use Scrapy Via Python. HTML: 2. Attributes: // All
MAHMUDA AKTER KEYA
No ratings yet
From Tkinter Import
Document10 pages
From Tkinter Import
Subavani
No ratings yet
Django Simple CRUD
Document21 pages
Django Simple CRUD
ivan sugiarto
No ratings yet
7.2. Testing: Work-Flow
Document8 pages
7.2. Testing: Work-Flow
Asep Mulyana
No ratings yet
25 Awesome Python Scripts
Document26 pages
25 Awesome Python Scripts
moises tinte
No ratings yet
Stock Management
Document28 pages
Stock Management
Mohammod
No ratings yet
238 Mini-API / Cheat Sheet You May Detach These Sheets Selecting
Document4 pages
238 Mini-API / Cheat Sheet You May Detach These Sheets Selecting
chiman
No ratings yet
Laravel Notes
Document18 pages
Laravel Notes
Pens And Swords Music
100% (1)
HTML
Document8 pages
HTML
kamal hamza
No ratings yet
Weather Forecasting
Document5 pages
Weather Forecasting
ahmed salem
No ratings yet
Using Jquery in ASP
Document7 pages
Using Jquery in ASP
Mayuri Kalikar
No ratings yet
Object Oriented Programming
Document10 pages
Object Oriented Programming
Jamila Noor
No ratings yet
Functions - PHP File From Your Theme. If That File Doesn't Exists, Create It
Document53 pages
Functions - PHP File From Your Theme. If That File Doesn't Exists, Create It
garfield137
100% (1)
Python .13 18
Document15 pages
Python .13 18
makanijenshi2409
No ratings yet
Theme Bundle .Js
Document14 pages
Theme Bundle .Js
Christian Daniel
No ratings yet
Lecture 1 JavaScript
Document19 pages
Lecture 1 JavaScript
Muhammad Nouman Ali
No ratings yet
Scraping Instagram With Python
Document4 pages
Scraping Instagram With Python
Srujana Takkallapally
No ratings yet
SDFG
Document4 pages
SDFG
gprasadatvu
No ratings yet
Django
Document30 pages
Django
praveen g
No ratings yet
Python Lab
Document16 pages
Python Lab
Siddharth
No ratings yet
Lab Manual Python Programming 2021-2022-39-77
Document39 pages
Lab Manual Python Programming 2021-2022-39-77
rohithashwanth45
No ratings yet
Chemistry Project File
Document26 pages
Chemistry Project File
Mohammad Imran
No ratings yet
W9 JQuery - Part 1 - Module PDF
Document6 pages
W9 JQuery - Part 1 - Module PDF
Jacob Satorious Excalibur
No ratings yet
MCTS 70-515 Exam: Web Applications Development with Microsoft .NET Framework 4 (Exam Prep)
From Everand
MCTS 70-515 Exam: Web Applications Development with Microsoft .NET Framework 4 (Exam Prep)
Eddie Vi
Rating: 4 out of 5 stars
4/5 (1)
How to a Developers Guide to 4k: Developer edition, #3
From Everand
How to a Developers Guide to 4k: Developer edition, #3
Xinc Cyberwizard
No ratings yet
Tally Short Cut Keys
Document3 pages
Tally Short Cut Keys
somuswat
No ratings yet
User Manual For HR Attendence Mobile App 20190425 V1.1
Document9 pages
User Manual For HR Attendence Mobile App 20190425 V1.1
MuhammadTouseef
No ratings yet
Fundamentals of Power System Protection by Paithankar Solution Manual PDF
Document3 pages
Fundamentals of Power System Protection by Paithankar Solution Manual PDF
Suvom Roy
33% (3)
Logcat STDERR1684672906386
Document2 pages
Logcat STDERR1684672906386
Ridho I Ruwansyah
No ratings yet
Pgcet 2013 Queans Full
Document71 pages
Pgcet 2013 Queans Full
JAAPA
No ratings yet
Command Line Teguh3
Document6 pages
Command Line Teguh3
Shank Yonkou
No ratings yet
PAM Implementation Guide For Linux, Unix and Solaris
Document20 pages
PAM Implementation Guide For Linux, Unix and Solaris
Ashish Upadhyaya
No ratings yet
Frequently Asked Questions: Reminder Notifications', and Send Notifications For Purchasing Documents
Document2 pages
Frequently Asked Questions: Reminder Notifications', and Send Notifications For Purchasing Documents
Nadipalli
No ratings yet
What Is Paging?
Document3 pages
What Is Paging?
hdfjkghfbhjgfbhjklgf
No ratings yet
Connecting To TFC 2023
Document7 pages
Connecting To TFC 2023
hurtin unit
No ratings yet
Antivirus Software
Document26 pages
Antivirus Software
ashxmee
No ratings yet
Lab Activity 3
Document3 pages
Lab Activity 3
Fabian Sebastian
No ratings yet
An Overview of Azure VMware Solution
Document50 pages
An Overview of Azure VMware Solution
DhyanDorjeArturo
100% (1)
Dell Premier Multi-Device Wireless Keyboard and Mouse KM7321W
Document37 pages
Dell Premier Multi-Device Wireless Keyboard and Mouse KM7321W
bvladimirov85
No ratings yet
WIF SM-W620 Galaxy Book 10 EN UM W10 050117 FINAL AC
Document71 pages
WIF SM-W620 Galaxy Book 10 EN UM W10 050117 FINAL AC
ruben
No ratings yet
Florin Olariu: "Alexandru Ioan Cuza", University of Iași Department of Computer Science
Document117 pages
Florin Olariu: "Alexandru Ioan Cuza", University of Iași Department of Computer Science
Gigi Florica
No ratings yet
Smartoffice Pl4080: A Fast & Versatile Document Digitalization Solution For Workgroups and Vertical Markets
Document2 pages
Smartoffice Pl4080: A Fast & Versatile Document Digitalization Solution For Workgroups and Vertical Markets
Tristan El Tristan
No ratings yet
Setup EC2 and VPC - AWS
Document34 pages
Setup EC2 and VPC - AWS
hsy20205050
No ratings yet
Muhammad Ezzat Bin Mohd Sapian
Document2 pages
Muhammad Ezzat Bin Mohd Sapian
Muhammad Ezzat
No ratings yet
MSX-DOS v2.0 (Versión 2 Larga)
Document102 pages
MSX-DOS v2.0 (Versión 2 Larga)
Sergio Diaz Pereira
No ratings yet
E-Notes DBMS PDF
Document144 pages
E-Notes DBMS PDF
bhoomika varshney
No ratings yet
Fiery Variable Data Printing Guide
Document26 pages
Fiery Variable Data Printing Guide
mason757
No ratings yet
Issues ORA 0054 Resource Busy
Document5 pages
Issues ORA 0054 Resource Busy
myworldmyrules
No ratings yet
Electronic Document Preparation and Management Notes For CXC Exam
Document18 pages
Electronic Document Preparation and Management Notes For CXC Exam
Jabari Morgan
88% (8)
Kbvault - Knowledge Base System: Codeplex
Document1 page
Kbvault - Knowledge Base System: Codeplex
Amila Thennakoon
No ratings yet
SkyLIGHT PVX - v5.1 - Using The API
Document8 pages
SkyLIGHT PVX - v5.1 - Using The API
nagasato
No ratings yet
Function Block Introduction Guide R133-E1-01
Document73 pages
Function Block Introduction Guide R133-E1-01
laba libi
No ratings yet
Docu71556 - AppSync 3.0.1 User and Administration Guide
Document274 pages
Docu71556 - AppSync 3.0.1 User and Administration Guide
Daniel
No ratings yet
Improving Decision Making
Document2 pages
Improving Decision Making
SajanStha
No ratings yet
Important Commands For Linux Admin PDF
Document19 pages
Important Commands For Linux Admin PDF
jaipreetsharma
No ratings yet